UK Web Focus

Innovation and best practices for the Web

Archive for June 10th, 2008

whois++ and IAFA templates

Posted by Brian Kelly (UK Web Focus) on 10 June 2008

SCA Home Nations Forum

I recently facilitated a series of breakout sessions on Standards at the SCA Home Nations Forums, held in Edinburgh, Belfast and Cardiff. The aim of the sessions was to discuss the approaches which are being taken to the use of standards by SCA partners in Scotland, Northern Ireland and Wales.

The first event included a plenary talk on “The Standards Dilemma” given by Alastair Dunning, JISC, and I’ve embedded his slides in my blog post.

Alistair’s blog post about the first event, entitled “Digital Standards: Going beyond Stalin“, summarised some of the difficulties which have been experienced in seeking to deploy open standards in digital library development work.

eLib Standards Document

These concerns were reflected in the breakout sessions at the three events. And when I was preparing the breakout session I though it would be useful to review my involvement in standards work, which date back to my contribution to the eLib Standards document, published in February 1996.

In that document I was fascinated to discover some of the open standards which we thought would lead to interoperability for eLib projects. The document mentioned the Open Document Architecture (ODA) standard but went on to (correctly) predict that “It is unclear what future there is for the ODA standard” and stated that “It is not recommended for use in the eLib programme“.

Rather than using ODA, the standards document “anticipated that SGML will be a key standard for eLib“. The document “encouraged [projects] to work together to agree or, where necessary, develop document type definitions“. Although SGML was used by a number of projects (such as, I think, project which used the TEI DTD) SGML did not have a significant role to play for many of the eLib projects until a simplified version of SGML, XML, became available. The exception to that generalisation was HTML. My contribution to the eLib standards document was to write: “Hypertext Mark-up Language (HTML) is simply a DTD which prescribes formats for presentation and display. Hypertext documents in the World Wide Web are written in HTML. eLib projects will make heavy use of HTML and should use HTML 2 and HTML 3 when it is stable. Netscape and other vendor-specific extensions are deprecated.

It was in the area of standards identifiers, metadata and searching in which the recommendations are most interesting. The document (correctly) stated that “eLib projects should be able to supply a URL for public services” – although in retrospect we should have said “a static and stable URL”. But the above sentence then went on to say the “… and be prepared to adopt URNs when they are stabilised“. The URN (Uniform Resource Name) was envisaged as “a persistent object identifier, assigned by a ‘publisher’ or some authorising agent“. Now today, 12 years later, project Web sites still have a URL for their resources, with other approaches to identifiers (such as DOIs) only being used in specialised areas, such as providing identifiers for journal articles or, in projects such as E-Bank, molecules.

Regarding metadata standards, the document stated:

Relevant standards for resource description:US-MARC, IAFA, TEI headers

although it immediately added the caveat that “This is an area in which there is still much research and development and where it is premature to suggest one preferred approach“.

The document also suggested that the WHOIS++ cross-search protocol could have an important role to play for searching metadata held in the IAFA templates. Indeed the e-Lib-funded ROADS open source software, which underpinned several of the eLib Subject-Based Information Gateways (such as SOSIG and OMNI), was based on this approach.

Discussion

I feel there is much which can be learnt by reviewing the experiences of digital library programmes such as eLib – indeed eLib projects were themselves expected to be open in reviewing their experiences, both positive and negative. Looking at the standards document with the benefit of 12 years of hindsight we can smile at its naivety. But we should also ask why certain standards, which failed to gain acceptance, were encouraged in the first place? An answer, perhaps, is to be found in the interests of the contributors to the standards document. Anne Mumford (a former colleague of my when I worked at Loughborough University) was actively involved in the development of the CGM (Computer Graphics Metafile) standard, so it’s perhaps not surprising that this standard was included in the standards document.

What have we learnt since 1996? Do we ensure that we have more disinterested processes for recommendations? A recent Tweet from Owen Stephens, related to a TechWatch report on “Metadata for digital libraries: state of the art and future directions” suggested that this is not the case: “[I] was surprised how pro-METS [the report] was until I noted “Richard Gartner is [...] is a member of the editorial board for the METS“. Which current exciting new standard will turn out to be tomorrow’s whois++ I wonder?

Posted in standards | 3 Comments »