whois++ and IAFA templates
Posted by Brian Kelly (UK Web Focus) on 10 June 2008
SCA Home Nations Forum
I recently facilitated a series of breakout sessions on Standards at the SCA Home Nations Forums, held in Edinburgh, Belfast and Cardiff. The aim of the sessions was to discuss the approaches which are being taken to the use of standards by SCA partners in Scotland, Northern Ireland and Wales.
The first event included a plenary talk on “The Standards Dilemma” given by Alastair Dunning, JISC, and I’ve embedded his slides in my blog post.
Alistair’s blog post about the first event, entitled “Digital Standards: Going beyond Stalin“, summarised some of the difficulties which have been experienced in seeking to deploy open standards in digital library development work.
eLib Standards Document
These concerns were reflected in the breakout sessions at the three events. And when I was preparing the breakout session I though it would be useful to review my involvement in standards work, which date back to my contribution to the eLib Standards document, published in February 1996.
In that document I was fascinated to discover some of the open standards which we thought would lead to interoperability for eLib projects. The document mentioned the Open Document Architecture (ODA) standard but went on to (correctly) predict that “It is unclear what future there is for the ODA standard” and stated that “It is not recommended for use in the eLib programme“.
Rather than using ODA, the standards document “anticipated that SGML will be a key standard for eLib“. The document “encouraged [projects] to work together to agree or, where necessary, develop document type definitions“. Although SGML was used by a number of projects (such as, I think, project which used the TEI DTD) SGML did not have a significant role to play for many of the eLib projects until a simplified version of SGML, XML, became available. The exception to that generalisation was HTML. My contributution to the eLib standards document was to write: “Hypertext Mark-up Language (HTML) is simply a DTD which prescribes formats for presentation and display. Hypertext documents in the World Wide Web are written in HTML. eLib projects will make heavy use of HTML and should use HTML 2 and HTML 3 when it is stable. Netscape and other vendor-specific extensions are deprecated.“
It was in the area of standards identifiers, metadata and searching in which the recommendations are most interesting. The document (correctly) stated that “eLib projects should be able to supply a URL for public services” - although in retrospect we should have said “a static and stable URL”. But the above sentence then went on to say the “… and be prepared to adopt URNs when they are stabilised“. The URN (Uniform Resource Name) was envisaged as “a persistent object identifier, assigned by a ‘publisher’ or some authorising agent“. Now today, 12 years later, project Web sites still have a URL for their resources, with other approaches to identifiers (such as DOIs) only being used in specialised areas, such as providing identifiers for journal articles or, in projects such as E-Bank, molecules.
Regarding metadata standards, the document stated:
Relevant standards for resource description: US-MARC, IAFA, TEI headers
although it immediately added the caveat that “This is an area in which there is still much research and development and where it is premature to suggest one preferred approach“.
The document also suggested that the WHOIS++ cross-search protocol could have an important role to play for searching metadata held in the IAFA templates. Indeed the e-Lib-funded ROADS open source software, which underpinned several of the eLib Subject-Based Information Gateways (such as SOSIG and OMNI), was based on this approach.
Discussion
I feel there is much which can be learnt by reviewing the experiences of digital library programmes such as eLib - indeed eLib projects were themselves expected to be open in reviewing their experiences, both positive and negative. Looking at the standards document with the benefit of 12 years of hindsight we can smile at its naivety. But we should also ask why certain standards, which failed to gain acceptance, were encouraged in the first place? An answer, perhaps, is to be found in the interests of the contributors to the standards document. Anne Mumford (a former colleague of my when I worked at Loughborough University) was actively involved in the development of the CGM (Computer Graphics Metafile) standard, so it’s perhaps not surprising that this standard was included in the standards document.
What have we learnt since 1996? Do we ensure that we have more disinterested processes for recommendations? A recent Tweet from Owen Stephens, related to a TechWatch report on “Metadata for digital libraries: state of the art and future directions” suggested that this is not the case: “[I] was suprised how pro-METS [the report] was until I noted “Richard Gartner is [...] is a member of the editorial board for the METS“. Which current exciting new standard will turn out to be tomorrow’s whois++ I wonder?









10 June 2008 at 4:26 pm
Yes, I remember that document. I thought I had a copy, but I don’t (perhaps because it was HTML, not a great format for keeping documents in…). I found an email about version 2 from 1998, by which time XML had been added and CGM seems to be listed in the raster section! I also found an email from Anthony Watkinson wanting to reference the document. I wrote back “Anthony, they are just guidelines for the eLib projects. They have just been updated, and should be updated regularly, but it is very difficult to achieve this. One of the problems is that you can’t define standards for projects which are attempting to explore new territory!”
I think that last point about standards versus what we would now call innovation is a very real concern.
At one point as we went through the series of MODELS workshops, it seemed that every year we had a new “magic bullet” that was going to solve our problems. Whois++ was it one year (and I still mourn the centroid!), Z39.50 came soon, collection descriptions followed, and so it went on. But you’re right, it’s interesting to speculate which of today’s hot buzz is tomorrow’s cold leftover curry!
10 June 2008 at 4:34 pm
Hi Chris - thanks for the comments. I can recall a meeting at Centre Point in the late 1990s at which I was very excited about the latest W3C initiative - RDF and the Semantic Web. You suggested caution at the time, I remember. I now know you were right! Perhaps standards need to simmer for 5-10 years before they’re ready for widespread adoption?
10 June 2008 at 4:53 pm
Seems like yesterday; as soon as that comment from CR came up on the RSS feed, the thought “Imesh Toolkit!” popped into my head.
This paper on things WHOIS++ was a nice collaborative production between the various ROADS gateways at the time: