Microformats and RDFa: Adding Richer Structure To Your HTML Pages
Posted by Brian Kelly (UK Web Focus) on 25 March 2010
If you visit my presentations page you will see a HTML listing of the various talks I’ve given since I started working at UKOLN in 1996. The image shown below gives a slightly different display from the one you will see, with use of a number of FireFox plugins providing additional ways of viewing and processing this information.
This page contains microformat information about the events. It was at UKOLN’s IWMW 2006 event that we made use of microformats on the event Web site for the first time with microformats being used to mark up the HTML representation for the speakers and workshop facilitators together with the timings for the various sessions. At the event Phil Wilson ran a session on “Exposing yourself on the Web with Microformats!“. There was much interest in the potential of microformats back in 2006, which was then the hot new idea. Since then I have continued to use microformats to provide richer structural information for my events and talks. I’ll now provide a summary of the ways in which the microformats can be used, based on the image shown above.
The Operator sidebar (labelled A in the image) shows the Operator FireFox plugin which “leverages microformats and other semantic data that are already available on many web pages to provide new ways to interact with web services“. The plugin detects various microformats embedded in a Web page and supports various actions – as illustrated, for events the date, time and location and summary of the event can be added to various services such as Google and Yahoo! Calendar.
<p><a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/2.0/"> <img src="http://creativecommons.org/images/public/somerights20.gif" alt="Creative Commons License" /></a>This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/2.0/">Creative Commons License</a>.</p>
The power is in the rel=”license” attribute which assigns ‘meaning’ to the hypertext link.
The link to my Google Calendar for each of the events (labelled C) is provided by the Google hCalendar Greasemonkey script. Clicking on the Google Calendar icon (which is embedded in the Web page if hCalendar microformatting markup is detected – although I disable this feature if necessary) will allow the details to be added to my Google Calendar without me having to copy and paste the information.
The additional icons in the browser status bar (labelled D) appear to be intended for debugging of RDFa – and I haven’t yet found a use for them.
The floating RSS Panel (labelled E) is another GrreaseMonkey script. In this case the panel does not process microformats or RDFa but autodetectable links to RSS feeds. I’m mentioning it in this blog post in order to provide another example of how richer structure in HTML pages can provide benefits to an end user. In this case in provides a floating panel in which RSS content can be displayed.
RDFa – Beyond Microformats
The approaches I’ve described above date back to 2006, when microformats was the hot new idea. But now there is more interests in technologies such as Linked Data and RDF. Those responsible for managing Web sites with an interest in emerging new ways of enhancing HTML pages are likely to have an interest in RDFa: a means of including RDF in HTML resources.
The RDFa Primer is sub-titled “Bridging the Human and Data Webs“. This sums up nicely what RDFa tries to achieve – it enables Web editors to provide HTML resources for viewing by humans whilst simultaneously providing access to structured data for processing by software. Microformats provided an initial attempt at doing this, as I’ve shown above. RDFa is positioning as providing similar functionality, but coexisting with developments in the Linked Data area.
The RDFa Primer provides some examples which illustrate a number of use cases. My interest is in seeing ways in which RDFa might be used to support Web sites I am involved in building, including this year’s IWMW 2010 Web site.
The first example provided in the primer describes how RDFa can be used to describe how a Creative Commons licence can be applied to a Web page; an approach which I have described previously.
The primer goes on to describe how to provided structured and machine understandable contact information, this time using the FOAF (Friends of a Friend) vocabulary:
<div typeof="foaf:Person" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <p property="foaf:name">Alice Birpemswick</p> <p>Email: <a rel="foaf:mbox" href="mailto:firstname.lastname@example.org">email@example.com</a></p> <p>Phone: <a rel="foaf:phone" href="tel:+1-617-555-7332">+1 617.555.7332</a></p> </div>
In previous year’s we have marked up contact information for the IWMW event’s program committee using hCard microformats. We might be in a position now to use RDFa. If we followed the example in the primer we might use RDFa to provide information about the friends of the organisers:
<div xmlns:foaf="http://xmlns.com/foaf/0.1/"> <ul> <li typeof="foaf:Person"> <a rel="foaf:homepage" href="http://example.com/bob/">Bob</a> </li> <li typeof="foaf:Person"> <a rel="foaf:homepage" href="http://example.com/eve/">Eve</a> </li> <li typeof="foaf:Person"> <a rel="foaf:homepage" href="http://example.com/menu/">Menu</a> </li> </ul></div>
However this would not be appropriate for an event. What would be useful would be to provide information on the host information for the speakers and workshop facilitators. In previous year’s such information has been provided in HTML, with no formal structure which would allow automated tools to process such institutional information. If RDFa was used to provide such information for the 13 years since the event was first launched this could allow an automated tool to process the event Web sites and provide various report on the affiliations of the speakers. We might be then have a mechanism for answering the query “Which institution has provided the highest number of (different) speakers or facilitators at IWMW events?“. I can remember that Phil Wilson, Andrew Male and Alison Kerwin (nee Wildish) from the University of bath have spoken at events, but who else? And what about the Universities which I am unfamiliar with? This query could be solved if the data was stored in a backend database, but as the information is publicly available on the Web site, might not using slightly more structured content on the Web site be a better approach?
When we first started making use of microformats I envisaged that significant numbers of users would be using various tools on the browser to process such information. However I don’t think this is the case (and I would like to hear from anybody who does make regular use of such tools). I have to admit that although I have been providing microformats for my event information, I have not consumed microformats provided by others (and this includes the microformats provided on the events page on the JISC Web site).
This isn’t, however, necessarily an argument that microformats – or RDFa - might not be useful. It may be that the prime use of such information is by server-side tools which harvest such information form a variety of sources. In May 2009, for example, Google announced that Google Search Now Supports Microformats and Adds “Rich Snippets” to Search Results. Yah0o’s SearchMonkey service also claims to support structured search queries.
But before investing time and energy into using RDFa across an event Web site the Web manager will need answers to the questions:
- What benefits can this provide? I’ve given one use case, but I’d be interested in hearing more.
- What vocabularies do we need to use and how should the data be described? The RDFa Primer provides some example, but I am unsure as to how to use RDFa to state that, for example, Brian Kelly is based at the University of Bath, to enable structured searches of all speakers from the University of Bath.
- What tools are available which can process the RDFa which we may chose to create?
Anyone have answers to these questions?