RDFa and WordPress
Posted by Brian Kelly (UK Web Focus) on 5 April 2011
RDFa: A Brief Recap
RDFa (Resource Description Framework – in – attributes) is a W3C Recommendation that adds a set of attribute level extensions to XHTML for embedding rich metadata within Web documents.
As described in the Wikidpia entry for RDFa five “principles of interoperable metadata” are met by RDFa:
- Publisher Independence: each site can use its own standards
- Data Reuse: data is not duplicated. Separate XML and HTML sections are not required for the same content.
- Self Containment: the HTML and the RDF are separated
- Schema Modularity: the attributes are reusable
- Evolvability: additional fields can be added and XML transforms can extract the semantics of the data from an XHTML file
Additionally RDFa may benefit Web accessibility as more information is available to assistive technology.
But how does go about evaluating the potential of RDFa? Last year I wrote a post on Experiments With RDFa which was based on manual inclusion of RDFa markup in a Web page. Although this highlighted a number of issues, including the validity of pages containing RDFa, this is not a scalable approach for significant deployment of RDFa. What is needed is a content management system which can be used to deploy RDFa on existing content in order to evaluate its potential and understand deployment issues.
The Potential for WordPress
WordPress as a Blog Platform and a CMS
WordPress provides a blog platform which can be used for large-scale management of blogs which are hosted at wordpress.com. In addition the software is available under an open source licence and can be deployed within an institution. There is increasing interest in use of WordPress within the higher education sector as can be seen from the recent launch of a WORDPRESS JISCMail list (which is aimed primarily at the UK HE sector) with some further examples of interest in use of WordPress in being available on the University Web Developers group.
A recent discussion on the WORDPRESS JISCMail lists addressed the potential of WordPress as a CMS rather than a blogging platform. Such uses were also outlined recently in a post on the College Web Editor blog which suggested reasons why WordPress can be the right CMS for #highered websites. In light of the growing interest in use of WordPress as a CMS it would seem that this platform could have a role to play in the deployment of new HTML developments such as RDFa.
The wp-RDFa WordPress Plugin
A strength of WordPress is its extensible architecture which allows plugins to be developed by third parties and deployed on locally installations of the software. One such development is the wp-RDFa plugin which supports FOAF and Dublin Core metadata. The plugin uses Dublin Core markup to tag posts with the title, creator and date elements. In addition wp-RDFa can be configured to make use of FOAF to “relate your personal information to your blog and to relate other users of your blog to you building up a semantic map of your relationships in the online world“.
Initial Experiments With wp-RDFa
Dublin Core Metadata
UKOLN’s Cultural Heritage blog has been closed recently, with no new posts planned for publication. The blog will however continue to be hosted and can provide a test bed for experiments such as use of the wp-RDFa plugin.
In an initial experiment we found that the although the titles of each blog post were described using Dublin Core metadata, the title was replicated in the blog display. Since this was not acceptable we displayed the use of Dublin Core metadata and repeated the experiment on a private backup copy of the UK Web Focus blog. This time there were no changes in how the blog posts were displayed.
The underlying HTML code made use of the Dublin Core namespace:
with each individual blog post containing the title and publication date provided as RDFa:
<span property=“dc:date” content=”2010-04-27 08:17:53″ resource=”http://blogs.ukoln.ac.uk/xxxxx/2010/04/27/workshop-on-engagement-impact-value/” />
<span rel=”http://blogs.ukoln.ac.uk/xxxxx/2010/04/27/workshop-on-engagement-impact-value/” property=”dc:title” resource=”http://blogs.ukoln.ac.uk/xxxxx/2010/04/27/workshop-on-engagement-impact-value/”>Workshop on Engagement, Impact, Value</span></a></h3>
It therefore does appear that the plugin can be deployed on local WordPress installations in order to provide richer semantic markup for existing content. I suspect that the problem with the display in the original experiment may may due to an incompatibility with the theme which is being used (Andreas09). I have reported this problem to the developer of the wp-RDFa plugin.
I had not expected an RDFa plugin to provide support for FOAF, the Friends-of-a-Friend vocabulary. However since my work with FOAF dates back to at least 2004 I had an interest in seeing how it might be used in the context of a blog.
I had expected that information about the blog authors and commenters would be displayed in some way using a RDFa viewer such as the FireFox Operator plugin. However nothing seemed to be displayed using this plugin. In addition use of the RDFa Viewer and the RDFA Developer plugin also failed to detect FOAF markup embedded as RDFa. I subsequently found that the FOAF information was provided as an external file. Use of the FOAF Explorer service provides a display of the FOAF information which has been created by the plugin.
What surprised me with the initial display of the FOAF content was the list of names which I did not recognise. It seems that these are authors and contributors to a variety of other blogs hosted on UKOLN’s WordPress MU (multi-user) server. I wonder whether the plugin was written for a previous version of WordPress, for which there was one blog per installation? In any case a decision has been made to provide access to a FOAF resource which contains details of the blog authors only, as illustrated.
A post on Microformats and RDFa deployment across the Web recently surveyed take-up of RDFa based on an analysis of 12 billion web pages indexed by Yahoo! Search and shows that we are seeing a growth in the take-up of semantic markup in Web pages. As CMS systems (such as Drupal 7 which supports RDfa ‘out of the box’ – link updated in light of comment) begin to provide RDFa support we might expect to see a sharp growth in Web pages which provide content which can be processed by software as well as being read by humans. For those institutions which host a local WordPress installation it appears that it is now possible to begin exploring use of RDFa. As described in a post by Mark Birkbeck on RDFa and SEO an important role for RDFa will be to provide improvements to searching. But in addition the ability to use wp-RDFa to create FOAF files makes we wonder whether this approach might be useful in describing relationships between contributors to blogs and perhaps provide the hooks to facilitate data-mining of the blogosphere.
It would be a mistake, however, to focus on one single tool for creating RDFa markup. On the WORDPRESS JISCMail list Pat Lockley mentioned that he is also developing an RDFa plugin for WordPress and invited feedback on further developments. Here are some of my thoughts:
- There is a need for a clear understanding of how the semantic markup will be applied and the user cases it aims to address.
- There will also be a need to understand how such semantic markup would be used in non-blogging uses of WordPress, where the notions of a blog post, blog author and blog commenters may not apply.
- There will be a need to ensure that different plugins which create RDFa markup are interoperable i.e. if a plugin is replaced by an alternative applications which process the RDFa should give consistent results.
- Consideration should be given to privacy implications of exposing personal data (in particular) in semantic markup.
Is anyone making use of RDFa in WordPress who has experiences to share? And are there any further suggestions which can be provided for those who are involved in related development work?