UK Web Focus (Brian Kelly)

Innovation and best practices for the Web

RDFa and WordPress

Posted by Brian Kelly on 5 Apr 2011

RDFa: A Brief Recap

RDFa (Resource Description Framework – in – attributes) is a W3C Recommendation that adds a set of attribute level extensions to XHTML for embedding rich metadata within Web documents.

As described in the Wikidpia entry for RDFa five “principles of interoperable metadata” are met by RDFa:

  1. Publisher Independence: each site can use its own standards
  2. Data Reuse: data is not duplicated. Separate XML and HTML sections are not required for the same content.
  3. Self Containment: the HTML and the RDF are separated
  4. Schema Modularity: the attributes are reusable
  5. Evolvability: additional fields can be added and XML transforms can extract the semantics of the data from an XHTML file

Additionally RDFa may benefit Web accessibility as more information is available to assistive technology.

But how does go about evaluating the potential of RDFa? Last year I wrote a post on Experiments With RDFa which was based on manual inclusion of RDFa markup in a Web page. Although this highlighted a number of issues, including the validity of pages containing RDFa, this is not a scalable approach for significant deployment of RDFa. What is needed is a content management system which can be used to deploy RDFa on existing content in order to evaluate its potential and understand deployment issues.

The Potential for WordPress

WordPress as a Blog Platform and a CMS

WordPress provides a blog platform which can be used for large-scale management of blogs which are hosted at wordpress.com. In addition the software is available under an open source licence and can be deployed within an institution. There is increasing interest in use of WordPress within the higher education sector as can be seen from the recent launch of a WORDPRESS JISCMail list (which is aimed primarily at the UK HE sector) with some further examples of interest in use of WordPress in being available on the University Web Developers group.

A recent discussion on the WORDPRESS JISCMail lists addressed the potential of WordPress as a CMS rather than a blogging platform.  Such uses were also outlined recently in a post on the College Web Editor blog which suggested reasons why WordPress can be the right CMS for #highered websites.  In light of the growing interest in use of WordPress as a CMS it would seem that this platform could have a role to play in the deployment of new HTML developments such as RDFa.

The wp-RDFa WordPress Plugin

A strength of WordPress is its extensible architecture which allows plugins to be developed by third parties and deployed on locally installations of the software.  One such development is the wp-RDFa plugin which supports FOAF and  Dublin Core metadata. The plugin uses Dublin Core markup to tag posts with the title, creator and date elements. In addition wp-RDFa can be configured to make use of FOAF to “relate your personal information to your blog and to relate other users of your blog to you building up a semantic map of your relationships in the online world“.

Initial Experiments With wp-RDFa

Dublin Core Metadata

UKOLN’s Cultural Heritage blog has been closed recently, with no new posts planned for publication.  The blog will however continue to be hosted and can provide a test bed for experiments such as use of the wp-RDFa plugin.

In an initial experiment we found that the although the titles of each blog post were described using Dublin Core metadata, the title was replicated in the blog display. Since this was not acceptable we displayed the use of Dublin Core metadata and repeated the experiment on a private backup copy of the UK Web Focus blog. This time there were no changes in how the blog posts were displayed.

The underlying HTML code made use of the Dublin Core namespace:

<rdf:RDF xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#&#8221; xmlns:dc=”http://purl.org/dc/elements/1.1/”&gt;

with each individual blog post containing the title and publication date provided as RDFa:

<h3 class=”storytitle”>
<span property=“dc:date” content=”2010-04-27 08:17:53″ resource=”http://blogs.ukoln.ac.uk/xxxxx/2010/04/27/workshop-on-engagement-impact-value/&#8221; />
<span rel=”http://blogs.ukoln.ac.uk/xxxxx/2010/04/27/workshop-on-engagement-impact-value/&#8221; property=”dc:title” resource=”http://blogs.ukoln.ac.uk/xxxxx/2010/04/27/workshop-on-engagement-impact-value/”>Workshop on Engagement, Impact, Value</span></a></h3>

It therefore does appear that the plugin can be deployed on local WordPress installations in order to provide richer semantic markup for existing content. I suspect that the problem with the display in the original experiment may may due to an incompatibility with the theme which is being used (Andreas09). I have reported this problem to the developer of the wp-RDFa plugin.

FOAF (Friends-of-a-Friend)

I had not expected an RDFa plugin to provide support for FOAF, the Friends-of-a-Friend vocabulary.  However since my work with FOAF dates back to at least 2004 I had an interest in seeing how it might be used in the context of a blog.

I had expected that information about the blog authors and commenters would be displayed in some way using a RDFa viewer such as the FireFox Operator plugin. However nothing seemed to be displayed using this plugin. In addition use of the RDFa Viewer and the RDFA Developer plugin also failed to detect FOAF markup embedded as RDFa.  I subsequently found that the FOAF information was provided as an external file.  Use of the FOAF Explorer service provides a display of the FOAF information which has been created by the plugin.

What surprised me with the initial display of the FOAF content was the list of names which I did not recognise.  It seems that these are authors and contributors to a variety of other blogs hosted on UKOLN’s WordPress MU (multi-user) server. I wonder whether the plugin was written for a previous version of WordPress, for which there was one blog per installation? In any case a decision has been made to provide access to a FOAF resource which contains details of the blog authors only, as illustrated.

Emerging Issues

A post on Microformats and RDFa deployment across the Web recently surveyed take-up of RDFa based on an analysis of 12 billion web pages indexed by Yahoo! Search and shows that we are seeing a growth in the take-up of semantic markup in Web pages.  As CMS systems (such as Drupal 7 which supports RDfa ‘out of the box’ – link updated in light of comment)  begin to provide RDFa support we might expect to see a sharp growth in Web pages which provide content which can be processed by software as well as being read by humans.  For those institutions which host a local WordPress installation it appears that it is now possible to begin exploring use of RDFa. As described in a post by Mark Birkbeck on RDFa and SEO an important role for RDFa will be to provide improvements to searching.  But in addition the ability to use wp-RDFa to create FOAF files makes we wonder whether this approach might be useful in describing relationships between contributors to blogs and perhaps provide the hooks to facilitate data-mining of the blogosphere.

It would be a mistake, however, to focus on one single tool for creating RDFa markup.  On the WORDPRESS JISCMail list Pat Lockley  mentioned that he is also developing an RDFa plugin for WordPress and invited feedback on further developments.  Here are some of my thoughts:

  • There is a need for a clear understanding of how the semantic markup will be applied and the user cases it aims to address.
  • There will also be a need to understand how such semantic markup would be used in non-blogging uses of WordPress, where the notions of a blog post, blog author and blog commenters may not apply.
  • There will be a need to ensure that different plugins which create RDFa markup are interoperable i.e. if a plugin is replaced by an alternative applications which process the RDFa should give consistent results.
  • Consideration should be given to privacy implications of exposing personal data (in particular) in semantic markup.

Is anyone making use of RDFa in WordPress who has experiences to share?  And are there any further suggestions which can be provided for those who are involved in related development work?

9 Responses to “RDFa and WordPress”

  1. http://webscience.org/person/2.html

    Have a look at how the triples are embedded in the above page. This approach, using a script tag, does not cock up the HTML. Trying to use the same strings of text for both data and HTML is seen as something elegant.

    it is not.

    What it results in is trying to rejig a document flow to support the RDF. That’s nuts. One is a tree, the other a graph. Stop it right now you silly silly people.

    RDFa is really bloody hard to write. It’s possible for software tools to get it right, but it’s insane to expect normal people editing content to do anything less than make an utter hash of it.

    Much as I think it’s a bad idea, I hate to see it done wrong, so made this tool: Stuff 2 RDF which converts many formats to many types of RDF. One option lets you put in triples and Cut-and-Paste some RDFa. Let me know if there are any bugs, cjg@ecs.soton.ac.uk

    • Hi Chris
      I completely agree with you that RDFa shouldn’t be hand-coded. But this post is about a software tool (wp-RDFa) which automates that process. So I’m not sure of the point you’re trying to make – aren’t we in agreement?

      • I think I was sleepy and ranty and was agreeing with you as an excuse to trot out my old RDFa rant :)

        I think in software it’s only slightly flawed *grin*. But embedding the triples in HTML in some fashion *is* a good idea. Probably.

      • :-) For the sake of completeness I agree that there is an argument that embedding Linked Data within HTML (as RDFa) may have flaws. In the context of this particular post I was exploring ways in which WordPress plugins can expose richer structure for the content. The main focus was the structure described as RDFa but as we saw in the case of use of wp-RDFa to create FOAF data that isn’t the only possible solution – the FOAF file is, in this case, a separate RDF file. I assume this would be your preferred approach?

  2. Steph Gray said

    Interesting Brian – I did use WordPress custom fields and page templates (pre WordPress 3.0) to support RDFa for government consultations, when the BIS site was still on WordPress (no longer).

    Written up here:
    http://www.helpfultechnology.com/helpful-blog/2010/01/adding-rdfa-to-a-consultation/

  3. Pat Lockley said

    I see it as a problem with WordPress themes.

    Using http://codex.wordpress.org/Plugin_API/Filter_Reference I can make lots of stuff RDFa rich and beautiful, but to not break a theme, I might need to use javascript – which will be fine for an end user in a browser, but not for a robot?

    Choices?

  4. Pat Lockley said

    Answering own question – have a robot detector that wrecks the theme – robots don’t have aesthetic eyes after all – poor things.

  5. Lin Clark said

    Just wanted to point out that the link that you have there for Drupal actually points to an issue in the issue queue for Views, a contributed module. While Views does plan to use core’s RDF Mapping API to produce RDFa, that isn’t the RDF support that most people are talking about.

    The issue that was active for RDF support in core is at http://drupal.org/node/493030, and Dries announced the completion of development at DrupalCon SF 2010, http://buytaert.net/semantic-web-and-drupal-video

    Thanks for the interesting post, it’s great to see that people are pushing forward data interoperability in WordPress too.

Leave a comment