UK Web Focus

Innovation and best practices for the Web

Archive for January, 2013

Evolving Rules of Grammar

Posted by Brian Kelly on 31 January 2013

Is “Why every researcher should sign up for their ORCID ID” Grammatically Incorrect?

Tweets saying "every researcher should claim their ORCID ID"Yesterday a post of mine entitled “Why every researcher should sign up for their ORCID ID” was republished on the LSE Impact of Social Sciences blog. The announcement made by @lseimpactblog was subsequently widely retweeted, as illustrated.

It was subsequently pointed out the sentence contained a grammatical error: “every researcher” is singular and therefore shouldn’t be followed by a plural form of the pronoun: “their ORCID ID“. Coincidentally yesterday I came across as tweet which linked to the announcement that “The [University of Washington] Daily adopts gender-neutral pronoun“. I responded to the tweet questioning whether this was a wise decision:

Univ of WA adopts gender neutral pronouns – they as singular pronoun: ow.ly/hgwD6 Surely a thumbs down?

In response it seems that several people were in agreement with the decision taken at the University of Washington that “The Daily will join the efforts of these organizations by implementing gender-neutral language, using “they” as a singular pronoun when applicable“. I received several responses shortly after publishing my tweet:

  • It was good enough for Jane Austen! :)
  • Meh. It’s been around since at least 1595 – better than ubiquitous ‘he’, generally less clumsy than ‘he/she’, so why not?
  • I use ‘they’ as a gender neutral pronoun. Better than s/he surely?
  • why? I can live with it for the sake of less gendered conversations (and have been doing it for years anyway)

However one person made the point that:

  • I really HATE the use of “they” as a singular pronoun!

I would agree with the view that “Why every researcher should sign up for his/her ORCID ID” is ugly. I also feel that “Why every researcher should sign up for his ORCID ID” is sexist and “Why every researcher should sign up for her ORCID ID” seeks to make a political point which, although I might be sympathetic towards, will distract from the purpose of the sentence.

In light of the comments and subsequent discussion on Twitter this morning I now realise that I agree that this construct is now acceptable. However as a comment made on the English StackExchange forum put it:

It’s not ungrammatical per se on the basis of analysis of actual usage using reasonable linguistic methods. But use it at your own risk of being criticized by the self-righteous but misinformed.

The question seems to no longer a question of one’s understanding correct and incorrect language use but one’s willingness to potentially alienate the “self-righteous but misinformed“. And note that before anyone suggests that there is no such things as incorrect language use I’ll highlight a tweet I saw this morning which provided an ironical perspective on language misuse:

Somewhere, someone who writes “should of” instead of “should have” gets paid more than me.

The particular example discussed in this post clearly has ‘political’ connotations as one form which was popular in the past makes 50% of the population invisible (it was interesting to observe, y the way, that 4 of the 5 initial responses were from women). It would be possible to sidestep such controversy by restructuring the sentence e.g. “Why all researcher should sign up for an ORCID ID” or “Why all researchers should sign up for their ORCID ID“. But what about the more general question regarding changing rules of grammar?

“Data Is” or “Data Are”?

"Data is" or "Data are" discussionAs recorded in a Storify summary of the subsequent Twitter discussion, last year a reviewer of a paper which asked “Can Linkedin and Academia.edu Enhance Access to Open Repositories?” commented that:

the word ‘data’ is still a plural noun, no matter how many times people may erroneously use it in the singular

Myself and my co-author Jenny Delasalle disagreed and the paper was published containing the sentence:

As described by Delasalle [8] the data for Academia.edu was obtained by entering the institution’s name in the search box; the number of entries were then displayed

But what if reviewers or editors insist that text must conform with specific house rules in order for a submitted article to be published? Should one’s approach to writing and grammar be based on one’s own views on what is appropriate or on what may be appropriate for the readers? And if that latter, whose opinions should one prioritise: the editors and reviewers or general readers?

It seems to me that it can be helpful to gauge opinion on such matters. I have therefore set up two surveys to solicit views on whether the following grammatical constructs are felt to be appropriate in scholarly works: “Anyone who loves the English language should have a copy of this book in their bookcase” and (b) “The data was obtained from an online survey“.

I invite responses to the survey and comments on this topic.


View Twitter conversation from: [Topsy] | View Twitter statistics from: [TweetReach] – [Bit.ly]

Posted in General | 8 Comments »

Jisc Report on Sustaining Our Digital Future: Institutional Strategies for Digital Content

Posted by Brian Kelly on 30 January 2013

JISC SCA reportEarlier today the Jisc announced the launch of a report on Sustaining Our Digital Future: Institutional Strategies for Digital Content.

This report, which provides a close look at three institutions (UCL, Imperial War Museums and the National Library of Wales) in the United Kingdom confirms:

  • How fragmented the digital landscape is at universities and within other organisations.
  • How there are examples of good practice within and outside higher education that all can learn from but that greater co-ordination is required to deliver this at a UK level.
  • How little the topic of post-build sustainability comes up at the higher levels of administration.
  • How risk is present within the current system, concerning the sustainability of digital content.

The report (which is available in PDF format) is substantial, containing 88 pages. In addition to this main report a second document (also available in PDF format) provides a “Sustainability Health Check Tool for Digital Content Projects“.

This report is very timely arriving at a time in which we are seeing reductions in the levels of funding available across public sector organisations in the UK, which will lead to questions regarding the sustainability of existing online services and digital resources.

The report is based on a study conducted by Ithaka S+R, with funding from the Jisc-led Strategic Content Alliance, which reported on findings of earlier studies showing that both funders and project leaders rely heavily on their host institutions to support and sustain digital content, beyond the end of the grant. But what will happen when the host institutions have significantly reduced levels of funding to continue to maintain and develop such content?

The report describes the need for an “early and honest appraisal of which projects are likely to require .. support post-launch“:

  • Digital content, requiring just “maintenance”: These may not require ongoing growth, but certainly do require a clear exit plan to ensure that the content will be smoothly deposited and integrated into some other site, database, or repository. The issue of ongoing investment does not disappear; it just becomes the concern of the larger platform on which this piece of content now lives.
  • Digital resources, requiring ongoing growth and investment: These require early sustainability planning, including identifying institutional or other partners and careful consideration of the full range of costs and activities needed to keep the resource vibrant.

The Sustainability Health Check Tool provides a paper-based checklist for those with responsibilities for managing digital content. The tool covers a number of areas including ongoing support; audience, usage and impact assessment together with preservation issues.

A series of video clips have been produced to accompany the launch of this report. It was particularly interesting to hear the comment from Prof David Price, Vice-Provost (Research) at UCL:

We’re not just worried about things disappearing but about things never appearing! They are hosted all over the place, and not all the projects have a sustainable plan.”

This video clip is available on YouTube and embedded below.


View Twitter conversation from: [Topsy] | View Twitter statistics from: [TweetReach] – [Bit.ly]

Posted in preservation | Tagged: | 1 Comment »

Why I’m Now Embedding ORCID Metadata in PDFs

Posted by Brian Kelly on 28 January 2013

“Every PDF needs a title”

The day after announcing a post on Reflections on the Discussion on the Quality of Embedded Metadata in PDFs I received a tweet from @community which alerted me to a blog post on SEO Action for PDF files on the Adobe blog. The post describes an extension for use in Acrobat X Pro which automates the settings of the properties of the PDF file in accordance with guidelines which can enhance the discoverability of PDF files by Google. The guidelines, which had been published way back in August 2009, were based on experiments which demonstrated improvements in Google’s indexing of PDF files. The article’s main conclusion was that “Every PDF needs a title“:

In terms of PDF files, the blue underlined text in Google’s search results comes from one of two places. First, Google looks in the “Title” document information field. If it finds nothing, Google’s indexer tries to guess the document’s title by scanning the text on the first few pages. This usually doesn’t work, producing incorrect and improperly formatted results.

In addition to this advice, the article also suggested use of other metadata fields including author, subjects and keywords.

Metadata For Peer-Reviewed Papers

Although I ensure that I provide the correct title for my peer-reviewed papers when I create them in MS Word I was unsure whether I included the names of the co-authors or made use of other metadata fields.

Metadata fields in MS WordOn Friday 25 January 2013 I decided to update the metadata for one of my papers, “Developing A Holistic Approach For E-Learning Accessibility” which was the first paper myself, Lawrie Phipps and Elaine Swift wrote back in 2004

I added a number of tags to the paper and used the Comments field to provide the abstract. In addition the publication details were added to the Status field.

Whilst updating the metadata it occurred to me that it would be useful to include the ORCID ID for the authors as this will be less volatile than the author’s email address (one of the co-authors was based at the University of Bath when the paper was published but subsequently moved to Nottingham Trent University).

alt text for images in MS WordIn addition to the resource discovery metadata for the paper I also remembered that I should ensure that images in the paper contained appropriate alt text so that image descriptions are available to those who may make use of a screen reader. Fortunately we had done this for the paper, but I have to admit that this isn’t necessarily done for all of my research papers.

Having updated the metadata for the paper and embedded images I then created the PDF from MS Word. I noticed that the Save As PDF option in MS Word enabled a number of options to be specified, including Save As ISO-19005 (PDF/A).

As described in Wikipedia PDF/A is “an ISO-standardized version of the Portable Document Format (PDF) specialized for the digital preservation of electronic documents“. The articles goes on to explain that “PDF/A differs from PDF by omitting features ill-suited to long-term archiving, such as font linking (as opposed to font embedding)“.

Savie as PDF option in MS WordSince the digital preservation of peer-reviewed publications is important I ensured that I saved the paper in PDF/A format, using the Save As option illustrated.

Approaches to Embedded Metadata Embedded in PDFs

What practices should be used in providing the metadata to be created in the original authoring tool (MS Word, in my case) which will then be available in the PDF version of the paper? Here’s a summary of the approaches I have used:

Title: The title of the paper

Tags: My preferred tags about the content and my organisation.

Comments: The abstract of the paper, normally taken from the abstract provided in the paper.

Author: First Name Surname (ORCID: ORCID ID) e.g. Brian Kelly (ORCID: 0000-0001-5875-8744)

The title field will be obvious. The tags will reflect keywords which I feel will enhance access to the document (and I choose less than five). I am using the comments field to host the abstract for the paper. Finally the author field contains the full name followed by ORCID: ORCID ID (in brackets). I feel that this is a pragmatic approach to ensuring that the significant information which will be indexed by Google is found in the metadata fields which are available through my authoring tool (MS Word).

But could this cause problems? Might Google think my name is Mr Orcid or Mr 0000-0001-5875-8744? Might other indexing and aggregation tools have problems as I am misusing the semantics of these metadata tools? My feeling is that Google will be capable of understanding the content and it is better to have such quality metadata (which I have chosen) rather than no metadata. But are other researchers embedding ORCID IDs in PDFs? In order to answer this question I have using Google’s advanced search capability to search for “ORCID” in PDF resources across a number of domains, as summarised below.search for "ORCID" in PDFs in ac.uk domain

Domain Results Date Current Results
All 3,840 28 Jan 2013 Try it
.ac.uk   109 28 Jan 2013 Try it
bath.ac.uk       0 28 Jan 2013 Try it

These numbers are low – and when you realise that the results include PDFs which contain the string “ORCID” in the text of the pages (as illustrated) it seems clear that there is little evidence that ORCID IDs are being embedded in PDFs yet.

So before I embed ORCID IDs in my other papers I would welcome feedback on this proposal. Is it desirable to include the ORCID IDs of authors in the PDF versions of papers? If so, is the approach I have taken to be recommended to others? Or might it be desirable to provided richer structured metadata in PDF files, using the XMP (Extensible Metadata Platform) standard? But if this is felt to be desirable, how would it fit into the workflow, given that it appears difficult to persuade authors to provide metadata for their papers in any case?


View Twitter conversation from: [Topsy] | View Twitter statistics from: [TweetRearch] – [Bit.ly]

Posted in Identifiers, Repositories | Tagged: | 6 Comments »

Twitter Announces Vine. But How Could Higher Education Use 6-second long Videos?

Posted by Brian Kelly on 25 January 2013

Sharing Brief Video Clips on Twitter

Yesterday Twitter announced Vine: A new way to share video. As described in a TechCrunch article[Vine] integrates with Twitter in the same way that Instagram does, except that Vine never turned off permissions randomly, meaning that Vine videos can be embedded directly in tweets, showing up in followers’ streams“. An article in the Guardian explains how “Vine clips automatically play when embedded in tweets, although their sound is turned off by default. The clips also play within Twitter’s official mobile app. Users can add locations to their clips – the app draws on Foursquare’s places database for that – with three options for sharing: Vine, Twitter and/or Facebook.” The Guardian article instantly attracted comments on how Vine might be (mis-used):

  • Sexting app
  • Advert app
  • oh no it’s the video equivalent of gifs, twitter is gonna become as annoying as tumblr is with these.

although others provided more thoughtful responses:

As with everything, it’s all about how you leverage the technology. 
Yes, for the most part, this app will feature videos of no importance whatsoever, but there will, as always, be some gems in the dirt.

Leaving that aside, you have to remember that with Twitter, many people end up forming a close circle of people they meet physically in the real world – so Twitter augments that. 
I don’t give a damn about someone I’ve never mets photo of their dog on twitter, but I do care if a friend of mine posts a picture of their dog.

The same applies to tweeting – to most people, the “Did xyz run in xyz area this morning, totally knackered” is completely meaningless and banal. But to this persons friends, it’s likely to promote conversation when they next meet. “Saw your tweet Dave, how was the run down at xyz? Did a run there recently” …

So, before you instantly dismiss tech such as this, perhaps give it a *little* more thought?

I would agree that we should give a little more thought to the implications of new technologies, especially their potential in higher education.

Initial Experiments

Vine appEarlier today I installed the Vine app on by iPod Touch and recorded a number of video clips. I asked what could be said in 6 seconds (partly to get a feel for what could be said in such a brief period. In my second video clip I said “E=MC2 and the DNA is a double helix” to illustrate how important scientific concepts could be described using the Vine app. By then I had gained some familiarity with the app. In my third post I described what I liked about the app: being able to stop and start reshooting by simply removed my finger for the screen. My four post described what I didn’t like – the lack of support for the iPod Touch’s forward-facing camera.

I then started to write this post – and discovered that I couldn’t find the URL for the video clips I had created and uploaded to Vine. I can view the videos using the Vine app and people who follow me on Vine will see the videos in their Vine timeline but it seems as though they are not available via a Web interface; this was confirmed by Giles Turnbull, one of my Twitter followers who is also experimenting with Vine: “only way to find out the URL of your Vine post is to share it somewhere. if you choose not to share, or forget, you can’t find it on the web“.

I therefore created another clip which is available online. However there does not appear to be a Web interface to my Vine profile, so I can’t access my clips via a Web browser in order to change access rights, delete videos, manage Vine followers, etc.

Perhaps it is unfair to be too critical of the limitations of the initial release of the app: these short=-comings may be remedied in a subsequent release. However I thought I would summarise my initial experiments for others who may wish to evaluate the app. And rather than describe possible use cases for 6-second long video clips in higher education I’d welcome suggestions. If you’d rather not describe possible uses, perhaps you may wish to complete the poll on whether you think Vine has a role to play in higher education.


View Twitter conversation from: [Topsy] | View Twitter statistics from: [Bit.ly]

Posted in Social Networking, Twitter | Tagged: , | 4 Comments »

Don’t Leave Instagram (or Facebook, Google Drive, …) Until You’ve Considered the Implications

Posted by Brian Kelly on 17 January 2013

New Year: An Opportunity to Delete Social Media Accounts!

A few days ago I received the following email from Instagram:

As we announced in December, we have updated our Terms of Service and Privacy Policy. These policies also now take into account the feedback we received from the Instagram Community. We’re emailing you to remind you that, as we announced last month, these updated policies will be in effect as of January 19th, 2013. 

That’s right, as of Saturday 19th January 2013, the new terms and conditions come into operation.

Did you delete your Instragram account before Christmas, once you saw the tweets and the blog posts about how Instagram intended to sell the photos you have taken of your loved ones? Perhaps you made a new year’s resolution to cancel subscriptions to services for which you don’t pay a subscription, so that “you’re the product“. Or maybe you have taken the opportunity to delete accounts which you simply don’t use perhaps Google+ appeared promising when it was launched but it hasn’t found a place in your regular workflow.

Are You Making An Informed Decision?

Is your decision based on a correct understanding of the appropriate policies? Are you aware of the possible risks in deleting social media account?

Back in April 2012 a post which asked Have You Got Your Free Google Drive, Skydrive & Dropbox Accounts? was written in response to a tweet from @sydlawrence which said:

Holy crap. Google owns everything on google drive. Tell me a business that will use it… cl.ly/1W2h1A163p0W2A … 

which linked to the following screenshot of the Google Drive terms and conditions:

Google Drive terms and conditionsThe screenshot quite clearly states that “You retain ownership of any intellectual property that you hold in that content. In short, what belongs to you stays yours“. It’s therefore not surprising that the image was subsequently deleting – but not before the post was retweeted 1,109 times and favourited by 115 Twitter users!

This provides a good example of how an incorrect summary (whether through a mistake or malicious intent) of the terms and conditions of a service can be easily repeated and, through Twitter’s power in viral communications, lead to such misinformation being widely accepted as the truth.

The situation with Instragram is not as clear-cut since the company have admitted their failings:

it became clear that we failed to fulfill what I consider one of our most important responsibilities – to communicate our intentions clearly 

and explained how, in the light of user feedback (emphasis provided in original):

we are reverting … to the original version that has been in effect since we launched the service in October 2010

Instragram now echo Google in providing an unambiguous statement regarding ownership of content uploaded to the service:

Instagram has no intention of selling your photos, and we never did. We don’t own your photos – you do.

So if you deleted your Instagram account because you had been led to believe that you were losing ownership of your content or your content could be sold without your permission then your made this decision based on incorrect assumptions!

Further Thoughts on Deletion of Social Media Accounts

“If you’re not paying for something, you’re not the customer; you’re the product”

Back in November 2010 a post on the LifeHacker blog gave the background to the statement If You’re Not Paying for It; You’re the Product:

This particular quote comes from a discussion on MetaFilter, regarding the massive changes at the social aggregation news site Digg earlier this year. MetaFilter user blue_beetle accurately observed that “if you’re not paying for something, you’re not the customer; you’re the product being sold”. This sentiment doesn’t just apply to unhappy Digg users but to a significant portion of the online experience and many real life interactions.

I’ve commented previously on the flaws in this argument: I didn’t pay for my education as a child – does this mean that I’m simply a product of the capitalist system which will seek to exploit me as a worker and provide free health care so my productivity is maximised? Similarly I don’t pay to watch ITV; in this case the adverts are the TV companies’ key services which I am encouraged to consume, with the TV programmes filling the gaps between the advertising breaks.

In reality many of the social media service seek to monetise the ‘attention data’ in order to make a profit, as well as cover the costs of providing the services. Like many people, although by no means everyone, I am prepared to accept this environment and have not chosen to purchase a premium account which many social media companies provide for those who wish to avoid seeing advertising materials.

I am not alone in my views on the phrase. The Powazek blog contained a post entitled I’m Not The Product, But I Play One On The Internet which was published in December 2012 which described how:

But the more the line is repeated, the more it gets on my nerves. It has a stoner-like quality to it (“Have you ever looked at your hands? I mean really looked at your hands?”). It reminds me of McLuhan’s “the medium is the message,” a phrase that is seemingly deep but collapses into pointlessness the moment you think about using it in any practical way. 

The post concludes:

we should all stop saying, “if you’re not paying for the product, you are the product,” because it doesn’t really mean anything

There will be legitimate reasons why you may chose not to use a service because you are unhappy with their terms and conditions – but such decisions should be made because of an informed decision and not just because you aren’t paying for the service.

Social Media Accounts Which Aren’t Being Used

But beyond the issue of the terms and conditions, should you delete an account because it is little used? Although this would appear to be a sensible decision there is a need to consider the associated risks.

Back in January 2011 a post on Evidence of Personal Usage Of Social Web Services described the long gestation period for services such as Twitter. As I concluded “in the case of Twitter it was only after two years of first using the service that it became embedded in my working practices” – there was a need to have (a) have a critical mass of Twitter followers with whom I could engage with; (b) have more effective tools than the Twitter Web client I used initially and (c) have a compelling use case which convinced me of the value of the service (this turned out to be use of Twitter at a conference when I was away from the office for a period and meeting new people).

I would admit that I have not yet found a compelling use case for Google+. But I will keep the account, partly because the account is used to authenticate myself with other Google services. But in addition I would not wish to miss out on the occasional use I do make of Google+ or to have to rebuild a Google+ community if I delete the account and subsequently find uses for the service.

Similarly my Facebook account provide an address book for friends and colleagues and a means of keeping in touch beyond annual Christmas cards. But in addition, as I suggested in a post which asked What Could Facebook’s New Search System Offer Researchers? recent Facebook developments, such as the Facebook Graph Search, may provide new opportunities which could be of value to me. Stephen Downes on the OLDaily blog has commented that:

A graph search makes sense, and would eventually provide better results than Google, but it really depends on people being engaged enough with Facebook to generate useful data, and that is far from clear. More from E-Commerce TimesSocial Media TodayBBC NewsMashableBrian KellyClickZTechnology ReviewBen WerdmullerWired News..

I agree that it is unclear whether Facebook will have sufficient momentum to provide a useful service; for me, this is also true of Google+. However I have judged the risks of continuing to use the services as low, with the loss of my networks on such services meaning that it would be difficult and time-consuming to regenerate such networks if the services did turn out to be useful.

I have summarised the decisions I have made and the rationale behind the decisions. Have you chosen to delete any social media accounts? Or have you considered deleting accounts and decided not to? I’d welcome your thoughts.

PS: A tweet from @digisim reminded me that I had intended to also add that one reason for subscribing to social media services which aren’t used is to claim your username. I have claimed briankelly on the identi.ca service in case that service (touted as an open alternative to Twitter) ever takes off. However as I have only posted four times since July 2008 and only have 12 followers it seems unlikely that the service will take off.


View Twitter conversation from: [Topsy] | View Twitter statistics from: [Bit.ly]

Posted in Legal, Social Web | 19 Comments »

What Could Facebook’s New Search System Offer Researchers?

Posted by Brian Kelly on 16 January 2013

Facebook’s Graph Search Beta Targets Google

Metro headlineYesterday my Twitter stream was full of tweets about Facebook’s announcement that they were Introducing Graph Search Beta – and this morning the headline Facebook’s Search for Supremacy featured on the front page of the Metro newspaper.

The significance of this announcement can be gauged by the BBC news headline: Facebook’s Graph targets Google in which Rory Cellan-Jones, the BBC’s technology correspondent, describes how his initial scepticism may have been misplaced: “If [Facebook’s] Graph Search more closely resembles what Bing describes, then users will be able to stay on Facebook, earning the company huge advertising revenues as they search for goods and services“.

A TechCrunch article which asks “What Can You Search For On Facebook Graph Search?” has focussed on the social aspects of this development (dating, finding places to eat and drink, etc.). But what could Facebook’s new search system offer researchers?

What Does The Evidence Tell Us?

Importance of Evidence

Although people may be tempted to be instinctively dismissive of any developments to Facebook, as described in a paper on “What Next for Libraries? Making Sense of the Future” (available in PDF and MS Word formats)” involvement with work of the Jisc Obervatory has led to a greater emphasis on evidence-gathering. In addition the Jisc Inform article which announced “A Bright Future for Independent Jisc in 2013” described how a greater emphasis for development work will be based on the needs of the institutions. There will therefore be a need to gather evidence on how Facebook is being used across UK higher and further educational institutions in order to understand whether Facebook developments can enhance uses of made of Facebook to support institutional activities.

Institutional Use of Facebook

Facebook ‘Likes’ Across Russell Group Universities

Back in November 2007 a post on UK Universities On Facebook provided early evidence of use of Facebook by early adopters, when there were only about 76 universities with a Facebook presence. A year later a post on Revisiting UK University Pages On Facebook started to keep a record of Facebook usage by the early institutional adopters. More recently a post on Over One Million ‘Likes’ of Facebook Pages for the 24 Russell Group Universities provided an indication of the scale of use of Facebook across a selection of UK universities.

This might suggest that the enhanced searching techniques announced yesterday may be relevant for those involved in university marketing activities, although there may be some interesting privacy issues to be addressed.

But beyond use of Facebook by students, what about its potential to support researchers?

Use of Facebook by Researchers

Blog referrers for the yearAs described in a post of The Sixth Anniversary of the UK Web Focus Blog Facebook is “in third place behind Search Engines and Twitter in referring traffic to this blog” (as illustrated). This suggests that Facebook may have a role to play in supporting dissemination activities for bloggers. But does Facebook have any relevance for enhancing the dissemination of research papers, beyond the indirect dissemination which research blogs may provide?

A year ago a post entitled Facebook and Twitter as Infrastructure for Dissemination of Research Papers (and More) described the SpringerLink mobile app.

Springerlink appEarlier today I used the app to search for papers on ‘Web Accessibility. As illustrated a relevant paper can be shared across my professional networks using Twitter or Facebook as well as sharing with selected individuals using email.

As I described in the blog post “the Springlink app suggests that Facebook and Twitter may be becoming part of the dissemination infrastructure for research papers, especially on mobile devices“. But is there any evidence that researchers are using Facebook, in particular, to facilitate access to research papers?

Back in October 2012 a series of guest blog posts were published during Open Access Week 2012 in order to share the experiences of a number of institutional repository managers. In the posts on SEO Analysis of WRAP, the Warwick University Repository by Yvonne Budden, University of Warwick and on SEO Analysis of LSE Research Online by Natalia Madjarevic, LSE there was no evidence that Facebook was a significant driver of traffic to the two repositories, according to the MajesticSEO tool used to carry out the analyses. This was echoed by William Nixon in his post on SEO Analysis of Enlighten, the University of Glasgow Institutional Repository. William described how:

Social media sites such as Facebook and Twitter don’t appear in these initial results, it may be because the volume is insufficient to be ranked here or there may be breach of service issues. Google Analytics now provides some social media tools and we have been identifying our most popular papers from Facebook and Twitter.

Reading William’s post on the Enlighten blog it seems:

Looking at the data for the past year the following papers have had significant numbers of referrals from Facebook:

van Dommelen, P., Gómez Bellard, C., and Pérez Jordà, G. (2010)Produzione agraria nella Sardegna punica fra cereali e vino. In: Milanese, M., Ruggeri, P., Vismara, C. and Zucca, R. (eds.) L’Africa Romana. I Luoghi e le Forme dei Mestieri e della Produzione nelle Province Africane (Atti del XVIII Convegno di Studio, Olbia, 11-14 Dicembre 2008). Series: L’Africa Romana (18). Carocci, Rome, Italy, pp. 1187-1202. ISBN 9788843054916. http://eprints.gla.ac.uk/48143/

Cockshott, W.P., and Zachriah, D. (2012) Arguments for Socialism.Amazon. ISBN B006S2LW6U. http://eprints.gla.ac.uk/58987/

So at this stage it would appear that this is little evidence that Facebook has a significant role to play in enhancing access to papers hosted in institutional repositories. But are the experiences from these three institutional repositories typical across the sector? Might the early adopters, such as P van Dommelen and W. P. Cockshot and their co-authors be gaining advantages in enhancing access to their papers? And, finally, might the announcement of Facebook’s Graph Search prove of relevance to those with an interest in enhancing the discoverability of research papers?

I’ve asked questions, rather than suggested answers in this post. In part that is because the potential relevance of Facebook’s Graph Search will be based on the use of Facebook, rather than advocacy or critique of use of Facebook in a scholarly context. I’d therefore welcome comments from repository managers, in particular, on evidence of Facebook as a driver of traffic (whether large or small) to institutional repositories. For those who may not wish to leave a comment I’ve created two polls: one of the amount of traffic provided by Facebook and the other on interest in understanding the potential of use of Facebook’s Graph Search in a repository context.

Finally, if you’d like to know more about Facebook’s Graph Search, the following links may be of interest:


View Twitter conversation from: [Topsy] | View Twitter statistics from: [Bit.ly]

Posted in Facebook | Tagged: | 2 Comments »

A Tribute to Aaron Swartz: Lets Make #pdftribute Trend

Posted by Brian Kelly on 13 January 2013

I’m sure many readers of this blog will have heard the news of the untimely death of Aaron Swartz. As described on the BBC News Web site:

Aaron Swartz, a celebrated internet freedom activist and early developer of the website Reddit, has died at 26.

The activist and programmer took his life in his New York apartment, a relative and the state medical examiner said. His body was found on Friday.

A sad day, especially for those who share Aaron Swartz’s commitment to openness and admire his commitment to the development of tools, services and standards, such as RSS, which have helped to make open access to resources accessible on a global basis.

Earlier today I came across a tweet which encouraged academics to show their support for Aaron’s work:

Please share: Academics posting their papers online in tribute to Aaron Swartz using hashtag #pdftribute.

Storify summary of #pdftribute tweetsI would like to endorse this proposal. I have created a Storify summary of the #pdftribute tweets, which contains over 500 posts since the call was made just over 3 hours ago.

Although we have see that initial tweet being widely retweets, as @neuroconscience (Micah Allen) has suggested:

Folks as exciting as #pdftribute is we need less links talking about it and more actual paper posting.

But what could be said in 140 characters?

Within my Twitter stream I have already seen tweets from those involved in supporting their institutional repository including @SarahNicholas:

Cardiff academics! Post your articles to @CardiffOrca#openaccess#pdftribute

and @glamlaflib (Sue House):

Glamorgan academics can deposit their articles & papers here (if you retained the copyright) http://dspace1.isd.glam.ac.uk/dspace/ #pdftribute

I have also seen @openscience endorsing @jambina’s reminder of the role which can be played by librarians:

Librarians: always friends in #openaccess#openscience MT @jambina: librarians can help you free your work. we are on your side #pdftribute

Meanwhile @MrGunn describes services which can be used:

@opendna @venturejessica @Aine Mendeley can push into to local repository via Symplectics Elements, other routes can be made with Open API.

Of course many researchers are demonstrating their commitment to providing open access to their research papers:

Others, such as @mlterpstra (ML Terpstra) make the case for open data policies:

#public funded #academia should have a #opendata policy for their scientific papers #Aaron #pdftribute. Lets call it #AaronsLaw?@birgittaj

whilst others provide a more political view:

@MarietjeD66 @mikebutcher Let this be the start of the end of the ridiculous copyright laws. #pdftribute #AaronSwarz

Would you like to join in by giving your views or ensuring that your Twitter community is aware of how you have made your research papers openly available?

Note archives of the #pdftribute tweets are available at http://pdftribute.net and http://twubs.com/pdftribute


View Twitter conversation from: [Topsy] | View Twitter statistics from: [TweetReach] – [Bit.ly]

Posted in openness, Repositories | Tagged: | 2 Comments »

Reflections on the Discussion on the Quality of Embedded Metadata in PDFs

Posted by Brian Kelly on 11 January 2013

The Quality of Metadata Embedded in PDFs

Embedded metadata in PDFsThe recent post on Embedded Metadata in PDFs Hosted in Institutional Repositories: An Inside-Out & Outside-In View generated a fair amount of discussion, with ~17 comments on the post itself but perhaps more significantly, a more interactive discussion on Twitter, with relevant contributions being made by @mrnick, @neilstewart, @rmounce, @carusb@pj_webster, @emmatonkin, @MikeTaylor and @wrap_ed, with other Twitter users sharing links to the posts to their communities.

Whilst some people may still feel that discussions should take place on one centralised system (e.g. a mailing list) in reality this is an unrealistic expectation. In the real world discussions based on ideas which may have originated online will be dispersed across office and common rooms in institutions around the world, to say nothing of other discussions which may take place in pubs and coffee rooms as well as whilst travelling. Conversations about interesting ideas will be distributed; we have to accept that. However it can be helpful to aggregate valuable comments which may be fragmented across a variety of communication channels. Since I felt that the Twitter discussions about the post were particularly interesting I have created a Storify summary entitled The Quality of Embedded Metadata in PDFs (Jan 2013). Note that this complements the Topsy summary which gives the tweets which contains links to the blog post.

Note that in the comments on the blog post Nick Sheppard suggested that a forthcoming UK RepNet event might provide an opportunity to discuss the issues which were raised in more depth::

I wonder if some of these issues might be relevant within the context of the UK RepNet project which is holding a meeting in London on 21st Jan –http://www.rsp.ac.uk/events/supporting-and-enhancing-your-repository/

I will therefore provide a summary of the main issues which were discussed on the blog and on Twitter.

The Context

The initial post was written in response to a post by Ross Mounce in which he asked PDF metadata – why so poor? and a follow-up post a week late on PDF metadata: different tool, same story. Ross’s post was based on an analysis of the metadata embedded in PDFs hosted by scholarly publishers. Ross’s second post succinctly summarised his work:

So a week ago, I investigated publisher-produced Version of Record PDFs with pdfinfo and the results were very disappointing. Lots of missing metadata was found and one could not reliably identify most of these PDFs from metadata alone, let alone extract particular fields of interest.

I wondered whether PDFs hosted in institutional repositories also suffered from poor quality or missing embedded metadata. I examined some papers I had deposited in the University of Bath repository and found that metadata which was contained in the original PDF file I uploaded to the repository was missing from the PDF which users can download. I surmised that the metadata had been lost in the workflow when a cover sheet was added to the paper.

My post referenced a post by Lorcan Dempsey entitled Discovery vs discoverability … in which he explored the idea of the “inside-out and outside-in library“. This seemed very relevant to this scenario as both Ross and myself were concerned primarily by the implications is missing metadata for systems which may be used outside of the repository context: in Ross’s case this related to text mining of large collections of PDFs whereas my interest focussed on reuse of PDFs in other repositories.

The Discussion

Embedded metadata in PDFsThe initial comment on the blog post by Ingmar Koch illustrated how embedded PDF metadata can be (mis-)used by external systems. Ingmar descried how “the company that designed the document templates for most of the government agencies added a title and author in the template-file. The result is that thousands of online government documents (.pdf and .doc) are titled “at opinio facillime sumitur” and are written bij M. Hes.” This example provides a vivid illustration of how metadata embedded in PDFs is being used by Google. However this example might also be used to demonstrate the poor quality of embedded metadata.

In light of such examples Neil Stewart therefore askeddoes it matter if the rare and patchy instances of author-created metadata gets over-written or otherwise distorted?” since “the structured metadata provided at Eprint/DSpace/other repository software record level does the job here (as opposed to metadata embedded within the PDF itself).

But surely we cannot argue that since some resources may contain poor quality metadata we should delete all metadata! I would argue that there is a need to educate authors on the importance of appropriate metadata, which includes showing how such metadata can be used by services outside of the host institution. Neil recognises the validity of this point when he acknowledged that “not every service will use OAI-PMH or web crawling, some might parse the objects themselves“.

The discussion then moved on Twitter and initially addressed the relevance of cover sheets, since these appear to cause problems in work flows which take place outside of the institutional repository.

Ross Mounce asked:

why do IRs need 2 slap on cover page anyway? Perhaps they should just embed additional provenance metadata @briankelly @mrnick @neilstewart

Neil Stewart provided one use case for cover sheets:

@rmounce @briankelly @mrnick viewed as a way of advertising provenance (proper citation), branding as from home inst but agreed!

However Ross re-iterated his criticisms of cover sheets:

Cover-pages from a user-POV r just plain annoying. If provenance must be visibly embedded why not overlay? @neilstewart @briankelly @mrnick

Others, such as Chris Rusbridge, agreed with this view:

@mrnick @ukcorr @rmounce @briankelly @stevehit I agree with Ross that it’s BAD practice, from my POV

The discussion then moved on to problems which may occur if a paper is to be downloaded, with Nick Sheppard provided a good example of how PDFs may end up containing multiple cover sheets if they are taken from one repository and deposited (by, for example, a co-author) in another repository:

@neilstewart Um, can also lead to cover page disasters like this (scroll down) eprints.port.ac.uk/2278/1/A…@rmounce @briankelly

I then highlighted a paper by my colleague Emma Tonkin which showed that that problems with poor quality metadata went beyond the individual examples provided on Twitter:

@carusb @mrnick @rmounce My colleague @emmatonkin analysed PDF metadata a few years ago: opus.bath.ac.uk/24958/

The paper (PDF format) described how:

Many repositories … have developed or identified a means of adding a cover sheet to each document within the repository. This has potential for positive impact, for example, as a means of clearly indicating the provenance of an item (Puplett, 2008). As can be seen in Fig. 7, Google Scholar does not necessarily recognise the cover sheet for what it is, and this has negative implications for effective indexing and retrieval.

and went on to conclude:

However, the addition of a cover sheet has caused a number of issues beyond those that are usually encountered with the PDF format (ie. font problems, file corruption, etc). This limits the ability for automated processes to make use of this information, and could therefore be said on the level of automated indexing and other software access (such as conversion) to be a retrograde step. If this becomes common practice it may be necessary to review both the assumptions under which automated systems are developed, and perhaps the rationale that lead us to make use of cover sheets in this context.

Conclusions

The paper on Supporting PDF accessibility evaluation: early results from the FixRep project was written in 2010 by my colleagues Emma Tonkin and Andy Hewitt and presented at the 2nd Qualitative and Quantitative Methods in Libraries International Conference (QQML2010).The concluding sentence in the paper highlighted work which needs to be addressed:

it may be necessary to review both the assumptions under which automated systems are developed, and perhaps the rationale that lead us to make use of cover sheets in this context

The paper identified the benefits of cover sheets but also the problems they can cause for automated activities which may take place outside of the institutional repository environment.

But should repository managers and developers necessarily devote resources to addressing potential problems which may arise downstream of the repository environment? In a comment on Ross Mounce’s blog the point was made that publishers will need there to be a sound business case to be made:

“Why would publishers add metadata? Because their customers – libraries, governments, research funders (in the case of Open Access PDFs ) should demand it.” I’m not seeing a compelling business case here. High-quality metadata would be nice, but can anybody argue that their research is being hampered by a lack of such metadata? Could someone working in publishing make a case to their boss that adding such metadata would generate more revenue, web traffic, manuscript submissions (insert whatever metric matters)?

In the context of institutional repositories perhaps the approach to be taken would be to ensure that embedded metadata is preserved and that the training and advice provided by repository support staff ensures that authors are made aware of the ways in why embedded metadata can be used, even if such reuse takes place outside of the institutional repository.

The discussion also highlighted the need for enhanced workflow practices for merging cover pages with the original content and also for enabling users (and automated tools) to be able to access the original source paper in addition to the version contained provenance information designed for consumption by users.

Do any institutional repositories currently provide solutions to these requirements? In addition, I am interested in how many institutional repositories provide cover pages and whether those that do use a repository plugin technology to do this, some other automated technologies or by manual processes. Two polls on these questions are embedded in this post but if the situation is more complex than can be summarised in the poll, feel free to add a comment.

Footnote (added 12 January 2012): A tweet from @community alerted me to the blog post on SEO Action for PDF files on the Adobe blog. This describes an “Action” for use in Acrobat X Pro that will automate setting the properties of the PDF file in accordance with guidelines which can enhance the discoverability of PDF files by Google.


View Twitter conversation from: [Topsy]  |  View Twitter statistics from: [TweetReach] – [bit.ly]

Posted in Repositories | 5 Comments »

Why Every Researcher Should Sign Up For Their ORCID ID

Posted by Brian Kelly on 9 January 2013

JISC news item about ORCIDI was pleased to see the news item published by the Jisc earlier today which announced UK specialists welcome launch of ORCID as tool to identify researchers.

The news item describes how:

Jisc joins organisations from across the UK higher education network to welcome the launch of the Open Researcher and Contributor Identifier (ORCID).

and goes on to describe the benefits which ORCID can provide:

There are more academic articles being published than ever before and more authors working together. In order to be able to identify an author correctly a unique identifier is needed that can then link to each author’s publications. ORCID provides this link and if widely used would:

  • Ensure researchers get credit for their own work
  • Ensure researchers and learners looking for information will be able to find academic papers more accurately
  • Enable better management of researcher publication records, making it easier for them to create CVs, reduce form filling and improve reporting to funders
  • Create a means of linking information between institutions and systems internationally
  • Enable researchers to keep track of their own work with funders, publishers and institutions around the world.

It also provides researchers with their own ORCID. Researchers are able to control how much information it holds about them and who that is shared with. The adoption of ORCID is a solution to the current challenges of being able to search for work accurately. By researchers volunteering to adopt its usage it could improve discoverability and accurate referencing.

As described in a post which explained Why You Should Do More Than Simply Claiming Your ORCID ID I feel it is important that researchers claim their ORCID ID (I will use two words as I suspect that this will less ambiguous than ‘claiming an ORCID‘). The post gave the reasons why I feel that researchers should do more than simply claim their ORCID ID and go on to include their ORCID IDs together with the ORCID IDs of their co-authors in references to their papers. The reason I gave for doing this was to minimise the risks of losing connections with co-authors, who may have changed their affiliation and thus no longer have their original email address and institutional Web presence.

In light of the recent Announcement: UKOLN – Looking Ahead which described how the Jisc “will only provide core funding to the UKOLN Innovation Support Centre, up to July 2013 but not beyond” there will clearly be a need for myself and my colleagues to minimise the risks of losing the connections with our research outputs. Since the first bullet point of the benefits which ORCID can provide is to:

Ensure researchers get credit for their own work

it would appear that claiming an ORCID ID should be a priority for researchers whose position in their host institution is uncertain. But doesn’t this apply to everyone? From one perspective this might be relevant in light of funding uncertainties in the sector which are compounded by last month’s announcement of the “Huge Drop in Students Starting University“. But beyond the current economic situation, every researcher will, at some stage, leave their host institution (whether to take up a new post elsewhere, retirement, redundancy or death in service).

It would appear that every researcher who wishes to ensure that they get credit for their own work, and can ensure that such credit can be managed when they leave their current institution should benefit from claiming an ORCID ID. As described in the post claiming an ORCID ID “is a painless exercise, taking about 30 seconds to complete” so this is something which all researchers should be able to do.

In the Jisc news item Neil Jacobs, programme director, Jisc commented: “We recognise that this is only the start and that work needs to be done to implement ORCID in the UK. However, we have a solid beginning and we look forward to working with our partners across the sector to build on it.

As is clear from the ORCID Knowledge base many suggestions have been made on ways in which the service can be enhanced. But the simplest action lies in the hands of the individual researchers: sign up for an ORCID ID!


View Twitter conversation from: [Topsy]  |  View Twitter statistics from: [TweetReach]

Posted in Identifiers | Tagged: | 4 Comments »

Embedded Metadata in PDFs Hosted in Institutional Repositories: An Inside-Out & Outside-In View

Posted by Brian Kelly on 4 January 2013

PDF Metadata – Why Is it So Poor?

Metadata in PDF sourcePDF metadata – why so poor? asked Ross Mounce in a blog post published on New Year’s eve.

In the post Ross expressed surprise that although “with published MP3 files of audio you get rather good metadata … the results from a little preliminary survey of academic publisher PDF metadata” were poor: “Out of the 70 PDFs I’ve published (meta)data on over at Figshare, only 8 of them had Keywords metadata embedded in them“.

This made we wonder about the quality of the metadata for papers I have uploaded to Opus, the University of Bath repository.

I looked at a paper on A Challenge to Web Accessibility Metrics and Guidelines: Putting People and Processes First which is available in Opus in PDF and MS Word formats.

I first used Adobe Acrobat in order to display the metadata for the original source PDF file, prior to uploading to the repository. As can be seen from the accompanying screen shot the metadata included the title, the author details (with the email address for one of the authors) and two keywords.

Metadata for repository copy of paperHowever looking at the display for the PDF downloaded form the repository we find that no metadata is available!

This PDF differs from the original source in that a cover page is added dynamically by the repository in order to provide appropriate institutional branding. It would appear that in the creation of the new PDF resource, the original metadata is lost.

Metadata for MS Word masterLooking at the metadata created in the original source document – an MS Word file – we can see how the authors’ names which were subsequently concatenated into a single field. We can also see that although the title of the paper was given correctly, poor keywords had been included, which did not reflect the keywords which were included in the paper itself (Web accessibility, disabled people, policy, user experience, social inclusion, guidelines, development lifecycle, procurement).

I suspect that I am not alone in not spending much time in ensuring that appropriate metadata is embedded in the master source of a peer-reviewed paper. I have also previously not considered how such metadata might be lost in the workflow processes when uploading to an institutional repository: after all, surely the important metadata is added when the paper is deposited into the repository?

Ross’s blog post made me check the embedded metadata – and I discovered that the correct metadata is still included in the MS Word file which was uploaded to the repository along with the PDF copy.

Does the loss of the metadata embedded in the PDF matter? After all, surely people will use the search facilities provided in the repository in order to find papers of interest?

But people will not necessarily visit a repository to find papers of interest. A post which described A Survey of Use of Researcher Profiling Services Across the 24 Russell Group Universities showed that on 1 August 2012 there were over 18,000 users of ResearchGate in the 24 Russell Group universities and judging by the messages along the lines of “28 of your colleagues from University of Bath have joined ResearchGate in the last month. Why not follow them today?” which I am currently receiving, use of this service is growing.

researchgate-papers-abstractAs can be seen from the screenshot of my ResearchGate profile, the service provides access to PDF copies of my papers. I normally simply provide a link to the PDF hosted in the repository but the example illustrated contains a copy of original PDF which was uploaded to the service by one of the co-authors.

In the case of most of my papers it is clear from the thumbnail of the PDF that the paper contains the coversheet provided by the repository.

Researchgate Paper (hosted in Opus)

Discussion

We can see that the PDF copy of a paper hosted in a repository should not be regarded as a final destination; rather the PDF may be surfaced in other environments.

It will therefore be important to ensure that workflow processes do not degrade the quality of the PDF. It will also be important to ensure that authors are made aware of how embedded metadata may be used by services beyond the institutional repository. But to what extend do repository managers feel they have a responsibility to advise on practices which will enhance the discoverability of content on services hosted outside the institution?

Taylor FrancisIn a paper which asked “Can LinkedIn and Academia.edu Enhance Access to Open Repositories?” myself and Jenny Delasalle commented on how “commercial publishers are encouraging authors to use social media to drive traffic to papers hosted on publishers’ web sites” and provided examples of such approaches from Taylor and Francis, Springer, Sage and Oxford Journals. As an example, Taylor and Francis describe how they are “committed to promoting and increasing the visibility of your article and would like to work with you to promote your paper to potential readers” and go on to document services which can help achieve this goal.

In a blog post which discussed the ideas describe din the paper I described how we had failed to find significant evidence of similar approaches being employed by repository managers:

It was interesting that in Jenny’s research she found that a number of commercial publishers encourage their authors to use services such as LinkedIn and Academia.edu to link to their papers hosted behind the publishers paywalls – and yet we are not seeing institutional views of the benefits of coordinated use of such services by their researchers. Institutional repository managers, research support staff and librarians could be prompting their institutions to make the most of these externally provided services, to enhance the visibility of their researchers’ work in institutional repositories.

But that paper was limited to use of third-party services to provide access routes to research papers. What of the bigger picture in which institutional work flow processes should be designed to enhance discoverability?

The ‘inside-out and outside-in library’

On Wednesday in a post entitled Discovery vs discoverability … Lorcan Dempsey explored the idea of the “inside-out and outside-in library“. In the post Lorcan described how:

Throughout much of their existence, libraries have managed an outside-in range of resources: they have acquired books, journals, databases, and other materials from external sources and provided discovery systems for their local constituency over what they own or license.

However in a digital and network world, there have been two major changes, which shift the focus towards inside-out:

First access and discovery have now scaled to the level of the network: they are web scale. If I want to know if a particular book exists I may look in Google Book Search or in Amazon, or in a social reading site, in a library aggregation like Worldcat, and so on. … Secondly the institution is also a producer of a range of information resources: digitized images or special collections, learning and research materials, research data, administrative records (website, prospectuses, etc.), faculty expertise and profile data, and so on.

Lorcan goes on to describe the challenge facing libraries:

How effectively to disclose this material is of growing interest across libraries or across the institutions of which the library is a part. This presents an inside-out challenge, as here the library wants the material to be discovered by their own constituency but usually also by a general web population.

I would suggest that institutional repositories could usefully adopt the approach taken by Taylor and Francis:

 “[The institution is] committed to promoting and increasing the visibility of your article and would like to work with you to promote your paper to potential readers

But rather than simply encourage researchers to simply add links to papers deposited in the repository from popular services such as LinkedIn and ResearchGate might the institutional goal be enhanced by encouraging researchers to make the content of their papers available in such third party services (subject to copyright considerations) – with the institutional repository providing both a destination and a component in a workflow, with papers being surfaced in services such as ResearchGate, as I have illustrated above.

If such an approach were to be embraced there would be a need to ensure that embedded metadata was not corrupted through repository workflow processes. If, however, the repository is regarded as the sole access point, there would be little motivation to address such limitations in the work flow.

Or to put it another way, repository managers will have a need to manage content hosted within the institution, including management to support the use of the content by services they have no control over.

To a certain extent, this has already been accepted: repositories were designed to have “cool URIs” which can help resources to be discovered by Google. I am suggesting that there is a need to observe usage patterns which indicate emerging ways in which users are finding content. The growing numbers of email alerts from ResearchGate suggest that it may be a service to monitor – with Ross Mounce’s recent post of on the quality of metadata embedded in PDFs suggesting one area in which there will be a need to revisit existing workflow processes.

PS. Ross Mounce described “a little preliminary survey of academic publisher PDF metadata” and has published the data on Figshare. Has anyone harvested the metadata embedded in PDFs hosted on repositories and published the findings?


View Twitter conversation from: [Topsy]

Posted in Repositories, Web2.0 | 21 Comments »

Using Social Media to Publish/Share Ideas/Opinions Which Have Not Been Peer Reviewed

Posted by Brian Kelly on 3 January 2013

In The Bell, Listening to Fat Man Swings

Fat Man Swings at The Bell

Fat Man Swings at The Bell (I responded to a tweet during the break)

Last night I was in The Bell in Bath listening to Fat Man Swings when I noticed someone had mentioned me in a tweet:

@NSRiazat no but briankelly may be able to help

The message related to a discussion on the #phdchat Tweetchat during which Nasima Riazat (@NSRiazat) asked:

Has anyone used social media to publish/share ideas/opinions which have not been peer reviewed prior to sharing? #phdchat

According to her Twitter biography Nasima Riazat is “#PhDchat moderator. PhD research expertise in capacity building, distributed leadership, leadership sciences, developing middle leaders – Open University UK“. Her question was therefore very relevant for those who participate in the #phdchat discussions, which I have commented on previously.

The question, and its timing, may well horrify those who do not ‘get’ Twitter and are worried about being inundated with tweets during every hour of the day and having to respond during out-of-work hours. However established Twitter users will understand that Twitter provides a steady stream of content which you can dip into when it suits you and @ messages can often be ignored. On this occasion I felt the question was of interest and so I responded during the break to say I would address the question. The interaction, incidentally, including taking and posting a photo of the band probably took less than a minute.

Publishing and Sharing Ideas Which Have Not Been Peer Reviewed

Back in October, during Open Access Week I gave a series of talks on Open Practices for the Connected Researcher at the universities of Exeter, Salford and Bath in which I described the benefits which social media could provide for researchers. The talk was based on personal experiences of use of social media to support my peer-reviewed papers, especially in the area of Web accessibility. I described how social media could be used to develop one’s professional network (with the example of how I met Sarah Lewthwaite (@slewth) on Twitter and subsequently collaborated on a paper which won an award at an international conference). I also described how use of services such as Twitter and Slideshare could be used by one’s co-authors during a conference presentation in order to maximise the numbers of views of the paper and accompanying slides by those who have a particular interest in the conference – those who may subsequently cite the paper in their own research publications or take actions based on the ideas described in the paper.

But although social media has proven value in developing one’s professional network and enhancing access to research publications, the question which was raised addressed a different scenario: Has anyone used social media to publish/share ideas/opinions which have not been peer reviewed prior to sharing?

I suspect the answer to this question will be influenced by the area of research together with personal approaches towards openness and the culture within one’s research group or host institution.

In my case my areas of research are based on the Web (Web accessibility, Social Web, Web preservation, Web standards and institutional repositories). My organisation (and our funders) has always been supportive of open access for the research outputs. In addition I have sought to embrace open practices in my work. I should add that I do not feel that others should adopt similar approaches; as I described in a post on The Social Web and the Belbin Model my preferred roles as a ‘plant’ and ‘resource investigator’ in the Belbin model are well-aligned with use of social media services such as blogs. I am therefore comfortable with the notion of exposing one’s ideas to public view at early stages, with the intention that flaws in the ideas will be identified at an early stage and the value of the ideas will be enhanced by contributions from others.

For me the ideas published in a blog post (or even a tweet) can be subsequently developed and used in a peer-reviewed paper. As an example, in September 2012 I wrote a brief post which asked “John hit the ball”: Should Simple Language Be Mandatory for Web Accessibility? After the post had been published I came across a tweet from @techczech (Dominik Lukes) which commented:

Should Simple Language Be Mandatory for Web Accessibility? http://ow.ly/dOV4T < Bad idea for #a11y – ignorant of basic #linguistic facts

I looked at Dominik’s Twitter biography (“Education and technology specialist, linguist, feminist, enemy of prescriptivism, metaphor hacker, educator, (ex)podcaster, Drupal/Wordpress web builder, Czech.“) and followed the link to his blog and read his post on “Why didn’t anyone tell me about this?”: What every learning technologist should know about accessible documents #ALTC2012. I realised that we had similar interest so I decided to follow him on Twitter and then had an interesting phone conversation on Web accessibility and language issues.

I subsequently submitted a brief paper on this topic with Alastair McNaught, JISC TechDis, to the W3C WAI’s online symposium on “Easy to Read” (e2r) language in Web Pages/Applications. As described in a post on ‘Does He Take Sugar?’: The Risks of Standardising Easy-to-read Language the paper was not accepted. However since we were not restricted to the 1.00 word limit imposed by the organisers of the online symposium Alastair and I expanded on our original which were further developed through the contribution provided by Dominik. Our article entitled ‘Does He Take Sugar?’: The Risks of Standardising Easy-to-read Language was published in the Ariadne ejournal just before Christmas.

Although the article was not peer-reviewed we have subsequently realised that the ideas described in the article could provide a new insight into our previous work in developing a framework for making use of accessibility guidelines such as WCAG. We are currently discussing how we can build on these new insights.

To summarise, a brief blog post was commented on in a tweet. This led to an exchange of tweets, a phone call, a joint Skype call and a joint article – with an understanding that we will look for opportunities for further collaboration. Without the blog post and without the tweet, this would not have happened!


View Twitter conversation from: [Topsy] – [bit.ly]

Posted in Accessibility, Social Web, Twitter | Leave a Comment »

Signals from Institutions: The University of Edinburgh’s Strategic Goals, Targets and KPIs

Posted by Brian Kelly on 2 January 2013

The University of Edinburgh Strategic Plan 2012-2016

As described in a paper on What Next for Libraries? Making Sense of the Future the JISC Observatory “provides horizon-scanning of technological developments which may be of relevance to the UK’s higher and further education sectors“. The paper, available in MS Word and PDF formats, describes the systematic processes for the scanning, sense-making and synthesis activities to support this work. The paper focuses on the processes for observing technical developments. However there is also a need to observe signals of institutional interests in IT developments, especially in light of the recent announcement of Jisc’s objective to “address a number of specific priorities for universities and colleges through the development of resources, tools and supported infrastructure“.

Edinburgh University's strategic goals

Strategic plans published by institutions can provide a valuable starting point to help identifying areas of institutional interests. For example, Lorcan Dempsey recently drew attention to the strategic goals which have been identified by the University of Edinburgh:

mm.. U Edinburgh strategy targets include improving citation score in the THE World Uni Rankings. docs.sasg.ed.ac.uk/ gasp/strategic…

The document, The University of Edinburgh Strategic Plan 2012-2016, (which is available in PDF format) is interesting not so much for the way it identifies strategic goals and the key enablers who will be needed to ensure the goals are attained, but the list of specific KPIs (Key Performance Indicators) and the associated targets.

Of particular interest is the strategic goal of excellence in research for which the KPI is listed as “Russell Group market share of research income (spend)“. The corresponding targets are:

  • Increase our average number of PhD students per member of academic staff to at least 2.5
  • Increase our score (relative to the highest scoring institution) for the citations-based measure in the THE World University Rankings to at least 94/100

The strategic goal of excellence in innovation states that the KPIs are “Knowledge exchange metrics: number of disclosures, patents, licences and new company formation“. The targets for this goal are:

  • Achieve at least 200 public policy impacts per annum
  • Increase our economic impact, measured by GVA, by at least 8%

The Importance of Metrics

It is interesting to see how the University of Edinburgh has clearly targets which are based on measurable criteria: “Increase our average number of PhD students per member of academic staff to at least 2.5“; Increase our score … for the citations-based measure in the THE World university rankings to at least 94/100“; “Achieve at least 200 public policy impacts per annum“; “Increase our economic impact, measured by GVA, by at least 8%“; “Increase the proportion of our building condition at grades A and B on a year-on-year basis, aiming for at least 90% by 2020“; “Increase our total income per staff FTE year-on-year, aiming for an increase of at least 10% in real terms“; “Increase the level of overall satisfaction expressed in responses to the NSS, PTES and PRES student surveys to at least 88%“; “Increase the number of our students who have achieved the Edinburgh Award to at least 500“; “Create at least 800 new opportunities for our students to gain an international experience as part of their Edinburgh degree“; “Increase our headcount of non-EU international students by at least 2,000“; “Increase our research grant income from EU and other overseas sources so that we enter the Russell Group upper quartile“; “Increase our number of masters students on programmes established through our Global Academies by at least 500“; “reduce absolute CO2 emissions by 29% by 2020, against a 2007 baseline (interim target of 20% savings by 2015)” andIncrease our number of PhD students on programmes jointly awarded with international partners by at least 50%” (emphasis added).

The importance of metrics in the context of learning is being addressed by CETIS, with the CETIS Analytics Series being announced by Sheila MacNeill on 23 November 2012 with a follow-up post the next week addressing Legal, Risk and Ethical Aspects of Analytics in Education, The following week Sheila provided a broader perspective in a post on Analytics for Understanding Research, with the series of posts concluding with one on Institutional Readiness for Analytics – practice and policy.

Prior to CETI’s work in this area the importance of metrics had been identified by the JISC in 2010 when they asked UKOLN to facilitate the Evidence, Impact, Metrics activity. A series of reports on this work were published just over a year ago. As described in the document on Why the Need for this Work?:

There is a need for publicly-funded organisations, such as higher education institutions, to provide evidence of the value of the services they provide. Such accountability has always been required, but at a time of economic concerns the need to gather, analyse and publicise evidence of such value is even more pressing.

Unlike commercial organisations it is not normally possible to make use of financial evidence (e.g. profits, turnover, etc) in public sector organisations. There is therefore a need to develop other approaches which can support evidence-based accounts of the value of our services.

A series of three workshops were held between November 2010 and July 2011. It was interesting to reflect on how, at the initial workshop, there was a feeling that an emphasis metrics could be counter-productive in failing to appreciate the complexities of the work being carried out in the higher education sector. However the feedback from the second workshop included an awareness of the need for “More strategic consideration of gathering evidence) both for our own purposes and those of projects we work with/evaluate)“. The work concluded by highlighting the importance of metric-based approaches for projects:

Which should I bother with metrics?
Metrics can provide quantitative evidence of the value of aspects of project work. Metrics which indicate the success of a project can be useful in promoting the value of the work. Metrics can also be useful in helping to identify failures and limitations which may help to inform decisions on continued work in the area addressed by the metrics.

What are the benefits for funders?
In addition to providing supporting evidence of the benefits of successful projects funders can also benefit by obtaining quantitative evidence from a range of projects which can be used to help identify emerging patterns of usage.

What are the benefits for projects?
Metrics can inform project development work by helping to identify deviations from expected behaviours of usage patterns and inform decision-making processes.

What are the risks in using metrics?
Metrics only give a partial understand and need to be interpreted careful. Metrics could lead to the publication of league tables, with risks that projects seek to maximise their metrics rather than treating metrics as a proxy indicator of value.

It will be interesting to see if other institutions emulate the University of Edinburgh in stating specific targets for their institutional strategic plans – and how pressures on staff within the institutions to achieve the targets affects operational practices.

Is anyone aware of other institutions which are taking similar approaches?


View Twitter conversation from: [Topsy]

Posted in Evidence, General | Tagged: | 1 Comment »