What Can We Learn From Download Statistics for Institutional Repositories?
Posted by Brian Kelly on 6 July 2011
Gathering Quantitative Evidence
I am involved in work on looking at ways in evidence-based approaches can make use of metrics in order to understand best practices and demonstrate impact. A series of surveys have been carried out which have sought to gather quantitative evidence of use of a variety of services and, by publishing the findings on this blog, have encouraged discussions about the approaches.
This work complements the report on Splashes and Ripples: Synthesizing the Evidence on the Impacts of Digital Resources carried out by the Oxford Internet Institute and described on their blog which focussed on “synthesizing the evidence available under the JISC digitisation and eContent programmes to better understand the patterns of usage of digitised collections in research and teaching“.
Although my work has avoided addressing the complexities of metrics for research a recent survey entitled A Pilot Survey of the Numbers of Full-Text Items in Institutional Repositories has sought to profile the institutional repositories hosted by Russell Group Universities in order to have a better understanding of patterns of usage related to deposits of full-text items which would appear to be of importance in a repository is to have a role to play in the digital preservation of research papers.
Surveys of the numbers of downloads of papers from a repository is clearly a flawed approach if one is attempting to determine the quality, impact and value of research. But are there other insights to be gained from examining download statistics for an institutional repository? This latest survey, which is being carried out a few days before a workshop on “Metrics and Social Web Services: Quantitative Evidence for their Use and Impact“, will seek to understand whether new insights can be gained from a lightweight survey of the most popular downloads from the University of Bath’s Opus institutional repository.
Survey of Downloads
The University of Bath’s institutional repository, which I’ll refer to by the name “Opus”, is, like many UK University repositories, based on the ePrints software. A stats module, IRStats, seems to be provided as standard with ePrints although, as discussed in a previous post, the data which is gathered can by configured by the repository manager.
Opus currently has a total of 136,347 downloads since its launch in 2005. Looking at the histogram of monthly downloads we can see a slow growth for five months after the launch and then a plateau. Zooming in on the graph we can see growth in the numbers of downloads taking place in October/November 2009 and 2010 – and we might reasonably expect a similar pattern to be repeated when the next academic year begins.
But who are the authors of the most downloaded papers and might we be able to discover and techniques which can help to ensure that papers are downloaded ?
Looking at the top ten downloaded authors we find that the conferences proceedings of the 11th International Conference on Non-conventional Materials and Technologies, NOCMAT 2009 is in the top place with 28,449 downloads – an order of magnitude more than the item in second place.
The next most popular item is The use of QR codes in Education: A getting started guide for academics (2,514 downloads) by Andy Ramsden, former head of the e-Learning Unit who used to work in the office down the corridor from me. Andy has two other paper in the top ten, related to his elearning interest in QR codes (1,161 downloads) and Twitter (805 downloads). I am in third place, with my paper on Library 2.0: balancing the risks and benefits to maximise the dividends having 1,419 downloads. The other most popular downloaded papers seem to be PhD theses, with the exception of my UKOLN colleague Alex Ball whose project report on Review of the State of the Art of the Digital Curation of Research Data. is in tenth place (with 745 downloads).
Is there a pattern emerging, I wonder, or are these just one-off examples. It would be interesting to see what the evidence from a wider profile of downloads may indicate. Looking at the top ten authors pages we find A. Ramsden has had 7,760 items downloaded; B. Kelly (6,758); A.D Brown (3,323); A. Ball (2,267); S. J. Culley (2,250); S. Deneulin (1,900); S. Abdullah (1,535); E. Dekoninck (1,469); E. W. Elias (1,469); J. Millar (1,439) and L. Jordan (1,161). [Note that the items do not seem to total correctly in all cases so I will omit the links until I've tried to resolve this].
As mentioned previously it is important to note that downloads have relevance to quality – it would probably be timely now to point our that the numbers of readers of the News of the World demonstrate that quite clearly! However if we also acknowledge that researchers do have a responsibility to get their message across, then researchers will (should) have an interest in maximising the numbers of (appropriate) readers of their papers – and it is important, I feel, to highlight the need to engage with appropriate readers.
From the survey it seems that the authors who have papers in the top ten institutional downloads are also successful in having other papers also being downloaded in significant numbers. Perhaps having an office on level 5 of the Wessex House building may be a reason for such popularity of the papers! On the other hand it may be that the three of us who shared the same corridor discussed dissemination strategies or perhaps, and more likely, are simply working in an area (related to digital libraries) in which potential readers of our papers are more likely to access digital repositories.
Twitter conversation from Topsy: [View]