UK Web Focus (Brian Kelly)

Innovation and best practices for the Web

Getting Into The Top Ten For Your Institutional Repository

Posted by Brian Kelly on 10 Jun 2010

Statistics on Downloads for the University of Bath Institutional Repository

The University of Bath is currently testing the IR Stats package in Opus, the University’s institutional repository. Using the Web interface to the package I ran a search for the top ten downloads over the past year.   The results are shown below -and, as you can see, a paper on “Library 2.0: balancing the risks and benefits to maximise the dividends” by myself, Paul Bevan, Richard Akerman, Jo Alcock and Josie Fraser is in second place!  You’ll have to scroll on beneath the image to discover the secrets of how to ensure that your research paper gets into the top ten for your institutional repository :-)

Top ten downloads from Opus repository in past year

Seeking An Explanation

On 11 August 2009 I wrote a blog post in which I described how my Paper on “Library 2.0: Balancing the Risks and Benefits to Maximise the Dividends” [had been] Published in Program.

Now looking at the blog statistics for visits to the post I discover that there have been a total of 735 views (with 162 on the day of publication ).

Since the blog post linked directly to the details of the paper provided in the institutional repository I believe that many of the visits to the blog post resulted in downloads of the paper in the repository – and so it was a direct result of having a blog and writing a timely post about the paper which resulted in the paper being the second most downloaded paper last year.

Do I have any further evidence to back up this assertion? It would have been interesting to see it a tweet about the post had generated traffic to the article but, having looked at the archive of my tweets in BackUpMyTweets it seems I didn’t use Twitter on the day the post was published. It also seems that a URL for the post hadn’t been minted previously, so unfortunately there are no statistics to examine.

However looking at the download statistics over the past year for my other items in the repository this particular item stands out for its popularity – and so I will assert that the timely blog post linking to the repository item generated over thirty times the normal annual traffic to one of my papers.

Search engine traffic to my items in the Opus repositoryLooking at the search engine statistics for all of my items over the period I discover than 80% of the traffic is not delivered by a search engine (the red quadrant in the pie chart).

Referrers traffic to my items in the Opus repositoryUsing the display of referring traffic to my items confirms that search engines aren’t significant in providing traffic (20%) and the repository search itself only that only delivers 10% of the traffic. Rather it is external Web sites (i.e. my blog, I believe) which delivers 39% of the traffic with 31% of the traffic having no referred information (I have found this is often traffic from Twitter clients but in this case in may be traffic coming from RSS readers used to view the post).


Of course the large number of downloads is no indication of the quality of the paper.  And it might be that the paper was downloaded by an automated agent (perhaps someone was retrieving papers on Library 2.0 and the harvester repeatedly downloaded this paper).  Or, alternatively, maybe the statistics package is producing incorrect results.

But, unless I come across alternative evidence, I will regard the popularity of this item as an indication that blog posts can have a significant impact on the traffic to items in an institutional repository.  Note that I am not saying that blogs are the only significant factor – my UKOLN colleague Alex Ball and Andy Ramsden, head of the e-learning team (both of whom work on the same corridor as me) also figure in the top ten downloads. In their case I think embedding links to their Opus items in external Web sites helps to drive traffic.

However, especially for those working in areas in which there are significant numbers of blog readers, having a blog and using it effectively may provide the researcher with an advantage in raising awareness of their research.

Would you agree?

16 Responses to “Getting Into The Top Ten For Your Institutional Repository”

  1. Kara said

    It would be interesting to compare the downloads from Opus, with downloads of publications accessed from your pages. Is that possible?
    Also if anyone reading this can point to some documentation on IRStats (ie. specifically what the stats are measuring), that would be helpful.

    • The stats in the headline image are individual full text downloads of a record.

      Any file from any document in the record counts as a download (even just a jpg in an HTML file)

      Downloads of a record by an IP are only counted as 1 per day no matter how many files (or repeates) the IP makes on that record.

      Search engines are removed using the logic from the awstats package.

      • Ahh – so if I had deposited a HTML version of the paper I would have received additional hits for every embedded image (or other embedded objects). That might explain the popularity of the number one download, which is a HTML resource with several logos on the home page.

      • Christopher Gutteridge said

        no. the first hit on any file from any (non crawler/search engine) IP means one download for that day. All after that are ignored until the next day.

  2. Hi Kara
    As you know the paper is also available (in MS Word and PDF formats) on the UKOLN Web site.

    Looking at the Web stats I find that this year there have been 11 downloads of the PDF and 8 of the MS Word formats; in 2009 there were 33 downloads of the PDF and 58 downloads of the MS Word formats.

    So there have been 44 downloads of the PDF and 66 of the MS Word format or 110 downloads of the paper in total.

    Having the paper in the two location is fragmenting the statistics, but reflects my workflow practices which pre-date the Bath institutional repository. (I’ll need to revisit these practices in future).

    Note that the blog post only linked to the item in the repository and not the copy on the UKOLN Web site.

  3. Ben Toth said

    Hello Brian, can I make a couple of points?

    1 I fully agree that relevant blog posts and tweets will drive traffic towards content. I saw some research yesterday which makes this point about the changing balance of traffic to sites from search engines and social media.

    2 If my limited experience reflects the generality of university repositories I’d say they have some way to go yet in exposing their content. I needed the full text of a paper yesterday. Google found the reference fine, with links to UWE publications list, Pubmed, Sage publications and British Library Direct. The UWE reference had no links, PubMed had no abstract, Sage and BL wanted a lot of money for access. Through Google I found that UWE has an institutional respository; searched that and found a full text pre-print in both pdf and Word. It took 45 minutes and some persistence to get to full text, and I imagine that most people, including academics would have either given up or paid up. This disconnect is not acceptable in 2010, especially as it could be easily remedied. It breaks 4 out of 5 of Ranganathan’s laws, so surely it’s incumbent on the library profession to make it easier to connect content and readers? Sorry to sound grumpy, but there is a disappointing #fail here in my view.

    end of rant Ben

  4. Kara said

    Ben – how would you easily remedy this issue then?

  5. Thanks for this, Brian, it’s an interesting point – and thanks to the people who have explained the mechanisms.

    This is bound to happen more and more as search engines give higher priority to social media results (as you’ve previously mentioned). However I think you’re right that the topic of the paper will influence this. A paper on such a subject and/or by someone working in an area related to Web 2.0 and social media will have the advantages of Twitter, blogs, etc., but, also, the kind of conferences it may be presented at, or papers which reference it, will themelves be more likely to be online. A quick look at a couple of other university’s repositories reveal that e-learning-related papers invariably come very high up in the popularity listings, just because those who are interested know to look for them.


  6. Ben Toth said

    Hello Kara – it’s pretty obvious isn’t it?

    1 Talk to Google et al about improving the visibility of IR content, including pubmed central
    2 Ensure that personal and departmental publication lists are linked to repositories

    May I just use an example from your repository to illustrate the mess?

    A paper by Lyttleton and Fitch 1978

    – is listed in OPUS (, but isn’t full text which seems odd. The OPUS reference points to the full text at Springer, for which there is a charge of $34
    – a Google search ( a couple of hits in Opus, but these aren’t directly to the Opus entry for the paper
    – a few more minutes on Google finds the full text free….18..223L

    I have found this sort of pattern again and again recently, and it seems such a shame that IR’s do not seem to be wired into the Internet properly as yet.

    • Kara said

      Hi Ben
      I don’t disagree that there’s work to be done to improve the visibility of repositories- I have questions about whether ‘talking to Google’ falls into the ‘easy’ category.
      There’s been a lot of discussion about this over recent years – Les Carr and Paul Walk have both mentioned Google and IRs specifically on their blogs. Brian’s point that self-promotion, and social sites might aid discovery is well-made, although I’m not sure whether this is a job for librarians? Possibly?

      wrt your example, when we know about full text online, we link to it – ie. in PMC or to Brian’s webpage. It’s unusual that a 1978 paper has free fulltext online – welldone for finding it. But we don’t have time to Google for minutes for each record we put in Opus for alternative versions – perhaps this will improve after the OR10 developer challenge

      Here, we do feed researcher profiles with Opus publication lists – see for example.

      • Hi Kara Just to clarify, I am suggesting the Librarians may have a role to play in educating new researchers on the potential benefits of cultivating personal networks and using social media in order to ensure that their research publications reach audiences that otherwise may be missed. I’m not suggesting that librarians should do this work themselves.

  7. Aaron Tay said

    The problem with this “secret” is that for it to work, you need to know another secret. Namely, how do you create a very profile blog that is drawing a ton of readers. :)

    • Hi Aaron – I agree. As AJ Cann pointed out yesterday the challenge is to “Teach people to build and curate effective personal networks“. And I would ask “who should have responsibility for doing this? Should this be a role for librarians?

      • casey said

        sure librarians will do that :)

      • Alan’s point is nice and succinct – no less a secret of success online than offline. Be nice to think that this is a role for the librarian-of-the-future (the “informaticians” mentioned at the Library Of The Future conference) – rather than personal lifestyle gurus and purveyors of SOE snakeoil. But I suppose it will still be a role for the particular subset of HE librarians who take an interest in providing generic info literacy/digital skills training, and are up to speed with whatever’s new-ish in town. I know some that are; but probably more that aren’t. Maybe it’s an emerging role for lib-friendly tech-types like ourselves to bridge the gap, or train the trainers? Then again, maybe before long this kind of thing will just be second nature to attention-seeking Facebook-savvy post-netgenners?!

  8. […] posted here: Getting Into The Top Ten For Your Institutional Repository « UK … papers-which, social-media, Web […]

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: