UK Web Focus

Innovation and best practices for the Web

Archive for October 25th, 2012

SEO Analysis of Enlighten, the University of Glasgow Institutional Repository

Posted by Brian Kelly on 25 October 2012

Background

In the third and final guest post published during Open Access Week William Nixon, Head of Digital Library Team at the University of Glasgow Library and the Service Development Manager of Enlighten, the University of Glasgow’s institutional repository service, gives his findings on use of  the MajesticSEO tool to analyse the Enlighten repository.


SEO Analysis of Enlighten, University of Glasgow

This post takes an in-depth look at a search engine optimisation (SEO) analysis of Enlighten, the institutional repository of the University of Glasgow. This builds on an initial pilot survey of institutional repositories provided by Russell Group universities described in the post on MajesticSEO Analysis of Russell Group University Repositories.

Background

University of Glasgow

Founded in 1451, the University of Glasgow is the fourth oldest university in the English-speaking world. Today we are a broad-based, research intensive institution with a global reach. It’s ranked in the top 1% of the world’s universities. The University is a member of the Russell Group of leading UK research universities. Our annual research grants and contracts income totals more than £128m, which puts us in the UK’s top 10 earners for research. Glasgow has more than 23,000 undergraduate and postgraduate students and 6000 staff.

Enlighten

We have been working with repositories since 2001 (our first work was part of the JISC funded FAIR Programme) and we now have two main repositories, Enlighten for research papers (and the focus of this post) and a second for our Glasgow Theses.

Today we consider Enlighten to be an “embedded repository”, that is, one which has “been integrated with other institutional services and processes such as research management, library and learning services” [JISC Call, 10/2010]. We have done this in various ways including:

  • Enabling sign-on with institutional ID (GUID)
  • Managing author identities
  • Linking publications to funder data from Research System
  • Feeding institutional research profile pages

As an embedded repository Enlighten supports a range of activities including our original Open Access aims to provide as any of our research outputs freely available as possible but also to act as a publications database and to support the university’s submission to REF2014.

University Publications Policy

The University has a Publications Policy, introduced to Senate in June 2008, has two key objectives:

  • to raise the profile of the university’s research
  • to help us to manage research publications.

The policy (it is a mandate but we tend not to use that term) asks that staff:

  • deposit a copy of their paper (where copyright permits)
  • provide details of the publication
  • ensure the University is in the address for correspondence (important for citation counts and database searches)

Enlighten: Size and Usage

Size and coverage

In mid-October 2012 Enlighten had 4,700 full text items covering a range of item types including journal articles, conference proceedings, book, reports and compositions. Enlighten has over 53,000 records and the Enlighten Team work with staff across all four Colleges to ensure our publications coverage is as comprehensive as possible.

Usage

We monitor Enlighten’s primarily via Google Analytics for overall access (including number of visitors, page views referrals and keywords) and EPrints IRStats package for downloads. Daily and monthly download statistics are provided in records for items with full text and we provide an overall listing of download stats for the last one and 12 month periods.

Looking at Google Analytics for the 1 Jan 12 – 30 Sep 12 (to tie in with this October snapshot) and the previous period we had 201,839 Unique Visitors up to 30 Sept 12 compared to 196,988 in 2011.

In the last year we have seen an increase in the number of referrals and our search traffic is now around 62%. In 2012 – 250,733 people visited this site, 62.82% was Search Traffic (94% of that is Google) with 157,503 Visits and 28.07% Referral Traffic with 70,392 visits.

In 2011 232,480 people visited this site, 69.97% of that was Search Traffic with 162,665 Visits and 18.98% came from referrals with 44,128 Visits.

Expectations

Our experience with Google Analytics has shown that much of our traffic still comes from search engines, predominantly Google but it has been interesting to note the increase in referral traffic, in particular from our local *.gla.ac.uk domain, this rise has coincided with the rollout of staff publication pages which are populated from Enlighten and provides links to the record held in Enlighten.

After *.gla.ac.uk domain referrals our most popular external referrals come from:

  • Mendeley
  • Wikipedia
  • Google Scholar

We expected that these would feature most predominantly in the Majestic results, with Google itself.

Majestic SEO Survey Results

The data for this survey was generated on the 22nd October 2012 using the ‘fresh index’, current data can be found from the Majestic SEO site with a free account. We do own the eprints.gla.ac.uk domain but haven’t added the code to create a free report. The summary for the site is given below showing 632 referring domains and 5,099 external backlinks. Interestingly it seems our repository is sufficiently mature for Majestic to all provide details for the last five years too.

Since we were looking at eprints.gla.ac.uk rather than *.gla.ac.uk we anticipated that our local referrals wouldn’t feature in this report. As a sidebar a focus just on gla.ac.uk showed nearly 411,000 backlinks and over 42,000 referring domains.



Figure 1.  Majestic SEO Summary for eprints.gla.ac.uk

This includes 619 educational backlinks and 54 educational referring domains. This shows a drop in the number of referring domains since Brian’s original post in August which showed 680 and a breakdown of the Top Five Domains (and number of links) as:

  • blogspot.com: 5,880
  • wordpress.com: 5,087
  • wikipedia.org: 322
  • bbc.co.uk: 178
  • cnn.com: 135

These demonstrate a very strong showing for blog sites, news and Wikipedia.


Figure 2. Top 5 Backlinks

Referring domains was a challenge! We couldn’t replicate the same Matched Links data which Warwick and the LSE have used. Our default Referring Domains report is ordered by Backlinks (other options including matches are available but none of our Site Explorer – Ref Domains options seemed to be able to replicate this. We didn’t use Create Report.

These Referring Domains ordered by Backlinks point us to full text content held in Enlighten from sites it’s unlikely we would have readily identified.

Figure 3a: Referring Domains by Backlinks


Figure 3b: Referring Domains by Matches (albeit by 1)

This report shows wikipedia.org at number one with the blog sites holding spots 2 and 3 and then Bibsonomy (social bookmark and publication sharing system) and Mendeley at 4 and 5.

An alternative view of the Referring Domains report by Referring Domain shows the major blog services and Wikipedia in the top 3, with two UK universities Southampton and Aberdeen (featuring again) in positions 4 and 5.

The final report is a ranked list of Pages, downloaded as CSV file and then re-ordered by ReferringExtBacklinks.

URL ReferringExtBackLinks CitationFlow TrustFlow
http://eprints.gla.ac.uk 584 36 28
http://eprints.gla.ac.uk/58987/1/58987.pdf 198 18 15
http://eprints.gla.ac.uk/2081/1/languagepictland.pdf 77 10 9
http://eprints.gla.ac.uk/562 70 24 2
http://eprints.gla.ac.uk/431 69 23 2
http://eprints.gla.ac.uk/225/01/Thomas[1].pdf 61 0 0

Table 1: Top 5 pages, sorted by Backlinks

These pages are:

  • Enlighten home page
  • PDF for “Arguments For Socialism”
  • PDF for “Language in Pictland”
  • A chronology of the Scythian antiquities of Eurasia based on new archaeological and C-14 data [Full text record]
  • Some problems in the study of the chronology of the ancient nomadic cultures in Eurasia (9th – 3rd centuries BC) [Full text record]
  • PDF for “87Sr/86Sr chemostratigraphy of Neoproterozoic Dalradian limestones of Scotland and Ireland: constraints on depositional ages and time scales” [Full text record]

Summary

Focusing in more detail on the results, in Figure 2, the top 5 backlinks, 4 out of the 5 are from Wikipedia, the first two are to the same paper but from different Wikipedia entries. It’s interesting to see that our third ranked backlink is the ROARmap registry.

Looking at the top 5 pages ranked by backlinks, none of the PDFs or the records which have PDFs currently appear in our IRStats generated list of most downloaded papers in the last 12 months. It is clear however, in this pilot sampling to draw a correlation between ranking and the availability of  full text and not merely a metadata record.

Discussion

While this initial work has focused on the Top 5, extending this to at least the Top 10 would be useful for further comparison, it was interesting to see that sites such as Mendeley appeared in variations of our Referring Domains which correlated with our Google Analytics reports which indicate that they are a growing source of referrals.

Looking at Figure 3a, a Google search, on the first referring domain (by backlinks) reveals that the number Ref Domain scientificcommons.org has 136,000 results on Google for “eprints.gla.ac.uk”, salero.info didn’t match at all and abdn.ac.uk had 5 results.

Social media sites such as Facebook and Twitter don’t appear in these initial results, it may be because the volume is insufficient to be ranked here or there may be breach of service issues. Google Analytics now provides some social media tools and we have been identifying our most popular papers from Facebook and Twitter.

This has been an interesting, challenging and thought-provoking exercise with the opportunity to look at the results and experiences of Warwick and the LSE who, like us reflect the use of Google Analytics to provide measures of traffic and usage.

The overall results from this work provide some interesting counterpoints and data to the results which we get from both Google Analytics and IRStats. These will need further analysis as we explore how Majestic SEO could be part of the repository altmetrics toolbox and how we can leverage its data to enhance access our research.


About the Author

William Nixon is the Head of Digital Library Team at the University of Glasgow Library. He is also the Service Development Manager of Enlighten, the University of Glasgow’s institutional repository service (http://eprints.gla.ac.uk). He been working with repositories over the last decade and was the Project Manager (Service Development) for the JISC funded DAEDALUS Project that set up repositories at Glasgow using both EPrints and DSpace. William is now involved with the ongoing development of services for Enlighten and support for Open Access at Glasgow. Through JISC funded projects including Enrich and Enquire he has worked to embed the repository into University systems. This work includes links to the research system for funder data and the re-use of publications data in the University’s web pages. He was part of the University’s team which provided publications data for the UK’s Research Excellence Framework (REF) Bibliometrics Pilot. William is now involved in supporting the University of Glasgow’s submission to the REF2014 national research assessment exercise. Enlighten is a key component of this exercise, enabling staff to select and provide further details on their research outputs.

Posted in Evidence, Guest-post, openness | 2 Comments »