MajesticSEO Analysis of Russell Group University Repositories
Posted by Brian Kelly on 29 Aug 2012
Investigation of SEO Rankings of Institutional Repositories
There is a need “to investigate whether links [from popular social media services] are responsible for enhancing SEO rankings of resources hosted in institutional repositories” concluded the paper by myself and Jenny Delasalle which asked “Can LinkedIn and Academia.edu Enhance Access to Open Repositories?“.
The importance of SEO rankings for surfacing content hosted in institutional repositories can be gauged from the responses to the query I asked on the JISC-Repositories JISCMail list: “Does anyone have any statistics on the proportion of traffic which arrives at institutional repositories from Google?”. I asked a similar question on Twitter and found that mature research repositories seem to get about from 50-80% of their traffic from Google. This aligns with the findings reported by Les Carr for the University of Southampton back in 2006: “the majority of repository use, if I can equate eprint downloads with repository use, is due to external web search engines (64%)“. Indeed since it has been reported that direct downloads of PDFs hosted in repositories may not be reported unless Google Analytics has been configured appropriately such figures may be an underestimate!
In light of the importance of Google in supporting repositories in their mission of making research papers easily accessible to others it will be useful to gain a better understanding of the factors which contribute to supporting the discoverability of the content hosted in institutional repositories.
The survey described in this post reports on summary SEO findings for the 24 Russell Group universities. The aims of the survey are to provide a benchmark for comparisons with surveys which may be carried out in the future, to attempt to identify any interesting usage patterns which may help to enhance the effectiveness of institutional repositories and to identify the highest ranked domains which provide links to institutional repositories.
Survey Using MajesticSEO
The data was collected on 27-28 August 2012 using the MajesticSEO service. Note that the current finding can be obtained by following the link in the final column. The findings can be viewed if you have signed up to the free service.
Ref. No. |
Institutional Repository Details | Referring Domains |
External Backlinks |
Educational Backlinks |
Educational Domains |
Top Five Domains & Numbers of Links | View Results |
1 |
Institution: University of Birmingham
Repository used: eprint Repository
|
116 | 499 | 146 | 16 | blogspot.com: 6,424 wordpress.com: 4,658 wikipedia.com: 200 bbc.co.uk: 82 sourceforge.net: 67 |
[Link] |
2 |
Institution: University of Bristol
Repository used: ROSE
|
159 | 691 | 144 | 21 | wordpress.com: 7,871 blogspot.com: 6,692 wikipedia.org: 273 bbc.co.uk: 98 guardian.co.uk: 89 |
[Link] |
3 |
Institution: University of Cambridge
Repository used: Dspace @ Cambridge
|
86 | 7,339 | 283 | 97 | blogspot.com: 33,276 wordpress.com: 17,241 wikipedia.org: 1,771 google.com: 449 bbc.co.uk: 442 |
[Link] |
4 |
Institution: Cardiff University
Repository used: ORCA
|
22 | 58 | 9 | 4 | wordpress.com: 1,874 blogspot.com: 883 typepad.com: 250 bbc.co.uk: 85 guardian.com: 60 |
[Link] |
5 |
Institution: University of Durham
Repository used: DRO |
297 | 1,281 | 27 | 12 | wordpress.com: 5,430 blogspot.com: 3,020 wikipedia.org: 145 bbc.co.uk: 76 ask.com: 45 |
[Link] |
6 |
Institution: University of Edinburgh
Repository used: ERA
|
747 | 3,943 | 247 | 71 | blogspot.com: 14,380 wordpress.com: 9,845 wikipedia.org: 470 google.com: 401 bbc.co.uk: 296 |
[Link] |
7 |
Institution: University of Exeter
Repository used: ERIC
Note: Repository sub-domain not used. See footnote 2.
|
198 | 958 | 175 | 18 | wordpress.com: 1,125 blogspot.com: 1,115 bbc.co.uk: 45 wikipedia.org: 43 ning.com: 42 |
[Link] |
8 |
Institution: University of Glasgow
Repository used: Enlighten
|
680 |
4,868 | 423 | 62 | blogspot.com: 5,880 wordpress.com: 5,087 wikipedia.org: 322 bbc.co.uk: 178 cnn.com: 135 |
[Link] |
9 |
Institution: Imperial College
Repository used: Spiral
|
139 | 702 | 329 | 11 | wordpress.com: 3,363 blogspot.com: 1,883 bbc.co.uk: 121 guardian.co.uk: 119 wikipedia.org: 65 |
[Link] |
10 |
Institution: King’s College London
Repository used: Department of
Computer Science E-Repository |
14 |
37 | – | – | blogspot.com: 2,552 wordpress.com: 2,275 bbc.co.uk: 169 wikipedia.org: 160 reddit.com: 139 |
[Link] |
11 |
Institution: University of Leeds
Repository used: White Rose Research Online
|
700 | 4,847 | 1,354 | 2 | blogspot.com: 44 wordpress.com: 23 wikipedia.org: 13 google.com: 8 ox.ac.uk: 5 |
[Link] |
12 |
Institution: University of Liverpool
Repository used: Research Archive
|
66 |
297 | 147 | 8 | blogspot.com: 4,057 wordpress.com: 2,461 wikipedia.org: 97 bbc.co.uk: 55 google.com: 53 |
[Link] |
13 |
Institution: LSE
Repository used: LSE Research Online
|
1,365 | 9,771 | 549 | 80 | wordpress.com: 14,449 blogspot.com: 11,550 wikipedia.org: 343 google.com: 262 flickr.com: 244 |
[Link] |
14 |
Institution: University of Manchester
Repository used: eScholar Note: Repository sub-domain not used. See footnote 3.
|
(5) | (29) | – | – | [Link] | |
15 |
Institution: Newcastle University
Repository used: Newcastle Eprints |
30 | 215 | 85 | 5 | blogspot.com: 6,425 wordpress.com: 3,929 wikipedia.org: 221 bbc.co.uk: 116 ask.com: 87 |
[Link] |
16 |
Institution: University of Nottingham
Repository used: Nottingham Eprints
|
359 | 1,594 | 328 | 57 | blogspot.com: 5,410 wordpress.com: 3,856 wikipedia.org: 148 google.com: 77 guardian.co.uk: 66 |
[Link] |
17 |
Institution: University of Oxford
Repository used: ORA
|
299 | 1,116 | 94 | 35 | blogspot.com: 42,008 wordpress.com: 39,798 wikipedia.org: 1,437 ask.com: 548 bbc.co.uk: 504 |
[Link] |
18 |
Institution: Queen Mary, University of London
Repository used: QMRO
|
27 | 449 | 350 | 6 | wordpress.com: 4,722 blogspot.com: 1,221 orange.fr: 259 wikipedia.org: 219 ask.com: 89 |
[Link] |
19 |
Institution: Queen’s University Belfast
Repository used: Queen’s Papers on Europeanisation & ConWEB
Note: Repository sub-domain not used. See footnote 4. |
(9) | (14) | – | – | [Link] | |
20 |
Institution: University of Sheffield
Repository used: DCS Publications Archive
Note: Repository sub-domain not used. See footnote 5. Note: The University of Sheffield also uses the White Rose repository which is also used by Leeds and York. See the Leeds entry for the statistics. |
(2) | (3) | – | – | [Link] | |
21 |
Institution: University of Southampton
Repository used: eprints.soton
|
1,329 |
46,176 | 33,524 | 123 | blogspot.com: 4,384 wordpress.com: 2,568 wikipedia.org: 264 bbc.co.uk: 138 microsoft.com: 89 |
[Link] |
22 |
Institution: University College London
Repository used: UCL Discovery
|
335 |
13,978 | 492 | 24 | blogspot.com: 16,009 wordpress.com: 15,633 wikipedia.org: 860 bbc.co.uk: 406 youtube.com: 250 |
[Link] |
23 |
Institution: University of Warwick
Repository used: WRAP |
433 |
2,476 | 278 | 20 | blogspot.com: 9,412 wordpress.com: 7,601 google.com: 217 wikipedia.org: 179 reddit.com: 122 |
[Link] |
24 |
Institution: University of York
Repository used: YODL
Note: Repository sub-domain not used. See footnote 6.
Note: The University of Sheffield also uses the White Rose repository which is also used by Leeds and York. See the Leeds entry for the statistics.
|
(3) | (5) | – | – | [Link] | |
Range | 14 – 1,369 | 37 – 46,176 | 9 – 33,524 | 2 – 123 |
- The list of repositories is taken from OpenDoar.
- The ERIC repository at the University of Exeter is hosted at https://eric.exeter.ac.uk/repository/ Since the repository home page is a redirect from https://eric.exeter.ac.uk/ it was possible to analyse the SEO rankings and get appropriate results.
- The eScholar repository at the University of Manchester is hosted at http://www.manchester.ac.uk/escholar/ Figures for this home page are given but since the domains with incoming links may refer to pages hosted on the manchester.ac.uk domain, these figures are not given in order to avoid skewing the findings.
- The Queen’s University Belfast repository is hosted at http://www.qub.ac.uk/schools/SchoolofPoliticsInternationalStudiesandPhilosophy/Research/PaperSeries/ConWEBPapers/ Figures which are available for this home page are given but since the domains with incoming links may refer to pages hosted on the qub.ac.uk domain, these figures are not given in order to avoid skewing the findings.
- The DCS repository at the University of Sheffield is hosted at http://www.shef.ac.uk/dcs/research/publications Figures which are available for this home page are given but since the domains with incoming links may refer to pages hosted on the shef.ac.uk domain, these figures are not given in order to avoid skewing the findings.
- The YODL repository of the University of York is hosted at http://dlib.york.ac.uk/yodl/app/home/index Figures which are available for this home page are given but since the domains with incoming links may refer to pages hosted on the dlib.york.ac.uk domain, these figures are not given in order to avoid skewing the findings.
Table 2 gives the total number of links to the high-ranking domains which are listed in the survey, together with the Alexa ranking for these domains. Note Google.com has the highest Alexa ranking and is listed at number 1. Figure 1 shows the significance of links from blog platforms compared with the other most highly-ranked domains.
No. | Domains | No. of links | Alexa Ranking |
1 | Blogspot | 176,625 | 5 |
2 | WordPress | 153,809 | 21 |
3 | Wikipedia | 7,230 | 8 |
4 | BBC | 2,811 | 36 |
5 | 1,447 | 1 | |
6 | Ask | 769 | 46 |
7 | YouTube | 460 | 3 |
8 | Guardian | 334 | 187 |
9 | 261 | 143 | |
10 | Orange.fr | 259 | 259 |
11 | Typepad | 250 | 212 |
12 | CNN | 135 | 43 |
13 | Microsoft | 89 | 26 |
14 | Sourceforge | 67 | 139 |
15 | Ning | 42 | 256 |
16 | Oxford University | 5 | 6,764 |
Discussion
In a previous post I suggested that since LinkedIn.com is so widely used across Russell Group Universities, encouraging researchers to provide links to their papers hosted in their institutional repository would enhance the visibility of papers to Google, especially since LinkedIn has such a high Alexa ranking (it currently is listed at number 13 in the global ranking order).
However it appears that LinkedIn does not appear to have a significant presence according to the findings provided in MajesticSEO (although the free version does only list the top five domains).
Based on the information obtained in the survey it would appear that two blog platforms, WordPress.com and Blogspot.com, are primarily responsible for driving traffic to institutional repositories, having both high Alexa rankings together with large numbers of links to the repositories.
Following these two platforms, but a long way behind, we find Wikipedia and the BBC and then, perhaps somewhat confusingly, Google itself (perhaps links from Google Scholar). The presence of media sites such as the BBC, CNN and the Guardian suggest that researchers (or their media advisers) are doing a good job in ensuring that these organisations provide links to original research papers when stories about university research are being covered in the media.
But perhaps the most noticeable findings is that only one University Web site – Oxford’s – is included in the list of the top 5 domains across all of the Russell Group Universities. The low Alexa ranking (6,764) for the Oxford University Web site in comparison with the other sites listed (which have an Alexa ranking ranging from 1 to 259) suggests that links from university Web sites, even prestigious universities such as Oxford, will not have a significant impact on Google search results. It should also be noted that links from the University of Oxford Web site will not provide SEO benefits to the University of Oxford’s repository, which is hosted in the same domain (ox.ac.uk).
Limitations of this Survey
It should be noted that these conclusions are based on just one SEO tool and only a small selection of the findings are available. A more comprehensive survey would make use of the licensed version of the service, and make use of other SEO tools to compare the findings.
In addition Google do not publish the algorithms on which their search results are ranked so there can be no guarantee that the findings provided by SEO tools will relate directly to users experiences of using Google.
In order to relate these findings to the ways users access resources hosted on a repository there will be a need to examine usage statistics for repositories. It would be interesting to see if the downloads for the most popular items show any correlation with links from the services listed above.
Survey Paradata: The findings given in Table 1 were collected on 27-28 August 2012 using the free version of MajesticSEO. The Alexa rankings listed in Table 2 were obtained from the Alexa survey and collected on 28 August 2012. Where the findings from MajesticSEO were incomplete, due to the repository not being hosted on the root of a repository sub-domain this information was recorded and any data collected was not included in further analysis.
Twitter conversation from: [Topsy] – [SocialMention] – [WhosTalkin]
Dixon Jones said
Hi Brian, I am a director here at MajesticSEO. Thanks for using our data for this research. If we can help with a follow up then I would be very happy to help. Track me down @Dixon_Jones on Twitter. In particular, Looking at the comparative Trust Flow metrics would be an interesting comparison, since trust is a weighted metrics that passes through multiple link iterations, meaning that large numbers of low quality links are less influential on the resulting 0-100 scores. By contrast, Citation Flow is a “Link heavy” metric… so urls or sites with high trust and low citation flow MIGHT indicate narrow centres of excellence.
Brian Kelly (UK Web Focus) said
Thanks for the comment. I’m now following you on Twitter (I’m @briankelly).
Funnily enough initially I did include the Trust Flow and Citation Flow values on the table. However since the FAQ didn’t really provide a great explanation of what these terms mean and how they are obtained, I decided to omit this information so that the post was focussed on the figures which are easily understood,
On further reflection I wonder if significant number of links to the repositories are from link farms which are hosted on Blogspot or WordPress. I guess it would be useful to be able to filter out links which are felt to be from untrustworthy sources. Is that possible?
dsalo said
I don’t really know how to intepret these numbers! Are they correlated with amount of content, amount of full-text (or other non-metadata-only) content, breadth or depth of subject matter, what?
Brian Kelly (UK Web Focus) said
Hi Dorothea
I’m working with a small number of repository managers who will be in a position to provide such contextual information. I prefer working in an open fashion, providing evidence as the work progresses, so that flaws in the methodology can be spotted at an early stage.
MajesticSEO Analysis of Russell Group University Repositories | Open Repositories(RSC) | Scoop.it said
[…] Investigation of SEO Rankings of Institutional Repositories There is a need “to investigate whether links [from popular social media services] are responsible for enhancing SEO rankings of re… […]
MajesticSEO Analysis of Russell Group University Repositories | OER Tech | Scoop.it said
[…] Not about OERs but similar questions are of interest. Investigation of SEO Rankings of Institutional Repositories There is a need “to investigate whether links [from popular social media services] are responsible for enhancing SEO rankings of re… "The survey reports that two blogging platforms appear to be primarily responsible for providing the high ranking which may drive traffic to repositories." The two blogging platforms in question are WordPress.com and Blogger. One question, I suppose, is how much traffic would the publications get if they were exposed directy through these platforms compared to being in the repository. […]
Notes on Jorum’s 2012 Summer of Enhancements: SEO and OER JISC CETIS MASHe said
[…] Traffic from Google searches varies from repository to repository but ranges between 50-80% are not uncommon [Ref 2] […]
Ann said
thanks
Analysis of Google Search Traffic Patterns to Russell Group University Web Sites « UK Web Focus said
[…] what patterns of usage for searches for university Web sites do we find? In a recent survey of the search engine rankings, it was observed that only one institutional Web site (at the University of Oxford) was featured in […]
SEO Analysis of WRAP, the Warwick University Repository « UK Web Focus said
[…] post published in August 2012 on an MajesticSEO Analysis of Russell Group University Repositories highlighted the importance of search engine optimisation (SEO) for enhancing access to research […]
SEO Analysis of LSE Research Online « UK Web Focus said
[…] SEO Analysis of WRAP… on MajesticSEO Analysis of Russel… […]
SEO Analysis of LSE Research Online « UK Web Focus said
[…] SEO Analysis of LSE … on MajesticSEO Analysis of Russel… […]
SEO Analysis of Enlighten, the University of Glasgow Institutional Repository « UK Web Focus said
[…] SEO Analysis of LSE … on MajesticSEO Analysis of Russel… […]
Open Practices for Open Repositories « UK Web Focus said
[…] SEO Analysis of Enli… on MajesticSEO Analysis of Russel… […]
Essential Marketer (@SEOBirmingham) said
It would be very useful, to benchmark results and into analyse new link flow metrics that were added from that point.
This would give another dimension on the outcome and give a little bit more waiting to what link metrics added value?