UK Web Focus (Brian Kelly)

Innovation and best practices for the Web

  • Email Subscription (Feedburner)

  • Twitter

    Posts on this blog cover ideas often discussed on Twitter. Feel free to follow @briankelly.

    Brian Kelly on Twitter Counter

  • Syndicate This Page

    RSS Feed for this page

    Licence

    Creative Commons License
    This work is licensed under a Creative Commons Attribution 2.0 UK: England & Wales License. As described in a blog post this licence applies to textual content published by the author and (unless stated otherwise) guest bloggers. Also note that on 24 October 2011 the licence was changed from CC-BY-SA to CC-BY. Comments posted on this blog will also be deemed to have been published with this licence. Please note though, that images and other resources embedded in the blog may not be covered by this licence.

    Contact Details

    Brian's email address is ukwebfocus@gmail.com. You can also follow him on Twitter using the ID briankelly. Also note that the @ukwebfocus Twitter ID provides automated alerts of new blog posts.

  • Contact Details

    My LinkedIn profile provides details of my professional activities.

    View Brian Kelly's profile on LinkedIn

    Also see my about.me profile.

  • Top Posts & Pages

  • Privacy

    Cookies

    This blog is hosted by WordPress.com which uses Google Analytics (which makes use of 'cookie' technologies) to provide the blog owner with information on usage of this blog.

    Other Privacy Issues

    If you wish to make a comment on this blog you must provide an email address. This is required in order to minimise comment spamming. The email address will not be made public.

Archive for the ‘search’ Category

Why You Should Do More Than Simply Claiming Your ORCID ID

Posted by Brian Kelly on 19 Nov 2012

Background to ORCID

Last week the SpotOn London 2012 conference (#solo12) included a session entitled ORCID – Why Do We Need a Unique Researcher ID? As described in the abstract for the session:

Open Researcher & Contributor ID (ORCID) provides a persistent digital identifier that distinguishes you from every other researcher. Through integration with key research workflows and other identifiers, ORCID supports automated linkages between you and your professional activities, ensuring that your work is recognized. The ORCID service launched in October 2012 and in this hands-on workshop we will demonstrate the different tools that already use the ORCID identifier, from manuscript submission to altmetrics for your publications. The focus will be on working with these tools so that at the end of the workshop you will have registered for your personal ORCID (if you didn’t have one already), started creating your ORCID record, and explored cool ways to use your ORCID to connect your research back to you. Wide usage and adoption of a researcher naming standard is a key component of effective research communication. Such a standard is fundamental to improving data quality and system interoperability, and ultimately will reduce the amount of time individuals spend maintaining their professional record—freeing time for research itself.

As described in a recent post on Observing Growth In Popularity of ORCID: An SEO Analysis we can already observe take-up in use of ORCID since its launch last month.

Claiming an ORCID ID

Shortly after the launch I claimed my ORCID ID: 0000-0001-5875-8744. As suggested on the ORCID home page this is a painless exercise, taking about 30 seconds to complete.

I then added addition information including details of my research papers. Citation information for my papers were added automatically once I had associated my ORCID ID with my Scopus account. I then had to individually change the visibility of these items from Private to Public in order that the records were including in the public display of my ORCID profile.

The final thing I did was to add links to my key Web resources, including the UKOLN Web site, my UK Web Focus blog and my LinkedIn profile.

If you a researcher and have published peer-reviewed papers I would recommend claiming of your ORCID ID. But beyond investing 30 seconds in claiming the ID I would also suggest that you should associate your ORCID ID with your papers and then make them public (note it has been suggested that the display should be public by default). I would also recommend that your ORCID record should provide links so that others can find out more about you and your research activities, including your current contact details.

Using An ORCID Record

Maintaining Links, As Author Affiliations Changes

I would suggest, however, that researchers should do more than simply claim their ORCID ID. I recently realised recently that I was in danger of losing contact with people I have co-authored papers with since writing my first peer-reviewed paper back in 1999. This has always been a danger in light of the turn-over in affiliations for those working as researchers and will become even more relevant in light of cutbacks in higher education.

I have therefore started to make contact with co-authors and have invited them to claim their ORCID ID. I will include this information in citation records which I maintain. As an example the papers tab on this blog contains details of papers I have published and includes links to further information for each of the papers.

I have recently begun updating the citation details with links to the ORCID ID for my co-authors when I have been notified of their ORCID ID.

An example for the paper on A Challenge to Web Accessibility Metrics and Guidelines: Putting People and Processes First is illustrated, for which ORCID IDs for three of the four authors are available.

In this case the co-authors are still based at same institution. However for a paper on Developing Countries; Developing Experiences: Approaches to Accessibility for the Real World written by three of the four same authors, Sarah Lewthwaite was at the time based at the University of Nottingham. The page containing the citation information has Sarah’s institutional details from when the paper was published (and the paper itself will have the email details for this institution which will no longer work). However the ORCID ID will continue to be valid, and can be updated with any new organisational details and email address.

Supporting Resource Discovery

Since claiming my ORCID ID I have found that a Google search for ‘Brian Kelly ORCID includes my ORCID record in the first page of results, as illustrated. And whilst finding the page probably reflects a personalised view of my Google search results, it did occur to my that a search for ‘researcher’s name ORCID’ may become a quick way of finding research publications for an individual. Since my initial experiments tended to find results related to the Orcid flower I realised that use of ‘ORCID ID’ may provide a useful disambiguation term. I have therefore decided to use this structure in my Web resources, even if pedants point out the redundancy in use of ‘ID’ since ORCID stands for Open Researcher & Contributor ID. After all, we talk about the Sahara Desert even though Sahara means desert.

If a search for ‘name ORCID ID’ becomes a means of helping to find details for a researcher’s publication record might it also be useful for finding the papers themselves?

As illustrated, a Google search for ‘A Challenge to Web Accessibility Metrics and Guidelines: Putting People and Processes First finds the item in the institutional repository, an article posted on this blog and, in third place, the information provided in my ORCID record.

Although it should again be mentioned that these findings may be skewed by Google personalisation features (I was logged into Google when carrying out the search and used the PC in my office) the point to be made is that content held in ORCID will be found by Google.

In addition, the visibility of the ORCID Web site is likely to be enhanced as more people link to ORCID from their Web sites, especially high-ranking Web sites. This may mean that the early adopters who claim an ORCID ID in its early stages of development will gain benefits through their peers finding their published research papers – something likely to be of particularly important within the UK higher education sector in the run-up to REF 2014.

Why would you not claim your ORCID ID? Why would you not make use of your ORCID record as I have suggested? And if any of my co-authors read this post, feel free to get in touch and let me have details of your ORCID ID.


View Twitter conversation from: [Topsy]

Posted in Identifiers, search | Tagged: | 8 Comments »

Analysis of Google Search Traffic Patterns to Russell Group University Web Sites

Posted by Brian Kelly on 1 Oct 2012

Background

How can we ensure that the wide range of information provided on university Web sites can be easily found? One answer is quite simple: ensure that such resources are easily found using Google. After all, when people are looking for resources on the Web they will probably use Google.

But what patterns of usage for searches for university Web sites do we find? In a recent survey of the search engine rankings, it was observed that only one institutional Web site (at the University of Oxford) was featured in the list of Web sites which have a high ranking which can help drive traffic to the institutional repository. It was also noticed that this Web site had a significantly lower Alexa ranking (6,187) than the other 15 Web sites listed, such as WordPress.com, Blogspot.com, YouTube.com, etc. which had a Google ranking ranging from 1-256.

In order to gain a better understanding of how Google may rank search results for resources hosted on university Web sites are, the findings of a survey are published below which provide graphs of recent search engine traffic and summarise the range of values found for the global and UK Alexa rankings and the Alexa ‘reputation’ scores across this sector.

About Alexa

From Wikipedia we learn that:

Alexa Internet, Inc. is a California-based subsidiary company of Amazon.com that is known for its toolbar and website. Once installed, the Alexa toolbar collects data on browsing behavior and transmits it to the website, where it is stored and analyzed, forming the basis for the company’s web traffic reporting. Alexa provides traffic data, global rankings and other information on thousands of websites, and claims that 6 million people visit its website monthly.

The article goes on to describe how:

Alexa ranks sites based on tracking information of users of its Alexa Toolbar for Internet Explorer and Firefox and from their extension for Chrome. 

This means that the Alexa findings should be treated with caution:

the webpages viewed are only ranked amongst users who have these sidebars installed, and may be biased if a specific audience subgroup is reluctant to do this. Also, the ranking is based on three-month data

Despite such limitations, the Alexa service can prove useful in helping those involved in providing large-scale Web sites with a better understanding of the discoverability of their Web site.  The Alexa Web site describes howAlexa is the leading provider of free, global web metrics. Search Alexa to discover the most successful sites on the web by keyword, category, or country“.

In light of the popularity of the service and the fact that, despite being a commercial service, it provides open metrics it is being used in this survey as part of an ongoing process which aims to provide a better understanding of the discoverability of resources on institutional Web sites.

Survey Using Alexa

The following definitions of the information provided by Alexa were obtained from the Alexa Web site:

The Global Alexa Traffic Rank is “An estimate of the site’s popularity. The rank is calculated using a combination of average daily visitors to the site and pageviews on the site over the past 3 months. The site with the highest combination of visitors and pageviews is ranked.”

The GB Alexa Traffic Rank is “An estimate of the site’s popularity in a specific country. The rank by country is calculated using a combination of average daily visitors to the site and pageviews on the site from users from that country over the past month. The site with the highest combination of visitors and pageviews is ranked #1 in that country.

The Reputation is based on the number of inbound links to the site: The number of links to the site from sites visited by users in the Alexa traffic panel. Links that were not seen by users in the Alexa traffic panel are not counted. Multiple links from the same site are only counted once. 

The graph showing traffic from search engines gives the percentage of site visits from search engines.

The average traffic is based on the traffic over the last 30 days.

The data was collected on 20 September 2012 using the Alexa service. Note that the current finding can be obtained by following the link in the final column.

The graphs for the traffic from search engines contain a snapshot taken on 20 September 2012 together with the live findings provided by the Alexa service. The range of findings for the Alexa rank and reputation is provided beneath the table.

Table 1: Alexa Findings for Russell Group University Web Sites
1 2 3 4 5
Institution Traffic from Search Engines Average Traffic
(18 Aug –
17 Sep 2012)
View
Results
20 Sept 2012 Current findings
University of
Birmingham
19.7% [Link]
University of
Bristol
22.8% [Link]
University of
Cambridge
24.0% [Link]
Cardiff University   26.1% [Link]
University of
Durham
25.1% [Link]
University of
Edinburgh
26.9% [Link]
University of
Exeter
26.7% [Link]
University of
Glasgow
     [Link]
Imperial College     31.0% [Link]
King’s College
London
  19.9% [Link]
University of
Leeds
31.7% [Link]
University of
Liverpool
22.5% [Link]
LSE   22.5% [Link]
University of
Manchester
  25.8% [Link]
Newcastle
University
15.7% [Link]
University of
Nottingham
20.5% [Link]
University of Oxford 26.8% [Link]
Queen Mary,
University
of London
   20.2% [Link]
Queen’s
University
Belfast
  14.1% [Link]
University of
Sheffield
     17.4% [Link]
University of Southampton   21.9% [Link]
UCL     26.7% [Link]
University of
Warwick
  29.6% [Link]
University of
York
    23.5% [Link]

Survey Paradata

This survey was carried out using the Alexa service on Thursday 20 September. The Chrome browser running on a Windows 7 platform was used. The domain name used in the survey was taken  from the domain name provided on the Russell Group University Web site. The snapshot of the traffic shown in column 2 was captured on 20 September. Column 3 gives a live update of the findings from the Alexa service. Note that if the live update fails to work in the future this column will be deleted.

Summary

The Russell Group university Web sites have global Alexa rankings ranging from 6,318 to 75,000 and UK Alexa rankings ranging from 748 – 6,110. In comparison in the global rankings Facebook is ranked at number 1YouTube at 3Wikipedia at 6Twitter at 8Blogspot at 11WordPress.com at 22, and the BBC at 59.

The Russell Group university Web sites have “reputation” scores ranging from 4,183 – 43,917, which are based on the number of domains with links to the sites which have been followed in the past month. Although the algorithms used by Google to determine the search results ranking are a closely-kept secret (and are liable to change to prevent misuse) the number of domains, together with the ranking of the domains, are used by Google in its search algorithms for ranking the search results. According to the survey, Google delivered between 14-31% of the traffic to the Web sites during August-September 2012.

Caveat

In addition to the limitations of data provided by Alexa summarised above it should be noted that we should not expect institutions to seek to maximise any of the Alexa rankings purely for its own sake. We would not expect university Web sites to be as popular as global social media services. Similarly it would be unreasonable to expect findings  to be used in a league table. However universities may well be exploring SEO approaches, and perhaps commissioning SEO consultants to advise them. This post, therefore, aims to provide a factual summary of findings provided by a service which may be used for in-house analysis or by third-parties who have been commissioning to advise on SEO strategies for enhancing access to institutional resources.

Discussion

This survey was published in September since we might expect traffic to grow from a lull during the summer vacation, but increase as students prepare to arrive at university. It will be interesting to see how the pattern changes over time and, since this page contains a live feed from Alexa shown in column 7, it should be easy to compare the current patterns across the Russell Group universities.

This initial survey has been carried out in order to provide a benchmark for further work in this area and invite feedback. Further work is planned which will explore in more detail the Web sites which drive search engine traffic to institutional Web sites in order to identify strategies which might be used in order to enhance traffic search engine.

It should be noted that this data has been published in an open fashion in order that the methodology can be validated and the wider community can benefit from the findings and from open discussion about the approaches taken to the data collection and discussions on how such evidence might inform plans for enhancing the discoverability of content hosted on institutional Web sites. Feedback would be appreciated on these approaches.

Twitter conversation from: [Topsy]

Posted in Evidence, search | 1 Comment »

Google Search Results for Russell Group Universities Highlight Importance of Freebase

Posted by Brian Kelly on 24 Sep 2012

About This Post

This post summarises the findings of a survey of the Google search engine results for Russell Group universities. The post provides access to the findings which were obtained recently, with live links which enable the current findings to be viewed. The post explains how additional content, beyond the standard search results snippet, is obtained and discusses ways in which Web managers can manage such information.

The following sections are included in this post:

The Importance of Google Search

An important part of my work in supporting those who manage institutional Web service is in evidence-gathering. The aim is to  help identify approaches which can inform practice for enhancing the effectiveness of institutional Web service.

This post summarises the findings for Google searches for institutional Web sites. Google plays an important role in helping users find content on institutional Web sites. But Google nowadays not only acts as a search engine, it also provides navigational aids to key parts of an institutional Web site and hosts content about the institution.

An example of a typical search for a university is shown below; in this case a search for London School of Economics. As can be seen, the results contain navigational elements (known as ‘sitelinks‘); a search box (which enables the user to search the institutional Web site directly); a Google map; a summary from Wikipedia and additional factual content, provided by Google.

Findings of a Survey of Search Results for Russell Group Universities

Are the search results similar across all institutions? And if there are significant differences, should institutions be taking action to ensure that additional information is being provided or even removed?

In order to provide answers to such questions a search for the 24 Russell Group universities was carried out on 17 September 2012. The findings are given in the table shown below. Note that the table is in alphabetic order.  Column 2 gives the name of the institution and the search term used; column 3 gives the sitelinks provided; column 4 states whether a search box was embedded in the results; column 5 states whether a Google Map for the institution was provided; column 6 lists the titles of the factual content provided; column 7 provides a link to the Wikipedia entry if this was provided and column 8 provides a link to the search findings, so that up-to-date findings can be viewed (which may differ from those collected when the survey was carried out).

Table 1: Google Search Findings for Russell Group Universities
Ref.
No.
Institution / Search term Main search results (on left of Google results page) Additional results
(on right of Google results)
View results
Sitelinks Search box? Google Map? Factual information categories
 from Google
Wikipedia Content
1
University of Birmingham

Course finder – Jobs

Postgraduate study at … – Contact us

Accommodation – Schools and Departments

No Yes At a glance; Transit; More reviews [Search]
2
University of Bristol

Undergraduate Prospectus – Faculties and Schools

Jobs – Study

Contacting people – International students

No Yes Motto; Address; Enrollment; Phone; Mascot; Hours [Link]  [Search]
3
University of Cambridge

Job Opportunities – Contact us

Undergraduate – Visitors

Hermes Webmail Service – Staff & Students/

Yes Yes Motto; Address; Color; Phone; Enrollment; Hours [Link]  [Search]
4
Cardiff University

For… Current Students – International students

Job Opportunities – For… Staff

Prospective Students – Contact Us

Yes Yes Address; Phone; Enrollment; Colors [Link] [Search]
5
University of Durham

Undergraduate – Visit us

Postgraduate Study – Student Gateway

Staff Gateway – Courses

Yes Yes Address; Phone; Colors; Enrollment; Founded [Link] [Search]
6
University of Edinburgh

Jobs.ed.ac.uk – Staff and students

Studying at Edinburgh – Research

Schools & departments – Summer courses

Yes Yes Address; Acceptance rate; Phone; Enrollment; Founded; Colors [Link] [Search]
7
University of Exeter

Undergraduate study – Contact us

Postgraduate study – Visiting us

Working here – Studying

Yes Yes Address; Enrollment; Phone; Colors [Link] [Search]
8
University of Glasgow

Undergraduate degree … – MyGlasgow for students

Postgraduate taught degree … – Information for current students

Jobs at Glasgow – Courses

Yes Yes Address; Phone; Acceptance rate; Enrollment; Founded; Colors [Link] [Search]
9
Imperial College

Postgraduate Prospectus – My Imperial

Courses – Employment

Faculties & Departments – Prospective Students

Yes Yes Motto; Address; Phone; Acceptance rate; Enrollment; Colors [Link] [Search]
10
King’s College London

Postgraduate Study – Job opportunities

Undergraduate Study – Florence Nightingale School of …

Department of Informatics – School of Medicine

Yes Yes Address; Phone; Mascot; Enrollment; Founded; Colors [Link] [Search]
11
University of Leeds

Undergraduate – University jobs

Postgraduate – School of Mathematics

Portal – Coursefinder

Yes Yes Address; Phone; Enrollment; Founded; Colors [Link] [Search]
12
University of Liverpool

Students – Job vacancies

Postgraduate – Online degrees

Undergraduate – Departments and services

Yes Yes Address; Phone; Enrollment; Acceptance rate; Founded [Link] [Search]
13
London School of Economics

Impact of Social Sciences – Department of Economics

Undergraduate – Library

Graduate – LSE for You

Yes Yes Address; Phone; Enrollment; Mascot; Founded; Colors [Link] [Search]
14
University of Manchester

Postgraduate – Courses

Undergraduate – Contact us

Job opportunities – John Rylands Library

 Yes Yes Enrollment; Founded; Colors [Link] [Search]
15
Newcastle University

Undergraduate Study – Postgraduate Study

Student Homepage – Contact Us

Vacancies – Examinations

Yes Yes Address; Phone; Enrollment; Founded; Colors: [Link] [Search]
16
University of Nottingham

Undergraduate Prospectus – Open days

Postgraduate Study at the … – Visiting us

Jobs – Academic Departments A to Z 

Yes Yes Address; Phone; Enrollment; Founded; Colors: [Link] [Search]
17
University of Oxford

Jobs and Vacancies – Online and distance courses

Undergraduate admissions – Colleges

Graduate Admissions – Maps and Directions

Yes No Acceptance rate; Color; Enrollment [Link] [Search]
18
Queen Mary, University of London
 No Yes Address; Phone; Enrollment; Colors [Link] [Search]
19 Queen’s University Belfast

Course Finder – Queen’s Online

Postgraduate Students – Job Opportunities at Queen’s

Schools & Departments – The Library

Yes Yes Address; Phone; Enrollment; Founded [Link] [Search]
20
University of Sheffield

MUSE – Postgraduates

Jobs – Courses and Prospectuses

Undergraduates – Departments

Yes No

Enrollment; Founded; Colors:

[Link] [Search]
21
University of Southampton

Undergraduate study – University contacts

Postgraduate study – International students

Faculties – Medicine

No Yes Address; Enrollment; Founded [Link] [Search]
22 University College London

Prospective Students – Research

Philosophy – About UCL

Economics – Teaching and Learning Portal

Yes No Enrollment; Founder; Founded; Colors [Link] [Search]
23
University of Warwick

University Intranet – Undergraduate Study

Postgraduate Study – Visiting the University

Current Vacancies – Open Days

Yes Yes Address; Phone; Enrollment; Acceptance rate; Founded; Colors [Link] [Search]
24
University of York

Jobs – Postgraduate study

Undergraduate study – Staff home

Student home – Departments

 No Yes Address; Enrollment; Hours; Phone; Founded; Colors [Link] [Search]

Note: This information was collected on 17 September 2012 and checked on 18 September 2012. It should also be noted that since Google search results can be personalised based on a variety of factors (previous searches, client used to search , etc.) others carrying out the same search make get different results.

Summary

We can see that 21 Russell Group University Web sites have a Google Map; 19 have a search interface on Google. The following table summarises the areas of factual information provided. The table is listed in order of the numbers of entries for each category. Note that the American spellings for ‘enrollment‘ and ‘color‘ are used in the Google results.

Table 2: Summary of the Categories Found
Ref. No. Type      Number
 1  Enrollment 23
 2  Address 19
 3  Color(s) 19
 4  Phone 18
 5  Founded 16
 6  Acceptance rate   6
 7  Mascot   3
 8  Motto   3
 9  Hours   2
10  Founder   1

In addition the search results also included information on Ratings and Google reviews (15 Russell Group university Web sites have a Google rating and 17 have a Google review). The numbers of Google reviews ranged from 1 to 208. Note that this information may well be susceptible to the ‘Trip Advisor Syndrome’ in which people have vested interests in giving either very high or very low scores.

Discussion

Sitelinks

The navigational elements are referred to as ‘sitelinks’ by Google. As described on the Google Webmaster Tools Web site:

sitelinks, are meant to help users navigate your site. Our systems analyze the link structure of your site to find shortcuts that will save users time and allow them to quickly find the information they’re looking for

The creation of sitelinks is an automated process. However, as described on the Google Webmaster Tools Web site, if a sitelink URL is felt to be inappropriate or incorrect, a Webmaster who has authenticated ownership of the Web site with the Google Webmaster tools can demote up to 100 of such links.

It should also be noted that during the final checking of the findings, carried out on 21 September 2012, it was found that the sitelinks for the University of Exeter had changed over a period of 5 days. The initial set of six sitelinks, which are listed above, were: Undergraduate study – Contact usPostgraduate study – Visiting usWorking here – Studying. The more recent list is Undergraduate study – Working herePostgraduate study – Contact usInternational Summer School – Studying.

Google Content

Although I suspect the findings for location maps won’t be a significant issue for universities (unlike, say, for small businesses) it was the the factual content provided by Google which seems to be of most interest. The display of such factual information is a recent development. On 16 May, 2012 a post on the GigaOM blog announced Google shakes up search with new Wikipedia-like feature which described how “the search giant is carving out a chunk of the site for “Knowledge Graph”, a tool that offers an encyclopedia-like package in response to a user’s query“. I highlighted the importance of the announcement in a post entitled Google Launches Knowledge Graph and, as Martin Hawksey commented, “As Freebase uses Wikipedia as its main data source having information in there is important but it’s in Freebase that structure is added to individual entities to make the knowledge graph“.

This factual information appeared to be the most interesting aspect of the survey. A summary of the Freebase service is given below, together with a discussion of the implications for management of content hosted in Freebase.

Thoughts on Freebase

It was back in 2007 when I first became aware of Freebase. As I described in a report on the WWW2007 conference Freebase is “an open Web 2.0 database, which has been exciting many Web developers recently“, with a more detailed summary being provided in Denny Vrandecic’s blog posting. However since then I have tended to focus my attention on the importance of Wikipedia and haven’t been following developments with Freebase apart from the announcement in 2010 of the sale of Freebase to Google.

Looking at the Freebase entry for the University of Oxford it seems there are close links between Freebase and Wikipedia. As shown in the screen image, the textual description for the University of Oxford is taken from the Wikipedia entry. Just like Wikipedia it is possible to edit the content (see the orange Edit This Topic button in the accompanying screen shot) which allows anyone with a Freebase account to update the information.

As with Wikipedia, Freebase provides a history of edits to entries. Looking at the edits to the University of Oxford entry we can see many edits have been made. However most of these related to the assignment of the entry to particular categories e.g. Education (Education Commons). It was initially unclear to me how easy it would be to detect incorrect updates to the entry, whether made by mistake or maliciously.

In order to understand the processes for updating entries to Freebase with the permission of Rob Mitchell, the University of Exeter Web Manager, I updated the Enrollment figure for his institution which was 15,720 in 2006 to 18,542 in 2011. The updating process was simple to use and the new data was immediately made available for the University of Exeter Freebase entry. Rob will be monitoring the Google search results in order to see how long it takes before the update is available. We might reasonably expect (indeed hope) that there will be manual process for verifying the accuracy of updates made to Freebase articles.

It does seem to me that those involved in University marketing activities or those with responsibilities for managing a university’s online presence may wish to be taking responsibility for managing information provided on Freebase. Is the management of factual information about institutions hosted on Freebase something which institutions are currently doing? If so, does is this limited to annual updates of enrollment figures, etc. or is new information being provided?

Twitter conversation from: [Topsy]

Posted in Evidence, search | Tagged: | 1 Comment »

How I Learnt That “Google Scholar Has New Updates”

Posted by Brian Kelly on 10 Aug 2012

“Google Scholar Has New Updates For You”

Yesterday while visiting Google Scholar I noticed an alert which informed me that there were 10 new notifications for me (see image but note that as I have viewed the updates the alert which was displayed in the top right is no longer shown).

I’d not seen this alert before so I followed the link and discovered a set of recommended papers based on my citations. The second recommended paper in this list seemed particularly interesting: a paper on How Well Do Ontario Library Web Sites Meet New Accessibility Requirements?

I viewed the paper (available in PDF and HTML formats) and found that a recent accessibility audit of Library web sites in Ontario and found that, despite legal requirements for web sites to conform with WCAG 2.0 guidelines “an average of 14.75 accessibility problems were found per web page“.

Back in 2002 I published An Accessibility Analysis of UK University Entry Points which found that only 3 University home pages out of 163 conformed with WCAG 1.0 AA guidelines. Two years later a follow-up survey was published which reported that 9 out of 161 home pages conformed with WCAG 10. AA guidelines. Since I was well aware of the importance University Web managers placed on addressing Web accessibility issues, especially since the Special Educational Needs and Disability Act (SENDA) accessibility legislation was enacted in 2002, I regarded this as evidence of the limitations of WCAG guidelines. Around this time our first peer-reviewed paper on Web accessibility, Developing A Holistic Approach For E-Learning Accessibility, was published. In 2005 a paper on Forcing Standardization or Accommodating Diversity? A Framework for Applying the WCAG in the Real World documented the limitations of WCAG guidelines and the WAI model. A series of accessibility papers followed with the most recent paper, A Challenge to Web Accessibility Metrics and Guidelines: Putting People and Processes First, describing how:

This paper argues that web accessibility is not an intrinsic characteristic of a digital resource but is determined by complex political, social and other contextual factors, as well as technical aspects which are the focus of WAI standardisation activities. It can therefore be inappropriate to develop legislation or focus on metrics only associated with properties of the resource.

It was therefore disheartening to read the paper on Ontario Library Web sites concluding:

Since none of the library web sites examined in this study currently conform to WCAG 2.0, many changes will need to be made before sites can meet the new legal requirements for accessibility. Web accessibility guidelines and standards will need to be incorporated and integrated into the vocabulary, thinking, and processes of web content creators to successfully achieve WCAG 2.0 conformance. Complying with new web accessibility standards will involve a significant change in web development processes.

However the good news is that Google Scholar Updates correctly identified a paper of interest to me.

Learning More About Google Scholar Updates

This morning I spotted a tweet from Glyn Moody which stated:

Moody’s Microblog Daily Digest 120809 – http://bit.ly/QLn6Xe yesterday’s tweets as a single Web page

Since I know that Glyn uses his Twitter account to post links to resources which are likely to be of interest to me (especially related to a variety of open practices) followed the link to Glyn’s most recent tweets. There I spotted a timely tweet:

Wow – Google Scholar “Updates” a big step forward in sifting through the scientific literature – http://bit.ly/MAPqvZ nice

This provided a link to a blog post by Jonathan Eisen, Professor at UC Davis who described his reaction when encountering this new service from Google:

Wow. Completely awesome if it works well. So, well, let’s see if it works well. For me the system recommends the following

Jonathan Eisen went on to share his experiences in identifying the value of the recommendations. After concluding that the first recommendation was of little interest, like me he then looked at another suggestion:

paper number 2 seems a bit closer to my heart: REGEN: Ancestral Genome Reconstruction for Bacteria. And bonus – it is freely available. And so, well, I read over it. And it is definitely related to what I do and I probably would not have seen it without this notification. Cool.

Reflections

From a post entitled Scholar Updates: Making New Connections posted on the Google Scholar blog it seems that this new service was only released two days ago, on Wednesday 8 August. The post describes how:

We analyze your articles (as identified in your Scholar profile), scan the entire web looking for new articles relevant to your research, and then show you the most relevant articles when you visit Scholar. We determine relevance using a statistical model that incorporates what your work is about, the citation graph between articles, the fact that interests can change over time, and the authors you work with and cite. You don’t need to configure updates or enter any queries. We’ll notify you about new updates by displaying a preview on the homepage and highlighting a bell icon on search results pages.

I therefore seems that researchers can gain value by ensuring that they have a Google Scholar account containing information about their research publications which Google’s sophisticated search algorithms can use to suggest other relevant papers. It’s therefore interesting to note that last week’s Survey of Use of Researcher Profiling Services Across the 24 Russell Group Universities reported that 5,115 users at Russell Group universities have claimed a Google Scholar account, ranging from 77 at the University of Exeter to 580 at UCL.

In addition to the value of Google Scholar Updates it also occurred to me how valuable the links to resources provided by Glyn Moody in his tweets could me, if they were more easily accessed that the daily updates posted on his blog.

Aaron Tay is another person I follow who also provided valuable links to resources using his Twitter account. Back in February 2012 in a post entitled My Trusted Social Librarian I described how I had set up a Twitter list containing just @aarontay. I used this list with the Smartr app to view the content of links which Aaron tweeted. However Smartr is no longer available. In addition such access to Aaron’s links required every individual user to install Smartr or a similar app. Wouldn’t it be useful if there could be a web-based aggregation providing a summary of links which a Twitter user has tweeted? As I described last week, this is what RebelMouse provides. Even better, Aaron also uses RebelMouse. And, as can be seen, 19 hours ago Aaron also tweeted a link to the blog post about the Google Scholar Updates:

RT @figshare: Wow – Google Scholar “Updates” a big step forward in sifting through the scientific literature:http://vsb.li/m2p1bC by @p …

To conclude, if you use your Twitter account for sharing links, consider using a service such as RebelMouse to make it easier for others to see the content of the links you’ve shared.


Twitter conversation from Topsy: [View]

Posted in search, Web2.0 | Tagged: | 1 Comment »

Google Launches Knowledge Graph

Posted by Brian Kelly on 9 Aug 2012

 

In May 2012 Google announced the launch of Knowledge Graph, a database of more than 500 million real-world people, places and things with 3.5 billion attributes and connections among them. On 8 August it was reported that Google are rolling out the Knowledge Graph globally. The official Google Blog announced that “starting today [Wednesday 8 August], you’ll see Knowledge Graph results across every English-speaking country in the world. If you’re in Australia and search for [chiefs], you’ll get the rugby team—its players, results and history“.

The blog post explains that:

We’ll also use this intelligence to help you find the right result more quickly when your search may have different meanings. For example, if you search for [rio], you might be interested in the Brazilian city, the recent animated movie or the casino in Vegas. Thanks to the Knowledge Graph, we can now give you these different suggestions of real-world entities in the search box as you type:

and goes on to describe how:

the best answer to your question is not always a single entity, but a list or group of connected things. It’s quite challenging to pull these lists automatically from the web. But we’re now beginning to do just that. So when you search for [california lighthouses], [hurricanes in 2008] or [famous female astronomers], we’ll show you a list of these things across the top of the page. And by combining our Knowledge Graph with the collective wisdom of the web, we can even provide more subjective lists like [best action movies of the 2000s] or [things to do in paris]. If you click on an item, you can then explore the result more deeply on the web.

In addition Google have announced a limited trial of a service for searching email and will shortly be rolling out their voice search facility, currently available on Android devices, to iPhones and iPads – clearly responding to Apple’s Siri service.

Although such developments will clearly be of interest to general web users, in an educational context I am particularly interested in the implications of Knowledge Graph for finding research papers, research data, etc. Google’s blog post entitled “Introducing the Knowledge Graph: things, not strings” described how the Knowledge Graph “currently contains more than 500 million objects, as well as more than 3.5 billion facts about and relationships between these different objects. And it’s tuned based on what people search for, and what we find out on the web.” This will include research items, including items held in institutional repositories and may be in a position to exploit the relationships between such items such as citations.

This does seem to be a very interesting development. A video summary which describes how to explore lists and collections with Google search is available on YouTube and is embedded below.


Twitter conversation from Topsy: [View]

Posted in search | 4 Comments »

Link Strategies for UK Universities

Posted by Brian Kelly on 29 May 2012

The Commercial Sector is Using Link Optimisation Techniques

We are all aware of the importance of institutional Web sites. Over the past 15 years UKOLN’s Institutional Web Management Workshop (IWMW) series have provided many opportunities to share experiences and best practices across a range of areas. But I’m not aware of sessions which have been held during that time which have explicitly addressed linking strategies. This occurred to me following a tweet from Martin Hawksey which provided a link to an:

eyeopening summary of Linklove Boston bit.ly/HFeNaA Wondering how many inst. using these techniques?

The resource described a series of LinkLove and SearchLove events which have been held in the UK and US. A summary of the Linklove London conference provided by Hannah Smith of Distilled.net, the company which organised the event, is available. She highlighted the key suggestions from the plenary speakers who covered the following topics:

  • Content Strategy vs Link Building
  • Making Outreach Effective
  • Social Media & Links… a Love Story
  • Link Building Like Michael Winner, or Getting Golden Links
  • Building Targets, Relationships and Links
  • Putting the Love Back into Links
  • Tips, Tricks & Secrets from the Trenches
  • The Critchlow Hierarchy of Needs

The list of attendees at the London event show that this was very much focussed at the commercial sector and it might be tempting to dismiss link building strategies as the unacceptable face of Web site development, especially when you come across some of the comments made by the speakers such as:

There’s a £60 fine for driving in a bus lane in the UK, however Michael Winner doesn’t see it as a fine – he sees it as an investment in getting where he wants to go quickly

But I feel that the higher and further education sector should be willing to learn from others about ways of maximising access to their online content, resources and services and the laudable desire to do this in an ethical way should not preclude institutions from developing ‘white hat’ rather than ‘black hat’ SEO strategies, to use terminology which is described in Wikipedia.

Linking Strategies for the Higher Education Sector

I have already addressed linking strategies in the context of research papers in a post in which I described . In brief I suggested that in light of the popularity of LinkedIn, for which there seem to be over 100,000 users affiliated with the 20 Russell Group universities, and the high Google ranking which this service provides, it would appear beneficial in raising the ranking of one’s institutional repository if those responsible for providing advice on research dissemination strategies were to encourage researchers to provide links to copies of their papers held in the institutional repository. Such approaches should not only help to raise the visibility of the repository itself to search engines, but will also benefit the individual researcher, who should therefore be motivated to provide the appropriate links. In addition providing access tom one’s research publications in a popular environment can also benefit the many users of the service. Such an approach can clearly be seen as a white hat link building strategy.

But what about enhancing the visibility of online resources in other areas? Beyond the interests of researchers, a post which provided an Analysis of Incoming Links to Russell Group University Home Pages showed that Wikipedia, Twitter, LinkedIn, YouTube, Flickr, Microsoft and Google are the most highly-ranked Web sites with inbound links to Russell Group Universities.

Should Universities be seeking to maximise links from such popular sites, which may enhance their discoverability by Google users? But how might this be done, and what are the ethics associated of such strategies?  Perhaps this would be an interesting discussion to have at next month’s IWMW 2012 event. In the meantime,would anyone like to start the discussion on link strategies for Universities? Or do we simply leave such activities to the commercial sector?

Posted in search | Leave a Comment »

Enhancing Access to Researchers’ Papers: How Librarians and Use of Social Media Can Help

Posted by Brian Kelly on 26 Mar 2012

Tomorrow I’m giving a talk on “Enhancing Access to Researchers’ Papers: How Librarians and Use of Social Media Can Help” at a meeting of subject librarians at the University of Bath.

The talk is based on work which I’ve recently described on this blog including the post on How Researchers Can Use Inbound Linking Strategies to Enhance Access to Their Papers.

The talk will also address ideas described in a follow-up post on Profiling Staff and Researcher Use of Cloud Services Across Russell Group Universities in which I suggested that, in addition, to encouraging researchers to make their researcher publications available on their institutional repository, they should also be providing metadata and links to the papers from popular third party services, such as LinkedIn, Academia.edu, Microsoft Academic Search and Google Scholar Citations, which are provided particularly for use by researchers and academic staff.

The talk will highlight work in progress in making use of SEO analysis talks, including Linkdiagnosis.com and Majesticseo.com, in order to investigate what the highest SEO-ranking sites which link to the University’ of Bath’s Opus repository are. The initial findings from Linkdiagnosis.com suggests that wikipedia.org, wordpress.com, academic.research.microsoft.com and msn.com are the web sites with the highest SEO rankings which have links to the Opus repository. These four web sites all have an SEO Domain Authority score of 100, where this score “is a 100 point predicative score of the domain’s ranking potential in the search engines“.

The talk then goes on to suggest, as explaining in a post on My Trusted Social Librarian, that in addition to encouraging researchers to use such service, librarians may also help to support researchers by being a social librarians and favouriting (or liking or +1ng) useful resources since such actions can be seen in services such as Google,

The slides are available on Slideshare and embedded below.

I would welcome feedback.

Posted in library2.0, search | 3 Comments »

An SEO Analysis of UK University Web Sites

Posted by Brian Kelly on 8 Feb 2012

Why Benchmark SEO For UK University Web Sites?

The recent JISC Grant funding 18/11: OER rapid innovation describes (in the PDF document) how this call is based on a conceptualisation of “open educational resources as a component of a wider field of ‘open academic practice’, encompassing the many ways in which higher education is engaging and sharing with wider online culture“. The paper goes on to remind bidders that “Effective Search Engine Optimisation is key to open educational resources providing benefits of discoverability, reach reputation and marketing“.

The JISC Call will be funding further developments of OER resources. But how easy will it be to find such resources, in light of the popularity of Google for finding resources? Or to put it another way, how Google-friendly are UK University Web sites? Are there examples of best practices which could be applied elsewhere in order to provide benefits across the UK higher education sector? And are there weaknesses which, if known about, could be addressed?

Recently an SEO Analysis of UK Russell Group University Home Pages Using Blekko was published on this blog which was followed by an Analysis of Incoming Links to Russell Group University Home Pages which also made use of Blekko. These surveys were carried out across the 20 Russell Group universities which describe themselves as being the “20 leading UK universities which are committed to maintaining the very best research, an outstanding teaching and learning experience and unrivalled links with business and the public sector“.

Having evaluated use of the tool across this sample (and spotting possible problem areas where university web sites may have multiple domain name and entry point variants) the next step was to make use of the tool across all UK university web sites.

The Survey

The survey began on 27 January 2012 using the Blekko search engine. A list of UK university web sites has been created within Blekko which automatically lists Blekko’s SEO rankings for the web sites. This data was added to a Google spreadsheet which was used to create the accompanying histogram.

It should be noted that the list of UK universities should not be regarded as a definitive list. There may be institutions included which should not be regarded as a UK university. In addition there may be a small number of institutions which may have been omitted from the analysis. The accompanying spreadsheet may be updated in light of feedback received.

Discussion

What can we learn from this data? The rankings for the top five institutions are given in Table 1. From these five institutions it might be useful to explore the reasons why these web sites are so highly ranked. [Note the Russell Group universities were inadvertently omitted from the list when this post was published. Details have been added and Table 1 has been updated].

Table 1: Top Five SEO Rankings according to Blekko
Ref. No. Institution Rank
(on 27 Jan 2012) 
 Current Blekko
Ranking
1 UCL 1,433.67 View
2 University of Liverpool 1,286.85 View
3 University of Leeds 1,284.97 View
4 Durham University 1,277.32 View
5 University of York 1,246.03 View

The embarrassing aspect of such comparisons lies in describing the web sites which are poorly ranked. Table 2 lists the SEO ranking figures for the lowest ranked institutional web sites. In addition to the table a screenshot of the table is also included, which was taken at 6 pm on Monday 6 February 2012 (note that the data and time shown in the image was the date the entry was added to the table).

Table 2: Bottom Five SEO Rankings according to Blekko
Ref. No. Institution Rank
(on 27 Jan 2012)
Current Blekko
Ranking
1 Trinity Saint David View
2 Cardiff Metropolitan University  – View
3 UWL (University of West London)  – View
4 UCP Marjon  28.91 View
5 De Montfort University  31.58 View

It should be noted that two of the three web sites for which no rank value was available were in Wales. This may suggest that technical decisions taken to provide bilingual web sites might adversely affect search engine rankings. Since such factors only affect Welsh institutions it would seem that these institutions will have a vested interested in identifying and implementing best practices for such web sites.

I must admit that I was surprised when I noticed a large institution such as De Montfort University listed in Table 2, with a Rank of 31.58. Viewing the detailed entry I found that a host rank value of 507.9 was given – very different from the rank of 31.58 which is listed in the table of all institutions.

Can We Trust the Findings?

Further investigation revealed further discrepancies between the entries in the overall list of UK universities and the detailed entries. In the process of creating listing for use with the Blekko service, listings for UK Russell Group universities (as well as the 1994 Group universities) were created.

Table 3 gives the Blekko Rank value together with the Host Rank value which is provided in the detailed entry for the web site. In addition the accompanying screenshot provides additional evidence of the findings as captured at 7 pm on 6 February 2012.

Table 3: Top Five SEO Rankings
for Russell Group Universities
Ref. No. Institution Rank
(on 6 Feb 2012)
Host Rank
(on 6 Feb 2012)
1 UCL 1,433.67 1,607.6
2 University of Liverpool 1,286.80 1,260.3
3 University of Leeds 1,284.97 1,141.8
4 LSE 1,224.59 1,201.1
5 University of Nottingham 1,138.99 1,382.9

Table 4 gives this information for the five Russell Group universities with the lowest SEO ranking values.

Table 4: Bottom Five SEO Rankings
for Russell Group Universities
Ref. No. Institution Rank
(on 6 Feb 2012)
Host Rank
(on 6 Feb 2012)
1 University of Birmingham   80.60 205.4
2 University of Sheffield 395.04 529.7
3 Imperial College 514.22 476.8
4 University of Manchester 610.86 694.2
5 Cardiff University 692.08 752.9

From these two tables we can see that there is are some disparities in the ranking order depending on which Rank value is used, but the numbers do not seem to be significantly different.

The Limitations of Closed Research

Initially I had envisaged that this post would help to identify examples of good and bad practices, which could be shared across the sector since, as described in the JISC called described above “Effective Search Engine Optimisation is key to open educational resources providing benefits of discoverability, reach reputation and marketing“. However it seems that gathering evidence of best practices is not necessarily easy, with the tools and techniques used for gathering evidence appearing to provide ambiguous or misleading findings.

This post illustrates the dangers of research which makes use of closed systems: we do not know the assumptions which the analytic tools are making, whether there are limitations in these assumptions or if there bugs in the implementation of the underlying algorithms.

These are reasons why open research approaches should be used, where possible. As described in WikipediaOpen research is research conducted in the spirit of free and open source software” which provides “clear accounts of the methodology freely available via the internet, along with any data or results extracted or derived from them“. The Blekko service initially appeared to support such open research practices since the web site states that “blekko doesn’t believe in keeping secrets“. However it subsequently became apparent that although Blekko may publish information about the SEO ranking for web sites, it does not describe how these rankings are determined.

It seems, as illustrated by a post which recently asked “How visible are UK universities in social media terms? A comparison of 20 Russell Group universities suggests that many large universities are just getting started“, that open research is not yet the norm in the analysis of web sites. The post describes:

Recent research by Horst Joepen from Searchmetrics [which] derives a ‘social media visibility’ score for 20 Russell Group universities, looking across their presence on Facebook, Twitter, Linked-in, Google+ and other media.

The econsultancy blog describes how:

The visibility score we use here is based on the total number of links a web domain has scored on the six social sites, Facebook, Twitter, LinkedIn, Google+, Delicious and StumbleUpon, while accounting for different weightings we give to links on individual social sites.

Image from eConsultancy blog

But what are these different weightings? And how valid is it to simply take this score and divide it by the size of the institutions (based on the number of staff and students) in order to provide the chart which, as illustrates, puts LSE as the clear leader?

It should be noted that this work is based on the analysis of:

roughly 207,900 links every week related to content on the websites of the Russell Group universities posted on Twitter, Facebook (likes, comments and shares), Linkedin, Google+ and social bookmarking sites StumbleUpon and Delicious. 

and is therefore not directly related to the SEO analysis addressed in this blog. This work is being referenced in order to reiterate the point of the dangers of closed research.

However the LSE Impact of Social Sciences blog, which hosted the post about this study, made the point that:

The LSE Impacts blog approach is that some data (no doubt with limitations) are better than none at all. 

I would agree with this view – it can be useful in gathering, analysing and visualising such data and in order to provide stories which interpret such findings. The Blekko analysis, for example, seems to be suggesting that the Universities of UCL, Liverpool, Leeds, Durham and York have implemented strategies which make their web site highly visible to search engines, but Trinity Saint David, Cardiff Metropolitan University and the University of West London seem to have implemented technical decisions which may act as barriers to search engines. The eConsultancy analysis, meanwhile, suggests that LSE’s approaches to use of social media services is particularly successful. But are such interpretations valid?

Unanswered Questions

Questions which need to be answered are:

  • How valid are the assumptions which are made which underpin the analysis?
  • How robust are the data collection and analysis services?
  • Are the findings corroborated by related surveys? (such as the survey of Facebook ‘likes’ for Russell Group universities described in a post which asked Is It Time To Ditch Facebook, When There’s Half a Million Fans Across Russell Group Universities?)
  • What relevance do the findings have to the related business purposes of the institutions?
  • What actions should institutions be taking in light of the findings and the answers to the first three questions?

What do you think?


Paradata: As described in a post on Paradata for Online Surveys blog posts which contain live links to data will include a summary of the survey environment in order to help ensure that survey findings are reproducible, with information on potentially misleading information being highlighted.

This survey was initially carried out over a period of a few weeks in January and February 2012 using Chrome on a Windows 7 PC and Safari on an Apple Macintosh. The survey was carried out using the Blekko web-based tool. A request was made to Blekko for more detailed information about their ranking scores and their harvesting strategies but the reply simply provided the limited information which is provided on the Blekko web site. Inconsistencies in the findings were noted and this information was submitted to the Blekko support email (and also via an online support form on the web site). However no response has been received.

The information on the survey of visibility of Russell Group universities on social media sites was based on posts published on the LSE Impact of Social Sciences and eConsultancy blogs.

Footnote: The findings for the Russell Group universities were omitted from the list of all UK universities when this post was initially published. The data has now been added and Table 1 and the associated histogram have been updated.


Twitter conversation from Topsy: [View]

Posted in Evidence, search | 10 Comments »

Trends For University Web Site Search Engines

Posted by Brian Kelly on 15 Dec 2010

Surveys of Search Engines Used on UK University Web Sites

What search engines are Universities using on their Web sites? This was a question we sought to answer about ten years ago,with the intention of identifying trends and providing evidence which could be used to inform the development of best practices.

Search engines used across UK Universities in 1999

An analysis of the first survey findings was published in Ariadne in September 1999. As can be seen from the accompanying pie chart a significant number (59 of 160 institutions, or 37%) of University Web sites did not provide a search function. Of those that did the three most widely used search engines were ht://Dig (25 sites, 15.6%), Excite (19 sites, 11.9%) and a Microsoft indexing tool (12 sites, 7.7%).

Perhaps the most interesting observation to be made is the diversity of tool which were being used back then.  On addition to the tools I’ve mentioned universities were also using Harvest, Ultraseek, SWISH, Webinator, Netscape, WWWWais and Freefind together with an ever larger number of tools which were in use at a single institution.

The survey was repeated every six months for a number of years. A year after the initial finding had been published there had been a growth in use of the open source ht://Dig application (from 25 to 44 institutions) and a decrease in the number of institutions which did not provide a search function (down from 59 to 37).

This survey, published in July 2001 was also interesting as it provided evidence of a new search engine tool which was starting to be used: Google, which was being used at the following six institutions: Glasgow School of Arts – Lampeter – Leeds – Manchester Business School – Nottingham – St Mark and St John.

Two years later the survey showed that ht://Dig was still popular, showing a slight increase to use across 54 institutions.  However this time the second most popular tool was Google, which was being used in 21 institutions. Interestingly it was note that a small number of institutions were providing access to multiple search engines such as ht://Dig and Google. It was probably around this time that the discussion began as to whether one should use an externally-hosted solution (due to concerns regarding the sustainability of the provider, the loss of administrative control and use of a proprietary solution when open source solutions – particularly ht://Dig – were being widely used across the sector).

These surveys stopped in 2003. However two years later Lucy Anscombe of Thames Valley University carried out a similar survey in order to inform decision-making at her host institution. Lucy was willing to share this information to others in the sector, and so the data has been hosted on the UKOLN Web site, thus providing our most recently survey of search engine usage across UK Universities.

This time we find that Google is now the leading provider across the sector, being used in 44 of the  109 institutions which were surveyed. That figure can be increased of the five institutions which were using the Google Search Appliance are included in the total.

What’s Being Used Today?

A survey of Web site search engines used on Russell Group University Web sites was carried out recently. The results are given below.

Institution Search Engine Search
1 University of Birmingham Google Search Appliance Search University of Birmingham for “Search Engine”
2 University of Bristol ht://Dig Search University of Bristol for “Search Engine”
3 University of Cambridge Ultraseek Search University of Cambridge for “Search Engine”
4 Cardiff University Google Custom Search Search Cardiff University for “Search Engine”
5 University of Edinburgh Google Custom Search Search University of Edinburgh for “Search Engine”
6 University of Glasgow Google Custom Search(?) Search University of Glasgow for “Search Engine”
7 Imperial College Google Search Imperial College for “Search Engine”
8 King’s College London Google Search KCL for “Search Engine”
9 University of Leeds Google Search Appliance Search University of Leeds for “Search Engine”
10 University of Liverpool Google Search University of Liverpool for “Search Engine”
11 LSE Funnelback Search LSE for “Search Engine”
12 University of Manchester Google Search University of Manchester for “Search Engine”
13 Newcastle University Google Search Appliance Search Newcastle University of for “Search Engine”
14 University of Nottingham Google Search Appliance Search University of Nottingham for “Search Engine”
15 University of Oxford Google Search Appliance Search University of Oxford for “Search Engine”
16 Queen’s University Belfast Google Search Appliance Search Queen’s University Belfast for “Search Engine”
17 University of Sheffield Google Search Appliance Search University of Sheffield for “Search Engine”
18 University of Southampton Sharepoint Search University of Southampton for “Search Engine
19 University College London Google Search Appliance Search University College London for “Search Engine”
20 University of Warwick Sitebuilder Search University of Warwick for “Search Engine”

In brief 15 Russell Groups institutions (75%) use Google to provide their main institutional Web site search facility, with no other search engine being used more than once.

Note that Google provide a number of solutions including the Google Search Appliance, the Google Mini and the public Google search. Mike Nolan pointed out to me that “you can customise with API or XSLT to make [Google search results] look different” so I have only named a specific solution if this has been given on the Web site or I have been provided with additional information (note that I can update the table if I receive additional information).

Discussion

Over ten years ago there was a large diversity of search engine solutions being used across the sector. The discussions at the time tended to focus on use of open source solutions, with the argument occasionally being made that since ht://Dig was open source there was no need to look any further. There was also a suggestion that the open source Search Maestro solution, developed at Charles University and deployed at the University of Dundee could have an important role to play in the sector.

However in today’s environment it seems that a Google Search solution is now regarded as the safe option and this seems to have been corroborated with a survey carried out by Mike Nolan in December 2008. The potential of Google Custom Search will have been enhanced by the announcement, two days ago, of developments to metadata search capabilities.

There has, however, been some discussion recently on the web-support JISCMail list on software alternatives to the Google Search Appliance.Another discussion on the website-info-mgt JISCMail list has shown some interest in the Funnelback software. But, interestingly, open source solutions has not been mentioned in the discussions.

We might conclude that, in the case of Web site search engines, after ten years of the ‘bazaar’ the sector has moved to Google’s cathedral. What, I wonder, might be the lessons to be learnt from the evidence of the solutions which are used across the sector? Might it be that the HE sector has moved towards cost-effective solutions provided by Google’s free solutions or the richness of the licenced Google Search Appliance or Google Mini? And might this be used to demonstrate that the HE sector has been successful in identifying and deploying cost-effective search solutions?

Posted in Evidence, search, Web2.0 | 13 Comments »

Should We “Leave Search To Google?”

Posted by Brian Kelly on 21 Apr 2008

When I chaired the session on Search at the Museums and the Web 2008 conference the discussion, as I described in a recent post, turned to lightweight approaches to federated searching. During the session I received a Twitter comment on my feedback channel (intermingled with the football scores!) asking “is it more useful to develop compelling browse interfaces & leave search to Google?” The response at the time seemed to be that although Google might have a role to play in the future, its role at present is limited (in a museums’ context) due to the complexities of typical collections management Web interfaces: the valuable data is part of the ‘deep Web’ which search engines such as Google find difficult to index.

But just a few days ago, via a comment made by Nate Solas on his blog post about the Search session, I discovered that Google have announced their intention to index the deep Web:

This experiment is part of Google’s broader effort to increase its coverage of the web. In fact, HTML forms have long been thought to be the gateway to large volumes of data beyond the normal scope of search engines. The terms Deep Web, Hidden Web, or Invisible Web have been used collectively to refer to such content that has so far been invisible to search engine users. By crawling using HTML forms (and abiding by robots.txt), we are able to lead search engine users to documents that would otherwise not be easily found in search engines, and provide webmasters and users alike with a better and more comprehensive search experience.

Mia Ridge has commented on the implications of this announcement:

You’re probably already well indexed if you have a browsable interface that leads to every single one of your collection records and images and whatever; but if you’ve got any content that was hidden behind a search form (and I know we have some in older sites), this could give it much greater visibility.

In light of Google’s announcement it is timely, I would think, to revisit the question “It is it more useful to develop compelling browse interfaces & leave search to Google?” Imagine the quality of services we could provide if we redirect resources from replicating search algorithms which have already been developed (“standing on the shoulders of giants”).

And let’s remember (a) the evidence which suggests that users prefer simple search interfaces and (b) the costs of attempting to compete with Google in the search area – let’s not forget that, despite their riches, Microsoft haven’t been able to compete successfully. Is it likely that search technologies developed by tax-payers’ money will succeed where Microsoft have failed?

PS I should probably add that I’m not the first to suggest this idea. The OpenDOAR team, in particular have deployed a search interface using Google across institutional repository services. Many congratulations to the team at the University of Nottingham for evaluating this lightweight approach.

Posted in search | Tagged: | 27 Comments »

A New Search Interface for HERO

Posted by Brian Kelly on 26 Sep 2007

I have been reading the September issue of the HERO Headlines magazine, which provides “the latest news from HERO Ltd, the company behind the UK’s official online gateway to higher education and research opportunities“.

An article in the magazine describes the release of a search tool which can be added to Internet Explorer and Firefox browser to enable the HERO.ac.uk Web site to be searched directly from the browser, without first having to go to the HERO Web site. Use of this search facility to search for articles about UKOLN is shown in the diagram.

Search for 'UKOLN' on Hero Web site

At one stage there was a tendency in various Web development circles that browser-specific enhancements should be avoided, as they don’t necessarily provide universal solutions (in this case, users of the Opera browser may feel disenfranchised). I don’t go along with this argument – I feel that this provides a richer and easier-to-use solution for many users, whilst still allowing users of more specialist browsers (or old versions of Internet Explorer or Firefox) to search the Web site in the traditional way.

Congratulations to HERO for this development. Now how many institutions are configuring their browsers with similar search interfaces for their institutional Web site, I wonder?

Posted in browser, search | 6 Comments »