UK Web Focus (Brian Kelly)

Innovation and best practices for the Web

Thoughts on Google Scholar Citations

Posted by Brian Kelly on 22 Nov 2011

Citation Analysis Services

I recently wrote a post entitled “Will the Real Scott Wilson Please Stand Up, Please Stand Up” in which I described my initial experiences with the Microsoft Academic Search service.  I have to admit that I was impressed by the user interface and how, for example, it depicted links with my co-authors.

Revisiting Microsoft Academic Search

The main limitation with the Microsoft Academic Search service was, I felt, the accuracy of the data and the need to get author buy-in in order that authors could claim their own papers and remove papers incorrectly attributed to them.  The information it has about me, for example, suggests that I have published 56 papers, including one dating back to 1979. In fact it should know about 30 of my papers, the earliest of which was published in 1994.

Several weeks ago I edited my publications list to remove papers written by other Brian Kellys.  These edits have been accepted and when I sign in I get confirmation of the 38 papers I have confirmed authorship of and the 18 which have been removed from the list. However the wiki-style approach to editing the content means that edits have to be confirmed and this does not appear to have happened.  I therefore appear to be claiming more publications that is the case and, possibly, the citation statistics (G-Index=11 and H-Index=6) for my papers may be inaccurately calculated.

Google Scholar Citations

Whenever I come across a new service which appears to provide value I am also interested in seeing if there are alternative offerings. In part this is to ensure that I don’t find myself being locked into a single vendor. But in addition it can also help to see how other providers address the same area. As the Microsoft Academic Search service is based on harvesting metadata about papers hosted on institutional repositories, publishers Web sites and similar resources we should expect to see similar competing services.  I was therefore pleased when I received an email last week which announced that the Google Scholar Citations service, which I had signed up to during the beta testing, had been opened as a public service.

A post was published on the Google Scholar blog on Wednesday 16 November 2011 entitled “Google Scholar Citations Open To All‘ which described how:

You can quickly identify which articles are yours, by selecting one or more groups of articles that are computed statistically. Then, we collect citations to your articles, graph them over time, and compute your citation metrics – the widely used h-index; the i-10 index, which is simply the number of articles with at least ten citations; and, of course, the total number of citations to your articles. Each metric is computed over all citations and also over citations in articles published in the last five years.

My Google Scholar Citations page is illustrated below. In comparison with my Microsoft Academic Search page this page appears somewhat limited in its functionality. It also has much less social connectivity, with links to only six of my co-authors who have registered for the service.

In addition to differences in the user interface and the social connections, Google Scholar Citations also has differences in the papers it has analysed and the corresponding citation indices, giving a H-index of 11 (in comparison with Microsoft Academic Search’s H-index of 6). Google Citations also provides a I10-Index score of 12 whereas Microsoft Academic Search provides G-Index score of 11.

Google Scholar Citations’ analysis of the papers indexed by Google Scholar seems to be based on a more accurate representation of my papers, possibly because I verified my papers some time ago.  Google Scholar also includes a number of popular articles I wrote which haven’t been deposited in the University of Bath repository and therefore don’t seem to have been indexed by Microsoft Academic Search, such as the Ariadne article on “An accessibility analysis of UK university entry points” for which there have been 28 citations. But in addition a paper on “Using networked technologies to support conferences”  delivered at the EUNIS 2005 conference which has been deposited in the in the University of Bath repository has been indexed by Google Scholar but not by  Microsoft Academic Search.

Whilst investigating Google Citations I came across a tweet from Les Carr who provided a link to his Google Citations page, which is illustrated below (which brought to my attention the paper on “Earlier web usage statistics as predictors of later citation impact” from 2006 which will be worth reading in light of Social Web developments since the paper was published in 2006).

Carr

In order to make some further comparisons between the coverage and citation analyses of Google Citations and Microsoft Academic Search I’ve summarised details for Les Carr together with the co-authors of my papers who have registered with Google Scholar Citations in the following table.

Name Microsoft
Academic

Search (MAS)
Google
Citations
registered on(GC)
Nos. of
publications (MAS)
Nos. of
publications (GC)
Nos. of
citations (MAS)
Nos. of
citations (GC)
G-Index (MAS) I10-Index (GC) H-Index (MAS) H-Index (GC)
Brian Kelly Link  Link  56  83  153 498 11 12  6 11
David Sloan Link Link  42  67  204 615 13 12  7 12
Jane Seale Link Link   6  85    49  714   6 14  4 12
Helen Petrie Link Link 106 172  569 1,397  22 34 15 18
Lorcan Dempsey Link Link  10 110    29 1,139   5 30  1 19
Alastair Dunning Link Link   3  13    8   29   2   1  2   3
Les Carr Link Link 169 206 1,158 1,558  28 42 17  21

It should be noted that:

  • The Microsoft Academic Search entry for Jane Seale has her affiliation listed as the University of Southampton. She is now based at the University of Plymouth so her citation statistics may be split across two entries.
  • There are two Microsoft Academic Search entries for Lorcan Dempsey: entry 1  and entry 2.
  • here are two Microsoft Academic Search entries for Alastair Dunning: entry 1  and entry 2.

Discussion

I’m pleased that Google have provided an alternative to Microsoft for providing details of citations for research publications (there are similar services, of course, but I thought it would be worth focusing this post on a newly released service and provide comparisons with a service I described recently).

Microsoft Academic Search seems to have taken an approach of indexing as many research papers as it can find, associating the papers with author and institutions. The Microsoft Academic Search  entry point currently states that it provides access to “6,684,802 publications and 18,831,151 authors, 5,472 updated last week“.  Papers are automatically assigned to organisations, with the details for the University of Bath providing the following information: Publications: 29,331; Citation Count: 131,732; H-Index: 96 and 1,638 authors. In addition papers may also be assigned to departments with the details for Bath/UKOLN providing the following information: Publications: 262; Citation Count: 932; H-Index: 15 and 245 authors.

The problem with such automated processing is that the data can be flawed with.  In contract the Google Scholar Citations requires users to opt-in before their papers are assigned to their Google account.  This means, for example, that Google Scholar Citations currently has details for only 18 authors from the University of Bath.

It seems to me that rather than the functionality of the services I’ve described, the main challenges will be getting buy-in from the authors’ whose papers have been indexed.  They will be both a significant user community for such services as well as possibly having responsibility for cleaning up the data.

Some questions which came to mind when I was looking at these services:

  • What is being indexed?  The Microsoft Academic Search service seems to have indexed primarily my peer-reviewed papers which I have deposited in the University institutional repository and from publishers’ databases. The Google Scholar Citation service, in contrast, seems to have also included papers from the UKOLN Web site which I wouldn’t have classed as ‘papers’.  I have removed papers which don’t fit in with my view of what should be included, but I appreciated that such definitions are likely to be very subjective.
  • Motivation to manage one’s content. What is the motivation to manage one’s content?  Since the automated harvesting and assignment of papers is liable to lead to errors, there will be a need for the data to be cleansed.  But what are the motivating factors for authors to do this?
  • Barriers to the management of one’s content.  Although authors may have motivating factors, such as ensuring that popular services provide an accurate view of their research publications, there may also be barriers to updating one’s data.  This might include the user interfaces provided by the services, the turnaround time for changes to be approved and the requirements for a Windows Live ID (in the case of Microsoft Academic Search) or a Google ID (in the case of Google Scholar Citations).

I recently came across a tweet from Guus van Brekkel (@digcmd) who described:

How Google Scholar Citations passes the competition left and right at WoW! Wouter on the Web bit.ly/uw8ppc

The tweet introduced me to the WoW!ter blog, written  by Wouter Gerritsma, subject librarian and bibliometrician at Wageningen UR Library. In the post Wouter gave his thoughts on the service:

 Google Scholar Citations really excels at finding publications you completely forgot about. 

and went on to make comparisons with other alternatives:

Google Scholar easily beats ResearcherID since it updates automatically and Scopus ID because you can make your list with citations publically available. To make your publication list openly available is really recommended to all scientists, it helps your personal branding.

although he admitted that:

there are disadvantages to Google Scholar as well. The most serious at this moment all kind of ghost citations.

Wouter concluded:

Google Scholar is only about five years old. Give them another five years and they will have changed the market for abstracting and indexing database totally. If only 20 percent of all scientists make their publication lists correct (also editing of the references which can be done to improve the mistakes Google has made) even without making them publically available, Google sits on a treasure trove of high quality metadata. Really interesting to see how this story will develop.

Perhaps the risk of failing to engage with the service and update the information which Google has will turn out to be the motivating factor for updating the content.  I’ve updated my content and started to email my co-authors so that they are listed. Have you updated your papers?  And if not, I’d be interested to know the reasons why not.


Twitter conversation from Topsy: [View]

14 Responses to “Thoughts on Google Scholar Citations”

  1. […] outros: https://ukwebfocus.wordpress.com/2011/11/22/thoughts-on-google-scholar-citations/; […]

  2. Andy Mitchell said

    I just had a look at it and searched for my name. What came back included 5 papers accredited to me but were nothing to do with me. To my knowledge I’ve only had 4 papers published (more if you include posters for conferences and presentations about research), all concerning or relating to the ‘student voice’, so I was somewhat surprised to learn I had also written about Northern Irish cattle populations.

    Of the four that I was involved in, only one was included on the list. Therefore, I would suggest that the site be used bearing in mind that it is still rather flawed.

  3. Manolis Mavrikis said

    Isn’t it an issue that citations do not exclude self-citations?

  4. […] Thoughts on Google Scholar Citations […]

  5. […] effectively in the sciences and social sciences than in the humanities (see Brian Kelly’s Thoughts on Google Scholar Citations for a detailed […]

  6. I’ve had a similar situation to Andy, but then two papers that had nothing to do with me – and didn’t have my name on it, though a very similar one – just vanished when I next checked.

    The list for mine still has several papers on it that I wouldn’t consider papers or articles. It’s a very, very, broad definition that Google uses (similar, sort of, to the ridiculously broad definition of ‘book’ that the Kobo uses in its ‘million free ebooks’ offer).

    It is a useful service in some ways, though. I was a bit stunned, and initially scared, to find a report I’d completely forgotten about years ago has been cited 314 times. It’s fun quickly finding where, and how, it’s been cited.

    Also, unexpectedly, it’s been more useful than linkedin or facebook (especially) for finding several academics from eLib days (mid nineties) who lost touch with some time ago. Another person search facility, this time weighted to people who have written academic-y articles. If we do a 20th anniversary reunion in 2005, scholar citation could be a useful tool in pulling a crowd together.

  7. […] [11] Thoughts on Google Scholar Citations. https://ukwebfocus.wordpress.com/2011/11/22/thoughts-on-google-scholar-citations/ […]

  8. […] few weeks ago I came across a referrer link to a blog post from a post entitled “Google Scholar Citations y la emergencia de nuevos actores en la […]

  9. […] a meaning for number of positive votes it the ratings are poor, as they are for a post on are Thoughts on Google Scholar Citations does this indicate criticism of the post itself (poorly written and flawed arguments) or a […]

  10. […] be noted that, as described in posts on What I Like and Don’t Like About IamResearcher.com, Thoughts on Google Scholar Citations and Will the Real Scott Wilson Please Stand Up, Please Stand Up services such as LinkedIn, […]

  11. […] This initially appeared to be an anomaly. However I subsequently realised that a post giving Thoughts on Google Scholar Citations published a few days after Google’s announcement that Google Scholar Citations Open To […]

  12. James Neville Thompson. said

    As a retired scientist (and Google scholar whatever that is), I use Google to track continuation of my previous work. However, I was shocked today when Google found several of my research papers attributed to another scientist with a similar name in something described as Microsoft Academic Search. Actually I recognised this fine scientist (working in a completely different field thousands of miles away) from years of reading our adjacent records in Citation Index which was published by people who knew what they were doing. I assume that the Academic Search is lazy and based only on names and does not include the institution, for example, as part of identification. The data & charts are obviously misleading and are thus worse than useless concerning scientists with common names. Self citations are not an issue; papers are published to convey information to active scientists not satisfy statisticians. I have often found the publication of instalments unavoidable. I have also published findings in books ( from conferences) and again in less or greater detail in a journal. Both are needed by scientists active in the field. As a reviewer I would turn down a duplicate paper but I have never seen a true clone in a scientific journal.

  13. […] and were used by different communities. I went on to describe how researchers could find value in claiming a Google Scholar profile and providing access to their research publications using services such as Academia.edu and […]

  14. […] in this blog post) and Microsoft Academic Search have also started to offer identifiers (see this blog post). There may be privacy issues, for example in Google and Microsoft publicly surfacing information […]

Leave a comment