Recognising, Appreciating, Measuring and Evaluating the Impact of Open Science
Posted by Brian Kelly (UK Web Focus) on 6 September 2011
The #SOLO11 Conference
We are now starting to see various posts on the event being published. One of the first reports on the events was written by Alexander Gerber and published on the Scienceblogs service based in Germany. Alexander began his brief post by saying:
My sobering conclusion after two days of ScienceOnline London: The technologies are ready for take-off, the early-adopter-scientists are eager to kickstart the engine, but the runway to widespread usage of interactive technologies in science is still blocked by the debris of the traditional academic system. This system needs to be adapted to the new media paradigms, before web 2.0 / 3.0 can have a significant impact on both research and outreach.
and went on to list three central questions which he feels need to be answered:
- How can we recognise, appreciate, measure and evaluate the impact of outreach and open science in funding and evaluation practice?
- Which new forms of citation need to be installed for that?
- How can we create a reward system that goes way beyond peer-reviewed citations?
I’d like to address certain aspects of the first question, in particular ways in which one might measure and evaluate the use of social media to support such outreach activities since this issue was discussed during a workshop session on Online Communication Tools which I spoke at. However I would first like to give some thoughts on the opening plenary talk at the event.
Plenary Talk on Open Science
For me the highlight of SOLO11 was the opening plenary talk on “Open Science” which was given by Michael Nielsen, a “writer; open scientist; geek; quantum physicist; writing a book about networked science“.
A number of blog posts about the event have already been listed in the Science Online wiki. I found Ian Mulvany’s thoughts on the Science Online London Keynote talk particularly helpful in reminding me of the key aspects of the talk.
Michael told the audience that he didn’t intend to repeat the potential benefits of open science; rather he would look at some examples of failures in open science approaches and then look in other disciplines to see if there were parallels and strategies which could be used in the science domain.
The example given described use of open notebook science in which a readership of ~100 readers in a highly technical area had been established, but there was little active participation from others. The author, Tobias J Osbourne, was putting in a significant amount of effort but was failing to gain value from this work.
Michael gave an example of how a significant change can be made in a short period of time which brought significant benefits: the change to driving on the right hand side of the road in Sweden at 5am on Sunday, 3 September 1967.
However although this example was successful and brought benefits (such as reduced costs) there are many other examples in which the potential benefits of Collective Action fail to deliver, often due to some potential beneficiaries chosen to ‘freeload’ on the work of others.
We can learn from examples of successes in other areas, ranging from the establishment of trade unions and well-established practices for managing water supply in villages through to the growth of the ArXiv archive and of the Facebook social networking service. Successful approaches include:
Starting small: For example the ArXiV service success was due to it focussing on a small subject area. Similarly Facebook was initially available only to students at Harvard University, before expanding to, initially, other Ivy Leagues and then other higher educational institutions before being available to everyone.
Monitoring and sanctions: Michael concluded by describing how there was a need to monitor use and, if needed, to be able to apply sanctions.
The concept is that there is some action where if everyone changed it would be better for everyone, but you need everyone to change at the same time. There are incentives for people not to participate because there is some cost involved in changing for the individual but if the individual does not change, they get the benefit anyway from everyone else changing. This is the same kind of problem that we have with the move to open data.
In brief, therefore, Michael felt that those who feel that open science can provide benefits tend to be too ambitious – there is a need to start with small achievable aims and to make use of approaches for broadening the scope using various approaches which have proven successful in other areas.
Analytics for Use of Social Media
The second day of the SOLO 11 event provided a series of workshop sessions. I attended one which was billed as Scholarly HTML but it fact provided an introduction to blogging on WordPress :-( However a workshop session on Online Communication Tools which provided an introduction to Twitter, Google+ , etc in the morning moved on in the afternoon sessions to:
… cover all angles from how to practically use the tools most beneficially in an institutional or academic environment, to how to measure their impact via statistics and online “kudos” tools
Alan Cann, one of the facilitators of the session, invited me to speak in this session as Alan had attended a one-day workshop on “Metrics and Social Web Services: Quantitative Evidence for their Use and Impact” which I organised recently. I used the slides from a talk on “Surveying Our Landscape From Top to Bottom” which reviewed various analyses of use of social media services by individuals and institutions, including tool such as Klout, PeerIndex and Twitalyser.
Alan Cann also spoke in the session and in his presentation pointed out the statistical limitations in using such services – similar concerns to those made by Tony Hirst in a talk on which he gave at the ”Metrics and Social Web Services: Quantitative Evidence for their Use and Impact” event.
Tony’s slides, which are available on Slideshare, illustrated dangers of misuse of statistics including the accompanying graphs showing data which can all be, incorrectly, reduced to the same linear curve.
Tony went on to describe Goodhart’s Law which states that:
once a social or economic indicator or other surrogate measure is made a target for the purpose of conducting social or economic policy, then it will lose the information content that would qualify it to play such a role.
and Campbell’s Law:
The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.
Lies, Damned Lies and Social Media Analytics?
Might, therefore, we conclude that social media analytics tools such as Klout, PeerIndex and Twitalyzer have no role to play in, for example, “measuring and evaluating the impact of outreach and open science“? Not only are, for example, the ways in which Peerindex aggregates its scores for authority, activity and audience to give a single value statistically flawed, but, if such services are used for decisions-making purposes we will see users gaming the system.
Whilst this is true, I also feel that there are dangers in trying to develop a perfect way of measuring such impact – and it was clear from the workshop that this is an acceptance of the need for such measurements.
There will be many other examples of approaches to measurements which we generally accept but which have underlying flaws. The university system, for example, may be regarded as evaluating its successful consumers as first, two-one, two-two or third class degree students. But despite the limitations of assessment the importance of such assessment is accepted.
We might also wish to consider how such measuring schemes are used. The approaches taken by Klout and Peerindex have parallels with Google’s ranking algorithms – and again can be gamed. But organisations are prepared to invest in ways of gaining high Google rankings since this will provide business benefits, through Web sites being more easily found in Google searches.
We are starting to hear of examples of Klout and Peerindex statistics being used in recruitment, with a recent article published in the New York Times inviting readers to:
IMAGINE a world in which we are assigned a number that indicates how influential we are. This number would help determine whether you receive a job, a hotel-room upgrade or free samples at the supermarket. If your influence score is low, you don’t get the promotion, the suite or the complimentary cookies.
I suspect that marketing departments will use such statistics and that people working in marketing and outreach activities will start to use personal social media analytic scores in their CVs. Note that as can be seen from the image which shows my Peerindex scores such tools can be used in a variety of ways – it is clear that you wouldn’t employ me based on the diagram if you were looking for someone who had demonstrable experience in outreach work using Twitter in the field of medicine (my areas tend to focus on technology, sport and politics).
I therefore feel that we should treat social media analytics with care and use them in conjunction with qualitative evidence of value. But to disregard such tools completely whilst waiting for the perfect solution to appear will fall into the trap which Michael Nielsen warned against, of seeking to gain broad acceptance of a universally applicable solution.
I’d welcome your thoughts.