UK Web Focus

Innovation and best practices for the Web

Archive for December 11th, 2011

Responding to the Forthcoming Demise of TwapperKeeper

Posted by Brian Kelly on 11 December 2011

Twapper Keeper Archive Service to be Shut Down

On 8th December 2011 the following announcement was made on the Twapper Keeper Web site:

Transition update
Twapper Keeper’s archiving is now available in HootSuite! As a result, we will be shutting down Twapper Keeper. Existing archives will be kept running until Jan 6, 2012, after which you will not be able to access your archives anymore.

Twapper Keeper has been widely used within the UK’s higher education sector, especially for archiving tweets containing event hashtags at events aimed at the developer, researcher and library sectors.

The popularity in the service has helped to demonstrate the importance of Twitter archiving, something which was not necessarily widely appreciated a few years ago. But in light of, for example, the recent news item on the JISC Web site which announced that “Social media ‘not to blame’ for inciting rioters” and went on to describe how:

A study of 2.4 million Twitter messages from the time of the riots has found that politicians and other commentators were wrong to claim the website played an important role in inciting and organising the disturbances.  

we can see that the importance of Twitter archiving for a variety of purposes is now more widely understood.  However it seems that Twapper Keeper will not be providing a long term repository of tweets. This does not necessarily mean that tweets will be lost since, as described in an article on Tweet Eternal: Pros and Cons of the Library of Congress Twitter Archive published in Time on 8 December 2011 “Thanks to a deal between Twitter and the United States Library of Congress, every public tweet sent on the social messaging service since its creation will become part of the Library of Congress’ digital archive, available to researchers and historians as an example of contemporary life and culture“. However as highlighted in Nature in  n article on Social science: Open up online researchSocial media hold[s] a treasure trove of information [but]  the secretive methods of ethics review boards are hindering their analysis, says Alexander Halavais.

Since it unclear when and if the Library of Congress archives will be made publicly available people and organisations which have made use of Twapper Keeper may wish to migrate the content of these archives. This post will describes approaches for migrating existing data, ways of identifying which archives may need to be preserved and ways of identifying key stakeholders who may need to make such decisions.

Migration of Existing Archives

Tools

Since creators and users of Twapper Keeper archives have less than a month to migrate their content, this post will outline ways in which the archives can be managed, and a discussion about the implications of the announcement of the closure of the service will be made at a later date.

Martin Hawksey has published a post on his MASHe blog which describes how you can Free the tweets! Export TwapperKeeper archives using Google Spreadsheet.  Martin’s post also links to a post entitled LIBREAS.Library Grab your TwapperKeeper Archive before Shutdown! which describes a technique which can be used by those familiar with R code. Tont Hirst on the OUseful.info blog has also listed a technical solution based on R code in his post on Rescuing Twapperkeeper Archives Before They Vanish.

For people who may not be familiar with use of Google Spreadsheets or implementation of software applications for accessing Twitter archives you should note that you can also use a Web browser to view archives of interest (having ensured that all items are displayed and not just the default 10 items). You can then view the HTML source and save the file so that you have a HTML representation of the tweets which you can take manage locally.  In addition, you can also save an RSS representation of the tweets which will provide a more structured format which should be more amenable to subsequent processing, if you wish to do this. Examples of this approach can be seen from the copies of the  IWMW10 and IWMW11 archives.

Selection Criteria

In addition to being aware of the tools which can be used there will also be a need to decide which archives may be still be of relevance and identifying who may need to take responsibility for migrating the content to an appropriate location. Tony Hirst, in his post on Rescuing Twapperkeeper Archives Before They Vanish, has suggested that “one approach might be to look to see what archives have been viewed using @andypowe11′s Summarizr“. However although the Summarizr home page  lists recently viewed Summarizr summaries of Twapper Keeper archives, it is not clear if a comprehensive list is available and, even if such a list could be made available, how this would inform decisions on the selection of archives to be migrated.

An alternative approach is to look at the TwapperKeeper archives which have been created by particular Twitter IDs.  We can see, for example, that Tony Hirst (@psychemedia) has created 27 archives.  Similarly using Twapper Keeeper’s search facility I find that I have created a total of 62 Twapper Keeper archives. Perhaps the initial stage in identifying archives to be migrated is for active Twapper Keeper users to identify the archives they have created, and then for them to make a decision of archives to be migrated, where the new archives are to be hosted and what to do for acrhives which will not be migrated, which might include informing key stakeholders.

Case Study

Rather than attempting to keep a copy of all of of the Twapper Keeper archives I have created, in this post I will provide a summary of the archives I created and docum the decisions I have taken regarding migration of the content and the reasons for these decisions.

Migrated to UKOLN Web site: The IWMW2009, IWMW10 and IWMW11 archives, which will be made publicly available, together the UKOLN and Ariadne_Mag archives which will be stored locally if we decide at a later date to analyse the tweets.

Key stakeholders informed:  A number of archives may of interest to organisations such as JISC, CILIP, ALT, UCISA and CETIS. These organisations will be notified of the archives which I have created and will be informed of the techniques described in this post if they wish to migrate the content.

Archives of personal interest: Archives of personal tweets and personal interests have not been migrated.

Other archives: Other archives include archives for broad subject areas (e.g. #a11y, #phdchat) for which a general tweet about the forthcoming demise of the Twapper Keeper archive will be made and archives for events and areas of interest for which I had a short-term interest and wished to be able to view the tweets but which which I have no longer term interest.

A summary of the Twitter archives and the decisions I have made are given below.  Please note that:

  • The data given in the table was collected on 9 December 2011.
  • The decisions given in the table may be changed at a later date.
  • Twapper Keeper archives for other areas relevant to myself and UKOLN colleagues  may have also been created.  The #IWMW09 archive, for example, will be migrated and decisions about other archives will be made shortly.
Archive Type Name Description # of Tweets Create Date Comment
#Hashtag #a11y Accessibility (a11y) 96,491 04-25-10 #a11y community to be informed.
#Hashtag #a11yhack DevCSI hack day 329 06-21-11 One-off DevCSI event. Report has been published.
#Hashtag #accbc CETIS/BSI Accessibility SIG meeting 396 02-28-11 One-off DevCSI event. CETIS SIG coordinator to be notified.
#Hashtag #altc2009 The ALTC 2009 conference 4,754 08-28-09 Large annual event. Report has been published. Event organisers to be notified.
#Hashtag #altc2012 The ALT-C 2012 conference (Association for Learning Technology) 104 09-12-11 Created for next year’s event. Content not migrated.
#Hashtag #altmetrics New approaches for developing metrics for scholarly research 1,393 01-15-11 #altmetrics community to be informed.
#Hashtag #Ariadne The Ariadne hashtag – which may be used for UKOLN’s Ariadne ejournal. 42,102 09-21-10 Content not migrated due to multiple uses of tag.
Keyword Ariadne Archive of tweets contains the string ‘Ariadne’ 79,991 09-21-10 Content not migrated due to multiple uses of keyword.
@Person ariadne_ukoln Tweets about the Ariadne web magazine. 2,792 05-28-10 Content to be migrated to UKOLN.
#Hashtag #Bathcr The University of Bath’s Connected Researcher activity. 296 04-14-11 #Bathcr community to be informed
#Hashtag #brdidc11 Symposium on Data Attribution and Citation Practices and Standards, August 22-23 2011, Berkeley 51 08-22-11 Content not migrated.
@Person briankelly Tweets about Brian Kelly 9,952 03-19-10 Content not migrated as alternative backup available.
#Hashtag #CETIS The CETIS service, based at the University of Bolton. 9,561 09-24-10 CETIS colleagues to be informed.
#Hashtag #CILIP CILIP, the Chartered Institute of Library and Information Professionals. 14,356 09-24-10 CILIP colleagues to be informed.
#Hashtag #CILIP1 Campaign on future of CILIP organisation based on CILIP’s 1-minute messages. 357 06-13-10 Content not migrated.
#Hashtag #CSR Comprehensive Spending Review 0 10-15-10 Content not migrated.
#Hashtag #dataprato Invitational workshop to identify & agree areas for joined-up international action in research data management. 128 04-11-11 Content not migrated.
#Hashtag #digdeath The conference on Death and Dying in a Digital Age held in Bath, UK 72 06-25-11 Content not migrated.
#Hashtag #eduwebconf The eduwebconf conference 33 11-07-11 Content not migrated.
#Hashtag #falt09 ALTC Fringe 219 08-28-09 Content not migrated.
#Hashtag #fbdevlove The Facebook developers hack day 1,297 03-26-11 Content not migrated.
#Hashtag #fpw11 The Future of the Past of the Web conference, British Library, London on 7 October 2011. 755 09-22-11 Event organisers to be notified.
#Hashtag #heweb10 Tag for the HigherEdWeb 2010 conference 8,812 09-28-10 Content not migrated.
#Hashtag #heweb11 The HighEdWeb 2011 conference, 23-26 October 2011 11,505 10-23-11 Content not migrated.
#Hashtag #ILI2011 Internet Librarian International 2011 conference held in London on 27-28 Oct 2011. 3,067 10-27-11 ILI organisers to be notified. Report has been published.
#Hashtag #ili2012 Tweets for the Internet Librarian International (ILI) 2012 conference 3 10-29-11 Created for next year’s event. Content not migrated.
#Hashtag #ipres10 Tweets for the iPres10 conference, Vienna, 19-24 Sept 2010. 5 08-27-10 Content not migrated.
#Hashtag #ipres2010 Archive for the IPres 2010 conference to be held in Vienna on 19-25 Sept 2010. 1,424 08-27-10 Content not migrated.
#Hashtag #ISKB A holder for the ISKB 27 09-17-11 Content not migrated.
#Hashtag #iwmw12 UKOLN’s Institutional Web Management Workshop (IWMW) 2012 event 2 10-29-11 Created for next year’s event. Content not migrated.
@Person iwmwlive IMWM live blogging account 3,744 04-30-10 Content to be migrated.
#Hashtag #jisc10 JISC 2010 conference 2,065 04-02-10 Event organisers to be notified.
#Hashtag #jiscHTML5 JISC HTML5 Case study work 18 11-18-11 Content not migrated.
#Hashtag #jiscpowr Archive of tweets related to the JISC PoWR project provided by UKOLN and ULCC 13 07-09-10 Content not migrated.
#Hashtag #jiscpowrguide Archive of tweets about the Guide to Web Preservation published by the JISC-funded PoWR project and launched on 12 July 2010. 2 07-09-10 Content not migrated.
#Hashtag #JISCPP The JISC-Funded Patients Participate project. 0 05-25-11 Content not migrated.
#Hashtag #ldow2010 Linked Data on the Web 2010 conference 530 04-25-10 Content not migrated.
#Hashtag #loveHE Times Higher Education campaign to support Higher Education in UK. 20,719 06-12-10 Content not migrated.
#Hashtag #mdforum UKOLN’s Metadata Forum 1,746 12-10-10 Content to be migrated.
#Hashtag #morris Tweets about Morris dancing 183,338 10-16-10 Content not migrated.
#Hashtag #OAweek Open Access week 4,603 10-19-11 Content not migrated.
#Hashtag #online11 The Online Information 2011 conference held in London on 29 November -1 December 3,915 11-29-11 Content not migrated.
#Hashtag #oxsmc09 socialmediaconference 1,063 09-18-09 Content not migrated.
#Hashtag #PhD Tweets for researchers using the #PhD tag 161,215 09-24-10 Content not migrated.
#Hashtag #s113 Workshop session at ALTC 2009. 1417 09-03-09 Content not migrated.
#Hashtag #scl2010 Scholarly Communication Landscape (SCL): Opportunities and challenges symposium, held at Manchester Conference Centre on 30 November 2010. 0 12-02-10 Content not migrated.
#Hashtag #SHB11 Security and Hunan Behavior conference 1,117 06-18-11 Content not migrated.
#Hashtag #SLG2011 CILIP School Librarian Group conference. 283 04-03-11 Content not migrated.
#Hashtag #thatlondon People (Northerners?) talking about going to “that London” 1,781 07-09-11 Content not migrated.
#Hashtag #ucassm Social Media Marketing Conference organised by UCAS. 225 10-18-10 Content not migrated.
#Hashtag #ucsoc12 UCISA SSG (Support Services Group) event. 5 09-05-11 Content not migrated.
#Hashtag #udgamp10 What Can We Learn From Amplifed Events seminar, given by Brian Kelly, UKOLN at the University of Girona 395 09-01-10 Content migrated.
#Hashtag #ukmw09 UKMuseumsandtheWeb 750 12-05-09 Content not migrated.
Keyword ukoln Tweets about UKOLN 3,385 03-19-10 Content to be migrated.
#Hashtag #ukolneim UKOLN’s Evidence, Impact, Metric work 523 11-05-10 Content to be migrated.
#Hashtag #UKOLNseminar UKOLN seminars 69 04-01-11 Content to be migrated.
#Hashtag #UniofBath Tweets about the University of Bath 1,798 06-15-11 Content not migrated.
#Hashtag #UniWeek The UK’s Universities Week campaign. 1,767 06-15-11 Content not migrated.
#Hashtag #Virtualfutures The Virtual Futures conference 2,216 06-18-11 Content not migrated.
#Hashtag #w3ctrack W3C Track at WWW 2010 conference 205 04-30-10 Content not migrated.
#Hashtag #W3CUKI W3C UK and Ireland Office 266 04-18-11 Content not migrated.
#Hashtag #ww2010 Misspelling of WWW2010 hashtag 904 04-29-10 Content not migrated.

I welcome suggestions on other tools and approaches which can be used for managing such archives and also approaches to selection and deletion criteria for Twitter archives.

Posted in Twitter | 16 Comments »