UK Web Focus

Innovation and best practices for the Web

Archive for October 28th, 2010

Is There A Need For An Auto-Delete Service For Twitter?

Posted by Brian Kelly on 28 October 2010

Is Twitter “For The Moment”?

Last month’s post on opting-out of Twitter archives generated some discussion on Twitter from a handful of people who felt that tweets shouldn’t be archived at all.   In the discussion (which I won’t cite!) there was a suggestion that tweets should auto-delete after a short period of time. Such an approach fits in with a view that Twitter is “for the moment“.

An auto-deletion approach has been taken by the #NoLoC service. This has been set up by those who are concerned about the US Government (actually the Library of Congress) archiving of tweets. Users who register with the #NoLoC service give permission for this service to delete tweets which contain a specified hashtash after 23 weeks – one week before tweets are archived by the Library of Congress.

Although this service has been set up for one particular context it made me wonder if this approach could not be generalised. Could a service be developed be developed which allowed users to specify a period after which their tweets would automatically be deleted, together with hashtags which would identify tweets to be deleted after this period?

Someone who normally uses Twitter for professional purposes but also tweets about football might tag such tweets with #footie and requests that such tweets be deleted after a few days. Or if you are going to a party or music festival you might specify that tweets with #party or #festival are deleted the following day.

This suggestion is generalising the approach taken by the #NoLoc service and providing the flexibility to allow the user to have control over the time period and hashtag. And unlike the Twitwipe service (which deletes all tweets from a user’s account) provides users will control over how and when their tweets are deleted.

However although this approach (which would probably need to be provided by a trusted organisation as you are giving rights for your tweets to be deleted to another organisation) will ensure that tweets are deleted from Twitter (and also not archived by the Library of Congress if the deletion period is less than 24 weeks) it doesn’t delete tweets which have been archived by other services (including services such as Twapper Keeper and Google).

Thoughts On Approaches to Auto-Deletion of Tweets

If Twitter is an important part of the information landscape (which I feel it is) there will be a need to address issues such as privacy and content management at a more fundamental level – and the view that archiving isn’t important or shouldn’t be done ignores the fact that it is being done and judging by the Twapper Keeper usage statistics published in our recent paper:

As of 1 July 2010 the Twapper Keeper archive contains 1,243 user archives, 1,263 keyword archives and 7,683 hashtag archives. There are a total of 321,351,085 tweets stored. The average number of tweets ingested per second is from 50 to 3,000 per minute (around 180,000 per hour. or 4.32 million per day). Since Twitter itself processes about 65 million tweets per day the Twapper Keeper service is currently processing about 6-7% of the total public traffic.

But how might a distributed environment for respecting Twitter users rights to be able to delete their tweets from Twitter and from conforming Twitter archiving services?

Would it be possible for a Twitter API to enable tweets deleted by an authenticated user from a Twitter archive to also be deleted from Twitter? And could tweets which have been deleted in Twitter (perhaps from a remote request) to then be deleted from other archives?

Of course this would not stop people from capturing tweets in other ways, or for Twittering archiving tools to fail to respect such a protocol. But this is also the case with the robot exclusion protocol – robot software from search engines which respect the protocol will not index files which have been excluded in a robots.txt file in the root of a Web server. But such excluded files are still openly available and robots which don’t respect this protocol can still index the files. However in reality most users will use trusted search engines which have implemented such a widely accepted standard.

Is such an approach technically possible today? And, if not, would it be possible if Twitter provided appropriate APIs?

Or Maybe We Should Simply Accept Twitter’s Openness

Of course there might be an argument that such developments are pointless – tweets will be treated as public property and so there’s simply a need to accept this. Or perhaps an alternative to Twitter could be used by those who still have concerns. What would be needed would be a walled garden which made it difficult for content to be accessed by other applications with permissions which allowed various levels of access control, such as access by friends, friends of friends, etc.

Hmm, I wonder if Facebook could be the answer :-) More seriously, perhaps we will find that different services are used by people in different ways – and I know I have read how people may use Twitter for open discussions in a work context and Facebook for closed discussions with friends and families. Perhaps rather than overload Twitter with complex content management mechanisms we should simply accept that Twitter is an open environment, with the risks and benefits which openness provides.

Posted in Twitter | 7 Comments »