UK Web Focus (Brian Kelly)

Innovation and best practices for the Web

Issues In Crowd-sourced Twitter Captioning of Videos

Posted by Brian Kelly on 23 Mar 2010

Crowd-sourced Twitter Captioning of Videos

Back in March 2009 Tony Hirst write a post on his OUseful blog entitled Twitter Powered Subtitles for Conference Audio/Videos on Youtube in which he provided a proof-of-concept on how you could take time-stamped Twitter posts and synchronise them with a YouTube video to provide Twitter captions for videos.

Although people liked the idea people commented that the process was too difficult. So two weeks later Tony wrote a post on Easier Twitter Powered Subtitles for Youtube Movies.

Captioned video of Gordon Brown's talkMoving forward to February this year and we find a blog post written by Martin Hawksey of RSC Scotland North and East which describes Martin’s service for Twitter powered subtitles for BBC iPlayer. And yesterday Martin used his software to which provides Gordon Brown’s Building Britain’s Digital Future announcement with twitter subtitles. Great stuff. What this does is to provide cost-effective crowd-sourcing captioning (which provides accessibility benefits) as well as helping to contexualise tweets, which may otherwise lose their meaning when accessed from a Twitter archive which is decoupled from the talk.

Have a look at the video – and if you’ve not yet listened to Gordon Brown’s announcements I’d recommend that you do so.

Issues About Reusing Twitter Posts

My recent post on The “Building Britain’s Digital Future” Announcement summarised Gordon Brown’s talk based on tweets from @hadleybeeman. I was slightly worried about the ethics of doing this. Partly in light of the responses to my post last year on What Are the #jiscbid Evaluators Thinking? which cited a couple of tweets. In response to that post my colleague Paul Walk pointed out that Anything you quote from Twitter is always out of context and raised the issue of “courtesy and good practice” when citing tweets. Paul’s post generated a lot of interest, with 27 comments being made.Of particular relevance, I felt, was a comment Paul made; “Beyond the need for absolute privacy for some communications it’s a grey area of overlapping contexts & tacit trust“.

I agree that this a grey area and there is a need for what Paul described as a “sophisticated sense of proprietary in these matters“.

I was prepared to cite Hadley’s tweets as I judged these to have been made for the public good.  I also made a judgement call not to cite tweets (from others) which I felt to be trivial or may not accurately reflect the views of the person who posted the tweets.  And it seems that Hadley appreciated the approach I took, subsequently sayingI think that once my tweets are up, they’re cite-able published material. I’d like credit, but they live on their own!“.

So we can make a judgement call on how we cite and reuse tweets, without having to go to the extremes of regarding all tweets as public property which are fair game or personal remarks which should never by cited.

But what happens if a Twitter stream is embedded in another environment,such as Martin’s Twitter captions of Gordon Brown’s talk? And what if Nick Poole’s tweet posted at 09:03 which is captured on the opening frame instead of saying “Gordon Brown getting started on Building Britain’s Digital Future now. Anyone there doing reportage via Twitter? #bbdf” had said “Listening to Gordon Brown – but slightly hungover after too much to drink last night #bbdf“?

My Thoughts

My view is that we need to acknowledge that tweets which are published in an open space are always likely to be reused by others, possibly in ways that we might not always be happy with. “Caveat twitterer” might be our motto. But we might also find, as Hadley did, that the reuse of our tweets can be beneficial- and the accessibility benefits of crowd-sourced tweets might be a particular benefit to be aware of.

Perhaps we should start to regard tweets which contain an event hashtag as being particularly likely to be reused.

And maybe there is a need for more sophisticated tools for aggregating such tweets. Would it be possible for a video captioning service to allow a preferred Twitter user to be used for the captions (perhaps an official event Twitterer, as UKOLN used at last year’s IWMW 2009 event Twitter)? And would it be possible to delete inappropriate tweets from a stream used for captioning? After all, as Martin Poulter has recently pointed out on his Ancient Geeks blog in a post on The dark side of aggregating tags the Conservative Party’s experiment in social media fell foul of, presumably, left-of-centre geeks, embedding inappropriate content, markup and scripts in a feed which was automatically displayed on a Conservative party Web site. Let’s not repeat that mistake.

4 Responses to “Issues In Crowd-sourced Twitter Captioning of Videos”

  1. Social comments and analytics for this post…

    This post was mentioned on Twitter by briankelly: Issues In Crowd-sourced Twitter Captioning of Videos post on use of Twitter captions of PM’s #bbdf video: http://bit.ly/9sXQIM

  2. Tony Hirst said

    Martin and I have chatted about the service, and I think there are several obvious ways forward for it:
    – provide provide hooks into twitter archive apis to obtain tweets;
    – provide a moderation form that will allow a moderator to select which tweets to include in the caption stream, or remove captions on request from Twitter users whose content appears in the stream;
    – provide a caption powered search tool that lets you search deep into a video using a the twitter captions. We could probably get a patent if a said we’d invented a search tool that searches a user generated caption feed and then returns time stamps from 10s or so before the time of the captions returned from the search query (oops – that’s presumably not patentable now because it’s in the open, (rather than because it’s bleedin’ obvious!)

    If any devs want to volunteer, or any funders want to underwrite a week or two of dev and designer time to develop this further, I think Martin and I would both love to chat to you:-)

  3. Crowdsourcing Captioning and Meaning of Accessibility…

    Image via CrunchBase Crowd-sourcing Twitter Captioning of Videos: Brian Kelly has an interesting discussion going on in Issues In Crowd-sourced Twitter Captioning of Videos. He refers to an intriguing idea I had not heard about from Tony Hirst at OUsef…

  4. One of my frustrations when putting together the Gordon Brown’s Building Britain’s Digital Future announcement with twitter subtitles example was that the video was embedded within a custom flash player which meant I couldn’t overlay subtitles within the player, the data wasn’t linked. Turns out HTML5 holds some interesting possibilities particularly with how easy it is to enable users to select from different twitter streams. So you could have your official UKOLN tweets as well as offering the user other perspectives (selected experts, public timeline), the user being able to switch mid timeline to a different feed. For a demo of this see Twitter subtitles on Vimeo using HTML5

    (An advantage of using the Vimeo service is there is a lot less restriction on clip duration – ideal for conferences/lecture capture :-). You won’t be surprised however that there is compatibility issues with video codecs :-(

    As Tony says there is a lot more research to be done in this area.

Leave a comment