UK Web Focus

Reflections on the Web and Web 2.0

Archive for the ‘preservation’ Category

Are You Able?

Posted by Brian Kelly (UK Web Focus) on 17 February 2009

There were two invited keynote speakers who travelled from Europe to speak at the OzeWAI 2009 conference. As well as my talk (which I described recently ) Dr. Eva M. Méndez (an Associate Professor in the Library and Information Science Department at the Universidad Carlos III de Madrid and not the American actor!) gave a talk entitled “I say accessibility when I want to say availability: misunderstandings of the accessibility in the other part of the world (EU and Spain)“.

Eva’s research focuses on metadata and web standards, digital information systems and services, accessibility and Semantic Web. She has also served as an independent expert in the evaluation and review of European projects since 2006, both for the eContentPlus program and the ICT (Information and Communication Technologies) program and her talk was informed by her knowledge of the inner working of such development programmes funded by the EU.

Her talk explored the ways in which well-meaning policies may be agreed with the EU, although such policies may be misinterpreted or misunderstand and fail to be implemented, even by the EU itself.

I don’t have access to Eva’s slides, so I will give my own interpretation of Eva’s talk.

We might expect the EU to support the development of a networked environment across EU countries across a range of areas. These areas might include:

Available: Have resources been digitised? Are they available via the Web?

Reusable: Are the resources available for use by others?  Or they it trapped within a Web environment which makes reuse by others difficult?

Findable: Can the resources be easily found? Have SEO techniques been applied to allow the resource to be indexed by search engines such Google?

Exploitable: Are the resources available for others to reuse through, for example, use of Creative Commons licences?

Usable: Are the resources available in a usable environment?

Accessible: Are the resources accessible to people with disabilities?

Preservable: Can the resources be preserved for use by future generations?

Since the acronym ARFEUAP isn’t particularly memorable (and ARE-U-API would be too contrived) we might describe this as the Able approach to digitisation. But there is 0ne additional concept which I feel also needs to be included:

Feasible: Are the policies which are proposed (or perhaps mandated) feasible (or achievable)? We might ask are they actually possible (can we make all resources universally accessible to all?)  and can they be achieved with available budgets and with the standards and technologies which are currently available?

There is, of course, a question which tends to be forgotten question: is the proposed service of interest to people and will it be used?

The worrying aspect of Eva’s talk was that the EU don’t appear to be asking such questions – or even used the same vocabulary.  We need to have the bigger picture in order to address tensions between these different areas and the question (and power struggles) of how we prioritise achieving best practices – for example, should we be digitizing resources, even if we can’t make them accessible; should we regard access by people with disabilities as being of  importance than ensuring the resources can be preserved?  And let’s not fudge the issue by suggested that each is equally important and all can be achieved by use of open standards. That simply isn’t the case – and if you doubt this, ask managers of institutional repositories. They will probably say that they are addressing the available, reusable, findable, preservable and, perhaps, exploitable issues, but I suspect that the repository managers would probably admit that many of the PDFs in the repositories will not be accessible.

Posted in Accessibility, preservation, standards | Tagged: | 3 Comments »

Disappearing Resources On Institutional Web Sites

Posted by Brian Kelly (UK Web Focus) on 16 December 2008

I recently received the publisher’s proofs of an accessibility paper which will be published in the new year. The reviewers spotted a number of broken links in the references. Some of them were links to previous papers I had published, and the errors were introduced by the publisher (which I confirmed by checking the details of the paper which I submitted). But for a couple of other references the pages did seem to have disappeared. I contact Stuart Smith, one of the co-authors, and asked him if he knew anything about the references he had supplied which seemed to have disappeared.

Stuart told me that a new e-learning team in his institution has rebuilt the e-learning Web site, resulting, it seems, in the loss of existing resources. Stuart wrote a blog post about this incident entitled “Mummy I lost my MP3!“. Stuart felt that “My MP3 problem shows to me that the argument that the ‘cloud’ is too unstable doesn’t hold water … because institutional systems are open to the same criticisms“. Stuart concluded that “My solution to my MP3 problem will probably lie in the ‘cloud’ I’ll find a suitable archiving host that I like and also keep a backup offline (like I should have done in the first place) and if that host disappears at least I will know about it“.

I’m sure Stuart isn’t alone. How many resources do you think will have disappeared following the establishment of new Web teams or the release of new software?  Maybe institutional repositories will have a role to play, as they try to address the persistent identifier problem by at least decoupling the address of the resource form the technology used to access the resource.  But repositories won’t be used to manage all resources on an institutional Web site, will they?

Since our institutions don’t seem to have yet cracked the problem of management of resources across changes in policies, staff and technologies, is Stuart right, I wonder,  in regarding ‘the cloud’ (e.g. services such as the Internet Archive, perhaps) as the place (or one of the places) to deposit resources for safe-keeping?  Or perhaps the question is whether such services may be more reliable than the institutional Web site. After all, if your own institution misplaces your resources, you can;’t sue them, can you?

Posted in Web2.0, preservation | 1 Comment »

The Final JISC PoWR Workshop

Posted by Brian Kelly (UK Web Focus) on 29 August 2008

The final workshop organised by the JISC-funded Preservation of Web Resources (PoWR) will take place at the University of Manchester on Friday 12th September 2008.

Now you may think that preservation is a pretty dull topic, compared with the exciting developments that are taking place in a Web 2.0 environment. And if that’s what you think, then you’re not alone. As Alison Wildish, head of Web Services at the University of Bath described on the Web Services team blog:

We were asked by our colleagues at UKOLN (who organised the event) to deliver a brief talk detailing our approach to preserving web resources at the University. Our initial reaction was that we had little to say. Lizzie’s remit lies with the paper records and I am responsible for managing our website – ensuring it meets the needs of our users. Neither of us felt web preservation was something we had expertise in nor the time (and for me the inclination) to fully explore this.

And you can even listen to Alison and Lizzie Richmond (University of Bath records manager, archivist and FOI coordinator) expand on this by viewing the Slidecast of the talk they gave at the first JISC PoWR workshop:

If you listen to the end of the Slidecast you’ll hear Alison and Lizzie describing how they discovered in the course of the discussions reasons why Web preservation is a topic which needs to be treated seriously.

But how should one go about Web preservation? What should you preserve? What should one discard? What are the implications of use of Web 2.0 on preservation policies? Whose responsibility is this? What are the costs associated with preservation? And what are the costs and associated risks of not developing and implementing a preservation policy for your Web resources? And how does one ensure that an institutional preservation policy is sustainable and embedded withn the institution?

These are some of the topics which have been raised on the JISC PoWR blog and will be discussed at the workshop. But hurry up and book you place, as the deadline for bookings is Friday 5th September. And note that the workshop is free to attend for members of the higher and further education community.

And finally I should point out that the case study given by Alison Wildish and Lizzie Richard has been saved from being trapped in the non-interoperable world of the past, accessible only to Doctor Who (and even then only on a good day) by recording the talk and synching the recording with the slides and hosting this on Slideshare. You see, preservation can be enhanced through use of Web 2.0 services. Digital preservation can be cool – even though, arguably, it may kill the odd polar bear :-)

Posted in Web2.0, preservation | Leave a Comment »

Fahrenheit 451

Posted by Brian Kelly (UK Web Focus) on 15 August 2008

I recently attended the JISC’s Innovation Forum. One of the most interesting of the plenary talks was given by HEFCE’s John Selby. In his talk John praised the work of the JISC and the JISC Services, but went on to warn of troubled financial times ahead for the educational sector. The glory days of the past 10 years are over, he predicted.

This was probably not unexpected. What did surprise me, however, was the figures John quoted which put the carbon cost to the environment on par with the cost of flying – both at 2%.

This generated much debate at the forum, and, later on at the conference meal and in the bar. Although people questioned the accuracy of these figures, and wanted to know how these figures were obtained, there was an awareness that the carbon cost of IT is an issue which the IT secure needs to address. I should add that I subsequently came across details of a forthcoming Government Goes Green conference in which Malcolm Wicks, Energy Minister, BERR was quoted as saying that

ICT is now responsible for around 2% of global CO2 emissions. The public sector, with annual IT spending of £14bn, has an important role to play in reducing this two percent. An increased focus on sustainable procurement and efficient use of IT products are two key areas that it needs to work on and I am very pleased to see a conference dedicated on this.

At the JISC Innovation Forum dinner I found myself sitting next to colleagues from the Digital Curation Centre (DCC). I suggested, partly in jest, that although there was a clear need for continued development of networked services which are popular with the users, we had to ask ourselves where the costs of preserving digital resources could be justified. If, as we learnt from Alison Wildish’s recent presentation at the first JISC PoWR workshop, those involved in Web development activities tend to focus on the pressing needs of their user communities and find it difficult to justify diverting scarce resources to preserving resources which are no longer of significant interest to the institution, why don’t we stop pushing the notion of digital preservation. And not only will this allow the development community to focus their efforts on responding to pressing user needs – but removing archived files from hard disk drives could result in significant savings in energy.

This approach would then both help the users and help save the planet :-)

As I’ve said this was intended as a joke, over our conference meal. But we realised that their may be benefits for the digital preservation community in making such suggestions. After all, preservation is widely considered as worthy but dull. If digital preservation was regarded as something radical, might it have a greater appeal to developers? Could those involved in digital preservation work – harvesting old Web sites and even implementing OAIS models – find themselves repositioned as members of an underground radical movement, secretly preserving digital artefacts for a society which regards such activities as unacceptable. Fahrenheit 451 for the 21st century, perhaps.

Save a Polar Bear campaign posterThe following day when I suggested this, I was told that there have been discussions about strategies for digital preservation which acknowledge that there are environmental factors which need to be addressed. It seems that there have been proposals that such preservation activities should be based in places such as Greenland and Alaska where the low temperatures may reduce the need for consuming energy to keep the disk drives running at acceptable temperatures.

Now scientists may point out that running large scale server farms in locations near glaciers and the ice cap may increase the rate at which they melt. But the ideas which were bounced around at the event did make me wonder whether centralisation of networked services (e.g. running applications hosted by Google or Yahoo or running our applications on Amazon’s S3 and EC2 servers) would be more beneficial to the environment than all of our institutions running our own local servers.

And perhaps such discussion might be useful in a teaching context. Does data curation, for example, conflict with environmental protection? If so, should we forget it? Or could this approach result in deletion of the very data that could save the planet

What do you think?

And if you’d like to take part in a viral marketing campaign which seeks to make digital preservation interesting by suggesting that it might be responsible for global warming, feel free to make use of the post which has been produced. And note that a Creative Commons zero licence (currently in beta) has been assigned to this resource, so you don’t need to cite the original source. Let’s be part of an underground movement :-)

Posted in preservation | 14 Comments »

Places Still Available on “Preservation of Web Resources” Workshop

Posted by Brian Kelly (UK Web Focus) on 17 June 2008

I’ve previously mentioned the JISC Preservation of Web Resources (JISC-PoWR) project which is being provided by UKOLN and ULCC. The project has established a blog and will be running its first workshop, entitled Preservation of Web Resources: Making a Start, on Friday 27th June 2008 at Senate House, London.

The workshop is aimed staff in the higher and further education sector with responsibilities for the preservation of institutional Web resources. The workshop will introduce the concept of Web preservation, and discuss the technological, institutional and legal challenges the preservation of Web resources presents. One aspect of Web site preservation might be keeping a history of changes to your institution’s home page. Do you have a digital record of the changes? And do you have a record of why significant changes were made and when? I have been working with colleagues in the University of Bath on ways in which we might address this particular issue. The following video clip, which is available on YouTube, illustrates some of the issues (although if the display is too small you might prefer to view the original resource):

There are still a number of places available on the workshop – which is free to attend for those in the higher and further education sector. But please sign up promptly if you are interested. The timetable is given below:

10:00 – 10:30 Registration and coffee

10:30 – 12:45 Morning Sessions:

  • Presentation: Preservation of Web Resources Part I
  • Breakout session: What are the Barriers to Web Resource Preservation?
  • Presentation: Challenges for Web Resource Preservation
  • Presentation: Legal issues

12:45 – 13:45 Lunch
13:45 – 16:00 Afternoon Sessions:

  • Presentation: Bath University Case Study
  • Breakout session: Preservation Scenarios
  • Presentation: Preservation of Web Resources Part II

16:00 End

Posted in preservation | Leave a Comment »

The SearchMe Visual Service

Posted by Brian Kelly (UK Web Focus) on 13 June 2008

A recent Tweet from Tony Hirst alerted me to the Searchme Visual Search service. An example of use of this service searching for “UKWebFocus is illustrated below.

The Searchmevisual.com Service

As the name suggests this service provides a visually-oriented approach to searching and, rather than attempting to describe this service I suggest you try it.

I suspect that an initial response from some information professionals would be to highlight the limitations of such an interface, pointing out the difficulties of more advanced searching. However I feel that this would be to overlook the potential of this type of interface to provide browsing functionality. And this, indeed, was the use case made by Tony Hirst:

@briankelly would like a wayback machine browser for home pages over time. http://beta.searchme.com would look neat? Any libraries for it?

I met Tony at the recent CRIG DRY (Don’t Repeat Yourself) Metadata Barcamp held at the University of Bath. Over lunch I mentioned UKOLN’s JISC-PoWR (Preservation of Web Resources) project and described my interest in ways of exploiting content held in the Internet Archive’s WayBack Machine. I suggested that a generic screen-scraping interface to the service would be useful – and when I returned to the Barcamp later that afternoon Tony demonstrated the first version of the software :-) And the following day Tony had started to explore ways of providing a richer user interface to such data. A browse interface such as that used by Search Me Visual could potentially provide a very engaging way of visualising the changes to an organisation’s home page, I would think. And wouldn’t it be great if this could be demonstrated at the JISC-PoWR’s opening workshop on 25 June 2008. Has anyone come across any tools which could do this?

Posted in Web2.0, preservation | Tagged: , | 4 Comments »

Preservation of Web Resources: Making a Start

Posted by Brian Kelly (UK Web Focus) on 4 June 2008

My colleague Marieke Guy together with the JISC-PoWR project partners at ULCC have announced details of a workshop on “Preservation of Web Resources: Making a Start” – this one-day workshop will take place on Friday 27th June 2008 at the Senate House Library, University of London.

The JISC-PoWR project runs until the end of September 2008 and will run three workshops which will aim to identify best practices for preserving Web sites. The key deliverable of the project will be a handbook which will document the challenges to be addressed in Web site preservation in a number of areas which will include key institutional Web services (e.g. the prospectus), project Web sites (which have clear termination dates) and, a particular challenge for the project, the preservation issues associated with use of Web 2.0 services.

The first workshop will be free to attend (although there will be a penalty for non-shows), with the second workshop being held as part of the IWMW 2008 event at the University of Aberdeen on 23rd July.

Please sign up now if you would like to attend. And I’d you can’t make it but have an interest in the preservation of Web resource, why not subscribe to the JISC-PoWR blog – and, rather than being a passive reader, join in the discussions.  Topics we’d be interested in hearing about include (a) how institutions are currently addressing the preservation of key institutional Web-based services (such as the prospectus); (b) the approaches you may be taken to short-term project Web sites (whether JISC-funded or institutionally-funded and (c) your views on the preservation of data and services provided by externally-hosted Web 2.0 services.

Posted in Events, preservation | Leave a Comment »

Preserving The Past Can Help The Future

Posted by Brian Kelly (UK Web Focus) on 21 May 2008

Many of the posts featured in this blog describe innovative tools and applications which aim to provide a more effective work or study environment for users. However there can be a danger that an emphasis on new and innovative services can mean a failure to manage legacy services which can result in a loss of our experiences, history and culture.

This can be particularly true in the Web environment. I first became aware of the scale of the problem when I monitored the Web sites which had been set up for projects funded by the EU’s Telematics For Libraries programme. As I described in an article on WebWatching Telematics For Libraries Project Web Sites published in the Exploit Interactive e-journal in October 2000 of the 65 projects which had Web sites, a total of 23 of the Web sites has disappeared when I carried out the survey. And a recent check shows that at least 39 of the Web sites have gone. Our digital history, the associated learning and the investment (from EU taxpayers) is being lost!

Or is it? Is this assertion just being alarmist? Might not the information have been migrated to a more manageable environment? And perhaps some of the projects are now available, possibly under new names, as sustainable services?

There’s a clear need for these issues to be addressed and for advice to be provided – both to organisation as responsible for managing their own Web services and to funding bodies which commission development work which will involve the development of Web sites.

JISC have recognised the need to provide such advice. They issued a recent call for an ITT on “The Preservation of Web Resources Workshops and Handbook” and I’m pleased to report that a joint bid by UKOLN and ULCC was successful. The project, which had its launch meeting on 1 May 2008, will run three workshops which will aim to gain a better understanding of the challenges to be faced in Web site preservation, identify examples of best practices and provide a set of recommendations to policy makers, content providers and developers. This will be documented in a handbook which should be available after September 2008.

Although the project is only funded for 5 months it will seek to provide advice not only on conventional institutional Web sites, but also on use of third party Web 2.0 services – the potential benefits of such services are well-understood, but there needs to be a better understanding of the risks associated with their use and how institutions should assess such risks and use such assessments to inform policy.

JISC PoWR BlogThe project team members themselves are using a variety of Web 2.0 tools to support their work. As well as communications technologies (beyond email) to support the work of the distributed team members a blog is also being used to disseminate information about the project and to solicit feedback and encourage discussion and debate. The JISC-PoWR (Preservation of Web Resources) blog (illustrated) is hosted on the JISC Involve blog service.

The team would like to welcome those with an interest in Web site preservation to join the blog and contribute to the discussions.

Posted in preservation | Tagged: | 1 Comment »

Disappearing Public Sector Web Sites

Posted by Brian Kelly (UK Web Focus) on 31 March 2008

I recently used the Intute service to see what records it held about UKOLN’s activities. I found a record about the ‘Crossroads West Midlands service which UKOLN provided technical advice on the design of the collection description database:

This is the website of ‘Crossroads West Midlands’, a Resource funded project that is working to develop online access to the collections of libraries, museums and archives in the West Midlands (including universities and local authorities as well as private institutions). The Crossroads website is currently a prototype, testing a database built upon the RSLP collection level description database, covering the collections relating to the potteries industry of North Staffordshire.

The record provides additional information about the service which reminded me about the meetings I attended several years ago about this project. I was interested to see what the Crossroads West Midlands service now looks like, so I followed the link to the http://www.crossroads-wm.org.uk/ address – and, rather than a service providing access to a database of cultural heritage resources in the West Midlands, I found a page full of links to services such as golf, gambling, estate agents, motor insurance, etc.

Crossroads West Midlands Web SiteClearly at some point the domain name for the original service had lapsed and was purchased by a company which used it to host advertisments and links to companies which would be willing to advertise in this way (or possibly companies wishing to enhance their search engine ranking may have procured the services of a Search Engine Optimisation service and might not be aware of the approaches taken.)

I was interested in the history of the Web site. Using the Internet Archive I discovered that the Web site was first archived on 26 September 2002. At this point the information in the archive contained details about the project. The service itself was first launched around February 2003. And the service disappeared to be replaced by an advertsiment site at some point between December 2005 and April 2006.

What happened? Did project funding run out? Did key staff leave? Or was there a blunder, with nobody receiving the email requesting renewal of the domain name?

Whatever the reason, this West Midlands Crossroads service has disappeared for sight. Is this inevitable? Well back in 1999 I was the project manager for the Exploit Interactive e-journal- an EU-funded project which ran until 2000. Once the funding had finished we had to decide what would happen with the domain name. We agreed to continue paying for the domain for at least 3 years after the project funding had ceased and would try to keep the domain for a period of 10 years. This policy was informed by a survey I carried out of project Web site funded by the EU-funded Telematics for Libraries programme. As I described in an article published in Exploit Interactive in October 2000 23 Web site had disappeared of the 103 projects funded.

We are seeing a disappearance of cultural resource and EU-funded projects from the digital environment. And this may well get worse, if the UK Government’s policy of centralising its Web sites, which will result in 551 Web sites being closed down, is not managed properly. Will we, for example, find that the Drugdrive Web site at http://www.drugdrive.com/ suddenly becomes a site used for selling drugs?

What is to be done? The good news is that the Government does seem to be handling its redirects properly – the Drugdrive Web site, for example, is redirected to http://www.drugdrive.com/

Well done, the UK Government. But what about the rest of us? Are we managing the closure of Web sites? And are we assessing the risks of failing to do this? After all, if a government Web site on protection of children from dangers on the Internet became available and was bought by a pornography site, we could well see a government minister being forced to resign

Posted in preservation | 3 Comments »