Link Checking For Old Web Sites
Posted by Brian Kelly (UK Web Focus) on 4 January 2011
Web sites rot. Over time they’ll start to break. Not only will increasing numbers of links to external resources start to break but you may also find that the functionality provided within the Web site may start to break. This may be a problem if Web sites are still being used but are no longer maintained. But what should be done?
From 1999-2000 UKOLN was a member of the EU-funded EXPLOIT project and provided the Exploit Interactive Web magazine. This was followed, from, 2000-2003 by the Cultivate Interactive Web magazine. Since the funding ceased a link check of the Web sites has been carried out annually with the findings published and summaries of any problems documented. Only internal links are checked and the surveys helped us to identify and fix a number of problems which occurred when the Web site was migrated from a Windows NT service to an Apache server running on a Unix box. We have also observed a small number of broken links to third party Web site usage services, as illustrated below.
Running the annual link check and documenting the findings takes about 10 minutes. The Exploit Interactive and Cultivate Interactive Web sites are technically quite simple, with little integration with third party services. However as Web sites increasingly make use of content and services provided by third parties there are dangers that such dependencies will cause problems. So perhaps auditing of such services, including project Web sites which are no longer being funded, will become increasingly important. The Exploit Interactive
Alternatively you could argue that after a period of time such Web sites should be deleted. We recommended to the EU that project Web sites should be expected to continue to be hosted for at least three years after the funding had expired. We also suggested that this should be a minimum and that organisations should try to continue to host such Web sites for ten years after the funding has finished. Since the final issue of the Exploit Interactive ejournal was published in October 2000 we have achieved that goal. Should we now delete the Web site? Doing so might save ten minutes a year in checking that the Web site is still functioning, but would mean that articles on a number of EU-funded projects would be lost, including the following which were published in the final issue:
- ELVIL 2000: Ingrid Cartwell and Magnus Enzell introduce the prototype for the ELVIL 2000 Project, an Academic Portal for European Law and Politics.
- EQUINOX: Following on from an earlier article in Exploit Interactive, Monica Brinkley provides an update on the EQUINOX project, a Library Performance Measurement and Quality Management System.
- ILSES: Meinhard Moschner and Repke de Vries describe the development of a specialised networked digital library which integrates publication retrieval and survey data extraction.
- LIBECON 2000: David Fuegi, John Sumsion and Phillip Ramsdale discuss the LIBECON2000 Project and its Millennium Report.
- TECUP: Paul Greenwood and Martina Lange-Rein on TECUP, a meta project which analyses practical mechanisms for rights acquisition for the distribution, archiving and use of electronic products.
- VERITY: Alexandra Papazoglou gives a final report on Project Verity: Virtual and Electronic Resources for Information skills Training for Young people.
I can’t help but feel that the Web site should continue to be hosted. But what should the general policy be for project Web sites? What are others doing for project Web sites whose funding may have ceased ten years ago or five years ago or even more recently?
Note: Coincidentally after published this post I received an email containing details of the uptime for the Exploit Interactive and Cultivate Interactive Web sites. I receive an automated email if the Web sites are not available and also receive weekly reports on the server availability, as illustrated below. Another approach to consider for legacy Web sites?