Monitoring Web Server Usage Across A Community
Posted by Brian Kelly on 19 June 2007
How many public Web servers are there at your University? And how have the numbers changed over the past 5 years? Are you running more servers, as the range of services you provide grows, or have the numbers of servers decreased due to rationalisation in order to avoid duplication of effort across the institution?
I published an article on A Survey Of Numbers of UK University Web Servers in June 2000, with a follow-up article on An Update Of A Survey Of The Numbers of UK University Web Servers which was published in March 2002.
The survey was carried out using the online Netcraft service, by using a wildcard (*.ox.ac.uk) to obtain details of the numbers of Web servers in, in this case, the Oxford university domain. This process was repeated manually for all (~160) UK HEIs. A histogram for the results of the 2002 survey is illustrated.
How have things changed in the past 5 years? It would be possible to repeat the manual survey – as can be seen, the online Netcraft survey service is still available.
However in a Web 2.0 environment in which many lightweight Web-based tools are available it would not be sensible to repeat the methodology. It strikes me that the Netcraft results page is well-suited for screen-scraping (immediately after the “Results for *.ox.ac.uk” text is a line which says “Found 356 sites“. So while this interface remains, the data can be programatically extracted, stored and displayed, possibly in a graphical format).
The Dapper application could, perhaps, could be used for this purpose. After all, as I’ve described previously, Dapper has been used to create Blotter, which scrapes Technorati ranking data on a daily basis, stores this data and display the trends graphically.
But rather than doing this myself, I’d like to suggest that this might be a suitable example for the IWMW 2007 Innovation Competition – this should be lightweight and user-focussed (providing data which can detect trends across the community). It could be possible to provide an interface for a user to supply their own domain name, although another approach might be to take the domain names for the community (or perhaps a regional subset of the community) and display variations across the community – that, I think would be cool (and ‘coolness’ is one of the criteria for the competition).