When I chaired the session on Search at the Museums and the Web 2008 conference the discussion, as I described in a recent post, turned to lightweight approaches to federated searching. During the session I received a Twitter comment on my feedback channel (intermingled with the football scores!) asking “is it more useful to develop compelling browse interfaces & leave search to Google?” The response at the time seemed to be that although Google might have a role to play in the future, its role at present is limited (in a museums’ context) due to the complexities of typical collections management Web interfaces: the valuable data is part of the ‘deep Web’ which search engines such as Google find difficult to index.
But just a few days ago, via a comment made by Nate Solas on his blog post about the Search session, I discovered that Google have announced their intention to index the deep Web:
This experiment is part of Google’s broader effort to increase its coverage of the web. In fact, HTML forms have long been thought to be the gateway to large volumes of data beyond the normal scope of search engines. The terms Deep Web, Hidden Web, or Invisible Web have been used collectively to refer to such content that has so far been invisible to search engine users. By crawling using HTML forms (and abiding by robots.txt), we are able to lead search engine users to documents that would otherwise not be easily found in search engines, and provide webmasters and users alike with a better and more comprehensive search experience.
Mia Ridge has commented on the implications of this announcement:
You’re probably already well indexed if you have a browsable interface that leads to every single one of your collection records and images and whatever; but if you’ve got any content that was hidden behind a search form (and I know we have some in older sites), this could give it much greater visibility.
In light of Google’s announcement it is timely, I would think, to revisit the question “It is it more useful to develop compelling browse interfaces & leave search to Google?” Imagine the quality of services we could provide if we redirect resources from replicating search algorithms which have already been developed (“standing on the shoulders of giants”).
And let’s remember (a) the evidence which suggests that users prefer simple search interfaces and (b) the costs of attempting to compete with Google in the search area – let’s not forget that, despite their riches, Microsoft haven’t been able to compete successfully. Is it likely that search technologies developed by tax-payers’ money will succeed where Microsoft have failed?
PS I should probably add that I’m not the first to suggest this idea. The OpenDOAR team, in particular have deployed a search interface using Google across institutional repository services. Many congratulations to the team at the University of Nottingham for evaluating this lightweight approach.