Should We “Leave Search To Google?”

21 Apr 2008

When I chaired the session on Search at the Museums and the Web 2008 conference the discussion, as I described in a recent post, turned to lightweight approaches to federated searching. During the session I received a Twitter comment on my feedback channel (intermingled with the football scores!) asking “is it more useful to develop compelling browse interfaces & leave search to Google?” The response at the time seemed to be that although Google might have a role to play in the future, its role at present is limited (in a museums’ context) due to the complexities of typical collections management Web interfaces: the valuable data is part of the ‘deep Web’ which search engines such as Google find difficult to index.

But just a few days ago, via a comment made by Nate Solas on his blog post about the Search session, I discovered that Google have announced their intention to index the deep Web:

This experiment is part of Google’s broader effort to increase its coverage of the web. In fact, HTML forms have long been thought to be the gateway to large volumes of data beyond the normal scope of search engines. The terms Deep Web, Hidden Web, or Invisible Web have been used collectively to refer to such content that has so far been invisible to search engine users. By crawling using HTML forms (and abiding by robots.txt), we are able to lead search engine users to documents that would otherwise not be easily found in search engines, and provide webmasters and users alike with a better and more comprehensive search experience.

Mia Ridge has commented on the implications of this announcement:

You’re probably already well indexed if you have a browsable interface that leads to every single one of your collection records and images and whatever; but if you’ve got any content that was hidden behind a search form (and I know we have some in older sites), this could give it much greater visibility.

In light of Google’s announcement it is timely, I would think, to revisit the question “It is it more useful to develop compelling browse interfaces & leave search to Google?” Imagine the quality of services we could provide if we redirect resources from replicating search algorithms which have already been developed (“standing on the shoulders of giants”).

And let’s remember (a) the evidence which suggests that users prefer simple search interfaces and (b) the costs of attempting to compete with Google in the search area – let’s not forget that, despite their riches, Microsoft haven’t been able to compete successfully. Is it likely that search technologies developed by tax-payers’ money will succeed where Microsoft have failed?

PS I should probably add that I’m not the first to suggest this idea. The OpenDOAR team, in particular have deployed a search interface using Google across institutional repository services. Many congratulations to the team at the University of Nottingham for evaluating this lightweight approach.

The Search Session At MW 2008

14 Apr 2008

On the final day of the Museums and the Web 2008 conference (Saturday !) I chaired a session on Search. There were only two papers presented at this session – and as the session was scheduled to last from 11.00-12.30 both of the speakers were happy for the session to provide an opportunity for general discussions after the papers had been presented.

Terry Makewell ‘s paper was entitled “The National Museums Online Learning Project Federated Collections Search: Searching Across Museum And Gallery Collections In An Integrated Fashion“. As described in a blog post by Nate Solas, the paper described the approaches to federated search being taken by 9 partner organisations in the UK. The two search technologies described were  OAI/PMH and Opensearch – and a decision was made to use Opensearch, due to its simplicity, the short timescales and the limited technical expertise and resources available by some of the partners.

Following Terry’s talk Johan Møhlenfeldt Jensen, Museum of Copenhagen, Denmark presented a paper on “Approaches To Presentation Of Cultural Heritage Information In The ALM-Area In Denmark And Scandinavia“. This paper complemented Terry’s paper nicely, and highlighted some of the challenges posed by federated search including the differing cultures across the archives, libraries and museums domains and the differing cultures across the Scandinavian countries.

The discussions afterwards focussed on whether a simple approach to federated search would be sufficient. Mike Ellis asked Terry whether used of Google search technologies, such as Google Coop, had been considered. It seems it had, but ruled out due to the complexities posed  by use of session IDs on some of the collections. In a subsequent tweeton the Twitter back-channel Mike pointed out his experimentation with Google Coop across a number of museums – and this was briefly tested by the two speakers after the session had concluded (as an aside I should note that this was the only relevant Tweet received during the session – however Terry and I were also interested in the football scores which I receive on  my Twitter account, including the flurry of goals conceded by Derby County!) .

The discussion on simplicity versus sophistication led to discussions on the user experience. Following a question on evidence of use of advanced search capabilities, data from an Australian example showed that a very low percentage of users (1%, I think) accessed an advance search capability – and, indeed, most users submitted only a single search term!  I pointed out that the importance of simple interfaces was likely to grow as use of mobile devices became more popular – a comment that was particularly pertinent to the MW 2008 conference, as the WiFi access problems conference delegates had experienced the previous day were apparently due to the large numbers of network users who were using an iPhone or Nokia N95.

There was a feeling, I think, that federated search may, in the future, be provided by mainstream commodity products – and, indeed, as collections management tools evolve and start to provide static URIs, the benefits of solutions such as Google Coop may become even more apparent.

Will there, I wonder, be a session on federated search at future MW conferences or will this area be, like institutional search, be addressed by mainstream solutions?

Reflecting On Openness and the Semantic Web

12 Apr 2008

The printed copy of the proceedings of the Museums and the Web 2008 conference divides the papers into four sections: Institutions, User Participation, Web Space and Reflecting. The concluding section, on Reflecting, contains only two papers: one on Semantic Dissonance: Do We Need (And Do We Understand) The Semantics Web? by Ross Parry (University of Leicester), Nick Poole (The Collections Trust) and Jon Pratty (Culture 24) and my paper on What Does Openness Mean To The Museum Community?, co-authored by Mike Ellis (Eduserv) and Ross Gardler (JISC OSS Watch), which I’ve posted about recently.

It is pleasing that the two papers which reflect on the challenges and opportunities posed by recent Web developments have been written by a combination of researchers and practitioners based in the UK.

Ross Parry’s paper is based on a series of workshops funded by the AHRC which were held at various locations in the UK during 2006 and 2007. The paper describes discussions which have taken place recently in the UK in which it has been suggested that “museum data with good URIs, consistent metadata and simple tagging are seen to provide a vitally stable infrastructure on which to build“.

To this list I would add the importance of providing data which is free from restrictive licence conditions and which is exposed for reuse by other applications which can exploit the rich semantic data.

But stable URIs, consistent metadata, simple tagging, open data and machine interfaces – isn’t this what Web 2.0 is about? From one perspective, people may regard Web 2.0 as shorthand for referring to blog, wiki and RSS applications. But Tim O’Reilly’s original Web 2.0 diagram makes it clear that Web 2.0 is broader than this.

In a chapter entitled ‘‘If it quacks like a duck…’ – developments in search technologies‘ in a recent Becta Research Report on Emerging Technologies for Learning Volume 3 (2008) (PDF version of chapter) my colleague Emma Tonkin argues that:

By “semantic”, Berners-Lee means nothing more than “machine processable”. The choice of nomenclature is a primary cause of confusion on both sides of the debate. It is unfortunate that the effort was not named “the machine processable web” instead.

I think Emma is right: the term Semantic Web has caused much confusion. But if the Semantic Web is really a machine processable Web in which clean URIs can help to provide programatic access to structured data, then isn’t this very close to what Web 2.0 may be considered to be about?

And can you claim to be in favour of the Semantic Web if you are critical of the architectural aspects of Web 2.0? Or, to put it another way, isn’t engagement with Web 2.0 a needed stepping stone towards the Semantic Web? And won’t we find that those who come out with reasons for not engaging with Web 2.0, will come out with a similar set of reasons for not engaging with the Semantic Web?

What Does Openness Mean To Your Community?

9 Apr 2008

Myself, Mike Ellis (Eduserv) and Ross Gardler (JISC OSS Watch) are the co-authors of a paper on “What Does Openness Mean To The Museum Community?” which has been accepted for the Museums and the Web 2008 conference. And I’m pleased that David Bearman (conference co-chair) response when he read the paper was that it should be discussed in a Professional Forum at the conference. Indeed David’s comment on the paper was “it sounds like it could be the most amazing session at MW this year” :-)

The paper suggests that openness can include open standards, open source, open APIs, open access and an open culture (i.e. a willingess to encourage user-generated content). But the paper also acknowledges that there is a downside to each of these aspects. Some of these concerns were raised by Nick Poole, Chief Executive of the MDA in a thread on “The speculative aspect of using Web 2” on the MCG JISCMail list. Nick commented:

… ‘how can you be so naïve’? Low cost of entry? We were promised that with Open Source Software and it turned out to be no cheaper. Reaching audiences while we sleep? They told us Z39.50 and interoperability would solve that and we’re still not there. Content Management will make everyone a publisher? You just try and get a username and password out of the Council IT Admin.

I’m pleased that Nick raised such concerns. He’s right when he suggests that the potential benefits of both open source and open standards have been over-hyped. And, similarly, the benefits of Web 2.0 can also be exaggerated. But my response to the concerns raised by Nick are to argue that we need to develop more sophisticated ways of engaging with these aspects of openness – and just because policy makers appear to feel that simply mandating use of open standards and open source software will be sufficient to deliver their benefits, doesn’t mean we are faced with the binary choice of accepting or rejecting such views. Rather we need to engage in discussions and debate on ways in which real benefits can be realised.

I’ve been involved in working collaboratively with others in developing models for exploiting the potential of open standards and open source software. At the Museums and the Web 2.007 conference I presented a paper on Addressing The Limitations Of Open Standards, co-authored with my colleague Marieke Guy and Alastair Dunning (then of AHDS). These ideas were further developed and extended to include open source and an open access in a paper on Openness in Higher Education: Open Source, Open Standards, Open Access co-authored by Scott Wilson (JISC CETIS) and Randy Metcalfe (then of JISC OSS Watch).

But there’s a need to build on these approaches and to develop approaches for exploiting other aspects of openness. And such approaches need to recognise the dangers and difficulties. But just because there are difficulties, doesn’t mean we should reject openness – rather it means we need to continue having the debate, whether it’s on mailing list such as the MCG list, on this blog or at the professional forum at the Museums and The Web 2008 conference. So I’ll ask here the questions w’ll be discussing in a few day’s time: what does openness mean to your community, what are the benefits it can provide, what are difficulties which are likely to be faced and, most importantly, how do you feel such difficulties should be overcome.

Your feedback is warmly welcomed.

Micro-blogging At Events

8 Apr 2008


I can recall attending the UCISA 2004 conference and listening to a speaker describing the problems caused by providing free laser printing services to student. It seems students made heavy use of the service and this caused particular problems at the end of term: the print queues would be full, so students would resubmit jobs, compounding the problems.

But this is nothing new, I felt. I wanted to chat with my former manager at Loughborough University and ask him if we hadn’t addressed this problem back in the late 1980s. But he was near the front of the lecture theatre and I was near the back. Wouldn’t it be great, I thought, if we could exploit the WiFi networks which were starting to appear, and have such discussions during a talk – this could help to improve the quality of the questions I felt.

Since then I have explored various ways of providing chat channels at events. At the Institutional Web Management Workshop 2005 held at the University of Manchester we made use of an IRC channel – on which the small numbers of IRC users heard about the 7/7 London bombings prior to the rest of the audience: the logs of the IRC chat makes interesting reading from a historical perspective:

Jul 07 11:09:30 <SebastianRahtz>scary stuff with bombs. not impossible mchester next? ...
Jul 07 11:19:54 <AndrewSavory>Sebastian: Swindon and Brighton rail stations shut
Jul 07 11:19:59 <EmTonkin>oh
Jul 07 11:20:00 <AndrewSavory>all central london bus services stopped

Various chat tools were used at subsequent events, including Jabber and the Gabbly service. But since last year the term ‘micro-blogging’ has come into vogue and I’ve an interest in exploring the potential of Twitter in a conference setting, especially as I’ve been making regular use of Twitter for some time now.

Recent Experiments

My initial experiments took place when I attended the NDAP 2008 conference in Taiwan. However my use of Twitter (sometimes summarising individual slides) caused problems for my Twitter ‘followers’, some of whom commented that their Twitter client was full of my photos of my portrait when they logged on in the morning and others found that having my Tweets being delivered on their mobile phone resulted in a continual stream of SMS alerts.

Following a suggestion from James Clay, I then tried the Jaiku service. I’d tried this before, but this time I installed a dedicated Jaiku client and, with some help from James, set up the #ndap2008 channel which was dedicated to the conference. However, despite its richness as a micro-blogging and aggregation tool, Jaiku hasn’t really taken off – and as the most important aspect of a social networking tool is the social network, I reluctantly decided that Jaiku wouldn’t be the tool to use.

The Social Dimension Of Micro-Blogging At Events

The fact that the numbers of posts (tweets) I sent on the first day of he NDAP 2008 conference irritated a couple of my Twitter followers is a good indicator of the social aspect of micro-blogging. And although I’ve concluded that it’s not the best tool for summarising individual points for a series of talks I have found that it can provide social benefits. After the conference had finished and on my last night in Taipei I tweeted that I was about to head off for a meal. A few minutes later I received a phone call from Casey Bisson, a fellow speaker at the conference. He’d spotted my tweet and suggested we go out for a meal. Which we did, and found a German restaurant where we found sausages and dark German beer made a refreshing change from the Chinese meals we’d been eating.

And then arriving at Montreal I tweeted a few minutes after arriving at the hotel that I was about to go out for a meal. A few minutes later I received a series of suggestions for how I should spend my time in Montreal:

Twitter posts

And a few minutes later another Twitterer pointed out a post on the conference forum aimed at “Beer Geeks in Montreal“:
Twitter posts

From this I’ve learnt about the serendipitous benefits Twitter can provide. If I say where I am and what I’d like to do, people are willing to help :-) And this, of course, fits in nicely with the social aspect of conferences – it’s not all about listening to talks.

Micro-Blogs At The Museums and The Web Conference

These reflections are very relevant to the Museums and the Web 2008 conference I am currently attending. Mike Ellis (with whom I am running two sessions at the conference) is providing the technical infrastructure for aggregating blog posts, Flickr feeds, etc. related to the conference. Mike is currently finalising these technologies, which includes an aggregation of posts on the home page and, something I’ve not seen before, a timeline of Twitter posts with the #mw2008 tag.

Twitter Timeline

It is really interesting to see how the use of networked technologies at events is evolving. Initially we were using self-containing instant messaging tools, but we’re now using tools, such as Twitter, which, when used in conjunction with RSS feeds and agreed tags (#mw2008 in this case) allows the content to be reused in a variety of different ways. I’m looking forward to seeing how this experiment works.

Museums and the Web 2008 Conference

7 Apr 2008

It was over 19 months ago when Jennifer Trant invited me to join the programme committee for the Museums and the Web 2007 Conference. As myself and colleagues at UKOLN were looking to engage more with the museums sector, I welcomed this opportunity. And as I like to engage fully with such activities, I found myself at last year’s conference presenting one paper (on Addressing The Limitations Of Open Standards), running a professional forum with Professor Stephen Brown on Accessibility 2.0: A holistic and user-centred approach to Web accessibility) and contributing to a paper by Mike Ellis on Web 2.0: How to stop thinking and start doing: Addressing organisational barriers. In addition I chaired a session at the conference. And while I was at the event I blogged about the conference.

Jennifer, together with David Bearman, have succeeded in getting their money’s worth out of me again this year :-) I’m in Montreal this week for the this year’s Museums and the Web 2008 Conference. And this year I’ll be running a half-day Blogging workshop, with Mike Ellis (the workshop, I’ve just noticed, is fully subscribed), running a professional forum, again with Mike Ellis, on What Does Openness Mean To The Museum Community? and again chairing a session, this year on Search – which is being held on Saturday morning!

It’s going to be a busy week, I can tell. And as I seem to have left the snow behind in England, and am enjoying the sunshine here in Montreal :-)

