UK Web Focus (Brian Kelly)

Innovation and best practices for the Web

GCSEs Revisited

Posted by Brian Kelly on 21 Feb 2008

It always pleasing when a blog post achieves its aim, and even more so when this happens so quickly. So it was good to read AJ Cann’s post in which he describes how he spent 3 minutes using the Google Custom Search Engine (GCSE) to provide an alternative to his institutional search engine. As he titled his post “It was all Brian Kelly’s fault“!

Revisiting my original post it would seem that there are a number of ways in which GCSE is being used:

In this latter case, AJ is clearly unhappy with the local search engine service (ht://Dig): “I can’t stand the inadequate institutional search tools I’ve been forced to use for a decade” – and decided it was worth spending “less than 30 seconds” to set up an alternative! And this approach reflects AJ’s interests in Personal Learning Environments (PLEs). He now has a Personal Search Engine.

Now if setting up GSCE across a range of Web sites is so easy and can be done by individuals without the need for institutional commitment. in what other ways could the software be used?

As we’ve recently discussed institutional repositories and various people have aired their concerns on the approaches being taken, it seems to me that the GCSE could have a role to play in providing an alternative way of searching repositories.

And this approach has already been taken on the OpenDOAR Search Repository Contents service and the Search ROAR Content With Google service.

This approach fits in nicely with Rachel Heery’s comment that “I don’t really see that there is conflict between encouraging more content going into institutional repositories and ambitions to provide more Web 2.0 type services on top of aggregated IR content. Surely these things go together?“. We have the managed content in the repository and are providing users with a choice in the selection of a search interface.

It’s good to see that happening. But can’t we do even more. We could, for example, use the two ways of searching for gaining evidence of the preferences users may have for searching. And perhaps rather than exposing new users of repositories to the rich functionality of the repository’s search interface, shouldn’t we acknowledge that many users will prefer the simplicity of a Google search, and provide the GCSE interface as better focussed alternative to the global Google search tool, with the option of pointing the users in the direction of the richer service if they find that this search interface is not good enough.

This approach would have the added advantage of not requiring the expenses associated with in-house software development. Indeed could it not be argued public-sector organisations should have a responsibility to make use of relevant freely-available services, at least in prototyping or providing a service for making comparisons even if it isn’t envisaged that the service will be used in a final production role?

Of course the danger may be that the users decide that they are happy with Google. And we wouldn’t want that to happen, would we?

5 Responses to “GCSEs Revisited”

  1. Code Gorilla said

    I’ve a concern about using google-like tools for cross-harvesting tools… and it comes down to duplications in the results.

    Let me run with an example here:

    Let us say that there is a piece of research done – a lovely piece of research into the benefits of social networking, demonstrating how students and migrant workers are able to maintain strong ties with their home world as well as interact with the social world of their current residence, and further showing that this has a positive benefit to the person themselves, as well as a ripple-effect cross-influencing both social environments.
    Let us further assume that this piece of research has over a dozen authors and/or contributors, from many different institutions.
    There are probably going to be several articles from this piece of research, and not a few conferences (one could assume)…. meaning that there are a number of items to deposit.

    So, how does this affect this discussion?
    Well: Each of the institutions (of each of the authors) will want a copy of each piece of work by their authors – after all, it was research done by their staff…. meaning there will be (“many” x “a number”) of copies available in TinternetLand.

    How do we, as repository service providers, help to de-duplicate that clutter? How do we, as repository developers, make it easier for our customers (the searching researcher) to find the material they need to do their work?

    How does Google Scholar do it?

  2. This question of de-duplication is an interesting one – and you’re right to raise it as an issue.

    And I’ve already personally come across such duplication of results for one of my papers: “Accessibility 2.0: People, Policies and Processes
    Kelly, B., Sloan, D., Brown, S., Seale, J, Petrie, H., Lauke, P. and Ball, S. WWW 2007 Banff, Canada, 7-11 May 2007.

    The paper is available on the UKOLN Web site, the University of Bath repository and one of the co-authors, Jane Seale, has deposited a copy on her institutional repository service.

    Googling for the paper will find multiple versions, together with discussions about the paper. Search across repositories should potentially remove such duplication, but not with the GCSE solution, as you suggest.

    But my point is that we should be carrying out the experiments using simple tools such as the GCSE in order to discover what users actually want and would be willing to use.

    And what would happen if Google deployed de-duplication tools?

    But you’ve raised an interesting point. Thanks.

  3. ajcann said

    I’ve previously used GCSE several times, probably the most successful being the search option at http://www.microbiologybytes.com/. This search function links the static content at microbiologybytes.com with the more dynamic and rapidly expanding content at http://microbiologybytes.wordpress.com/ and as such serves a very useful purpose. What was different about yesterday was:
    a) Giving up on an institutional function in favour of a free Web 2.0 type tool
    b) Recommending to my colleagues that they do the same by copying the customized GCSE code I posted!

    (BTW Brian, it literally took me 30 seconds to build the new search, which covers our main web site as well as our newer Plone installation, not 3 minutes!)

  4. GCSE is OK, but what about institutional pages protected in some way from web crawlers (e.g. intranet items) – although I can understand the frustration with institutional search engines (I’ve yet to come across an instution where people don’t complain about their search engine!).

    I think that we are going to see much larger investments in search at Universities over the next few years, as we start to get to grips with the local information environment and the possibilities of bringing together disparate information across the institution – I have a bit of a library centric view of this, but if you look at the distribution of information in documents across an institution, there is a wealth of ‘hidden’ information, that we can expose through search.

  5. […] clipped from ukwebfocus.wordpress.com […]

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

 
<span>%d</span> bloggers like this: