Trip Database Blog

Liberating the literature


search algorithm

Clickstream data and results reordering

Recently I’ve been discussing the potential for using our clickstream data (our earliest post on the subject being from October 2013).  After a post earlier this year Ok, I admit it, I’m stuck I have been contacted by two separate people who have both been very generous with their time and on Friday I met with one of them who talked me what they had found.

Before I share the results there are a few points to consider:

  • This really is early days and it needs some imagination to see how it would work on Trip.
  • The image below is one trial, simply to illustrate a point.  The results are not based on the full Trip index, just a very small sample.
  • The search is using a very simple text matching for title words only.  So, as you will see in the image below all the articles in the left-hand column have the search term – diet – in the title.

So, what’s going on?

The left hand side are the results in this mock-up search.  However, those on the right-hand side have been reordered using simple clickstream data.  Those articles that are surrounded by the light blue colour have been boosted (so appear higher) due to lots of people clicking on them.  Those results surrounded by orange are arguably more interesting – as they don’t include the search term in the title!

What this signifies is that users of Trip, while searching the actual Trip, have clicked on the orange articles in the same search session as one of the articles on the left-hand side.  So, it’s telling us that the orange articles are related to the normal results – and being inserted into the results – even though they were not matched in our search test by having the word diet in the title.

Trying to describe this in the blog is slightly difficult as I’m not sure if I’ve explained it particularly well.  I suppose there are two take homes:

  • Clickstream data, even using a small sample, can undercover some really useful articles that a standard keyword search might miss.
  • I am very excited by this, so have faith in that!

    New algorithm

    From now on I’ll be using the new algorithm for all my searches of TRIP. I was convinced by the latest question I answered Should someone with a history of proven ischaemic heart disease, and who abuses alcohol be on a statin?. Using the old TRIP found lots of useful material but it took a bit of wading through. Using the new algorithm found all the material in the top ten results – I was deeply impressed (trying to maintain some attempt of objectivity)!

    I wish I could roll out the improvements in the next week or two. Unfortunately, this will have to wait till mid/end of September. As we’re introducing a number of other changes the web-company is keen we roll-out all the changes in one go. We could overrule that, but it’d cost significantly more to do so!

    In the interim I’ll be in the luxurious position of having sole access to the best TRIP ever!

    Blog at

    Up ↑