Search

Trip Database Blog

Liberating the literature

Category

improved search

Clinical area tagging of documents

Around 6 weeks ago I wrote the article Logging in to Trip which provoked much negative comment. I have clearly not rushed to reply properly but that’s because I’ve wanted to properly think through the issues.  In short, the reason for asking people to log-in meant we knew more about the users and could therefore improve the service we deliver.  As I see it there are two connected issues of concern:

  • Logging in – it’s a pain for users.
  • Profiling users and altering the results accordingly.

One suggestion (from Paul, see comments in previous post), in relation to the second point, is to focus on our refine by clinical area feature.  This is an already available system which tags documents by clinical area.  So, an article titled ‘Cholesterol and the elderly‘ would be tagged as cardiology and geriatrics.  If a user did a search for, say, cholesterol the above document would be returned in the results, alongside many others.  But if they decided to refine by geriatrics, the above document (alongside others tagged with geriatrics) would be moved to the top of the results.

Below are two images showing how it currently works:

The differences are clear and shows the potential for the system.  For me, it’s about helping users find the answers they need really quickly.  An anaesthetist, interested in ‘awareness’, would find the results of the ‘normal’ Trip disappointing but by selected anaesthesiology all the results are relevant.

As mentioned above this system is available already and can be accessed as shown below:

It’s not particularly developed and we could definitely improve it:

  • Machine learning to improve the document tagging
  • Better user interface so it’s more apparent and more intuitive to use.

Ultimately, it might serve a similar role as profiling users in improving the search results.

Logging in to Trip

One change we introduced recently is the increased user ‘pressure’ to log in.  A few people have contacted me to raise this as an issue and it made me realise we’ve added a barrier to use of Trip but we’ve not communicated why.  So, here goes…

Ultimately it’s part of a longer-term strategy to improve Trip and this requires us to better understand our users (which requires the user to be logged in).

Some background; my partners Dad was an eminent Professor of Anaesthetics (now retired) and I showed him Trip, and he said he’d use it for a bit.  He came back unimpressed!  His interest was in awareness, and a search for awareness on Trip (click here) returns no articles on awareness under anaesthesia, which was his interest/intention (see for yourself).

While this is an extreme example it does highlight that, without knowing the user, how can we optimise the search results?  Our system should have realised that the user was an anaesthetist and adjusted the results accordingly.  We’re doing lots of work on this area and are making real strides.  I blogged about in March with the article The important breakthrough which contained the following image:

As you can see from the results (in this experimental test system) we have detected the example user as a dentist and adjusted the results accordingly.  For an information retrieval ‘nerd’ (like myself) this is amazing.  I can think of no other innovation Trip has introduced that will come close to improving the search results as this. 

And there are loads more things we can do if we know the user. For instance improved email alerts – better linking users with evidence that is likely to be interesting and useful, as opposed to our current crude efforts!

But for it to work we need to know the user, which requires logging in.

The important breakthrough

Trip has been operating for over 15 years and I can easily say we have arrived at the most significant breakthrough yet.  It is still in our ‘labs’ section and still has much work to do before being rolled out.  But, the path is clear and, finance aside, there is no reason why we can’t produce a significant increase in search performance.

In search a really important concept is intention.  So, when a user searches they may add 2-3 search terms but what are they thinking about when they use those terms?  For instance, and this is a true story, I showed Trip to a Professor of Anaesthesiology  and asked for his views on the site.  He came back saying that he was unimpressed!  The reason – his interest was in awareness (as in, when a person is under anaesthetic are they truly anaesthetised or may they be aware) and when you search Trip for awareness you get lots of results, mostly on things like the awareness of public health messages! Another example I use to illustrate the point is the search pain.  We return the same results whether the person is an oncologist or a rheumatologist – which to me is ridiculous – as the intention is likely to be significantly different.  But, to date, there has been no good solution.

The below image (click to enlarge) shows a breakthrough.

In the image above there are 4 sets of results for the same search antibiotics.  This is a test system and not based on the real Trip results.  However, on the left-hand side we have the normal/natural results for the search antibiotics in the test system.  In the top right set of results the natural results have been reordered based on the clickstream activity of the users of Trip, those who have not logged in (85%).  At the simplest level this promotes results that have been clicked on and relegates those that have not been clicked.  It really is more complex than that – but I hope you get the point!

But the bottom right is where the magic it.  Even though it only accounts for 0.2% of the activity, we have reordered the results based on the clickthrough activity of dentists.  There are a few erroneous results, but I’d like to think you can see the effect – dental articles are promoted.

So, the effect of this is that – when we eventually roll out the system – and we know the user is a dentist we improve their results based on the previous activity of other dentists.  The reality is that this technique will work with any speciality and profession.

There are a few issues, the paucity of data is the biggest and we have two significant ways of tackling this:

  • When we roll out the new Trip we will – to a large extent – make login/registration obligatory.  This will mean we get lots more clickstream data which will make the results even better.
  • Machine learning.  We’ve already worked on machine learning and will bring these techniques to the system to enhance/compliment the clickstream work.

Oh yes, we’ve even figured out a way to mitigate the effects of filter bubbles.

This really has been a good few weeks.

Blog at WordPress.com.

Up ↑