Saturday, February 06, 2016

Answer engine is close to reality

In 2011 I posted:

"An answer engine.  For commercial reasons I have to be vague on this for now (something I'm not comfortable with) but it'll take shape over 2012."

I have revisited the idea many times in the last five years, always trying to solve the same problem: a search engine doesn't answer questions, it gives you 10-20 results to articles that may answer your question.  I borrowed the term 'answer engine' from Steve Wozniak (co-founder of Apple) who used it in relation to the release of Siri.  He said people want and need an answer engine.

Until recently the problem has been scalability - how to extract the answers from the great content held in Trip.  But, due to my involvement in various machine learning initiatives, the missing link(s) has been found and we're busy developing an 'answer engine'.  Our initiative works on a number of connected problems areas:
  • From a set of search terms can you infer the likely question?
  • Can we find an answer to that likely question?
  • Given there will be potentially multiple answers in Trip, can we surface the best answer?
We have made great strides in all three areas.  So much so we are currently working on the second iteration of our answer engine.  For the test it has two search boxes based on PICO.  The first box (P) represents the disease and the second box is the (I) intervention of interest. To illustrate what it does is using the P of 'sore throat' and I of 'antibiotics' - the answer engine returned:

"Antibiotics appear to have no benefit in treating acute laryngitis. Erythromycin could reduce voice disturbance at one week and cough at two weeks when measured subjectively. We consider that these outcomes are not relevant in clinical practice. The implications for practice are that prescribing antibiotics should not be done in the first instance as they will not objectively improve symptoms."

The above is a great answer.  But we can't rely on a single test so we've been extensively testing it and the current version gets the following results:

  • Fail - 19%
  • Partial pass - 32%
  • Pass - 49%
So virtually a 50% success rate, which impressed me!  But the biggest reason for the partial passes is our system not pulling through the answer (a relatively simple fix) and the biggest reasons for the failures was the inability to exclude additional terms which confused the answer (again, this should be a relatively simple fix).

Our system should be great for a number of reasons:

  • It can integrate seamlessly with the existing Trip interface but also act as a standalone product/app!
  • It can easily be integrated with our multi-lingual systems to users will be able to search and obtain answers in languages such as French, German and soon to come languages such as Spanish.  No English will be needed!
  • It will always be as up to date as the answer are based on all the evidence in Trip.
  • It is modular and will launch with a focus on therapeutics.  It will then expand to include medicines information (eg side-effects, interactions, contraindications) and from their I'd like to tackle clinical guidelines or lab tests (depends on the resource availability).
  • A user can get an answer in less than 5 seconds, when previously they would have had to scan the first 10-20 results to see which result was most likely to answer their question.  Assuming they get the best article they still need to read/scan it for the answer.  So, considerably longer.
 This is brilliant, it really is...!

Thursday, January 28, 2016

Update on the new upgrade

We are hoping to get the latest upgrade out by the end of February.  There are a number of changes, including:
  • Homepage design has been changed radically and it looks lovely.
  • Results page has been tidied up and functional items grouped together.  This is not a radical change, more a subtle one.
  • We'll be adding three new types of data: regulatory guidance, ongoing systematic reviews and overdiagnosis/overtreatment.
  • As part of the analysis of the way users used the site we are promoting the clinical areas feature (refining results by areas such as cardiology, primary care) to make it more usable.
  • Drug information. The main reason people use Trip is to answer clinical questions to help support clinical practice.  One thing that we don't do is offer drug information support (e.g. dosage, contraindications, adverse events) but we will be doing with the new design.  It'll help make Trip much more of a Point-of-Care tool.
  • Broken links improvement.  We have a really crude system for updating broken links, but by the end of February it'll be much improved.  While we'll not be able to guarantee no broken links, the issue will become less relevant over time!
  • Improved demarcation of free versus premium - currently premium members can find it difficult to know if they've actually got premium.  The new design should help with that!
  • We're going to be re-naming 'Trip Premium' to 'Trip Pro'!
Finally, we're also working on an answer engine, a mythical feature I've been talking about for years.  We are making real strides in this direction.  While it won't be ready for the end of February to be fully integrated in the site it might well appear in the 'Labs' section of the site - for people to try.

Saturday, January 09, 2016

Diabetes in 2015

We had a huge number of searches for diabetes in 2015, far too many for us to analyse easily.  But, I wanted to explore the topic - it's one of our most popular searches.

To keep things manageable I have analyses situations where a user has performed any search that contains the term diabetes and has then restricted the results to a specific clinical area, in this case cardiology, primary care and psychiatry.  The results are below.  It's fascinating to see the differences between the images as it gives an insight - perhaps nothing more - into the intentions of users.  For instance:
  • In the psychiatry screenshot the term mellitus is much more prominent
  • In primary care they like the word management.
  • For cardiology, not surprisingly hypertension is prominent, but then so is children (more so than primary care) - which I find more surprising

Searches for diabetes and restricted to cardiology

Searches for diabetes and restricted to primary care

Searches for diabetes and restricted to psychiatry

Friday, January 01, 2016

Looking back at 2015 and forward to 2016

I've already posted a breakdown of stats for the site (see 2015 in numbers).  The highlights for me are:

  • 3,700,000 - page views
  • 900,000 - individual sessions spent on the site
  • 650,000 - number of users of the site
  • 4,560,603 - number of minutes spent on the site
  • 350 - mentions in journal articles
  • 881,280 - number of times Trip helped improve patient care in 2015

Apart from our continued impact on health care related to search we made large strides towards financial security with the launch, in May, of the Freemium business model.  Uptake has exceeded our expectations with a significant increase in income.  However, we are far from secure, so any suggestions around additional income streams and/or sponsorship opportunities - I'm all ears!

The one slight mistake we made, in moving to Freemium, was to insist on users logging in to use Trip, which resulted in a significant user backlash.  We listened and removed that 'feature' and that resulted in a large increase in usage towards the end of the year.

We've been making some great progress in relation to our work around clickstream data.  For instance:
A new site, was launched to help separate out the Trip related content (on this blog) from the content related to rapid/systematic reviews.  Talking of which, my post 'A critique of the Cochrane Collaboration' reached over 20,000 views last year.

As part of our work with the EU-funded KConnect project we introduced some very nicely integrated multi-lingual search function.  It's currently restricted to French, German and Czech but more languages will come online in 2016 including Spanish.

But what else can we expect in 2016?:
  • A big announcement around our freemium business model in the next week or so.
  • A significant redesign of the site, hopefully ready by February.
  • A first draft of our answer engine concept which is looking really exciting.
  • Further work on rapid reviews including working on understanding important regulatory data and improvements to the Trip Rapid Review system.
  • The rollout of a new system to help cope with broken links.
  • Further development, related to our clickstream data, around analytics and insights e.g. A new analytic 'toy' and Searching for hypertension.
Running Trip is full of interest, lots of excitement and helping support care with high-quality evidence across the globe is the icing on the cake.  One thing I'd like to see this year, apart form continuing financial security, is to have a greater academic presence/impact.  I'm not even sure why, I think I'd just enjoy it (based on interactions over the last few years).

I will sign off now and wish you all a happy 2016.

2015 in numbers

Some statistics from 2015
  • 3,700,000 - page views, up slightly on 2014
  • 900,000 - individual sessions spent on the site
  • 18% - increase in users in November and December compared with 2014. I can only assume this was down to our removal of the requirement to sign-in to use Trip!
  • 650,000 - number of users of the site
  • 4,560,603 - number of minutes spent on the site
  • 76,010 - hours spent on the site
  • 178,000 - number of most recent registered user.  This suggests the actual number of registered users is 142-160,000 registered users.  This represents an increase of nearly 40,000 last year
  • 350 - mentions in articles
  • 83% - users accessing the site via a desktop computer (mobile 11% and tablet 6%)
  • 46% - users accessing the site with Chrome, the most popular browser (IE 20% and Safari 16%)
  • 22% - users from the USA followed by UK 13%, Spain 8%, Canada 5% and Australia 4%

The above figures are based on users actively coming to the website and does not include an increasing large number of people searching Trip via our webservice. So, if we assume an average of 2 searches per session and assume 20% of the searches coming via webservice (obviously the assumptions makes accuracy problematic) that equates to 2,160,000 searches in 2015. From previous analysis we have estimated that 40.8% of searches help improve patient care.  Therefore, probably the most important statistic of the lot:

881,280 - number of times Trip helped improve patient care in 2015.

Sunday, December 20, 2015

Rapid and/or systematic reviews - new post

Historically I have used this blog to write about Trip developments as well as expanding on my own work around rapid reviews.  This can sometimes be appropriate, for instance when we highlight a new tool on Trip to make reviews more rapid.  However, sometimes the articles have nothing to do with Trip and more to do with me (Jon Brassey) as an individual.  It was one of the reasons I started the Rapid Reviews website.

My intention is to post these articles on that site but I will continue to link to these articles if I feel they are substantial and offer value/benefit to Trip users.  One such article, is Why do we do systematic reviews? Part 4, possibly my firmest critique of the current ways we undertake systematic reviews.  This is linked to the need for rapid reviews.

This is probably my last post this side of Christmas, so if you're ready - and that way inclined - have a wonderful time.

Saturday, December 05, 2015

Site redesign

Every now and then it's important to take a step back and look at how the site is performing.  I'm not talking about the search results/algorithm, more the user interface. I am broadly happy with the design (logo, colours etc) but I'm less convinced that the site is configured optimally to ensure users can easily find the information they need.

So, we're exploring that and making some headway.  Don't expect it to look radically different, but hopefully it'll be a delightful surprise.

It should be ready early 2016, but that depends on user testing of the new interface.

Feel free to make suggestions as to bits you don't like...!

Saturday, November 28, 2015

Searching for hypertension

I love clickstream data - the data we collect as people use the site.  We know - anonymously - the search terms used and articles viewed.  We have just developed the ability to 'mine' this data; nearly 100 million items it!

Below are two images, both taken from searches for hypertension.  The first is for all the data we have from 2009 (tens of thousands of searches for hypertension) and the second is from 2015 only (still thousands).  What we've done is taken these searches, removed the term hypertension and created a wordcloud of the remaining words.  These words being the additional terms users used when searching for hypertension.  So, if a user searched for hypertension pregnancy, we simply chart the pregnancy term.

Searches since 2009

Searches in 2015

They look fairly similar but from a cursory analysis five terms are more prominent in 2015, those being:
  • placebo
  • aspirin
  • patient
  • enalapril
  • obesity
Interesting for sure, but more than that?