Trip Database Blog

Liberating the literature

Marketing Trip

The Trip Database is just amazing. I love how it works and the features that it offers. But from my experience, it just doesn’t seem as though it is well-known or is getting the recognition from the scientific community that it deserves. What efforts are being done for marketing the Trip Database?

Isaac M. E. Dodd
MD Student at Howard University College of Medicine

The above is not an uncommon type of email.  Users find Trip, love it and contemplate that it was perhaps accidental that they found it, that few of their colleagues know about it and that it should be more widely known.

One can rely on word of mouth, which works to an extent as we get hundreds of thousands of searches per month.  But to push on probably requires marketing!  Unfortunately, Trip’s marketing budget has historically been virtually zero.  I say virtually zero as I’m not sure if our various Twitter accounts count as marketing or not.

While marketing is not my strength I’m increasingly drawn to the need to do some!  The main aim being to raise awareness of Trip which will hopefully lead to more subscriptions. Historically, if we had money I’d put it towards product development not marketing.  But this is sort of self-defeating.  So, when confronted with something as vast as marketing – where does one start?

Do we:

  • Go down the social media route, embracing Twitter more (for instance)?
  • Try and use adverts?  Surely not as I doubt the engagement is there.
  • Work with 3rd parties in some mutually beneficial way? They get some product from Trip in return for raised profile of Trip.
  • Write more papers about the findings of Trip in peer-reviewed journals?

There we go, my marketing thoughts – completely unsophisticated – in one go.  I can think of variants of the above but nothing much more than that.

Clearly we need some help.  So, with a finite budget, what brings the best return on investment?



Document summarisation

Complete stab in the dark, stimulated by Google’s release of their cutting edge TensorFlow product, is our adventure in to document summarisation.  The work below does not use TensorFlow, we’re starting gently with something a little easier to implement!  But the general idea is you take long documents and summarise them into something shorter and easier to digest.  All the work below involves automated methods and the summarisation is pretty much instant.

I’ve long held the idea (see Article social networks, meaning and redundancy) of trying to make sense of document clusters and this work is another exploration of this area.  So, I took 5 articles from the UTI and cranberry cluster mentioned in the article above, focusing on the prevention of UTIs and placed them through our test system.  Below are the results for 5 articles, with the title (with embedded URL to the actual abstract) and then the summary as generated by our system.

1) Cranberry juice fails to prevent recurrent urinary tract infection: results from a randomized placebo-controlled trial.
Summary: we conducted a double-blind, placebo-controlled trial of the effects of cranberry on risk of recurring uti among 319 college women presenting with an acute uti. conclusions.: among otherwise healthy college women with an acute uti, those drinking 8 oz of 27% cranberry juice twice daily did not experience a decrease in the 6-month incidence of a second uti, compared with those drinking a placebo.

2) Cranberry-Containing Products for Prevention of Urinary Tract Infections in Susceptible Populations: A Systematic Review and Meta-analysis of Randomized Controlled Trials
Summary: the aims of this study were to evaluate cranberry-containing products for the prevention of uti and to examine the factors influencing their effectiveness. medline, embase, and the cochrane central register of controlled trials were systemically searched from inception to november 2011 for randomized controlled trials that compared prevention of utis in users of cranberry-containing products vs placebo or nonplacebo controls.

3) A randomized clinical trial to evaluate the preventive effect of cranberry juice (UR65) for patients with recurrent urinary tract infection
Summary: the subjects drank 1 bottle (125 ml) of cranberry juice or the placebo beverage once daily, before going to sleep, for 24 weeks. in the group of females aged 50 years or more, there was a significant difference in the rate of relapse of uti between groups a and p (log-rank test; p = 0.0425).

4) Cranberries for preventing urinary tract infections
Summary: cranberry products significantly reduced the incidence of utis at 12 months (rr 0.65, 95% ci 0.46 to 0.90) compared with placebo/control. cranberry products were more effective reducing the incidence of utis in women with recurrent utis, than elderly men and women or people requiring catheterisation.

5) Recurrent urinary tract infection and urinary Escherichia coli in women ingesting cranberry juice daily: a randomized controlled trial.
Summary:  to compare the time to urinary tract infection (uti) and the rates of asymptomatic bacteriuria and urinary p-fimbriated escherichia coli during a 6-month period in women ingesting cranberry vs placebo juice daily. the proportion of women with p-fimbriated urinary e coli isolates during the intervention phase was 10 of 23 (43.5%) in the cranberry juice group and 8 of 10 (80.0%) in the placebo group (p=.07). minor adverse effects were reported by 24.2% of those in the cranberry juice group and 12.5% in the placebo group (p=.07).

Then, as a further twist I took the 5 summaries above and placed them through the summarisation software to give this ‘meta-summary’:

Among otherwise healthy college women with an acute uti, those drinking 8 oz of 27% cranberry juice twice daily did not experience a decrease in the 6-month incidence of a second uti, compared with those drinking a placebo. to compare the time to urinary tract infection (uti) and the rates of asymptomatic bacteriuria and urinary p-fimbriated escherichia coli during a 6-month period in women ingesting cranberry vs placebo juice daily.

Not sure it makes much sense!

So, as a further experiment I took all the conclusions from the 5 articles (as opposed to our summaries) and this is the result:

the potential protective effect we observed is consistent with previous studies and warrants confirmation in larger, well-powered studies of women with recurrent uti. there is some evidence that cranberry juice may decrease the number of symptomatic utis over a 12 month period, particularly for women with recurrent utis.

Perhaps it is just me, but this seems interesting.  Interesting as it’s not too bad and interesting because this could be fully automatic – we just need to automated the clustering and improve on the summarisation.  So, when we get Google’s cutting edge TensorFlow working (which uses deep learning) I’ll repeat the experiment and see where that takes us.

One small step along a winding and interesting journey!


Child health, autism and data analytics

Further to my post of yesterday (What do people look for on Trip?) I wanted to look in more depth at a topic, as much as to familiarise myself with what’s possible with out analytics.  Below is some analysis based on child health and subsequently exploring autism (the most common issue relating to child health).  NOTE: All data based on the most recent 4 weeks worth of data AND most users of Trip are health professionals!

Topics of interest

blog child tag cloud

Based on the titles of the top 50 articles that people have clicked we can explore what topics are of interest.

Autism time trend – showing how the use of the term changes over time

blog child autism timeline

Based on search terms used and plotted daily.  As we add in more historical data a weekly recording would smooth things out.  I added croup data for a comparison.

Autism drugs

  • acetaminophen (paracetamol)
  • aripiprazole
  • melatonin
  • mmr

Based on searches that included autism and a drug, revealing the top drugs searched for in relation to autism

Sources on information

blog child top publications

Based on the documents users clicked on.  We aggregate this on a ‘by publisher’ basis.

What do people look for on Trip?

Another output from the Horizon 2020 funded KConnect project, this time led by the Vienna University of Technology.  This new system allows us to see what people are looking at based on clinical area.  Below are the top results from three separate clinical areas (based on 2-3 weeks worth of data):


  • Flossing for the management of periodontal diseases and dental caries in adults
  • The efficacy of dental floss in addition to a toothbrush on plaque and parameters of gingival inflammation: a systematic review
  • The Efficacy of Brushing and Flossing Sequence on Control of Plaque and Gingival Inflammation.


  • Management of patients with stroke: rehabilitation, prevention and management of complications, and discharge planning
  • Blood pressure monitoring
  • Chronic Heart Failure – Diagnosis and Management

Mental Health

  • A systematic review of the clinical effectiveness and cost-effectiveness of sensory, psychological and behavioural interventions for managing agitation in older adults with dementia
  • Comorbidity of mental disorders and substance use
  • Evidence based guidelines for the pharmacological management of substance abuse, harmful use, addiction and comorbidity

This data is important as it indicates what clinicians are looking for; it indicates what clinician’s uncertainties are.  Often people plan new research, reviews or educational products based on assumptions.  With this data it can be more evidence-based!

One graphic to finish with.  Take the data for cardiology (not just the top three) and transform it to a tag-cloud:

Cardiology tag cloud

Steps away from better search results

When users interact with Trip we capture what they’re doing – the search terms, articles clicked etc.  Previously I have shown how we can map this data using this stored (clickstream data).  Below is a map of articles relating to urinary tract infection (UTI):

UTI large map annotated

You can see, from the annotation, that similar articles cluster (bottom left is a cluster of articles on UTI and cranberry).  To better understand how we create these graphs see these two articles:

I’ve been working with this data for a while and uses keep appearing.  One that is very attractive is in improving search results.  For the sake of argument let’s say the articles in the image above (indicated by individual nodes in the image above) are evenly spread in the top 2-3 pages of search results in Trip.

As soon as a users makes their first click they are telling us where they are, in  relation to their interest/intention, in the map of articles (see below):

UTI large map for Sept 2016 blog

Using the above example a user clicks on an article in the bottom left of the image (in a cluster of articles on UTI and cranberry) the chances are they are likely to be interested in others articles that are close by (1-2 ‘steps’ away).  This works on the same principle as normal maps – if you’re looking at a street map of New York and you’re looking at a particular road in, say, Brooklyn it’s likely that your immediate interest is in the area close by to that road as opposed to say the Mission in San Francisco.

So, could we create  a system that can allow users to re-order results as soon as they click on their first result?  Could we do this dynamically (no clicking)?  The principals seems sensible but as with most of these things it’s how to operationalise them that’s the key…!

Experiments in machine learning at Trip

At Trip we like to ‘muck around’ with new techniques to make the site even better.  Sometimes there is a clear reason and other times it’s just to explore these techniques to see what they can offer.  Currently we’re doing lots of work involving machine learning and recently we released our work on the automated assessment of bias in RCTs.  But a few other things we’re involved in:

Word2Vec: Completely speculative and I have no idea what the output will be (I believe that it looks for similarities and relationships between words/concepts).  This is working with Vienna University of Technology (TUW) as part of our Horizon 2020 funded KConnect project.  There is loads of hype around this technique so we thought it was too good an opportunity to not get involved.

Learning to Rank: Again with TUW this is a much more understandable technique.  It is a machine learning technique used to improve the search results.  It’s one of a number of algorithm tweaks we’re attempting and all will be thoroughly tested using interleaving or A/B testing (probably the former).

Document summarisation: Another speculative venture.  Yesterday I saw that Google have opened up something called TensorFlow to support document summarisation.  This is something I’ve been interested in for a while so I contacted my freelance machine learning contact and we agreed to give it a go (he did most of the work on our 5 minute systematic review system).  I’m not sure how document summarisation fits in with Trip but seeing outputs can only help me figure it out.

Hopefully we’ll start seeing results on all these projects before the end of the year.

One important thing to point out (and something I relish) is Trip’s ability to get involved in these projects and get things moving quickly.  The document summarisation work was set up within 12 hours of seeing the announcement of the TensorFlow being opened up (I’d never even heard of it before).  One can only imagine the bureaucratic steps a large organisation would need to go through to even start considering these ground-breaking initiatives.

Trip plays an important role in the health information retrieval ecosystem as we are so innovative.  Larger, better funded, members of the ecosystem observe and copy/adopt where we succeed. It’s classic diffusion of innovations.   I much prefer being at the front of the adoption curve!

Search suggestions

In our recent poll the feature most users wanted to see was a search suggestions function.  Well, we’ve delivered on that and it is freely available on Trip.

Search suggestions

In the image above you’ll see the search suggestions to the right of the search box. The user has done a simple search and we’ve made a number of suggestions to help the user formulate a more focused search.  Clicking on one of those suggestions, for example, ‘breast cancer’ results in a new search for ‘breast cancer’ and further search suggestions, the top ones being:

  • breast cancer screening
  • negative breast cancer
  • breast cancer therapy
  • breast cancer treatment
  • triple negative breast cancer
  • breast cancer metastatic
  • breast cancer risk
  • breast cancer radiotherapy

So, it’s a really simple system to get better search results.  In addition our system is available as you start typing your search in the search box.

The search results system has been created as part of our involvement in the KConnect project (funded via the EU Horizon 2020 scheme).  The team at the Institute of Software Technology and Interactive Systems, Technische Universität Wien (Vienna University of Technology) have taken search suggestions from two sources:

  • PubMed – they have a system which we’ve used for a number of years (but restricted to a user typing in the search box).  This has never been satisfactory and always seemed a bit ‘dry’ – hence wanting to improve on it.
  • The Trip search logs.  Users search Trip thousands of times a day and we start to build up a picture of terms that go together.  We can mine this data to come up with potential search suggestions.

And, being evidence-based, we’re mixing the search suggestions and recording which get clicked.  So, will our users prefer PubMed or search log suggestions?  Either way, the results will help inform future developments of the system.  But, as it stands, the mix is already much better than the PubMed suggestions alone.

The one obvious improvement to make is the design – as it’s fairly poor.  But that will have to wait till we roll out our next new feature – mis-spelling (the second most wanted new feature requested in the poll).  This is near to being released and again has been created with the help of the team at the Institute of Software Technology and Interactive Systems as part of the KConnect project.  When that’s released we’ll get our designer involved to make it look seamless.

Trip, making search simple!

New feature: automated assessment of bias

I love it when we roll out new features and few have been as significant and innovative as this one.  Over the last few months I’ve been working with the wonderful team at RobotReviewer to introduce two major improvements to Trip.

Identification of RCTs.

Trip has featured a search results category called ‘Controlled trials’ for years.  To identify trials we used a filter to highlight trials in PubMed and imported them in to Trip.  This used a series of keywords and was good at identifying trials but was also prone to identifying a number of other articles that were not trials.  In other words there were a number of false positives (ie noise) and we invariably missed a few trials as well.

RobotReviewer used machine learning to identify trials from Trip and it works brilliantly.  In internal tests our controlled trials is about 97% accurate, which is amazing.  The total ‘count’ of trials has dropped by over 200,000 which means they were incorrectly identified by the filter.  So, when using the controlled trials filter you’re significantly more likely to just find trials and avoid the noise of incorrectly identified trials!

Automatic assessment of bias.

Last year the RobotReviewer team published RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials.  The paper concluded:

Risk of bias assessment may be automated with reasonable accuracy. Automatically identified text supporting bias assessment is of equal quality to the manually identified text in the CDSR. This technology could substantially reduce reviewer workload and expedite evidence syntheses.

In short their techniques pretty much matched human ability in assessing bias.  Now, in conjunction with Trip, they have extended their techniques to work on the controlled trials that Trip has: abstracts.  With very little loss of accuracy we have just released this feature (see their blog for more technical details).  In this first image it shows what to expect:


The ‘Estimate of bias…’ is clickable to reveal:


This is a significant moment for Trip and I’m delighted that we have this feature.  Assessment of bias is not most people’s idea of fun and if we can help reduce the barriers to using evidence – which we have with this feature – then everyone should be delighted.

Quick update

There are lots of things going on in the background and these will start to become visible over the next six months (some very soon).  To give a flavour of what we’re currently working on:

  • Using machine learning to better identify controlled trials for inclusion in Trip.
  • Using machine learning to assess for potential bias in controlled trials included in Trip.
  • The answer engine is slowly creeping towards a robust testing version.  Version 1 was available 4 months ago but we wanted to make it even better so we have re-built it from the bottom up.  I’m confident it’ll be worth the wait.
  • Search suggestions – an improved drop-down search suggestion feature PLUS a new post-search search suggestions feature.  So, if you conduct a search we’ll display a number of related searches that you might prefer to use to give a more focused search.
  • Improved search algorithm – this is a really exciting development and we’ll be using a number of cutting edge technologies to improve the search results.  These will all be tested to ensure we get the optimal balance.

And, when the above are all finished, we’ll move on to our Q&A community idea….!

Blog at

Up ↑