Search

Trip Database Blog

Liberating the literature

Category

http://schemas.google.com/blogger/2008/kind#post

Search safety net

The search safety net is a novel feature to help improve searching; helping users not miss important papers.  I wanted to explain it – simply – but have failed on that score.  It’s important so I hope you can make sense of what I’ve written.  If you have any questions, my email is jon.brassey@tripdatabase.com.

After a search you will see a new ‘Search Safety Net’ button

If you click that it’ll bring up a list of related search terms.  It does this by looking at the top 250 search results and analysing the search terms people have previously used when clicking on these results.  This works on the notion that a single document can be clicked on after numerous searches.  For instance, in the example above search terms might have been ‘prostate cancer screening’, ‘MRI screening’ etc.

The next section of the search safety net happens AFTER you’re conducted your search and found a number of documents you like AND looked at (or simply clicked the ‘check box’ to the left of the result).  If you click on the Search Safety Net button again you see three columns of results:

The first column is closely related articles, the second is other related articles and the third is the related search terms.  The latter column is similar to the description of related search terms above, but is based purely on the documents clicked (as opposed to the top 250 results).  However, to understand the process behind the other two columns you need to understand a clickstream data.

Paper 1 ———- Paper 2 ———- Paper 3

In the above there are three papers (1-3).  A user, in the same session, clicks on Paper 1 and Paper 2, therefore we can make a link between the two.  Another user might click on Paper 2 and Paper 3, again making a link.  So, Paper 1 is connected to Paper 2 (a single step, using network language) while Paper 3 is two-steps away from Paper 1.  We have this data for all articles in Trip.

Slightly simplifying things (!) the first column is the most popular related articles based on documents that are one step away from the documents clicked.  So, we look at all the articles clicked by the user and pull back all the documents that are one step away, displaying most ‘popular’ at the top.  The second column are all the documents that are two steps away.  This is likely to find less focused results, but the occasional really interesting study that might have been missed.

Two important issues:

  • For this to work requires clicks, if the documents you’ve looked at has no clicks, then you’ll get no results.
  • This is being released as a ‘beta’ bit of software, as in we’re still developing it.  At present it is available to both free and Premium users of Trip.  However, this is likely to change in the near future.

Restricting search results by clinical area

Just over three weeks ago I published Clinical area tagging of documents which highlighted a really useful but fairly neglected part of the site.  In short it’s a system that tags documents, by clinical area, as they are added to Trip.  There are multiple clinical areas e.g. cardiology, urology, oncology.  Users can then search for an item of interest and restrict the search results to a given clinical area.

The motivation for this came, many years ago, from a Professor of Anaesthetics I wanted to demonstrate Trip to.  After two weeks of use they reported back, saying that the results were poor.  Further investigation reveals their interested in awareness under anaesthesia and they had searched for ‘awareness‘.  If you repeat the search yourself (click here) you’ll see very few of the results are related to anaesthesia.  However, if you restrict a search of awareness to anaesthesia (click here) the results are really focused and would have impressed the Professor much more!

We’ve recently overhauled and significantly enhanced the tagging process making it even more powerful.  Give it a try and let me know how you get on.

Below is a brief screencast to show you how to use it.

Finally, for those interested in the mechanism of action around the tagging of documents it’s fairly simple.  We have a list of terms associated with each clinical area.  So, words such as cholesterol, hypertension, statins, angina are associated with cardiology.  The number of words used per area varies, but in some clinical areas it’s well over one hundred. If any article in Trip contains any of these words in the title it’s tagged with the appropriate area.  So, an article on hypertension in children, would be tagged as both cardiology and pediatrics.  Due to the nature of the process it can’t be assumed to be perfect, but it is usually very powerful. 

Are you a luddite?

Many will be familiar with my post A critique of the Cochrane Collaboration, it’s been the most viewed article ever published on this blog.  Continuing in the theme was Some additional thoughts on systematic reviews and more recently Evidence, hourglasses and uncertainty.

They all point to the current methods employed in systematic reviews (as exemplified by Cochrane) being a mess! As a type of summary, a few problems:

  • They can’t be relied upon to be accurate
  • They’re financially costly
  • They’re typically out of date
  •  Significant opportunity cost

More evidence that my perspective is correct is shown is the presentation Does access to clinical study reports from the European Medicines Agency reduce reporting bias? submitted to the Cochrane Colloquium in Vienna.  The conclusions:

Unpublished clinical study reports held by EMA may be a useful source to reduce outcome reporting bias.

It’s a testimony to Cochrane’s openness to allow such ‘dissent’ to be published.  It’s a dissenting view as the current Cochrane methods rely almost exclusively on published journal articles; unpublished clinical study reports – virtually unheard of. But that’s Cochrane; on one hand a business trying to maximise it’s business model but on the other a collection of individuals doing there best to improve methods.  As an outsider, I find this tension fascinating. Why? Because the best one can say for Cochrane’s methods (alongside most other SR producers) is that they are likely to produce ‘ball park’ accuracy – in most cases you simply cannot rely on the methods to produce results you can trust.  And this is where the tension comes, if they were being transparent, they would say ‘buy the Cochrane Library, most of the SRs are likely to be out of date and many are likely to be innaccurate’ which even I can see is not great for sales.

But this bring me nicely to diffusion of innovations – an area I studied for the PhD I never completed!  The Wikipedia article summarises it nicely by saying its “..is a theory that seeks to explain how, why, and at what rate new ideas and technology spread through cultures.” The spread of an innovation is characterised as an S-shaped curve:

In relation to systematic review methods, where are you on this scale?  If, like me, you think there are serious problems by relying on published journal articles you’re probably at the innovator side of things.  If, however, you think current methods are great and there is no need to change you may well be a laggard.  But, the majority (those that are feeling uneasy) are likely to be in the middle.

Clinical area tagging of documents

Around 6 weeks ago I wrote the article Logging in to Trip which provoked much negative comment. I have clearly not rushed to reply properly but that’s because I’ve wanted to properly think through the issues.  In short, the reason for asking people to log-in meant we knew more about the users and could therefore improve the service we deliver.  As I see it there are two connected issues of concern:

  • Logging in – it’s a pain for users.
  • Profiling users and altering the results accordingly.

One suggestion (from Paul, see comments in previous post), in relation to the second point, is to focus on our refine by clinical area feature.  This is an already available system which tags documents by clinical area.  So, an article titled ‘Cholesterol and the elderly‘ would be tagged as cardiology and geriatrics.  If a user did a search for, say, cholesterol the above document would be returned in the results, alongside many others.  But if they decided to refine by geriatrics, the above document (alongside others tagged with geriatrics) would be moved to the top of the results.

Below are two images showing how it currently works:

The differences are clear and shows the potential for the system.  For me, it’s about helping users find the answers they need really quickly.  An anaesthetist, interested in ‘awareness’, would find the results of the ‘normal’ Trip disappointing but by selected anaesthesiology all the results are relevant.

As mentioned above this system is available already and can be accessed as shown below:

It’s not particularly developed and we could definitely improve it:

  • Machine learning to improve the document tagging
  • Better user interface so it’s more apparent and more intuitive to use.

Ultimately, it might serve a similar role as profiling users in improving the search results.

A multi-lingual Trip

I have the great pleasure of being part of an EU-funded project called KConnect.  It’s a group of academics, health care providers and commercial organisations (such as Trip) working together with the broad aim of innovating to improve search.

One early result has been a collaboration with the Institute of Formal and Applied Linguistics at Charles University in Prague to introduce a very nice multi-lingual tool for Trip.  It allows users to search in French, German or Czech, with more languages due in the next 6-12 months.

As you’ll see in the image above we have added a discreet link for ‘language options’ which, when pressed, reveals the three language options.  Once pressed the user can enter search terms in the selected language.

In this second image you’ll see that German has been selected and the search term bluthochdruck added.  This has been translated to hypertension and the results have also been translated into German.

It’s a very simple yet powerful system which will only improve over time as the translations get even better and more languages are added.

Related article test

For many years I’ve admired PubMed’s related articles feature.  If I was searching for an answer to a clinical question and found a useful article, related articles was a great way to see similar articles.  These similar articles had a good chance of being useful as they were so similar.  PubMed has no renamed the feature Similar Articles and this is what it does:

The Similar Articles link is as straightforward as it sounds. PubMed uses a powerful word-weighted algorithm to compare words from the Title and Abstract of each citation, as well as the MeSH headings assigned. The best matches for each citation are pre-calculated and stored as a set.

Trip’s related articles use a completely different approach – clickstream data.  Does it matter?  Does it work as well, worse or better?

Below are three comparisons.  But these are not necessarily fair. For instance, Trip’s approach relies on users clicking on the articles – so it won’t work on brand new articles.  Also, as you’ll see below a couple of the examples only have 4 related articles.  This is down to paucity of data.

In the examples below I believe that Trip’s approach is superior but I’m not sure with the other two examples, I’d call it close! But I’d value any input from others – those less biased than me!

Bottom line: it’s a really powerful demonstration of the potential of clickstream data but requires data, another reason to log in to Trip!

One final point, this approach is phase 1.  Phase 2 will be to start to use an approach closer to PubMed’s – using linguistic and semantic approaches.

Paper 1: Screening for prostate cancer. Cochrane 2013

PubMed’s related articles

  • Screening for prostate cancer. Cochrane Database Syst Rev. 2013
  • Screening for prostate cancer. Cochrane Database Syst Rev. 2006
  • Lycopene for the prevention of prostate cancer. Cochrane Database Syst Rev. 2011
  • Prophylactic platelet transfusion for prevention of bleeding in patients with haematological disorders after chemotherapy and stem cell transplantation. Cochrane Database Syst Rev. 2012
  • Chemoprevention of colorectal cancer: systematic review and economic evaluation. Health Technol Assess. 2010

Trip’s related articles

  • Screening for prostate cancer: a review of the evidence for the U.S. Preventive Services Task Force DARE. 2011
  • Population screening for prostate cancer: an overview of available studies and meta-analysis. DARE. 2012
  • PSA Test to Screen for Prostate Cancer. theNNT 2011
  • Update of evidence for prostate-specific antigen (PSA) testing in asymptomatic men. New Zealand Guidelines Group 2010
  • Focal therapy using high-intensity focused ultrasound (HIFU) for localised prostate cancer. National Institute for Health and Clinical Excellence – Interventional Procedures 2012

Paper 2: Comparison of conventional pulmonary rehabilitation and high-frequency chest wall oscillation in primary ciliary dyskinesia. Pediatric pulmonology 2014

PubMed

  • Comparison of conventional pulmonary rehabilitation and high-frequency chest wall oscillation in primary ciliary dyskinesia. Pediatr Pulmonol. 2014
  • Short-term comparative study of high frequency chest wall oscillation and European airway clearance techniques in patients with cystic fibrosis. Thorax. 2010
  • Effectiveness of treatment with high-frequency chest wall oscillation in patients with bronchiectasis. BMC Pulm Med. 2013
  • A pilot study of the impact of high-frequency chest wall oscillation in chronic obstructive pulmonary disease patients with mucus hypersecretion. Int J Chron Obstruct Pulmon Dis. 2011
  • Comparison of high-frequency chest wall oscillation with differing waveforms for airway clearance in cystic fibrosis. Chest. 2007

Trip

  • High frequency oscillation in patients with acute lung injury and acute respiratory distress syndrome (ARDS): systematic review and meta-analysis DARE. 2010
  • Effect of high-frequency chest wall oscillation on the central and peripheral distribution of aerosolized diethylene triamine penta-acetic acid as compared to standard chest physiotherapy in cystic fibrosis. Chest 2006
  • CNE article: pain after lung transplant: high-frequency chest wall oscillation vs chest physiotherapy. American journal of critical care.  2013
  • Effect of high-frequency chest wall oscillation versus chest physiotherapy on lung function after lung transplant. Applied nursing research. 2014


Paper 3: Glibenclamide, metformin, and insulin for the treatment of gestational diabetes: a systematic review and meta-analysis. BMJ 2015

PubMed

  • Glibenclamide, metformin, and insulin for the treatment of gestational diabetes: a systematic review and meta-analysis. BMJ. 2015
  • Metformin vs insulin in the management of gestational diabetes: a systematic review and meta-analysis. Diabetes Res Clin Pract. 2014
  • The use of oral hypoglycaemic agents in pregnancy. Diabet Med. 2014
  • Screening and diagnosing gestational diabetes mellitus. Evid Rep Technol Assess (Full Rep). 2012
  • Benefits and risks of oral diabetes agents compared with insulin in women with gestational diabetes: a systematic review. Obstet Gynecol. 2009

Trip

  • Effect comparison of metformin with insulin treatment for gestational diabetes: a meta-analysis based on RCTs. Archives of gynecology and obstetrics. 2014
  • The efficacy and safety of DPP4 inhibitors compared to sulfonylureas as add-on therapy to metformin in patients with Type 2 diabetes: A systematic review and meta-analysis. Diabetes research and clinical practice 2015
  • Evaluation of the potential for pharmacokinetic and pharmacodynamic interactions between dutogliptin, a novel DPP4 inhibitor, and metformin, in type 2 diabetic patients. Current medical research and opinion 2010
  • Metformin vs insulin in the management of gestational diabetes: a meta-analysis. PloS one 2013

Article analytics, again

Earlier today in the post Article analytics I said “This latest feature will be released soon.”  Little did I realise it would be live by the end of the day!

In the above image I’ve highlighted four key areas:

  • Analytics – appears under every link (for Premium users only), this is clicked to generate the data below.
  • Related by viewer – these are articles that have been clicked on during the same search session as they had clicked on the main article (Canadian clinical practice guidelines for the management of anxiety, posttraumatic stress and obsessive-compulsive disorders).
  • Viewers by country – this highlights where the users originate from who did the clicking!
  • Viewers by profession – as above but broken down by profession

NOTE: the above example is very rich as it’s clearly a very popular article.  Others will have considerably less data, another reason why we’re keen to get users to login!

Article analytics

This latest feature will be released soon.  For a given article premium users will be able to see related articles (based on clickstream data) as well as information on total views, views by country and views by profession…

Evidence, hourglasses and uncertainty

Long-term readers of this blog will know I struggle with many aspects of the systematic review process.  At the time of writing, my ‘A critique of the Cochrane Collaboration‘ has been viewed over 18,300 times and ‘Ultra-rapid reviews, first test results‘ nearly 10,000 times.

I believe the main justification given for conducting systematic reviews is to obtain a really accurate assessment of the effectiveness (or ‘worth’) of an intervention.  So, the thinking goes that spending 12-24 months is worth the cost (financial, opportunity, etc) due to the accuracy of the prediction it then gives.

My immediate response is that is demonstrably false. In my article ‘Some additional thoughts on systematic reviews‘ (just under 5,000 views) the evidence is clear that if you rely on published journal articles to ‘inform’ your systematic reviews (which is the case in the vast majority of systematic reviews) there is approximately a 50% chance that the effect size is likely to be out by over 10%.

But, even if we suspend being evidence-based and believe that systematic reviews can be relied upon to give us an accurate estimate of an effect size, is everything fine? I don’t think so and the image below illustrates my thinking.

It’s an hourglass!  At the top are all the unsynthesised trials, all floating around and the uncertainty is moderate.  Someone then spends 12-24 months pulling these together in a systematic review (likely of published trials and therefore ‘a bit dodgy’) and the certainty is reduced at the aperture of the hourglass.  But then, when you apply it to the real world of patient care, the uncertainty flares out again.  In the above example the intervention has a NNT of 6, so the intervention needs to be given to 6 people to obtain the desired outcome in 1 person.  Which is the 1 person?  Where’s the certainty?

If we were to spend significantly less time doing a review it might indicate a wider hourglass aperture (perhaps suggesting an NNT of 5-7).  In what situations does that matter?  I don’t think we’ve even started to explore these issues. In other words, when is it appropriate to spend 12-24 months on an systematic review and when is a significantly less resource intensive approach ‘ok’?

Is it irony that the reality is the type of review (systematic versus ‘rapid’) doesn’t alter the effectiveness of an intervention?  After all the compound remains the same, untroubled by the efforts of trialists.  Sorry, getting sociological there – must be time to sign off for now.

Blog at WordPress.com.

Up ↑