Search

Trip Database Blog

Liberating the literature

Some additional thoughts on systematic reviews

In April 2013 I published A critique of Cochrane which was based on my presentation at Evidence Live 2013.  This article has been read over 8,500 times and has opened up numerous separate discussions around systematic reviews and the nature of evidence.  It’s still a topic I find fascinating and have moved my thinking on further.

This post follows my presentation at the Rethinking Evidence-Based Medicine: from rubbish to real meeting a short while ago….

In 2005 Richard Smith, former editor of the BMJ wrote the article Medical Journals Are an Extension of the Marketing Arm of Pharmaceutical Companies in Plos Medicine.  It starts with a quote from the editor of The Lancet, Richard Horton, “Journals have devolved into information laundering operations for the pharmaceutical industry“.  It’s well worth a read but the salient point is that journals are being co-opted by pharma to help push a skewed view of the research base for a given intervention.  As Richard’s article states:

The companies seem to get the results they want not by fiddling the results, which would be far too crude and possibly detectable by peer review, but rather by asking the “right” questions….but there are many ways to hugely increase the chance of producing favourable results, and there are many hired guns who will think up new ways and stay one jump ahead of peer reviewers.

Recently he was on Twitter with the same message:

Another ‘trick’ is publication bias, so well presented in the Turner’s 2008 NEJM article Selective Publication of Antidepressant Trials and Its Influence on Apparent Efficacy.  As Turner et al. point out:

In the United States, the Food and Drug Administration (FDA) operates a registry and a results database. Drug companies must register with the FDA all trials they intend to use in support of an application for marketing approval or a change in labeling. The FDA uses this information to create a table of all studies. The study protocols in the database must prospectively identify the exact methods that will be used to collect and analyze data. Afterward, in their marketing application, sponsors must report the results obtained using the prespecified methods. These submissions include raw data, which FDA statisticians use in corroborative analyses. This system prevents selective post hoc reporting of favorable trial results and outcomes within those trials.”

So the FDA see many trials, but are these all published in peer-reviewed journals?  The article reported that there were 38 studies favouring an antidepressant, of which 37 were published.  Of the 36 that were negative, only 3 were published.  If you do a meta-analysis on the published trials alone you get a distorted view and Turner compared a meta-analysis using FDA trials and published trials and found a 32% increase in effect size for meta-analyses of published trials versus FDA registered trials.

Tamiflu is another example and that raises even more issues.  It is well worth reading more about it on the BMJ’s special Tamiflu webpage.  The work that has gone on and is ongoing has highlighted further problems when relying on journal articles (published or unpublished).  Articles are effectively summaries of the trial.  Much more information is contained within the Clinical Study Report (CSR) – standardized documents representing the most complete record of the planning, execution, and results of clinical trials, which are submitted by industry to government drug regulators.  I tend to view journals articles as abstracts of the CSR. So, an abstract is a summary of the journal article, the journal article is an abstract of a CSR.  As abstracts are summaries they miss lots of data and in the case of Tamiflu when the authors looked at the CSRs they found lots of information about adverse events, discrepancies with blinding etc.  

So, above I’ve highlighted two major issues that affect systematic reviews:

  • Publication bias
  • Using journal articles as opposed to CSRs

The AllTrials campaign is working very hard and with great success in trying to ensure more trials are published but it’s a work in progress, that’s for sure, and I hope that all readers of this article will support the campaign.

Publication bias and systematic reviews

But how do Cochrane deal with unpublished trials?  Now, to clarify I think Cochrane do some great things and the thousands of volunteers work tirelessly to improve healthcare. In my previous article I was critical, but hopefully constructively so, and I will be again later in this article but it’d be rare for any large organisation to be perfect.  Two further significant pluses about Cochrane:

  • The Bill Silverman prize, started by Iain Chalmers (using his own money) which “acknowledges explicitly Cochrane’s value of criticism, with a view to helping to improve its work“.  Iain himself asked me to submit my previous article for consideration for the prize.
  • They are more transparent than any other systematic review producer.  A lot of my criticisms were based on information from their own publications.

Anyway, back to Cochrane and unpublished trials. In 2013 Schroll, Bero and Gotzsche published Searching for unpublished data for Cochrane reviews: cross sectional study and concluded:

Most authors of Cochrane reviews who searched for unpublished data received useful information, primarily from trialists. Our response rate was low and the authors who did not respond were probably less likely to have searched for unpublished data. Manufacturers and regulatory agencies were uncommon sources of unpublished data.

In other words any unpublished trials found are unlikely to be from the manufacturers and any they do find are found in an un-systematic manner.

My cynicism/realism aside (delete depending on perspective) what does the evidence say about relying on published trials? The 2011 paper Effect of reporting bias on meta-analyses of drug trials: reanalysis of meta-analyses by Hart et al. used similar methods to Turner’s and compared meta-analyses of published trials versus meta-analyses of FDA data but over many more drugs/drug classes – 41 interventions in total.  They reported:

Overall, addition of unpublished FDA trial data caused 46% (19/41) of the summary estimates from the meta-analyses to show lower efficacy of the drug, 7% (3/41) to show identical efficacy, and 46% (19/41) to show greater efficacy.

I charted the range of differences in effect size and that is shown below.  I have zeroed the effect sizes of the meta-analyses based on published trials on the vertical zero line and the horizontal bars represent the discrepancy when using FDA data:

Quite a range. In just under 50% of the cases the discrepancy was greater than 10%. But what is equally troubling is the fact that the results are unpredictable. There is no way of knowing if the result of a meta-analysis, based on published trials, are likely to under-estimate or over-estimate the true effect size.

Clinical study reports and systematic reviews

As far as I can tell the Cochrane tamiflu review (proper title: Neuraminidase inhibitors for preventing and treating influenza in healthy adults and children) is the only Cochrane review to reject journal articles and only rely on CSRs.  I am also unaware of any major systematic review producer that routinely uses CSRs (although some invariably will).  However, I’m confident that the vast majority of systematic reviews avoid CSRs.  The reasons are pragmatic, CSRs are massive documents with poor structure. Systematic reviews take an age using research articles, using CSRs would multiply the workload ten-fold (or even more).

Where does that leave us?

It is evident that systematic reviews cannot be relied upon for an accurate assessment of an average effect size for an intervention.  The Hart paper (and others) has demonstrated that by including regulatory data it alters – in an unpredictable way – the estimate of effect of the intervention.  So, if you see a systematic review with a given effect size there is an approximate 50% chance of the effect size being out by over 10% (not withstanding the chance that the systematic review is out of date). And, to reiterate this point, you’ve no way of knowing if the review is one of those reviews and if it is whether the estimate is under or over.

So, all I can conclude is that a systematic review gives a ball-park figure for the actual effect size. They are possibly the most accurate way of assessing likely effect size but who knows?

This is problematic for a large number of reasons, including:

  • They take an awful lot of time and cost to arrive at this approximation.  As I previously highlighted the workload on Cochrane is significant and only around a third of their reviews are up to date.
  • Original Cochrane Reviews were not the massive beasts they are now.  They were done quickly and were short.  Over time, it appears to me, that methodologists have tried to eliminate bias from published data completely ignoring the likely larger effect of missing data bias and also the implication on workload of each new trick they roll out.  But to what end?  Accuracy?
  • We use systematic reviews as a heuristic for accuracy.  I have used myself and heard many others say something along the lines of ‘There’s a recent Cochrane systematic review, so no need to look any further‘.  It is very easy, and convenient, to uncritically accept a review’s findings.  This point is perhaps no-ones fault.  I assumed systematic reviews were accurate and I suppose they are in that they invariable are good at saying if the intervention is good, bad or indifferent.  But, if you want an estimate of effect – they simply don’t deliver in a predictable way.  And confidence intervals don’t adequately represent the uncertainty as they are based on the included trials, they cannot and do not estimate uncertainty based on the 30-50% of missing trials.

So, the future is very much as laid out in my initial post, arguably the case is more compelling than ever.

I often wonder, due to the problems highlighted above, if giving an effect size, framed by narrow confidence intervals is mis-leading.  Can we really say any more than ‘This is likely to effective and the effect size is likely to be in the range of…’?  That seems more honest to me. I do feel much of the ‘problem’ I have with systematic reviews are linked to my perception.  Systematic reviews producers probably don’t say that they produce accurate results, it’s assumed.  But, they probably don’t do enough to highlight the shortcomings either.  Accuracy sells, ballpark doesn’t.

Finally, a bit of mischief based on the assumption that 30-50% of trials are never reported…

Trip in systematic reviews

Trip is mentioned thousands of times in articles in Google Scholar (click here) and in the last year over 300 times (click here).  Most of these are systematic reviews and I found two new articles today, these were:

Needless to say Trip was not alone and other databases mentioned in these SRs were:

 PubMed
 Medline
 Cochrane Library
 Google Scholar
 EMBASE
 Web of Science
 CINAHL
 ProQuest

 What’s different about Trip? Size, only one paid employee (me and one day per week)!

Twitter and the dissemination of research evidence

Trip aggregates some wonderful content.  The main route for people finding this evidence is via search or by registering with Trip and indicating what topic areas they’re interested (in which case we email the user with the latest research that matches their interests).

Towards Christmas I started to experiment with using Twitter as a dissemination route.  Basically, I created two topic areas (Primary care and Cancer) and starting tweeting simply the title of the article and the URL of relevant articles that were recently added to Trip.  The Trip techie (Phil) suggested I use some tracking to see if people are actually clicking on the articles and so I started using a site called Bitly which has been brilliant.  It basically showed that people were clicking in quite large numbers on the articles I had tweeted. So, 5 days ago I started three more topic areas (respiratory, child health and CVD).  So, the results:

As of writing this there have been 2,050 clicks, see image below (click to enlarge)

 In addition we can track which articles have been clicked on the most and some have been spectacular, here are the top 5:

  1. Diagnosing pneumonia in patients with acute cough: clinical judgment compared to chest radiography Eur Respir J – 128 clicks
  2. Prophylactic antibiotic therapy for chronic obstructive pulmonary disease (COPD) Cochrane – 62 clicks
  3. NSAIDs and cardiovascular safety: the truth makes my heart hurt Tools for Practice – 42 clicks
  4. Sulphonylureas and risk of cardiovascular disease: systematic review and meta-analysis Diabet Med. – 41 clicks
  5. Once or twice daily versus three times daily amoxicillin with or without clavulanate for the treatment of acute otitis media Cochrane – 38 clicks

 The above figures have far exceeded my fairly modest expectations. 

The five topic areas are the limit of what I can realistically do manually, but if we find some resource we could automate the whole system.  So, we could create many more topic areas (e.g. dermatology, women’s health, allergy, neurology) and even individual conditions (e.g. depression, diabetes, myocardial infarction).

In summary – an exciting innovation!

A look back at 2013

As 2013 comes to an end I find myself being reflective and wanted to share the disappointments and highlights of the year.  I’ll also glance forward into 2014 and see what that might bring…

Disappointments.  It can’t all the be plain sailing and there have been a few disappointments:

  • Continued financial insecurity, what can I say about this – it’s the same every year!
  • Professorship.  It was very nice – while it lasted – but an academic I’ve worked we suggested I apply for an honorary professorship at their university.  My application passed through the first two rounds but came unstuck at the last round.
  • Employment.  Trip is not may main day job – I work for the NHS in Wales.  Within our organisation they have been creating an Evidence Service, something I was very keen to lead.  However, in the middle of the month I was unsuccessful in securing the ‘top job’ and will therefore have no role in the service.  

As my Mother would say: ‘it could be worse, you could be dying’! So we need to keep the above in context.

Triumphs. Thankfully there are a significant number of these and these more than outweigh the above disappointments.  A few highlights below:

  • 4 million searches of Trip from around the globe
  • The Trip advisory group have been invaluable and I appreciate all the input I’ve received.
  • Leaflets have finally been produced and I was particularly pleased to have them translated into Spanish by Netzahualpilli Delgado Figueroa and Daniel Gonzalez Padilla (from the advisory group).
  • The ability of link from Trip to an institutions full-text articles, something I’ve struggled with for years.  We currently have over 300 institutions signed up to this feature.
  • Trip has been involved in two main bits of research this year, both have been spectacular.  Firstly, my hunch about the social networks of articles has been proved right.  Secondly, my work on clinician similarity and the potential to improve search has again been a great success.  The trick with both of these is to roll them out onto Trip and improve search performance!
  • Rapid reviews. This is been a great source of intellectual challenge for me, building on my 15 years experience in the area of clinical question answering.  In March I presented at the wonderful EvidenceLive conference in Oxford, the title of my talk being ‘Anarchism, Punk and EBM’.  It was a broad critique of Cochrane and was written up as a blog article, which has now been read over 8,000 times.  It has led to me submitting a large research grant application to the NIHR and numerous invitations to take part in projects around rapid reviews from around the World.
  • Ultra-rapid reviews. With our usual low-budget we created a world-first, a five minute system to review and synthesize multiple research articles.  Not quite, but nearly, a five minute systematic review.
  • Call me superficial, but it was very nice to see myself being quoted by a hero of mine Iain Chalmers in his address to the Cochrane Colloquium!  Also, in the Christmas edition of the BMJ I was name-checked, at various points, by Richard Smith.

But a constant joy for me, and this has been the same for me over the years, is the new relationships I make as a result of Trip.  I get daily emails from health professionals and/or information experts reaching out, offering advice, support and praise.

2014, the future

This is a bit harder to predict, so I’ll keep this brief:

  • I am very keen to obtain security for Trip and investment to roll-out both the social networks of articles and the clinician similarity measures.  I am fortunate that I have three separate avenues to explore – one in early January.  
  • I really need to write a follow-up to my Critique of Cochrane and to further explore the place of systematic reviews.
  • Answer engine – fingers crossed we can roll this out, albeit modestly to start with.

TILT – my biggest regret

This post signaled my biggest professional disappointment!  A key phrase being:

I think it’s fair to say that it has failed

TILT stood for ‘Today I Learnt That’ and in a nutshell allows health professionals to record things they’ve learnt recently.  A longer explanation can be viewed here.  Some examples of recorded learning being:

  • The commonest causes of postural hypotension are medications and conditions that cause hypovolaemia
  • Even after extensive evaluation, about a third of patients with persistent, consistent postural hypotension have no identified cause

Why do I like it?  These are nuggets of learning/evidence that have typically been distilled from a larger document.  The person TILTing has removed all the unnecessary background information and just recorded what’s important to them.  Also, the user will only record what they previously didn’t know – so fresh learning that’s likely to be valuable to others.

In this post I highlighted while I think it failed. I think things have moved on since then, in relation to the notion of sharing.  Design has moved on as well.  But I still think the main reasons for failure are the same.  In a nutshell it needs to be easier to use and we need to communicate what we’re trying to achieve better.

Can we do it?  Is it worth it?  I want it to happen but sometimes you’ve got to know when to stop.  I guess I’ll be seeking opinions to see what people think – so let me know.  If it works it’ll be magic, but it’s a big ‘if’!

How to take Trip forward

I write articles like this from time to time all caused by a lack of robust business model for Trip.  As I’ve stated before, it’s no way to run a business/service/website when there’s the constant fear of money running out.  I do long for some security (perhaps that’s a sign of getting old).

Still, this issue has become more of an issue with the three great bits of research we’ve recently completed:

I want these to be implemented.  They will improve Trip and make it easier for users to find the evidence they need to improve patient care.

We recently had registered user number 100,000, I want to get to 1,000,000. The reason being that the top two bits of research get better with scale – the more data the better the results.  So, we currently get around 500,000 page views per month, let’s get it higher – to 5,000,000 – the data and improvements to Trip will be amazing.

So, to implement these initiatives, work on others and boost traffic we need investment.  I’ve had initial discussion with a large publisher and separately with an investment specialist with a view to securing venture capital.

So these are real possibilities.  The advantages are clear but less so are the disadvantages.

UPDATE: It’s not just money, that’s important. But also important is a business partner, someone who understands business who can help drive that side of things – to make Trip sustainable.

Uncertainties and search

Shops, they connect the producer to the consumer.

A supermarket contains a large number of products from a large number of producers.  Consumers come in and wander round picking off products off the shelves.  Problems arise in a number of ways, and one clear example is when a user can’t find the product.  The shopper’s need is unmet.  The shopper is dissatisfied.

In many ways Trip is a supermarket – a supermarket of evidence.  Consumers come to the site with a wide variety of needs and we do our best to match the consumer with the producer.  The consumers being doctors, nurses etc.  The producers are the likes of NICE, AHRQ, Cochrane, BMJ etc.

Problems arise in a number of ways, and one clear example is when a user can’t find the evidence. The health professional’s need is unmet.  The health professional is dissatisfied.

I can’t help feeling the likes of Tesco, Carrefour, Spar, Walmart etc really understand their consumers and try to understand their unmet needs/frustrations.  A few years ago a celebrity chef/cook mentioned a product (some sort of birds egg) which caused a huge increase in interest and large unmet need at the supermarkets.  Supermarkets realised they were missing market and desperately sought appropriate stock.

In Trip, we record most things a user does on the site.  It allows us to better understand the research landscape and draw information (and pretty) graphs such as seen here.  One thing we’re not good at is mining the data on dissatisfied users.  As ever time is a problem – there’s only one of me!  But I don’t think I’ve ever given it a great deal of thought.

Arguably a user coming to the site and searching and not clicking on an article is a clear sign that they have not had their information need met.  I wonder if it’s more sophisticated than that.  It might be that on average – for each search – 2 articles are clicked on.   Can we spot trends where for a given search term(s) lower than average articles are clicked on? 

This could have two effects:

  • Trip could try and locate producers of evidence in this area and bolster our index.
  • It might be that evidence has not been produced at all and therefore the challenge is down to the producers to help meet this un-met need.

Just a thought.

Structure in Trip (Social networks part two)

Last week I posted an article The social networks of articles.  In the ten days since then things have moved on with additional further analysis provided by Orgnet, LLC. As an aside look around the site, the case studies are brilliant!

We supplied the same data as last time, a sample of UTI searches (either alone or with additional terms) and what came back was stunning (amazing what you can do with the appropriate expertise and software – InFlow 3.1):

The image above reinforces my initial analysis that there is a rich structure within the data.  Each node is a unique article in Trip and the links are made from the clickstream data (see previous post)

But, does the structure have a meaning?  That required additional analysis, see annotated image below (click on image to expand):

I don’t think anyone can appreciate how exciting this was for me!  Even with a small sample of data we’re revealing a rich structure; a rich structure that has meaning. In effect, by using Trip, our users are curating the content, crowdsourcing the organisation of it.

I’ve had these images for a few days now and I’m still reflecting on the next steps.  In keeping with my Clinical Like Me idea I’d be really interested in seeing how networks compare by similar clinicians.  So, at a high level the network for UTI would likely be different if the search was from a general/family practitioner versus a urologist versus a paediatrician.  But loads of other potential applications from speeding up the review process, highlighting related articles etc.

The social networks of articles

Years ago I did a lot of work on social network analysis, but I then moved away from the area.  However, my interest got renewed when I read this article in PLOS ONE Clickstream Data Yields High-Resolution Maps of Science.

Clickstream data is the data a website records when a user comes to the site.  In the case of Trip a user would visit the site, conduct a search and then click on a number of articles.  Below is an example set of data:

0blzia55j3krmi55nuqc24nc    11/07/2013    Pregnancy UTI    5
0blzia55j3krmi55nuqc24nc    11/07/2013    Pregnancy UTI    6
0blzia55j3krmi55nuqc24nc    11/07/2013    Pregnancy UTI    7
0blzia55j3krmi55nuqc24nc    11/07/2013    Pregnancy UTI    8

The first column is the session id, second is the date, third is the search term used and the 4th is the unique document ID (this is typically a long number but I’ve transformed it for the sake of the analysis).  So, in the above example, a user search for pregnancy UTI and clicked on documents 5, 6, 7 and 8.

Big deal you may say!  Well, I think it is a big deal, I think it has the potentially to be really important.  And here’s why.  The user came to Trip with an intention and they have clicked on documents 5-8.  They have told us that, for their search intention, those documents are linked.  The ‘intention’ bit is vital as search is improved if we have more knowledge about the users intention.  In isolation this might be meaningless, but over thousands of users you get lots of really useful data.  Data that can be analysed. Below is an network map I produced (crudely as I’m not skilled in social network software packages). NOTE: for those eagle-eyed it’s not based on the same data as the example data given above.

Each number represent a unique article and the lines represent a relationship between them (a relationship is formed when a user clicks on the two articles in the same session).As you’ll see there is a structure, and where there is structure there is value.  I’m not talking financial value.

I was convinced that there would be a structure to the clickstream data, which is correct (as shown above) and I’m convinced that there is value in the structure.  That’s the next step, to understand it.  I’m got some help in analysing the key structural elements (of a sample of searches for UTI) and from there, who knows.  I’ll report back ASAP.

In the interim, from the same website as above (Orgnet) I highlight two articles for interest:

Blog at WordPress.com.

Up ↑