Search

Trip Database Blog

Liberating the literature

Autosynthesis – an example of the significant challenges ahead?

This was a sobering exercise.

As part of the update of Trip I came across this article Efficacy of 8 Different Drug Treatments for Patients With Trigeminal Neuralgia: A Network Meta-analysis. So, I excitedly went to see how good our automated review system did for trigeminal neuralgia.  On an initial examination, of the 8 interventions we did well in just one – so, 1 out of 8 – that’s a fail in anyone’s book. However, it’s not as it first seems….

Lidocaine – we gave it an overall score of 0.01 (indicating a pretty neutral score). This was based on three very small studies.  As such we discount really small studies due to their inherent unreliability.  The network meta-analysis (NMA) also referenced three studies (but not the same three!):

Of which our system only incorporated the top one.  We included two others:

What confuses me is that the two references we didn’t find – from the network meta-analysis (NMA) – are not specifically about trigeminal neuralgia. So, I’m thinking our result is potentially better than theirs!!  I’ve emailed the author for clarification!

Botulinum toxin type A – we scored it as 0.45 (maximum score is 1) so it fits with their analysis.

Carbamazepine – A big failure on our part, we scored it -0.03. We included two studies of carbamazepine, neither of which belonged there. So, we should have reported no trials. It should not have even featured in our results.

Tizanidine – We scored it -0.03 with our system found a single trial A clinical and experimental investigation of the effects of tizanidine in trigeminal neuralgia which was very small and reported “The limited efficacy of TZD“. It scores near zero as, due to the size, we consider it unreliable and therefore discount the score.

The actual NMA referenced one other study Tizanidine in the management of trigeminal neuralgia.  This is not in the Trip index (failure of our RCT system as it is included in PubMed). And that paper reported “The results indicate that tizanidine was well tolerated, but the effects, if any, were inferior to those of carbamazepine.” so hardly a glowing indictment of the efficacy of tizanidine!

I actually think our assessment is reasonable and it seems a stretch of the paper to report it as being superior to placebo (even if they don’t claim statistical significance).

Lamotrigine – we found no trials.  Trip includes one of the trials the NMA included Lamotrigine (lamictal) in refractory trigeminal neuralgia: results from a double-blind placebo controlled crossover trial but for some reason it wasn’t tagged properly. Something to investigate

Oxcarbazepine – we found no trials and Trip includes no trials, so our system didn’t fail it was due to the fact Trip doesn’t contain all published clinical trials.

Pimozide – we found no trials. Trip includes one of the trials Pimozide therapy for trigeminal neuralgia but for some reason it wasn’t tagged properly. Something to investigate.

Proparacaine – We scored it -0.07 and the NMA reported it as no better than placebo. In hindsight I think this is what our system found. The system compares interventions with placebo. So towards 1 = better than placebo, -1 = worse than placebo and 0 = similar to placebo.

So, having gone through each entry I actually think our system did better than before.

Correct

  • Botulinum toxin type A
  • Proparacaine

Uncertain, I think our system did better than the paper (on the evidence I’ve seen)

  • Lidocaine
  • Tizanidine

Wrong, due to finding no trials with no trials in Trip and not reporting the intervention (so not too bad as we didn’t make any claim on efficacy)

  • Oxcarbazepine

Wrong, due to finding no trials but missing trials in Trip and not reporting the intervention (so not too bad as we didn’t make any claim on efficacy)

  • Lamotrigine
  • Pimozide

Failure, due to us falsely including two trials and making a ‘claim’ for it’s efficacy. It should not have featured at all!

  • Carbamazepine

Conclusion: When I first looked I was fairly depressed by the results. However, now I’ve understood them I’m actually quite pleased.  Of the eight interventions in the NMA we only clearly got one wrong (Carbamazepine) where we wrongly assigned a score.  We omitted giving a score for three (but we should have for two of those Lamotrigine and Pimozide) however, as that does not create any prediction by our system I’m fairly relaxed about it – but will still investigate why.  There are still two unclear results (Lidocaine and Tizanidine) where I actually think our results are better – but will wait to see what the authors report back.

Interestingly the CKS guidance on trigeminal neuralgia (sorry only available in the UK) suggests using carbamazepine as the first line, before stating:

If carbamazepine is contraindicated, ineffective, or not tolerated, seek specialist advice. Do not offer any other drug treatment unless advised to do so by a specialist.

This indicates a lack of faith in any other intervention! CKS reference the NICE guidance on Neuropathic pain in adults which has a section “2.3 Carbamazepine for treating trigeminal neuralgia” which reports:

Carbamazepine has been the standard treatment for trigeminal neuralgia since the 1960s. Despite the lack of trial evidence, it is perceived by clinicians to be efficacious. Further research should be conducted as described in the table below.

So, it’s not surprising there are no trials but the recommendation itself seems to lack an evidence base.

Bottom line: Initial a ‘fail’ but actually a ‘reasonable pass’

 

Advertisements

Automated reviews – explaining some issues using real examples

A reminder, the automated review system is a proof of concept. Using the example of obesity I’d like to point out problems and explain why they are happening. In part this to acknowledge them but more importantly to give a further insight in to how the system works!

Two evidence blobs stood out, to me, antibiotics and probiotics:

Antibiotics:  The positive, low risk of bias, RCT was “Efficacy of prophylactic antibiotic administration for breast cancer surgery in overweight or obese patients: a randomized controlled trial“.  So, our system has mis-classified this by not picking up the breast cancer.  It’s a similar issue with the two other trials included, both are about surgery in obese patients.

I’m going to see if we can exclude trials where two ‘populations’ (breast cancer and obesity) are mentioned for a given trial. Although I wonder if that causes more problems than it solves!

Probiotics: There was a recent systematic review “Effects of probiotics on body weight, body mass index, fat mass and fat percentage in subjects with overweight or obesity: a systematic review and meta-analysis of randomized controlled trials“, it concludes:

Administration of probiotics resulted in a significantly larger reduction in body weight (weighted mean difference [95% confidence interval]; -0.60 [-1.19, -0.01] kg, I2 = 49%), BMI (-0.27 [-0.45, -0.08] kg m-2 , I2 = 57%) and fat percentage (-0.60 [-1.20, -0.01] %, I2 = 19%), compared with placebo; however, the effect sizes were small. The effect of probiotics on fat mass was non-significant (-0.42 [-1.08, 0.23] kg, I2 = 84%).

So, it’s a positive review – albeit it with small effect size. Our system cannot distinguish large or small effect sizes – simply positivity or negativity.  Hence it appears as one of the better interventions!

I’m not sure how to overcome that one…!

Automated review system – known issues

As we find issues with our automated review system we’ll post them here. Keep them coming, we need the feedback!

  • NEWAutosynthesis – an example of the significant challenges ahead?, a new, challenging (for us) blog post, some more real world examples. But actually quite positive in the end.
  • NEW: Automated reviews – explaining some issues using real examples, a blog post highlighting some real world examples of problems, the causes and possible solutions!
  • As noted above none of the automated systems are 100% accurate, although most are around 90% accurate.
  • Sometime incorrect articles are added to the wrong ‘blob’.
  • Sometime our system has not correctly assigned a sample size.
  • Our system is biased towards drug therapies, so certain interventions are ignored for now.
  • The y-axis is labelled likelihood of effectiveness. This is speculative and has not been validated!  It should, perhaps be positivity of evidence

Automated review system – explained

Our automated review system is experimental/proof of concept and should be treated with scepticism. And, to be clear, this is a fully-automated system that relies on techniques such as machine learning and natural language processing (NLP). There are a list of known issues – please read!

At the simplest level the system aggregates interventions that explore the same condition and intervention – be it randomised controlled trials or systematic reviews. Each evidence ‘blob’ indicates a different intervention for a given condition. The system assesses:

  • if the intervention is effective
  • is based on biased data and
  • in the case of RCTs – how big the trial is.

It uses these to arrive at an estimate of effectiveness (visualised by relative position along the y-axis).  Bias is demonstrated by the shade of the evidence ‘blob’.

Now, for the more complex explanation.

Identifying the condition and intervention: We use a mixture of NLP and machine learning to try to extract the condition and intervention elements for all the RCTs and SRs within Trip. At this stage we only use trials with no active comparison – so we only use trials/reviews that are against things like placebo and usual care.

Effectiveness: We use sentiment analysis to decide if the intervention is positive (favours the intervention) or negative (shows no benefit over placebo or usual care).

Sample size: Using a rules based system we identify the sample size of RCTs.

Bias: For RCTs we use RobotReviewer to assess for bias.  Trials are categorised as ‘low risk of bias’ or ‘high/unknown risk of bias’. For SRs we have been pragmatic and cautious.  We have counted Cochrane reviews as low risk of bias and all the rest as high/unknown.

Creating the overall score: For each trial or review we start with the score of either 1 or -1 (positive or negative). We then adjust using the sample size and bias score.

  • Sample size: If the trial is large we don’t adjust on that variable, but the smaller the trial the greater the adjustment. So, a very small trial – due to inherent instability – will score very little.
  • Bias: If the trial has low risk of bias we do not adjust the score further but if it has a bias score of high/unknown we reduce the score further.

So, a large, positive, trial with low risk of bias will score 1 while a very small, positive, trial with a high/unknown bias will score very little (not much more than zero).

We then combine the separate scores, depending on what trials/reviews we find:

  • Only RCTs: The scores are weighted based on the sample size.  If we have two trials, one with a sample size of 100 and a score of 0.20 and another trial with a sample size of 900 and a score of 0.80 we – in effective – create a score based on ((100*0.2) + (900*0.8))/1000 = 0.74
  • Mix of RCTs and SRs: If there is an unbiased SR we take that as a definitive answer and use that score (irrespective of trials beforehand – we assume the SR found those). However, any trials or reviews published the same year or later are used to modify the score (as outlined in the ‘Only RCTs’ scoring system).  So, an unbiased SR with a positive score will have a score of 1.  If any RCTs and SRs (high/unknown risk of bias) which score negatively will bring the overall score down – depending on sample size and bias scores.

Understanding the visualisation

The size of each blob represents the sample size. The larger the blob, the bigger the combined sample size.

The colour of the blob represents how biased the content is. By that we mean the proportion deemed at low risk of bias and the proportion at high/unknown risk of bias. Light green is the lowest risk of bias.

Second level visualisation

If you click on an individual blob it reveals a detailed breakdown of the constituent parts of each blob, showing the individual trials/reviews:

Reminder: even though it shows all the data we find we don’t necessarily use all of it in the scoring of each intervention. See ‘Creating the overall score’ above.

Next stage

To us, this is a proof of concept, and we feel the wider ‘evidence community’ can help guide developments.  However, quality is key so we want to improve the data.  Each of the automated steps is not 100% accurate (although fairly close)  So, we see two immediate needs:

  1. Improve the underlying automation systems – we will move to this shortly.
  2. Allow manual editing. We need to build a system that easily allows ‘wrong’ trials/reviews to be removed and omitted ones added. Again, this is being planned and we have lots of ideas to make it work pretty smoothly.  Assuming people participate we are contemplating allowing users to ‘publish’ their work and we’re talking to publishers about this.

Automated review system – out soon

It’s taken longer than we had hoped but it’s all ready to go! It’s been through a small-scale beta testing round and improvements made.  It should be live before the end of the weekend.

As part of the testing we’ve received feedback on how people might use the, the cautions, possible comms issues etc.  As such we’re going to released as a ‘proof of concept’. This is to help convey the experimental nature of the approach.

Needless to say we’ll let you know when it’s actually live!

GDPR

General Data Protection Regulation (GDPR) is upon us and to reflect that we have updated the privacy notice on Trip.  While GDPR can appear irritating to both site users (being bombarded by privacy notice updates) and site owners (getting GDPR ready) it serves a purpose.

Sites have privileged access to user’s data and there is a need to explain how this data is collected, stored and used.  By doing so users can make informed decisions about site usage.

We have tried to make the information as clear as possible but if you have any questions/concerns please email privacy@tripdatabase.com

Guidelines on Trip

Following the news about the National Guideline Clearinghouse (NGC) having its funding withdrawn we have set about capturing as many USA guidelines as possible (see Guidelines on Trip – moving forward after the demise of NGC).  So, instead of linking to NGC summaries we decided we would link directly to the guidelines on the publishers site. We started migrating all the records last weekend (adding the new links and removing the old ones) and this process is now complete. We now have 3,495 US guidelines.  For comparison NGC has around 1,350 guideline summaries.

At Trip we break guidelines down by geography so the full count is as follows:

  • USA – 3,495
  • Canada – 1,730
  • Australia and NZ – 1,149
  • UK – 3,239
  • Other – 948

So, across the board Trip links to 10,561 guidelines.

This is a unique FREE collection one we hope to develop in the forthcoming months (more to follow in due course).

Taking control of guidelines from the USA

This weekend we are migrating our American guidelines from the National Guideline Clearinghouse to our own collection.  For further information click here.

This is ‘non-trivial’ and there may well be some teething issues.  Still, the sooner we start the sooner we get it right…..

Autosynthesis update

We’ve reached a significant milestone….

We’ve been working away, quietly, on the autosynthesis system and today we shared the beta-version with a small number of testers.  There are still issues relating to data quality, but every day it’s getting better.  By allowing others to test it means we’ve reached a significant milestone.

One things for sure though – I’m nervous about the feedback.  If it goes well we’ll extend the testing to a wider group.

Fingers crossed….

Create a free website or blog at WordPress.com.

Up ↑