One thing I’ve been working on recently has been an ultra-rapid review system, based on machine learning and some basic statistics.  In a nutshell can we take multiple abstracts, ‘read’ what they’re about and combine the results to give a ‘score’ for the intervention?  More importantly, will any score actually be meaningful?

Our latest version of the system is pretty robust and the significant amount of machine learning has proved beneficial.  So, to start testing it I thought I’d use real data.  To do this I took a relatively random selection of Cochrane Systematic Reviews, with these ‘notes’:

  • I typically avoided classes of drugs, focusing on single interventions.
  • Our system is optimised for placebo-controlled trials, so that guided my selection.
  • I used recently released systematic reviews.

So, our system works in the following step-wise way:

  1. Add search terms via two boxes: Population (e.g. diabetes, acne) and Intervention (e.g. antibiotics, zinc).
  2. From the list of results, select (via a tick box) which articles are relevant to the search.
  3. Press the ‘Analyse’ button and wait (about 5 seconds) for a result.  The result ranging from +1 to -1.  From that result we’ve started to think about assigning a narrative conclusion, as you’ll see from the table below.

So, the results are below, using ten Cochrane Systematic Reviews as a test

Our score
Our narrative conclusion
Cochrane conclusion
Agree?
0.02
unclear benefit
Paediatric oncology patients receiving chemotherapy are able to generate an immune response to the influenza vaccine, but it remains unclear whether this immune response protects them from influenza infection or its complications. We are awaiting results from well-designed RCTs addressing the clinical benefit of influenza vaccination in these patients
Y
-0.02
unclear benefit
We found insufficient evidence to determine whether acupuncture is effective for controlling menopausal vasomotor symptoms. When we compared acupuncture with sham acupuncture, there was no evidence of a significant difference in their effect on menopausal vasomotor symptoms. When we compared acupuncture with no treatment there appeared to be a benefit from acupuncture, but acupuncture appeared to be less effective than HT. These findings should be treated with great caution as the evidence was low or very low quality and the studies comparing acupuncture versus no treatment or HT were not controlled with sham acupuncture or placebo HT. Data on adverse effects were lacking.
Y
-0.12
unclear benefit
Opioids may be an appropriate choice in the treatment of acute pancreatitis pain. Compared with other analgesic options, opioids may decrease the need for supplementary analgesia. There is currently no difference in the risk of pancreatitis complications or clinically serious adverse events between opioids and other analgesia options.Future research should focus on the design of trials with larger samples and the measurement of relevant outcomes for decision-making, such as the number of participants showing reductions in pain intensity. The reporting of these RCTs should also be improved to allow users of the medical literature to appraise their results accurately. Large longitudinal studies are also needed to establish the risk of pancreatitis complications and adverse events related to drugs.
~
0.21
unclear benefit
There is reliable evidence that topical application of tranexamic acid reduces bleeding and blood transfusion in surgical patients, however the effect on the risk of thromboembolic events is uncertain. The effects of topical tranexamic acid in patients with bleeding from non-surgical causes has yet to be reliably assessed. Further high-quality trials are warranted to resolve these uncertainties before topical tranexamic acid can be recommended for routine use.
~
0.39
likely to be effective
Limited data were available when considering the impact of galantamine on vascular dementia or vascular cognitive impairment. The data available suggest some advantage over placebo in the areas of cognition and global clinical state. In both included trials galantamine produced higher rates of gastrointestinal side-effects. More studies are needed before firm conclusions can be drawn.
Y
-0.37
likely to be ineffective
Omega-3 fatty acids appear to have little haematological benefit in people with intermittent claudication and there is no evidence of consistently improved clinical outcomes (quality of life, walking distance, ankle brachial pressure index or angiographic findings). Supplementation may also cause adverse effects such as nausea, diarrhoea and flatulence. Further research is needed to evaluate fully short- and long-term effects of omega-3 fatty acids on the most clinically relevant outcomes in people with intermittent claudication before they can be recommended for routine use.
Y
0.25
unclear benefit
The value of steroids in the treatment of idiopathic sudden sensorineural hearing loss remains unclear since the evidence obtained from randomised controlled trials is contradictory in outcome, in part because the studies are based upon too small a number of patients.
Y
0.08
unclear benefit
According to the results, there is no evidence from randomised controlled trials to indicate any benefit of zinc supplementation with regards to serum zinc level in patients with thalassaemia. However, height velocity was noted to increase among those who received this intervention.There is mixed evidence on the benefit of using zinc supplementation in people with sickle cell disease. For instance, there is evidence that zinc supplementation for one year increased the serum zinc levels in patients with sickle cell disease. However, though serum zinc level was raised in patients receiving zinc supplementation, haemoglobin level and anthropometry measurements were not significantly different between groups. Evidence of benefit is seen with the reduction in the number of sickle cell crises among sickle cell patients who received one year of zinc sulphate supplementation and with the reduction in the total number of clinical infections among sickle cell patients who received zinc supplementation for both three months and for one year.The conclusion is based on the data from a small group of trials,which were generally of good quality, with a low risk of bias. The authors recommend that more trials on zinc supplementation in thalassaemia and sickle cell disease be conducted given that the literature has shown the benefits of zinc in these types of diseases.
Y
0.22
unclear benefit
Meta-analysis demonstrates that topiramate in a 100 mg/day dosage is effective in reducing headache frequency and reasonably well-tolerated in adult patients with episodic migraine. This provides good evidence to support its use in routine clinical management. More studies designed specifically to compare the efficacy or safety of topiramate versus other interventions with proven efficacy in the prophylaxis of migraine are needed.
N
0.06
unclear benefit
For people with the common cold, the existing evidence, which has some limitations, suggests that IB is likely to be effective in ameliorating rhinorrhoea. IB had no effect on nasal congestion and its use was associated with more side effects compared to placebo or no treatment although these appeared to be well tolerated and self limiting. There is a need for larger, high-quality trials to determine the effectiveness of IB in relieving common cold symptoms
N

Those marked with a ~ means I’m unsure if they are right or not (possibly a shortcoming of the narrative system I’ve used).

But, two clearly wrong (the bottom two), one could argue that’s not too bad.  However, I did have a dig round to see why they might have been wrong and found that:

  • The system analysed 17 trials, two were assessed wrong – so if re-assessed the score = 0.31 (likely to be effective)
  • Analysed 3 trials, one was assessed wrong – so if re-assessed = 0.28 (likely to be effective)

I actually take this as a positive (the initial incorrectness, followed by the subsequent correctness!).  The current testing system is not the finished article, that should be available in 2-3 weeks.  This will improve on the above in two main ways:

  • It will use two different types of machine learning to assess results (in this test we used a single type), making it easier to identify wrongly classified results.
  • The system will make it much easier to edit our systems assessment of the scores.

In other words, the newer system will make it much easier to deal with the issues that caused the incorrect assessment of results.

In conclusion, this is early days and our first testing steps.  The results have been very encouraging and when our new system is out it’ll be even better.  But much more testing is required!

Oh yes, the time taken – if you’re interested, then scroll down.

With the exception of the second to last result they all took around 2-3 minutes.  The second to last one took approximately 5 minutes (as I had to scroll through around 55 results to select the 17 that we used).