NOTE: since publishing this we have a linked post Automated reviews – why?, which compliments this one!

This work has been supported via the KConnect project (a Horizon 2020 EU-funded project) and on Wednesday we had to present the work of the whole consortium at the EU in Luxembourg.  The response was overwhelmingly positive and so I wish to share a bit more of the work.

A slight bit of context; I have been involved in a number of automation projects and these were typically seen as methods to supplement the human methodology.  For instance speeding up risk of bias assessment via tools such as RobotReviewer. While these are great initiatives I wanted to explore what could be done fully automatically.

At Luxembourg we presented our ‘product’. The product is a system that automatically synthesises randomised controlled trials (and potentially systematic reviews).  It is not finished but we have a very good first effort.  I’m not sure whether to classify it as a ‘proof of concept’ or ‘alpha’ version – not sure it matters!

All the images below are based on asthma and I have deliberately blurred out the intervention names.  The reason being that we are still improving the results (more below) and I would hate to think people will be put off the system by making judgements based on a system that has yet to be optimised.

The first image is the current default result which is to present the interventions (y-axis) alphabetically while the likely effectiveness is presented on the x-axis. Note, as there are an awful lot of interventions for asthma we’ve can’t show them all – so we’re simply showing a single screen-grab, the actual graph is considerably taller!

To orientate you:

  • Each ‘blob’ represents a single intervention.  The size of the blob indicates the total population used in the trials.  So, a big blob indicates more participants in the various trials.
  • The horizontal positioning is based on our estimate of effectiveness (not effect size) and the further right it is the more effective the system estimates the intervention to be.  This is further indicated by the colours – green being better and red worse (traffic light colouring)!
  • Above the graph there is the ability to sort the interventions by number of trials, sample size, score etc.
  • The sample size refinement allows users to exclude trials that are below the size entered – as we know small trials tend to be less reliable.
  • The risk of bias allows you to automatically remove trials that are not considered low risk of bias – so another reliability measure.

This second graph shows the results arranged by effectiveness:

And, to reiterate, this is fully automatic and always up to date.  As new RCTs are published (PubMed only at present) they will be automatically added.  To me that’s incredibly cool.

While I would love to share this more widely there is still some work to do before I’m happy to open it up.  In internal testing we have identified a number of areas that could, realistically, be improved.  Nothing major, we’re just being sure we’ve done as much as we can – under the fully automatic banner – to ensure the biggest impact.

As far as I know, this is the first fully automated evidence synthesis/review tool. This is such a disruptive bit of technology people will need to be convinced of it’s worth and that will come with use and understanding.  David Moher wrote an editorial about the various synthesis methods being part of a family.  Is our technique the screaming baby of evidence synthesis, the eccentric uncle, the angry adolescent? You tell me….!