Search

Trip Database Blog

Liberating the literature

Automated clinical question answering (ACQA) update 1

Amazingly, in around a fortnight, we have created a rudimentary ACQA system and the initial results are both good and bad! We’ve run through a number of questions and some have been pretty good and others pretty bad. An example of a good ACQA was Is brexpiprazole effective in schizophrenia? with our system generating the following answer:

Brexpiprazole has shown effectiveness in treating schizophrenia based on a post hoc analysis of various short- and long-term studies. These studies demonstrated that brexpiprazole led to clinically relevant improvements in functioning for patients with schizophrenia in both short and long durations. Patients on brexpiprazole experienced greater improvements in Personal and Social Performance (PSP) scores and all PSP domains compared to those on placebo. The Global Assessment of Functioning (GAF) scale also indicated better functional remission for brexpiprazole-treated patients. These improvements were noticeable during the treatment of acute schizophrenia and generally persisted over a one-year period. However, low-dose brexpiprazole might not be as effective as standard-dose brexpiprazole in improving efficacy and safety for acute schizophrenia, potentially carrying a risk of weight gain. Additionally, a study highlighted that brexpiprazole treatment reduced impulsive behavior and associated brain activation in patients with schizophrenia. Overall, brexpiprazole appears to be effective in managing symptoms and improving functioning in schizophrenia.

I’m not saying it’s perfect but pretty good for a first attempt. This answer was all referenced and mostly created from secondary evidence. However, a worse answer was generated for Is zolmitriptan effective in migraines?, which generated this answer:

Zolmitriptan, a medication commonly used for migraines, belongs to the triptan class of drugs. It works by constricting blood vessels in the brain and reducing inflammation. Studies have shown its effectiveness in relieving migraine symptoms such as headache, nausea, and sensitivity to light and sound. It can provide relief within two hours of taking it. However, its effectiveness may vary and it may have side effects. Consultation with a healthcare professional is important. The provided text does not directly address the effectiveness of zolmitriptan in migraines.

But understanding the process and why some answers are so poor have given us a load of learning, some easy to respond to but others less so. Currently, the biggest struggle is transforming the question into appropriate search terms. We’ll continue to use LLMs to try to solve this but an easier route might be to include the users more and make it semi-automatic. In many way I favour this, it gives users more understanding of the process and it’s less ‘black box’. It might add an extra step but the extra reassurance might be worth it.

Early days and all to play for!

Automated clinical question answering

We started the Trip Database due to our work answering clinical questions. Trip has now been running for over 25 years and we’ve never strayed far from clinical question answering. So, it is really pleasing to say that we’ve just started work on an automated Q&A system. More precisely, users can ask a free-text question and we’ll deliver an answer.

In our internal testing the system worked amazingly well, including on the questions deemed clinically difficult. We were able to expose the references used, so indicating the likely robustness of the answer. Our internal testing was done manually, but all the steps can be automated – now we know the process works. So, we now automate it to generate a very basic testing system. When we get there, we’ll ask users to test it and we go from there.

Coupling clinical Q&A with our hierarchy of evidence is really exciting!

Survey results

Hundreds of response meant it was more time-consuming to analyse but here are the headlines:

Profession of respondents

  • 48% doctors/physicians
  • 14% nurse
  • 14% librarian/information specialist
  • 8% academics
  • 16% other

Reasons for using Trip

  • 42% literature review
  • 30% clinical Q&A
  • 15% keeping up to date
  • 8% teaching
  • 5% research

Any suggestions for new sites to add

The top 5 suggestions being:

  • Cochrane
  • Cortellis
  • Duodecim
  • ECRI
  • Embase

Suggested improvements

There were very few that got more than one mention:

  • Advanced search – already on our ‘to do’ list
  • Friendlier presentation – will have to dig deeper to understand that
  • Easier storage of searches – possibly can be rolled in to the advanced search work
  • Make it all free – alas, not possible
  • More full articles – we’re trying our best
  • More Public Health content – we can add extra public health journal content, not sure of any other sources

Is there anything else you would like to share about your experience on Trip?

What was nice to see was overwhelmingly positive comments (including my favourite: “Go further, please, you are unique!“), but a few constructive comments and some negatives:

  • ChatGPT accepts questions and answers in many languages….
  • a DSI (selective dissemination of information) newsletter would be interesting where we ask to stay updated on a topic
  • I often don’t get great results from TRIP – I do try! Don’t get on with advanced or pico search functionality, although
  • Not great. Also need a better advanced search way better; like Ebsco databases
  • Preplexity, elicit and evidence hunt are all great alternatives

New guideline scores added

We launched our guideline scoring system two months ago; as a reminder this is what it looks like:

Since then we have been very busy added scores for another 50 guideline publishers, including:

  • Ministry of Health, Malaysia
  • Scandinavian Society of Anaesthesiology and Intensive Care
  • World Society of Emergency Surgery
  • British Society for Sexual Medicine
  • European Association for the Study of the Liver
  • European Psychiatric Association

We now have scores for almost all the guideline publishers we cover, nearly 300 in total!

Quality and updated synonyms

A user alerted us to a set of poor results on Trip, in this case relating to a search of breast cancer. It took me a few seconds to realise what was wrong:

For a hot topic such as breast cancer you’d expect almost exclusively green results (signifying higher quality). And, one result I was looking for – a recent Cochrane systematic review – was at result #28. So, what was going on?

We figured out it was a synonyms issue. We had mammary as a synonym of breast, so our system saw effectively three terms in the title (breast, cancer and mammary) and felt these were really, really relevant.

We have now edited the synonyms and the results now look as we’d expect:

All green and there are two Cochrane reviews in the top 5.

As it happens this issue only affected a small number of results (it also affected colorectal cancer but to a lesser extent) but for a popular search, such as breast cancer, the impact would have been widespread.

We’re committed to making Trip better and improvements like this are significant steps forward.

Survey time, your feedback matters

User feedback is a cornerstone of Trip’s development and, over the years, it has proved invaluable.

To help us gather user feedback we have devised a short questionnaire. Please help us to make Trip better and take the survey.

CLICK HERE to participate.

Connected articles, a progress update

At the end of last year I posted Connected/related articles to highlight some thinking about combining different connections between documents to help ensure users can quickly and find articles related to the documents they’ve already clicked on. In other words, as users click on documents of interest, we start to collate connected articles to present to the user to help them ensure they’ve not missed important documents.

I have had the pleasure of testing out our text version and it’s really really good. Currently its not had the design treatment but you can start to see the power. I did this search on Trip bisphosphonates prostate cancer and clicked on these results:

  • Contemporary Population-Based Analysis of Bone Mineral Density Testing in Men Initiating Androgen Deprivation Therapy for Prostate Cancer
  • Hypocalcaemia in patients with prostate cancer treated with a bisphosphonate or denosumab: prevention supports treatment completion
  • The role of bisphosphonates or denosumab in light of the availability of new therapies for prostate cancer
  • Use of bisphosphonates and other bone supportive agents in the management of prostate cancer-A UK perspective
  • Bone Health in Patients with Prostate Cancer

The top four being from PubMed and the bottom one a Canadian guideline. Our connected articles outputted the following:

So a lot’s going on in the above screenshot, so some explanations around the score. Note, this is still in the testing phase so these weights/scores are liable to change. The following factors are shown:

  1. N – number of results to display.
  2. Clicks – this is based on our co-click data. If any of the 5 documents I clicked on have been co-clicked in a previous search session this is noted and added to the list of connected articles. We have currently weighted this by a factor of 3, as we feel this is a really important factor.
  3. Rel – stands for related articles. This is only available, at present, for PubMed articles and we extract the related articles from the 5 documents clicked. These related articles are added to the list of connected articles.
  4. Ref – references. We extract the references used from the 5 documents clicked. These referenced articles are added to the list of connected articles.
  5. Cites – this looks to see if any of those 5 clicked documents been cited by other documents. These cited articles are added to the list of connected articles.
  6. Incl – shows if the document is already in the Trip index.
  7. Words – this explores if the connected articles contain the initial search terms in the document title. The more words the documents find in the title the closer we judge it to be to the initial search and therefore the user’s intentions.

So, from the above, points 2-5 are about identifying connected articles while the others are additional factors we’re using.

Here are the top 5 articles that our system generated with hyperlinks:

Some observations:

  • They are all highly relevant documents (although I note that one returned article was one we originally clicked on – this will be fixed before we released this).
  • The connected articles tend to be older. This makes sense as the Trip algorithm favours newer articles so these tend to be shown first. So, connected helps unearth older, important, papers that a user may have missed.
  • The 3rd article in a Cochrane systematic review. Interestingly not the most up-to-date version but it still unearths it and a user can quickly navigate to the latest version.
  • One of the top 5 articles was not in the main Trip index (and in the larger list of 11 documents over half were not in Trip). That was the Cochrane review, we have the more up-to-date version. But it’s nice that the system can highlight possibly important papers outside of Trip.

I’m possibly biased but I’m really excited by the possibilities of this system to help users find the best articles they need. And remember, this is a test system and we have some fixes/improvements to roll out. Once we’ve done that we need to incorporate this into Trip and that highlights the next challenge – the user interface/design. We need to balance making it obvious to users yet not too intrusive. Given how good the system is I consider this a nice problem to have!

De-duplications: more quality improvements

One of the developments strands of Trip is improving the quality of existing content or functionality. De-duplication was mentioned in a recent post on quality and we’re pleased to announce significant progress.

Given the complex nature of Trip and the variety of sources of content, we have generated a number of duplicate records – two (or more) examples of the same article. Often identical but sometimes a link to the abstract and another to the full-text. Having two copies of the same article is good for no-one and just adds ‘noise’ to the search results. To identify and remove these has proved to be a challenging piece of work but we’ve finished the work and identified a total of 143,218 duplicates and these are currently being removed from the index.

Are we now duplicate free? Invariably not, but we’ve probably got the vast majority. But, if you do spot one please let us know.

Up Next

As the de-duplication finishes our next quality issue is to remove articles, from PubMed, that contain no abstract. We never used to include them but with the new system it was overlooked so they’ve crept back in. PubMed articles with no abstract contain no/little actionable information so it adds ‘noise’ to the results and very little ‘signal’.

Introducing our RCT score

Hot on the heels of us releasing our guideline score we’re releasing our RCT score. We’ve been working with the wonderful RobotReviewer team for years now and one of their products is a Risk of Bias (RoB) score for RCTs. We introduced it in 2016 where we classified all the trials into categories of ‘low risk of bias’ and ‘high/unknown risk of bias’. When we recently re-wrote the site we did not immediately include the RoB score. In part this reflects that, since 2016, the thinking and technology has developed considerably. So, we’re very pleased to reintroduce it to the site.

The new score does not categorise the RoB into ‘low’ or ‘high/unknown’, it gives a score based on the likely RoB on a linear scale. We take that score and transform that into a graphic that is similar to that seen on the guideline score:

RCTs are important in the world of EBM and, as with guidelines, they are not all equally good! This score reflects the likelihood of bias and should help our users better make sense of the evidence base.

Blog at WordPress.com.

Up ↑