Search

Trip Database Blog

Liberating the literature

Author

jrbtrip

Quality Control in Action: Guidelines Reflect Yesterday’s Evidence

Because quality control is central to the growth of AskTrip, we invest a lot of time in it. I’d like to share a couple of examples where we take a systematic review and ask AskTrip the very clinical question the review set out to answer.

Example 1: A Systematic Review And Meta-Analysis Of Randomized Trials Of Therapeutic Intraarticular Facet Joint Injections In Chronic Axial Spinal Pain

AskTrip question: What is the evidence for intra-articular facet joint injections in treating chronic axial spinal pain?

Using the neutral(ish) ChatGPT 5 we asked it to compare the results:

Both sources agree that intra-articular facet joint injections offer at best short-term pain relief in chronic axial spinal pain, with limited or low-certainty evidence and weak/negative support from guidelines. The systematic review/meta-analysis (ONE) takes a narrow, RCT-only lens and downgrades the evidence to Level IV with low certainty, stressing the absence of robust long-term benefit. The broader narrative and guideline-based synthesis (TWO) reaches a similar conclusion but adds clinical context: short-term improvements are sometimes seen, yet effects are transient, major guidelines (e.g., NICE, BMJ) recommend against routine use, and alternatives such as radiofrequency ablation generally provide more durable relief. Thus, while both sources converge on limited efficacy, ONE emphasizes strict evidence grading, whereas TWO highlights comparative effectiveness, guideline positions, and practical considerations such as imaging and safety.

Example 2: Prophylactic Antibiotics for Upper Gastrointestinal Bleeding in Patients With Cirrhosis: A Systematic Review and Bayesian Meta-Analysis

AskTrip question: In patients with cirrhosis and upper gastrointestinal bleeding, should prophylactic antibiotics be administered to reduce mortality or complications like infection or re-bleeding?

ChatGPT5 comparison:

The 2024 systematic review and Bayesian meta-analysis casts doubt on the mortality benefit of prophylactic antibiotics in cirrhotic patients with upper GI bleeding, showing that shorter or no prophylaxis was likely non-inferior for mortality and rebleeding, though antibiotics did reduce reported infections; overall, it highlights low–moderate quality evidence and questions the current 5–7 day guideline standard. In contrast, the AskTrip answer aligns with NICE and earlier meta-analyses, presenting prophylactic antibiotics as evidence-based standard care that reduces mortality, infections, and rebleeding, particularly in decompensated cirrhosis, and recommending 5–7 days of treatment with ceftriaxone or quinolones depending on resistance. Thus, while the SR emphasises uncertainty and possible overtreatment, the AskTrip answer reflects guideline consensus and stronger claims of clinical benefit.

A really interesting finding, our answer reflects current guideline recommendations, which support prophylactic antibiotics in cirrhotic patients with upper GI bleeding. But a new 2025 systematic review questions the mortality benefit and suggests shorter or no courses may be just as effective. It’s a clear example of how new evidence can challenge established guidelines—and why keeping answers under review is so important.

The Key to Efficient Evidence Searching: Structure the Question First

Clinicians waste a lot of time searching for clinical evidence! In AskTrip, we’ve seen that when our automated answers are limited, it’s often not because the evidence doesn’t exist, but because the question itself was too vague.

Evidence searching is like diagnosis: a fuzzy question leads to fuzzy answers. The fastest way to get to the right evidence is to sharpen the question before you even touch the search bar.


Structure the Question: The Foundation of Evidence Retrieval

A clear, well-framed question is possibly the single biggest factor in cutting search time.

A vague query like “asthma treatment” returns thousands of scattered results. Reframed using PICO, the question becomes much more precise: “In children with asthma (Patient), how effective are inhaled corticosteroids (Intervention) compared with leukotriene antagonists (Comparator) in reducing exacerbations (Outcome)?”

This is the PICO framework:

  • Patient (or Problem)
  • Intervention
  • Comparator
  • Outcome

You don’t need every element every time, but just adding a comparator or outcome can transform your results from broad noise to focused evidence.

And to make this even easier, Trip includes a dedicated PICO interface with four search boxes—one for each PICO element. This helps you break down your question into its core components and avoid the common pitfall of vague searching.


Using PICO in Trip

Once you’ve identified the PICO elements, you can:

  • Enter them directly into Trip’s standard search, combining terms to sharpen your results. For the example question the search might be children asthma AND inhaled corticosteroids AND leukotriene antagonists AND exacerbations which generates just 265 results of which 39 are from the higher quality, secondary evidence.
  • Or use Trip’s dedicated PICO interface, which has four search boxes, one for each PICO element. Unlike the standard search, this isn’t designed to be exhaustive. Instead, it aims to return a handful of the most relevant documents, the ones most likely to answer your question quickly.

Looking Ahead: Smarter AI Support

We’re enhancing Trip’s PICO interface with AI and large language model (LLM) tools, so clinicians can automatically uncover more relevant evidence without extra effort.

This is just the beginning. In future blogs, we’ll explore how to speed up other stages of evidence searching—using filters effectively, navigating the evidence pyramid, and more.

Strengthening the link between Trip and AskTrip

Trip began in 1997 from our work in clinical question answering, designed to make answering questions manually more efficient. With AskTrip, we’ve come full circle: Trip is now helping to automatically answer clinical questions.

Last week we began indexing AskTrip Q&As into the main Trip database, allowing users to discover them directly through search:

We’re also developing a system to integrate the two more closely, intelligently suggesting clinical questions as users engage with Trip. Here’s a real example from our test site: when a user searches for “measles”, a broad query, the system responds by suggesting broad clinical questions:

  • What are the common signs and symptoms of measles in pediatric patients?
  • What are the current vaccination recommendations for preventing measles in adults?
  • How should complications of measles, such as measles encephalitis, be managed in hospitalized patients?

The user then scrolls down and clicks on the document ‘Guidance for risk assessment and infection prevention and control measures for measles in healthcare settings‘. This gives the system more ‘signal’ and updates the questions to:

  • What are the current risk assessment strategies for managing measles outbreaks in healthcare settings?
  • What are the recommended infection prevention and control measures for measles in hospitals and clinics?
  • How should healthcare facilities handle exposure to measles in patients and staff?

These new questions reflect the user’s interest. When the user next clicks on ‘Managing measles in asylum seeker accommodation settings‘, the system adapts again::

  • What are the guidelines for risk assessment and infection prevention and control measures for measles in healthcare settings?
  • How should measles outbreaks be managed in asylum seeker accommodation settings?
  • What infection prevention strategies are most effective in controlling measles transmission in densely populated accommodation settings like those for asylum seekers?

This seamless integration makes finding evidence faster and more intuitive. By turning a search into a precise, question-driven process, we’re helping clinicians move quickly from a general need to targeted, evidence-based answers. It’s an effortless way to bridge the gap between searching for information and getting the exact answers you need.

Can the Eye Reveal Alzheimer’s Early? Retinal and Plasma Biomarkers for Detection

One of the things we love about AskTrip is seeing the connections that emerge between different clinical questions.
Recently, four questions came in — seemingly from two different users — that all seemed to circle around a shared theme: the retina, Alzheimer’s disease, and early biomarkers.

That got us curious. Were these topics just coincidentally linked, or was there a meaningful thread here? We pulled the latest evidence from AskTrip to find out.


1. How reliable are retinal OCT measurements?

Two of the questions focused on the reliability of optical coherence tomography (OCT) — both angiography and retinal thickness measurements.
The evidence shows that OCT performs well:

  • Angiography: Intraclass correlation coefficients (ICCs) indicate good repeatability, especially for small scan sizes, though variability can increase with larger scans or different measurement setups.
  • Retinal thickness: High repeatability and strong ICC values across devices, suggesting this is a robust measurement tool for research and clinical monitoring.

Why it matters: For any biomarker to be clinically useful, we need to trust the measurement. These results suggest OCT has the reliability needed for potential inclusion in early-detection pathways.


2. Shared vascular mechanisms in the retina and brain

Another question asked about parallels between vascular leakage in the retina and brain, linked to pericyte dysfunction in Alzheimer’s disease.
The literature points to common processes: pericyte loss or dysfunction undermines vascular integrity, disrupting both the blood–brain and blood–retina barriers.
In Alzheimer’s, this could contribute to early pathology in both systems — supporting the idea that retinal imaging might give a non-invasive “window” into brain health.


3. Plasma biomarkers — how do they compare?

The final question looked at plasma levels of Aβ42/Aβ40 for distinguishing mild cognitive impairment or preclinical Alzheimer’s from normal controls.
Reported area under the curve (AUC) values range from 0.68 to 0.80 — suggesting moderate discriminative ability. This is promising, though still far from a stand-alone diagnostic tool.


Why we’re watching this space

When several AskTrip questions start circling the same idea, we take notice. Here, the connection is clear:

  • Retinal OCT is reliable enough to measure subtle changes.
  • Pathophysiological parallels between retina and brain make those measurements relevant to Alzheimer’s research.
  • Blood biomarkers like Aβ42/Aβ40 add another dimension to early detection efforts.

This combination of imaging and blood-based biomarkers could, in the future, help detect Alzheimer’s before symptoms appear. The evidence isn’t there yet for widespread screening — but the building blocks are taking shape.


Explore the full Q&As:

10 Ways AskTrip Is About to Get Better

We’ve now answered over 1,800 questions, and every week we’re learning more about how to make AskTrip stronger, faster, and more useful.
Here’s a look at what’s in the works and what’s coming next.


Currently in Development (Expected August–September)

ImprovementProblemSolutionBenefit
Limited answers – better optionsWhen little evidence exists, answers can be incomplete.Offer two alternative approaches when evidence is scarce.More useful responses even with limited research.
Smarter related questions“Related questions” aren’t always relevant.Upgrade embedding models for better matches.Find answers faster, with less searching.
Monthly Q&A highlights emailNew/popular Q&As can be missed.Monthly email tailored to interests.Stay up to date automatically.
Save your Q&AsUseful Q&As can be hard to find again.Bookmark questions for quick access.Instantly revisit important answers.
Prompt upgrade for better recallSome relevant documents aren’t retrieved.Implement upgraded prompt for higher recall.More complete, evidence-rich answers.

Next on the List (Planned for late September)

ImprovementProblemSolutionBenefit
Dedicated drug informationDrug answers can lack specialist detail.Add targeted medicines content.More precise and clinically useful drug info.
Consistent AnswersSimilar questions give inconsistent answers.Reuse past high-quality answers.Greater accuracy and consistency.
Export answersNo easy way to share/store Q&As.Download as PDFs and other formats.Easy sharing, offline access, and record-keeping.
Citation suggestionsUsers lack ready citation formats.Auto-generate “How to cite this” statements.Saves time and ensures accurate referencing.
Transparency modeThe answer process isn’t visible.Show key steps in generating answers.Builds trust through openness.

Why we’re excited:
The prompt upgrade, dedicated medicines content, and answer consistency will have the biggest impact on quality – especially the new prompt, which is already showing excellent results in testing. Once these updates are live, AskTrip will be more accurate, robust, and user-friendly, paving the way for bigger ideas like educational modules, deeper topic reviews, and even the occasional quiz 🙂

Adding AskTrip to your mobile phone (iPhone/Android) or tablet

Want AskTrip to work just like an app? You can!

Add it to your home screen for one-tap access, full-screen view, and instant updates — with no storage space needed.

Below are the steps for iPhone (iPad is the same). Android instructions are at the bottom.

Step One – open Safari (or your preferred browser) and navigate to AskTrip:

Step Two – if you don’t see the Share icon at the bottom of Safari, scroll or swipe up slightly and it will reappear. Tap the Share icon:

Step Three – in the menu that appears, scroll down and tap Add to Home Screen:

Step Four – The default name will appear as Trip Database. You can rename it to AskTrip before pressing Add. Note: We’ll be adding the AskTrip icon soon — for now, you’ll see the Trip icon.

Step Five – That’s it! AskTrip will now appear on your home screen. Tap it anytime to launch – just like any other app.

Android Instructions

The process is similar:

For Android (Chrome)

  1. Open AskTrip in Chrome.
  2. Tap the ⋮ menu (top right).
  3. Select Add to Home Screen or Install App.
  4. Tap Add (and confirm if prompted).
    ✅ Done! AskTrip now launches in full-screen mode.

Updating PICO

The current PICO search has not much changed since it was launched in 2012

At launch

Currently:

So, there has been a design change but the underlying mechanism has not changed much ‘under the hood’ (read about it here). Well, we’re currently working on enhancing it… I’ve not loved the feature for two main reasons:

  • I was answering clinical questions before PICO was widely known. The history is vague but it’s believed it was first described between 1995- 1997 but was not widely adopted till the early 2000s. By then I had answered thousands of questions and was comfortable with converting a question to search terms. However, I acknowledge I’m an edge case!
  • It was sometimes disappointing with the results

I was pleased to have the opportunity to trial a new approach—combining our existing method with a newer one that embraces AI and large language models (LLMs). Interestingly, this wasn’t the original intention. I had assumed we would replace the old with the new, but testing has shown that this may be sub-optimal.

To illustrate, consider this PICO example:

P – deep vein thrombosis
I – D-dimer
C – ultrasonography
O –

Using both approaches, we identified several overlapping articles. However, each method also surfaced four relevant articles that were unique to it:

Current PICO system

  • Serial 2-point ultrasonography plus D-dimer vs whole-leg color-coded Doppler ultrasonography for diagnosing suspected symptomatic deep vein thrombosis: a randomized controlled trial
  • A randomized trial of diagnostic strategies after normal proximal vein ultrasonography for suspected deep venous thrombosis: D-dimer testing compared with repeated ultrasonography
  • D-dimer testing as an adjunct to ultrasonography in patients with clinically suspected deep vein thrombosis: prospective cohort study
  • Left Rule, D-Dimer Measurement and Complete Ultrasonography to Rule Out Deep Vein Thrombosis During Pregnancy

AI/LLM approach

  • Safety of D-dimer as a stand-alone test for the exclusion of deep vein thrombosis compared to other strategies
  • Lower-Extremity Venous Ultrasound in DVT-Unlikely Patients with Positive D-Dimer Test
  • Comparison of the Accuracy of Emergency Department-Performed Point-of-Care-Ultrasound (POCUS) in the Diagnosis of Lower-Extremity Deep Vein Thrombosis
  • Test Characteristics of Emergency Physician-Performed Limited Compression Ultrasound for Lower-Extremity Deep Vein Thrombosis

We’re excited to be releasing this updated feature over the summer. It’s been a rewarding challenge to modernise such a longstanding system by integrating cutting-edge AI and LLM technology. While the core mechanism remains familiar, the enhancements deliver a clear improvement, broadening the scope of results and offering deeper insights. It’s a great example of how old and new can work better together than either alone.

What Scaling Taught Us About AskTrip

The quality of AskTrip’s answers is fundamental to earning users’ trust, and we’ve recently shared the key areas we’re focusing on to make improvements [see: 1,400 Qs = lots of learning]. But as we’ve passed the 1,500-question mark, something important has become clear: scaling up is revealing issues that weren’t visible in our earlier testing.

We manually review every Q&A and flag any that we feel don’t meet our standards. So far, we’ve identified 13 clear failures – less than 1% of the total. We suspect there are a similar number that aren’t outright bad, but also not good enough. So let’s say 26 out of 1,500 (about 1.7%) are sub-optimal. While that’s a small number, we’re determined to drive it down further.

As noted in our previous post, we’re analysing these issues closely and have already identified concrete steps that should lead to significant improvements. But this phase has also highlighted a broader insight: these kinds of flaws only emerge at scale.

Just as randomized controlled trials often lack the power to detect rare side effects, early pilots of AI systems – like our initial 250-question evaluation – can miss edge-case failures. It’s only through broader, real-world use that such issues surface. And that’s invaluable. These findings help us better understand the limits of our system and guide the next wave of improvements.

We increasingly see AskTrip as a journey. The launch went well, and now we’re building on that strong foundation with meaningful refinements. Will it ever be perfect? Probably not. But our commitment to continual improvement is unwavering.

It’s been an incredibly rewarding learning process so far—here’s to the next 1,500 questions.

AskTrip – versión en español – ya está disponible

Click here to try it now!

NOTE: The site is running a bit slow as the system is working hard on the translations!

Blog at WordPress.com.

Up ↑