Search

Trip Database Blog

Liberating the literature

HTML Scissors

When I first started in clinical Q&A nearly 30 years ago with ATTRACT, we often received questions from general practitioners that I knew could be answered by the excellent clinical guidelines available at the time (I think they were called Prodigy then). The challenge wasn’t the lack of guidance – it was that the guidelines were long, and pinpointing the relevant section was difficult. For many questions, our real task was simply to extract the key information buried within a mass of content, most of which wasn’t directly relevant.

Even then, I felt that if the guidelines were broken into bite-sized pieces, they would be far easier to use. I used to talk about taking a pair of “HTML scissors” to cut them up, so GPs could more easily find the specific information they needed for themselves.

Fast forward to today, and at AskTrip we face a related challenge – one that has reminded me of those early “HTML scissors” conversations. Our system searches documents and sends the entire text (guidelines, systematic reviews, and so on) to the AI model, asking it to identify and extract the relevant passage. If a document happens to be 5,000 words long, this process takes time – and incurs unnecessary computational cost – just to locate the key section.

By coincidence, the idea behind those old “HTML scissors” has become a recognised approach in modern information retrieval. It’s now a standard technique, widely used in AI pipelines, and it even has a name: chunking.

Chunking divides large documents into smaller, coherent sections to make them easier and faster to process. Instead of treating a guideline as a single 5,000-word block, chunking breaks it into major thematic units – such as causes, diagnosis, initial management, monitoring, or special populations. Within each of these larger chunks, the content can be divided even further into sub-chunks, which capture more granular pieces of information. For example, a diagnosis chunk might be split into sub-chunks for individual diagnostic tests, criteria, red flags, and decision pathways. These sub-chunks retain enough local context to stand alone, allowing the AI system to pinpoint very specific information without processing the entire guideline or even the full section.

The result is faster retrieval, lower computational cost, and more accurate matching between a clinician’s question and the part of the guideline that truly answers it. Because the AI is working with smaller, well-defined blocks of text, it can zero in on precise details – such as a dosing adjustment, a diagnostic threshold, or a management step – without being distracted by the surrounding material. This not only reduces latency and improves user experience but also increases reliability: the system is less likely to miss key details or return irrelevant passages, making the overall process both more efficient and more clinically useful.

So, our next major improvement to AskTrip is the introduction of chunking for large documents. This will allow us to deliver clearer, more precise answers, generated more quickly and at a much lower computational cost. And we’re not stopping there. To push performance even further, we’re developing vector search to improve how we target the most relevant chunks in the first place. I’ve written a brief explanation of vector search already, and I’ll share more updates as this work progresses—but together, these advances mark a significant step forward in making AskTrip faster, smarter, and more efficient for everyone who relies on it.

New on Trip: Linking RCTs to Trial Registrations and Systematic Reviews

Released today: We’ve added a new feature to Trip that helps you understand clinical trials in their full context. When you view an RCT, Trip now automatically attempts to links to:

  • its ClinicalTrials.gov registration, and
  • any systematic reviews that include the study.

This makes it easier to verify protocols, spot outcome discrepancies, and see how a trial fits into the wider evidence base – all without extra searching. This is how it looks:

In the top RCT we can see it links to 3 trial registrations. The second RCT links to 1 trial registration and is linked to 4 systematic reviews. And, finally, for the 3rd RCT we have not been able to find a trial registration or an inclusion in a systematic review. NOTE: Just because we can’t find a trial registration it doesn’t mean it’s not been registered, it simply means we have been able to identify it using the scraping technology we’ve employed.

If you click on the ‘Details’ link a drop-down appears:

This is really cool and it’s part of our ongoing effort to make high-quality evidence quicker and easier to use.

What Is Vector Search?

Vector search is becoming increasingly prominent. At Trip we’re exploring its use and – in the spirit of transparency – we wish to share insights into what it is and how it differs from keyword/lexical search! And, to be clear, we’re at the start of the journey…!

From Keywords to Concepts: How Vector Search Is Changing Information Retrieval

For decades, information retrieval has been built on keyword search — matching the words in a user’s query to the same words in documents. It’s the logic behind databases, search engines, and Boolean queries, and it has served information specialists well, particularly when controlled vocabularies like MeSH are used.

But language is slippery. Two people can describe the same idea in very different ways — “heart attack” vs. “myocardial infarction,” “blood sugar” vs. “glucose.” Keyword search struggles when users and authors use different terms for the same concept.

That’s where vector search comes in — a new approach that focuses on meaning rather than exact wording.

What Is Vector Search? (An Intuitive Explanation)

At its core, vector search represents meaning mathematically.
Instead of treating text as a bag of words, it converts language into numbers that capture relationships between concepts.

This transformation happens in three main steps.


1. Text to Vectors — Turning Language into Numbers

The starting point is a language model — a type of AI system trained on vast amounts of text (for example, research papers, books, and web content). During training, the model learns how words appear together and in what contexts. Over time, it builds a kind of map of language, where meanings cluster naturally.

Here’s how this works in practice:

  • Words that often appear in similar contexts, such as doctor and physician, end up close together in this semantic map.
  • Words that rarely co-occur or belong to very different contexts, like insulin and wheelchair, are far apart.

When text is processed by the model, each sentence or paragraph is represented as a vector — a list of numbers indicating its position in this high-dimensional space.
For instance:

  • “High blood pressure” → [0.13, -0.45, 0.77, …]
  • “Hypertension” → [0.12, -0.47, 0.75, …]

These numbers are coordinates on hundreds of “meaning axes” that the model has learned automatically. While humans can’t easily interpret each axis, together they capture how phrases relate semantically to everything else in the model’s training data.

You can think of these dimensions as encoding things like:

  • Whether the phrase is medical or general
  • Whether it describes a disease, treatment, or symptom
  • Its relationships to concepts such as “cardiovascular” or “chronic condition”

If two texts have vectors that are close together, it means the model recognises that they have similar meanings.

So:

  • “High blood pressure” and “hypertension” → almost identical
  • “High blood pressure” and “low blood pressure” → related but opposites
  • “High blood pressure” and “migraine” → far apart

This process — called embedding — is how modern AI systems move from words to concepts.


2. Measuring Similarity

When a user searches, their query is also converted into a vector. The system then compares that query vector to every document (or passage) vector in its database using a measure of semantic closeness, often called cosine similarity.

The closer two vectors are, the more related their meanings. This allows vector search to identify results that discuss the same idea even when the words are completely different.

For example, a query about “lowering blood pressure without medication” might retrieve:

  • Trials on “lifestyle modification for hypertension”
  • Reviews of “dietary sodium reduction”
  • Cohort studies on “exercise and cardiovascular risk”

— even if the exact phrase “lowering blood pressure without medication” doesn’t appear in any of those documents.


3. Returning Results

Instead of relying on literal matches, vector search retrieves the documents (or parts of documents) closest in meaning to the user’s query.

In contrast:

  • Keyword search finds what you said.
  • Vector search finds what you meant.

How It Differs from Keyword Search

FeatureKeyword SearchVector Search
BasisExact word matchingConceptual similarity
StrengthsTransparent, precise, good for controlled vocabulariesFinds semantically related content, handles synonyms and context
WeaknessesMisses relevant material with different wordingMay surface loosely related material if not tuned carefully
Good forNarrow, well-defined, reproducible queriesExploratory or question-based searching

Many systems now use hybrid search, combining keyword and vector methods. Keywords help with precision and reproducibility; vectors help with recall and conceptual understanding.


Why It Matters for Information Specialists

For information professionals, vector search introduces both power and complexity.
It enables:

  • Retrieval of semantically related evidence, even when vocabulary differs.
  • More natural-language searching — closer to how users think and ask questions.
  • The foundation for AI-driven Q&A tools, where the system retrieves and synthesises the most relevant evidence rather than just listing papers.

But it also brings new challenges:

  • Relevance can be fuzzier and harder to explain.
  • Transparency and reproducibility — essential in evidence-based work — need careful handling.
  • Understanding how a system defines “similarity” becomes as crucial as knowing how it handles Boolean logic or MeSH terms.

The Bottom Line

Vector search doesn’t replace traditional methods — it expands them.
It’s a bridge between human language and machine understanding.

In short:

Keyword search finds the words. Vector search finds the meaning.

Together, they represent the next chapter in evidence discovery and retrieval — one that blends linguistic nuance, AI, and the information specialist’s craft.

Further improvements to AskTrip

We have just rolled out a batch of improvements to AskTrip, with three main changes:

  • Medicines information
  • Answer consistency
  • Improving the efficiency of Beyond Trip

Medicines Information

Previously answers about medicines (e.g. side effects, dose etc) relied on the reports in the research literature. This was fine, to a point, but we realised dedicated information was required. So, now, if we receive a question about medicines we include the relevant content from the DailyMed and openFDA. Both are great medicine resources.

Answer consistency

AI can be a bit inconsistent at times (it’s described as being non-deterministic) and this can manifest itself by giving slightly different answers and using different references for the same or very similar questions. Typically, these differences are small – often just nuances – but they can still feel a bit unsettling! So, we’ve introduced something we call reference stripping. In essence, when we receive a question that’s very similar to a previous Q&A, we ensure the new answer takes the earlier references into account, boosting consistency across responses.

Improving the efficiency of Beyond Trip

Beyond Trip was proving quite expensive to run, so we needed to find ways to reduce costs. Previously, the system reviewed all of the top search results we found. But we soon realised that “top” didn’t always mean relevant. Many results near the top of the list weren’t particularly useful for the actual query.

To fix this, we introduced an extra step to exclude results that are likely to be irrelevant. The remaining results are then reviewed sequentially until we’ve gathered enough evidence for a solid answer. This approach reduces costs and brings a small but welcome speed boost.

AskTrip: Cluster Reviews

Clinical questions frequently form natural clusters – variations on a theme that together reveal a richer, more connected picture of evidence. For example, questions about TSH and lifestyle might include sleep, exercise, diet, stress, and psychosocial factors – each distinct, yet interrelated.

One approach we’re exploring to capture these connections is cluster reviews – analyses that group related clinical questions to uncover overarching patterns in evidence and practice. These reviews would take a bottom-up approach, grounded in real clinical questions asked by health professionals. Unlike traditional top-down reviews that begin with predefined topics or published frameworks, cluster reviews are shaped by the real-world information needs that emerge in clinical settings, offering a practice-driven view of the evidence landscape.

We’re experimenting with an interactive cluster review that brings these related Q&As together into a single, navigable experience. It allows clinicians, researchers, and learners to see how different lifestyle and psychosocial factors intersect, and to identify where evidence is thin or emerging.

The goal is to make evidence engagement interactive, modular, and cumulative – each review builds on previous answers, creating a living, evolving knowledge map rather than static summaries.

You can explore the first prototype, Lifestyle, Psychosocial, and Behavioral Influences on TSH Levels, through the interactive review – and we welcome your feedback on how this could best support your evidence needs. And, to be clear, this is a simple prototype, if we go ahead with this we would work hard to make the design wonderful 🙂

CLICK HERE TO EXPLORE

If you have any specific comments, such as how to improve this, please leave a comment or email me jon.brassey@tripdatabase.com

From 10,000 Q&As in 15 Years to 5,000 in 16 Weeks: The Evolution of Evidence Access with AskTrip

AskTrip has just reached a remarkable milestone – 5,000 clinical questions answered in under 20 weeks. On its own, that’s an impressive figure. But the real story lies in the contrast with our early work and what this achievement represents for evidence-based healthcare.

From Manual Q&A to the Digital Frontier

Back in 1997, we launched ATTRACT, one of the world’s first evidence-based Q&A services for clinicians. It was followed by the National Library for Health Q&A service – both pioneering efforts, the latter ran until around 2012.

Across those 15 years, our teams of information specialists and clinicians answered around 10,000 clinical questions. Each one required 4–6 hours of careful searching, appraisal, and synthesis – a manual, time-intensive process, but one that had a huge impact on clinical decision-making.

Those services were driven by a simple belief: that busy clinicians should have quick, trusted access to the best available evidence to inform patient care. That belief remains unchanged today.

Trip’s Core Mission: Connecting Clinicians with Evidence

The Trip Database was originally created to support the work of ATTRACT, providing rapid access to high-quality evidence for the team answering clinical questions. Over time, it became clear that Trip could serve a much wider audience – helping clinicians everywhere find reliable evidence efficiently.

From those early days, Trip has always been about one thing: connecting clinical decision-makers with the best available evidence.

Over the years, it has evolved from a focused evidence search tool into a comprehensive evidence ecosystem, helping millions of users around the world find trustworthy answers faster. AskTrip is the latest, and perhaps most exciting, chapter in that ongoing story.

AskTrip: A Natural Extension of Trip’s Mission

AskTrip builds directly on Trip’s foundations but uses a radically different interface – natural language. Clinicians can now simply type a question such as “What’s the best treatment for resistant hypertension in pregnancy?” and receive a clear, concise, evidence-based summary in seconds.

What previously took hours or days of searching can now be achieved almost instantly. Yet the principles that underpin AskTrip are the same as ever: reliability, transparency, and a commitment to evidence, not opinion.

AskTrip doesn’t replace human judgment or the careful reading of full studies – it amplifies access to trusted information when it’s needed most.

A Shift in Scale

The numbers tell the story:

  • 10,000 Q&As in 15 years through manual services like ATTRACT and the NLH Q&A service.
  • 5,000 Q&As in less than 20 weeks through AskTrip.

That’s not just efficiency – it’s accessibility at scale. Thousands of clinicians have been able to get quick, high-quality answers to their clinical questions, helping improve decision-making in real-world settings.

Looking Ahead

This milestone is more than a statistic; it’s a reflection of how far evidence-based medicine has come – and how technology can help accelerate it without compromising quality.

AskTrip represents the next step in a journey that began nearly three decades ago. The tools may have changed, but the mission remains constant: to connect clinical decision-makers with the best available evidence, as quickly and clearly as possible.

We’re incredibly proud of how far we’ve come – and even more excited about what lies ahead.

AskTrip Upgrade: Smarter, Broader, and More Accurate

We’re excited to share an important update to AskTrip – not quite a version 2, but definitely a strong v1.5. This upgrade builds on what’s working well, while tackling some of the challenges we’ve seen since launch. The result: better answers, more trustworthy evidence, and less noise.

What’s New?

1. More Evidence, More Coverage

AskTrip now considers a wider pool of articles when building answers. This means you’ll benefit from a broader sweep of relevant studies, ensuring that useful evidence doesn’t get missed.

2. Smarter Evidence Extraction

We’ve upgraded both the prompts and the large language model behind AskTrip. These improvements sharpen how the system extracts evidence from research, cutting through complexity to surface the insights that matter.

The payoff? More accurate answers and fewer hallucinations.

3. Improved Quality Scoring

Our enhanced quality score system better balances study design, recency, and relevance. That means you’ll see more reliable evidence, ranked in a way that helps you judge its strength quickly and confidently.

4. Beyond Trip: Smarter Sourcing

Sometimes the evidence available in Trip isn’t enough. With this update, AskTrip automatically extends the search to Google Scholar and OpenAlex when needed – giving you access to a wider world of research without leaving the platform.

Why This Matters

Every improvement we make to AskTrip is guided by one principle: helping health professionals make faster, better-informed decisions. With v1.5, you’ll get answers that are broader in scope, more accurate, and underpinned by higher-quality evidence.

We’ll continue refining AskTrip in response to your feedback, so please keep letting us know what works and where we can improve.

Una disculpa a nuestros usuarios de habla hispana de AskTrip

En AskTrip, nuestro objetivo es hacer que la evidencia de alta calidad sea accesible para los profesionales de la salud en todo el mundo, sin importar el idioma. Para nuestros usuarios hispanohablantes, esto significa que traducimos su pregunta al inglés, la procesamos en el sistema de AskTrip y luego traducimos la respuesta nuevamente al español.

Recientemente descubrimos un problema en la forma en que se gestionaban las preguntas en español dentro de AskTrip. Esto ocasionaba dos situaciones principales:

  1. La pregunta aparecía en español, pero la respuesta se mostraba en inglés.
  2. Tanto la pregunta como la respuesta aparecían en español, pero los términos de búsqueda también se procesaban en español.

El segundo caso era especialmente problemático. Dado que la base de evidencia de Trip está en inglés, realizar búsquedas en español devolvía pocos resultados o, en muchos casos, resultados de muy baja calidad.

Durante el fin de semana implementamos una solución y, desde entonces, no hemos detectado más casos de este problema. Confiamos en que ha quedado resuelto.

Queremos disculparnos sinceramente con nuestros usuarios de habla hispana por esta incidencia. A partir de ahora, deberían notar una clara mejora en la calidad y coherencia de su experiencia en AskTrip.

English Translation: An Apology to Our Spanish-Language Users of AskTrip

At AskTrip, we aim to make high-quality evidence accessible to health professionals worldwide, regardless of language. For our Spanish-speaking users, this means we translate your question into English, process it through the AskTrip system, and then translate the answer back into Spanish.

Recently, we discovered an issue in how Spanish questions were handled within AskTrip. This led to two main problems:

  1. The question appeared in Spanish, but the answer was displayed in English.
  2. Both the question and the answer were in Spanish, but the search terms were also processed in Spanish.

The second issue was especially problematic. Since Trip’s evidence base is in English, running searches in Spanish returned little – or in many cases, poor – results.

We implemented a fix over the weekend, and since then we’ve seen no further cases of the problem. We’re confident it’s resolved.

We sincerely apologise to our Spanish-language users for this disruption. From now on, you should notice a clear improvement in the quality and consistency of your AskTrip experience.

Introducing Beyond Trip: Expanding the Evidence Horizon

Sometimes the best answer isn’t within Trip’s core collection. That’s why we’ve introduced Beyond Trip, a new feature designed to broaden the search and deliver stronger, more reliable answers when evidence is limited.

How It Works

Beyond Trip is automatically triggered when AskTrip produces an answer that’s judged to be poor:

  • Limited answers, or
  • Moderate answers with three or fewer references.

When this happens, AskTrip seamlessly expands the search to Google Scholar and OpenAlex, scanning the wider research landscape for additional evidence.

You don’t need to take any action – the process happens automatically. It adds about 20–30 seconds to generating the answer.

What You’ll See

New answers created through this process are clearly labelled as having used Beyond Trip.

Two outcomes are possible:

  1. A stronger answer: If new evidence is found, the revised response will be presented with its expanded reference base. You’ll see a note confirming that the answer has utilised Beyond Trip.
  2. A genuine evidence gap: If evidence remains poor, we’ll highlight that even after Beyond Trip, good-quality evidence could not be found. In these cases, we’ll offer five broader or related searches you can try, helping you explore areas where stronger evidence may exist.

Why It Matters

In testing, results have ranged from no change (confirming a genuine lack of evidence) to major improvements – for example, an answer going from zero references in the original output to six references after Beyond Trip.

By intelligently expanding the search only when needed, Beyond Trip ensures you’re not just getting an answer – you’re getting the best possible evidence available.

Blog at WordPress.com.

Up ↑