When I first started in clinical Q&A nearly 30 years ago with ATTRACT, we often received questions from general practitioners that I knew could be answered by the excellent clinical guidelines available at the time (I think they were called Prodigy then). The challenge wasn’t the lack of guidance – it was that the guidelines were long, and pinpointing the relevant section was difficult. For many questions, our real task was simply to extract the key information buried within a mass of content, most of which wasn’t directly relevant.

Even then, I felt that if the guidelines were broken into bite-sized pieces, they would be far easier to use. I used to talk about taking a pair of “HTML scissors” to cut them up, so GPs could more easily find the specific information they needed for themselves.

Fast forward to today, and at AskTrip we face a related challenge – one that has reminded me of those early “HTML scissors” conversations. Our system searches documents and sends the entire text (guidelines, systematic reviews, and so on) to the AI model, asking it to identify and extract the relevant passage. If a document happens to be 5,000 words long, this process takes time – and incurs unnecessary computational cost – just to locate the key section.

By coincidence, the idea behind those old “HTML scissors” has become a recognised approach in modern information retrieval. It’s now a standard technique, widely used in AI pipelines, and it even has a name: chunking.

Chunking divides large documents into smaller, coherent sections to make them easier and faster to process. Instead of treating a guideline as a single 5,000-word block, chunking breaks it into major thematic units – such as causes, diagnosis, initial management, monitoring, or special populations. Within each of these larger chunks, the content can be divided even further into sub-chunks, which capture more granular pieces of information. For example, a diagnosis chunk might be split into sub-chunks for individual diagnostic tests, criteria, red flags, and decision pathways. These sub-chunks retain enough local context to stand alone, allowing the AI system to pinpoint very specific information without processing the entire guideline or even the full section.

The result is faster retrieval, lower computational cost, and more accurate matching between a clinician’s question and the part of the guideline that truly answers it. Because the AI is working with smaller, well-defined blocks of text, it can zero in on precise details – such as a dosing adjustment, a diagnostic threshold, or a management step – without being distracted by the surrounding material. This not only reduces latency and improves user experience but also increases reliability: the system is less likely to miss key details or return irrelevant passages, making the overall process both more efficient and more clinically useful.

So, our next major improvement to AskTrip is the introduction of chunking for large documents. This will allow us to deliver clearer, more precise answers, generated more quickly and at a much lower computational cost. And we’re not stopping there. To push performance even further, we’re developing vector search to improve how we target the most relevant chunks in the first place. I’ve written a brief explanation of vector search already, and I’ll share more updates as this work progresses—but together, these advances mark a significant step forward in making AskTrip faster, smarter, and more efficient for everyone who relies on it.