When the web moves: building a smarter broken-link system for Trip

Trip connects clinicians to a vast range of articles, guidelines and evidence reviews spread across thousands of external websites. That breadth is one of Trip’s great strengths, but it comes with a long-standing vulnerability.

Websites are redesigned, publishers migrate platforms and documents move. Links break. And when a link breaks, a clinician following the evidence hits a dead end.

This is a problem we have wanted to solve properly for a long time. We are now close to doing so.

Users can already report broken links using the option beneath each Trip search result, and many do. But this only catches the links that someone happens to encounter and takes the time to flag. We have never known what proportion of the broken links encountered by users are actually reported or the true scale of link rot across a database of Trip’s size.

The new system gives us a much more systematic approach.

It checks links automatically and at scale, identifying 404 errors, server failures, timeouts and unhelpful redirects. Because some failures are temporary, flagged links are checked again before being treated as genuinely broken.

Where a link remains unavailable, the system uses the article’s title, date, publication and other metadata to search for a replacement. This is where large language models have made a previously impractical task achievable at scale. Using the article’s title, date, publication and other metadata, the LLM first searches for and identifies a likely new location for the article, something it does remarkably well. It then assesses whether the proposed page is genuinely the same article, rather than relying on title similarity alone. Only when this initial recovery process fails do we move to broader searches through Google, Google Scholar and other sources, with the resulting candidates subjected to a further LLM-based validation check.

Potential replacements found through Google, Google Scholar and other search routes receive an additional LLM-based validation check. The proposed URL must also pass the link checker itself.

High-confidence matches can then be updated and reindexed automatically. Uncertain cases are placed in a review queue rather than being changed on the basis of weak evidence.

We are currently completing the final testing. Once the system is ready, we will run a substantial one-off check across Trip’s eligible records, something we have never previously been able to do. This will be followed by regular checks, so that broken links can be identified and repaired rather than silently accumulating.

For a resource that exists to connect people with the best available evidence, ensuring those connections actually work is exactly the kind of unglamorous – but vital – infrastructure that matters. We are pleased to be getting it right.

	What Follow-Up Quest… on AskTrip at One: 20,000 Clinica…
	Kate Jones on Lifting the Lid: How AskTrip S…
	Lifting the Lid: How… on Further improvements to A…
	Lifting the Lid: How… on A great example of the power o…
	Beyond Treatment Que… on What clinicians are really try…

	What Follow-Up Quest… on AskTrip at One: 20,000 Clinica…
	Kate Jones on Lifting the Lid: How AskTrip S…
	Lifting the Lid: How… on Further improvements to A…
	Lifting the Lid: How… on A great example of the power o…
	Beyond Treatment Que… on What clinicians are really try…

Trip Database Blog

Liberating the literature