With nearly 3,700 questions answered, we’ve gained a wealth of learning. From the very beginning, we’ve closely tracked both the questions and the answers, giving us valuable insights into the system’s strengths and areas for improvement.
Behind the scenes, we’ve been working on a major upgrade (our “v2”), which is now in testing. The key enhancements include:
- Improved search: A new approach that strengthens the link between a user’s question and the articles we identify, ensuring more relevant candidates are surfaced.
- Greater coverage: A more sensitive system that draws on a wider range of articles identified through the improved search.
- Reduced hallucinations: Specific safeguards to minimise inaccurate or invented content.
- Beyond Trip: If evidence is scarce in Trip, the search will automatically expand into the broader academic literature [learn more here].
- Answer scoring: A more refined and nuanced way of rating responses.
Each of these features has been tested individually, and we’ll soon begin testing them together as an integrated system. We’re optimistic these changes will deliver a step change in performance.
And, a final comment, we’re already working on v2.1…
Leave a comment