Search

Trip Database Blog

Liberating the literature

Author

jrbtrip

More full-text on Trip

Full-text is really important to our users and is one of the main benefits of Trip Pro. Historically, we have checked for full-text at the time of indexing only (indexing is the the process of taking the uploaded document and making it available to a user to search).

One realisation is that many documents are restricted when they are initially released and then become free full-text after 6-24 months. So, if we only check for full-text close to the time of release we miss those that subsequently turn open access.

So, we’ve introduced a re-sampling process that will periodically check documents in Trip to see if they now have free full-text access. This has been a huge success with a huge number of new full-texts identified. We can even quantity this:

  • We have 4,244,009 articles with DOIs.
  • We have 3,761,834 that link to full-text.
  • Overall, 88.63% of articles with a DOI (typically PubMed articles) link to full-text.

This is spectacular!

Survey results: How best to use AI in Trip?

Thank you for the many hundreds who took part in this survey, it has been really helpful and will definitely guide our future engagement with AI.

Overall, 51.4% of responders were health professionals, 31.8% information specialists, 9.8% academics, leaving 7% ‘other’!

We asked 4 questions, the first 3 being:

  1. Automated Q&A system: Users can ask questions in free-text format. The system would generate answers using content exclusively from Trip, explicitly mentioning the strength of the evidence and including references. How desirable is this feature for you? Please rate it on a scale from 1 to 4, with 1 being not desirable and 4 being highly desirable.
  2. Semi-automated evidence review system: Users can select a review topic, and our system will find the best available evidence, extract relevant content, and present it in an evidence table. The information would be summarised and automatically updated. How desirable is this feature for you? Please rate it on a scale from 1 to 4, with 1 being not desirable at all and 4 being highly desirable
  3. Better results ordering: This system would allow users to perform their initial search and then they could provide additional context explaining the reason for their search. Based on this extra information, the search results would be re-ordered (using AI) to ensure the most relevant articles appear at the top. How desirable is this feature to you? Please rate it on a scale from 1 to 4, with 1 being not desirable at all and 4 being highly desirable.

Observations:

  • All ideas were popular – which is good and bad!
  • The questions could have been more discerning (linked to the above point). So, instead of asking about how desirable a feature we could have offset it with highlighting potential negative aspects of the approach!
  • There was little difference between the groups of responders

Our 4th question took a slightly different format:

Focus on highest quality evidence: Currently Trip generates results from all evidence types, from the highest quality secondary evidence, through to journal articles and eTextbooks. Trip’s specialism is the higher-quality evidence and it might be the main reason you visit the site. To what extent would you want to use Trip to only see results from the highest quality evidence?

Again, very positive responses (y-axis = percentage) with little difference between types of users.

Free text responses were fascinating! The main issues being:

  • Lots of concern about accuracy/hallucinations and having the ability to check responses
  • Control – can any AI be optional
  • Reproducibility
  • Transparency
  • Lots of very lovely comments about how people love Trip!
  • A number of very interesting ideas for new developments…!

We are delighted with the above as they are very closely aligned with our own thinking. We have been working with LLMs for many months and have a reasonable level of experience. We have also tested a few ideas out and shortly we will be meeting to discuss which elements we will be taking forward. Watch this space!

Add Trip search to your site

If you are interested in adding a TRIP search box to your website then feel free to use the code below. This adds a great feature to your site and once there a search opens into a new window, meaning you don’t lose your users!

<form method="get" action="https://www.tripdatabase.com/search" style="border: 1px solid #9B82B4; font-family: Verdana, Helvetica, sans-serif; padding: 5px; width:200px">
<svg id="Layer_1" data-name="Layer 1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 165.17 104.41">
  <g>
    <path d="M59.36,9.66V6.2Q59.36.62,56,.62H3.34Q0,.62,0,6.2V9.66q0,5.56,3.34,5.57H21.05v61.9H38.31V15.23H56q3.34,0,3.34-5.57M78.13,21.79q-7.47,0-12.14,8.48a17,17,0,0,1-.39-1.79,17.79,17.79,0,0,0-.5-2.06c-.15-.41-.38-.93-.67-1.56a3.23,3.23,0,0,0-1.17-1.4A6.56,6.56,0,0,0,60.31,23a38.78,38.78,0,0,0-6.74.89q-4.51.9-4.51,2.68a32.78,32.78,0,0,0,.67,4,69.65,69.65,0,0,1,.67,11.71V77.13H66.88V41.87c2.23-3.72,5.19-5.58,8.91-5.58a11.35,11.35,0,0,1,2.89.45,11.28,11.28,0,0,0,2.34.44c1,0,1.79-1.17,2.45-3.51a22.09,22.09,0,0,0,1-5.86,6.75,6.75,0,0,0-1.4-4.18c-.93-1.22-2.58-1.84-5-1.84m81.25,8q-5.79-8-15-8a17.23,17.23,0,0,0-9,2.34,15.67,15.67,0,0,0-6,6,16.76,16.76,0,0,1-.39-1.68,19.62,19.62,0,0,0-.5-2,14.55,14.55,0,0,0-.67-1.62,3.23,3.23,0,0,0-1.17-1.4,6.53,6.53,0,0,0-3-.44,38.78,38.78,0,0,0-6.74.89q-4.52.9-4.51,2.68a32.78,32.78,0,0,0,.67,4,69.65,69.65,0,0,1,.67,11.71V99.49h16.48v-21a18.85,18.85,0,0,0,11.36,3.46q11.13,0,17.32-8.09t6.18-22.08q0-14-5.79-22M138.11,68.85a15.68,15.68,0,0,1-7.8-2.12V41.87q3.78-6.59,9.13-6.58,9,0,9,16.78T138.11,68.85M99.08,0H96c-4.46,0-6.68,1.19-6.68,3.57v8.11h16.48V3.57Q105.76,0,99.08,0m0,22.85H96c-4.46,0-6.68,1.19-6.68,3.57V77.13h16.48V26.42q0-3.57-6.68-3.57" fill="#533764" fill-rule="evenodd"></path>
    <path d="M37.33,77.53v.58Q37.33,82,31,82H27.26q-6.36,0-6.36-3.9v-.58Z" fill="#63c608" fill-rule="evenodd"></path>
    <path d="M67.19,77.53v.58q0,3.9-6.66,3.9h-3.1q-6.66,0-6.66-3.9v-.58Z" fill="#0e6cbb" fill-rule="evenodd"></path>
    <path d="M106,77.53v.92c0,2.38-2.22,3.56-6.65,3.56H96.25c-4.44,0-6.66-1.18-6.66-3.56v-.92Z" fill="#00a89d" fill-rule="evenodd"></path>
    <path d="M130.7,100.06v.9q0,3.45-6.55,3.45h-3.06q-6.56,0-6.55-3.45v-.9Z" fill="#eec82f" fill-rule="evenodd"></path>
    <path d="M106,11.83v.92c0,2.38-2.22,3.56-6.65,3.56H96.25c-4.44,0-6.66-1.18-6.66-3.56v-.92Z" fill="#ba390d" fill-rule="evenodd"></path>
  </g>
</svg><br />
<input type="text" name="criteria" style="width:150px" />
<input type="submit" value=" Go " />
</form>

How best to use AI in Trip?

We’ve been experimenting with AI for around a year and are confident enough with it to start to use it! However, a big uncertainty is how best to use it? By that I mean the technology is useless unless it supports our users! So, how best to support you?

We have created a short survey to help inform our decision making. Your opinion really matters and we really do listen. So, please, click here now to do the survey.

Thank you!

Relevancy – a big change

Trip’s search is sensitive! If a user searches for ‘measles’ and the term is mentioned one time in a document of over 100,000 words, it is still returned as a result. This is not normally an issue if there are lots of results. However, if you select a facet with relatively few results (e.g. UK guidelines) then these very low relevancy results appear.

A real example, if you do a search for measles and go into European guidelines, there are 12 results. These are results 7-12:

  • ESPEN expert statements and practical guidance for nutritional management of individuals with sars-cov-2 infection
  • Autologous haematopoietic stem cell transplantation and other cellular therapy in multiple sclerosis and immune-mediated neurological diseases
  • Hepatitis E Virus Infection
  • Autoimmune Hepatitis
  • Guideline on the Diagnosis and Treatment of Sclerosing Diseases of the Skin
  • ECCO-ESGAR Guideline for Diagnostic Assessment in IBD Part 1: Initial diagnosis, monitoring of known IBD, detection of complications

Result 7, ESPEN expert statements and practical guidance for nutritional management of individuals with sars-cov-2 infection mentions measles 4 times in over 8,000 words. To us, this article is not about measles!

This issue has been frustrating us, and some of our users, for years. But we’ve now made a big step forward in removing low relevancy results from Trip. This feature has been released today – for Pro users only. Repeating the search for measles we now get 6,681 results overall and 4 European guidelines, previously it was 7,764 results and 12 European guidelines.

Overall, there was a 14% reduction in total results, but in European guidelines the reduction was nearly 67%. This is to be expected as guidelines are typically much longer documents and therefore have more scope for mentions of low relevancy terms.

And, if you don’t like this you can revert to the full results by a link at the foot of the results page:

I suspect this feature will not be noticed by many but it should dramatically change the quality of results in some situations. I am delighted to see this feature, it has bothered me for many years.

We’ve moved

On the weekend we switched over from hosting the website on a dedicated server to ‘the cloud’. After 36 hours there have been no major incidents, just a few ‘niggles’.

This move has seen the site speed up, allow greater flexibility moving forward and saved us a reasonable amount of money – which we can use to improve the site.

Using document clustering to show evolution of a topic area

Mpox (formerly known as Monkeypox) became a WHO designated Public Health Emergency of International Concern (PHEIC) between 23 July 2022 and 10 May 2023. Using the Carrot2 technology to cluster text (as shown yesterday) we thought it might be interesting to look at cluster pre and post the outbreak. So, we did two searches:

  1. Documents with Mpox or Monkeypox in the title published between 1980-2021. This yielded 104 results.
  2. Documents with Mpox or Monkeypox in the title published between 2022-2024. This yielded 679 results – showing a huge interest in the topic.

1980-2021

2022-2024

Or, looking at the top ten topic areas in a different format:

Small numbers pre-2022 but clear difference in topic areas. And, as an EBM source, nice to see a prominence for including systematic reviews.

Clustering search results using Carrot2

I’ve been wanting to use Carrot2 for ages and have finally got the chance… Carrot2, as it says on their website “Carrot2 organizes your search results into topics. With an instant overview of what’s available, you will quickly find what you’re looking for

So, two examples to show you. The first is for a search for ‘prostate cancer screening’ and we used Carrot2 to process just over 250 results and this is the output:

Treemap view

List view

The second example is using just under 700 results for the search ‘arteriovenous malformation’

Treemap view

List view

I can’t help feeling this topic clustering might be useful in a number of situations. For instance it’d be a nice way of refining your search or it could be useful to give you an overview of the topics covered. Let me know what you think either via comments or via email jon.brassey@tripdatabase.com.

Moving to the cloud

Since the start of 2024 the vast majority of our development time has been taken up with moving Trip onto ‘the cloud’. Currently we use a dedicated server, this has served us well but the server was getting old and needed replacing, so moving to the cloud was a no-brainer. We didn’t expect it to take quite so long and we’re currently having to reindex over 5 million documents. This process will be over in the next day or so and then it’ll be on to testing, which will hopefully not take long.

Broadly, users shouldn’t notice any difference. We’ve upgraded the underlying search software which has improved relevancy scoring so that might make the results order change – but I can’t imagine that being huge. I’ll update, if necessary, during testing.

Sorry, boring post saying we’re busy doing stuff, in the background, that you’ll not notice – but I wanted to be transparent. After that we can move on to forward facing developments.

Blog at WordPress.com.

Up ↑