Search

Trip Database Blog

Liberating the literature

Using document clustering to show evolution of a topic area

Mpox (formerly known as Monkeypox) became a WHO designated Public Health Emergency of International Concern (PHEIC) between 23 July 2022 and 10 May 2023. Using the Carrot2 technology to cluster text (as shown yesterday) we thought it might be interesting to look at cluster pre and post the outbreak. So, we did two searches:

  1. Documents with Mpox or Monkeypox in the title published between 1980-2021. This yielded 104 results.
  2. Documents with Mpox or Monkeypox in the title published between 2022-2024. This yielded 679 results – showing a huge interest in the topic.

1980-2021

2022-2024

Or, looking at the top ten topic areas in a different format:

Small numbers pre-2022 but clear difference in topic areas. And, as an EBM source, nice to see a prominence for including systematic reviews.

Clustering search results using Carrot2

I’ve been wanting to use Carrot2 for ages and have finally got the chance… Carrot2, as it says on their website “Carrot2 organizes your search results into topics. With an instant overview of what’s available, you will quickly find what you’re looking for

So, two examples to show you. The first is for a search for ‘prostate cancer screening’ and we used Carrot2 to process just over 250 results and this is the output:

Treemap view

List view

The second example is using just under 700 results for the search ‘arteriovenous malformation’

Treemap view

List view

I can’t help feeling this topic clustering might be useful in a number of situations. For instance it’d be a nice way of refining your search or it could be useful to give you an overview of the topics covered. Let me know what you think either via comments or via email jon.brassey@tripdatabase.com.

Moving to the cloud

Since the start of 2024 the vast majority of our development time has been taken up with moving Trip onto ‘the cloud’. Currently we use a dedicated server, this has served us well but the server was getting old and needed replacing, so moving to the cloud was a no-brainer. We didn’t expect it to take quite so long and we’re currently having to reindex over 5 million documents. This process will be over in the next day or so and then it’ll be on to testing, which will hopefully not take long.

Broadly, users shouldn’t notice any difference. We’ve upgraded the underlying search software which has improved relevancy scoring so that might make the results order change – but I can’t imagine that being huge. I’ll update, if necessary, during testing.

Sorry, boring post saying we’re busy doing stuff, in the background, that you’ll not notice – but I wanted to be transparent. After that we can move on to forward facing developments.

Understanding simple searches – poll

Our previous post highlighted the fact that many of our top searches are for single concepts e.g. asthma, pregnancy, aspirin. In some situations this sort of search is fine but in others it might be imprecise.

If you use simple searches can you answer the following question please..:

If there is another reason, not listed above, please let us know. This can be done via comments or email: survey@tripdatabase.com.

Simple searches

UPDATE: please take our simple poll to help us understand simple searches (if you do simple searches)

One observation we’ve made over the years has been that many of the top searches (by frequency) are for single concepts e.g. asthma, diabetes, aspirin. This seems quite non-specific as the user possibly doesn’t want to know about, say, asthma generally. Perhaps they want to know about asthma diagnosis, asthma in children etc.

At Trip we want to help users get better results faster and we could take the above issue and try to solve it using techniques such as query expansion. However, the fact is we don’t fully understand the issue and we’ve suggested a solution (query expansion) straightaway! I have said the search is non-specific and the user doesn’t want to know about asthma generally. But there are many possible reasons for using such broad terms, for instance:

  • They just want to get a feel for the literature
  • They are new to the site and want to experiment with using the site
  • They are not confident in creating searches that mirror their intention e.g. they may be wanting to know what is the most appropriate diagnostic tests for asthma in children and are not sure how to make a suitable search

There are probably other reasons as well – not really sure. So, leaping into a solution seems foolish. So, we should ask our users! My idea is that for a set number of searches (say every 10th search) we have a pop-up to take the user to a questionnaire to better understand their reasons for using Trip and if we note the search terms used we can cross-check that and see if there is a problem and from there what might the solution be.

I even asked ChatGPT to suggest some Qs and it came up with these:

Q1 Search Intent Understanding:
“What is your primary objective when conducting a search with us (e.g., browsing for general information, seeking specific data or studies, looking for treatment guidelines)?”


Q2 Search Term Selection:
“How do you decide on the search terms to use? Please select all that apply: a) Based on the specificity of the information needed, b) Using terms familiar to me, c) Repeating terms from educational or professional materials, d) Other (please specify).”


Q3 Experience with Search Outcomes:
“How often do the search results meet your expectations in terms of relevance and specificity? Always, Often, Sometimes, Rarely, Never.”


Q4 Challenges in Formulating Search Terms:
“What challenges, if any, do you encounter when deciding on which search terms to use? Lack of knowledge on the topic, uncertainty about which terms will produce the best results, other (please specify).”

Q5 Interest in Search Assistance Features:
“Would you find it helpful if the search engine offered suggestions or guidance on refining search terms to improve result specificity? Yes, somewhat helpful, No, not necessary.”

These are really interesting questions (thank you ChatGPT) and I think that scrolling through the results will be incredibly useful. So, when we roll out the survey (probably in the next month or so) please consider completing it.

Restricting results to a single publisher

A simple tip to help you get results from one publisher. To start with do a search, this is one for prostate cancer, and the results look like this (with snippets turned off):

Note that the tope results are all from different publishers! To restrict to articles just from Cancer Care Ontario, simple click on the publication name:

And the results look like this:

We hope this was helpful and if you’d like any more insights as to how to get the best from Trip then leave a comment or email me: jon.brassey@tripdatabase.com.

A blast from the past – 1998

The Internet Archive is great and I found the first ‘grab’ of the Trip Database here. It lists the 25 publications we covered back then:

Very few are still going, I think only these are still operating:

  • Cochrane
  • Evidence-Based Medicine
  • SIGN

In those days these sites contributed around 1,100 links and search was by title only.

Trip Clinical Evidence Review – a mock-up

We mentioned the possibility of a new LLM project improving the way we present the latest evidence to our users. In that post we mentioned asking our designer to mock something up and the first draft is below. Some things to consider:

  • It’s early days so should not be considered final
  • The content is for the clinical area of primary care
  • If we go ahead with this project it can be full-automated or we may seek clinical area specialists to act as editors, both routes have advantages and disadvantages

Let me know what you think, either in the comments or directly: jon.brassey@tripdatabase.com. NOTE: We have had to ‘cut up’ the single, large image of the mock-up, into smaller images – so the blog displays them at a decent size….

EMDA MMC – an example search

We’re undertaking a rapid review to answer the client’s question “Is there high-quality evidence to support the use of EMDA MMC in combination with BCG therapy for non-muscle invasive high-risk bladder cancer (NMIHRBC)“. It’s an interesting question and I thought it might be nice to show how we use Trip to help answer the question.

Understanding the Q: EMDA = ElectroMotive drug administration, MMC = Mitomycin C and BCG = BCG vaccine.

Initial search: I keep things simple and adjust if the initial search is problematic. In this case I searched (EMDA OR electromotive) AND (MMC OR Mitomycin) AND BCG AND (“bladder cancer” OR NMIHRBC). In this there are 4 elements:

  • EMDA OR electromotive – covers the first part of the search
  • MMC OR Mitomycin – using the abbreviation and the term ‘Mitomycin’. I didn’t add the C as it doesn’t seem necessary, in this case. If we got lots of results I could always replace ‘Mitomycin’ with “Mitomycin C” (note quotation marks to use it as a phrase search)
  • BCG – As above there seems no real need to add the term ‘vaccine’
  • “bladder cancer” OR NMIHRBC – bladder cancer might seem to be too vague (why not search for the fuller term ‘non-muscle invasive high-risk bladder cancer‘? Well, again, if it’s problematic – and we got too many results – I could always add it to make the search more specific

This is what it looks like via the Advanced Search:

So, the search generates 24 results. I clicked on 14 articles that looked particularly relevant and then used the Connected Articles feature (BTW see our video explainer of Connected Articles) to reveal closely connected/linked articles. Here are the top results:

In total Connected Articles returns 100 results and even at the bottom many seem relevant:

Connected Articles goes beyond the Trip content and is great for helping minimise the chance of overlooking important articles. I need to work through these 100 to look for supplementary documents to enhance the review.

So, the above is a nice example of using Trip to highlight some highly relevant documents including guidelines from the European Association of Urology, some documents from NICE and Cochrane, as well as some other systematic reviews.

Blog at WordPress.com.

Up ↑