Search

Trip Database Blog

Liberating the literature

Category

Uncategorized

Improving search on TRIP

We have just released a major improvement in the TRIP search results.

This has been brought about by the refinement of our ‘text cutoff tool’. Prior to this introduction a search would return every result that contained the search terms. So a search of acute kidney injury returned the CKS guideline on ankles and sprains! I imagine most people would agree that that is not even an average result – it simply shouldn’t be there. But how has it got there? It contains all the individual terms, so is returned.

So how does the text cutoff tool work?

Each search on TRIP looks for matches to the search term(s) used and these are ordered based on our algorithm. One component of the algorithm is a text score. The higher the text score the more likelt the result is to be pertinent to the search terms. Some factors that affect the text score include:

  • Location – where the term occurs (e.g. if someone searches on asthma, documents with asthma in the title tend to get a text score higher than if they are only mentioned in the text).
  • Term density. If you have two documents, both 1000 words long and without the search term in the title and one mentions the search term once and the other ten times, the latter will receive a higher text score.

The idea behind the text cutoff is simply to say that all results with a text score of lower than X do not get returned.

This sounds nice and simply but it fairly arbitary and if you set the score too high you may miss important documents out, too low and you allow too many in. It’s also complication that there is no magic score that defines pertinence. We have tested lots of combinations of search terms and text cutoff points and we’ve managed to remove an awful lot of noise from the searches. The cutoff point we’ve used has reduced the number of results from the acute kidney injury from 2,609 to 440.

Ironically, the CKS guideline on ankles and sprains still remains. Lowering the score to remove that caused the removal of other documents from other searches.

We’re not claiming its perfect, but it’s pretty good!

Health and Second Life

I’ve been on Facebook for a while, for most of that time I have not understood the appeal. I joined due to the buzz and I wanted to understand it.

Similarly I joined Second Life, hoping to understand it. In many ways I can see the appeal of Second Life more than Facebook. I don’t use it but can appreciate the ability to immerse yourself in an alternative reality. Perhaps if I was 20 and had lots of time on my hands I’d be hooked.

But this report Providing Consumer Health Outreach and Library Programs to Virtual World Residents in Second Life has just been released:

“The major accomplishments of the project include: the careful and thorough development of an island in the virtual world called Second Life, replete with buildings, grounds, meeting spaces, exhibits, collections, and other information resources and services; development and deployment of a variety of informational exhibits and displays; and numerous contacts and collaborative efforts with health-related groups and organizations. Overall, HealthInfo Island has become the focal point for many health-related initiatives in Second Life. It has been very successful and well-received by the general public in Second Life, healthcare professionals, and health sciences librarians.”

Situated Question Answering in the Clinical Domain

I’m finding myself increasingly drawn to computer/statistical techniques for helping in the Q&A service. Many papers, such as highlighted below, look at answering the question; my main interested – currently – is in using sematic analysis in updating clinical questions.

Situated Question Answering in the Clinical Domain: Selecting the Best Drug Treatment for Diseases

Abstract: Unlike open-domain factoid questions, clinical information needs arise within the rich context of patient treatment. This environment establishes a number of constraints on the design of systems aimed at physicians in real-world settings. In this paper, we describe a clinical question answering system that focuses on a class of commonly-occurring questions: “What is the best drug treatment for X?”, here X can be any disease. To evaluate our system, we built a test collection consisting of thirty randomly-selected diseases from an existing secondary source. Both an automatic and a manual evaluation demonstrate that our system compares favorably to PubMed, the search system most commonly-used by physicians today.

Oops, we did it again

Just to show our million plus searches wasn’t a fluke for March, we had 1,077,190 searches in April. As April only contains 3o days this is more of an acheivement. Perhaps more meanigful is the average daily searches:

  • March – 32,302
  • April – 35,906

So roughly a 10% increase…

Q&A Standards

I’m currently preparing a talk on standards in clinical question answering services. Even though I’ve been involved in Q&A for ten years it’s proving really hard. I’ve been scribbling notes down, trying to decide what is important. Some major issues:

  • Transparency, you’re not doing a systematic review – so make that clear. I think the onus is on the Q&A service to ensure users are acutely aware of potential drawbacks in the service.
  • Feedback, be it from the person who asked the question or from other readers of the answers (assuming things are web-based). We receive lots of feedback, but we could make it easier – perhaps something as straightforward as a digg style thumbs up or thumbs down.
  • Process-standards. These are things such as length of answer, speed of answer, referencing of material etc. Not the most interesting!
  • People skills. It’s fine to say that the person asking the question is competent (or even excellent) in searching Medline, but that certainly doesn’t make them good at answering questions! Understanding of the question is important before you consider a search. How can you make a standard around that?
  • Quality control. Is there a robust system in place?

The above is not a complete list but some of the more memorable ones.

The biggest drawback I found is that there is very little research in this area on which to base standards! In the area of systematic reviews there are vast amounts of research on the actual process of conducting systematic reviews. In Q&A there is virtually nothing.

For me, and this notion hasn’t changed in ten years, is that we’re not trying to do a systematic review, we’re just trying to improve on what a clinician would do. I remember 8-9 years ago receiving some criticism from a civil servant suggesting what I did was negligent. I suggested to her that if our service is negligent then surely she was in providing Medline to clinicians who are poorly trained to search on them.

Although hard work it has proved very useful preparing for this talk, it’s an area I’ve not spent a great deal of thought on in the past. And there’s still over a week to go, so plenty of time for reflection.

CKS new website

Clinical Knowledge Summaries (CKS) was previously known as PRODIGY and produces some great guidelines. They have just released a redesigned website – it’ll take some getting used to!

I find the browsing of topic confusing but the search is good (powered by Google!). But my biggest concern is – once you find the correct guideline – how long it takes to find the appropriate section. They have split the screen into 3 sections, with the actual content squeezed into 1 section; before the content filled the screen. This effectively means each guideline is now around 3 times longer.

I was involved in the beta-testing and have been waiting for the release of the updated site. Perhaps the problem is that I’m an atypical user of the site, perhaps I’m just like most other people and find change unsettling. In 3 months time I may love it!

The importance of Q&A

I’ve just answered the 5th question of the day and decided to check if we’ve had any feedback for the answers I posted earlier in the day. We received the following:

“I stumbled on to the service after searching the haematuria topic via google. The first time I tried you were not taking further questions so tried again and got a very helpful response. I think it is a BRILLIANT service and really useful for somebody like me (a GP) who would like to be evidence based in approaching clinical problems but often th eresaearch is just too much on top of the clinical work.

Serves a really imporatnt need. This is as important as anything else going on in the NHS now.”

I’ve been answering clinical questions for around ten years now and this sort of feedback only helps highlight the importance of clinical Q&A services. The Q&A services I run are answering hundreds of questions per month and are able to offer significant support to clinicians in accessing the evidence base. The services tend to be run with little money or other support. What does it take to have them taken seriously? I can only think that getting papers written in journals will help, but other than that I have few ideas. I still want to create a journal of clinical Q&A and may well get that off the ground after June. I see that having the following features:

  • A formally written up Q&A. I’m not sure what this might include but the following seem sensible: the actual question, search methodology, articles found, narrative description of evidence and possibly a clinical bottom line.
  • A resource review. This would highlight resources useful in answering clinical questions.
  • Theory. There are many aspects of Q&A that lack a robust theoretical underpinning so papers exploring this would be helpful.

My slight concern is that the majority of the papers would be from me or close colleagues who work on the various Q&A services.

Where have the counts gone?

Due to the large volume of traffic we have temporarily removed the counts for each category on the results page. The counts should be re-instated by the end of the month using a new system to prevent the burden on the servers.

Since we removed the counts the speed of query-response has improved significantly.

Enhanced TRIP video

Last month I posted a video of TRIP on YouTube. The quality was poor!

Well, a month later and we’ve posted it on our own server (click here to view). The quality is much enhanced and the text readable – hurrah!

I feel using short videos is a great way of showing people how the site works and is invariably more useful to a busy clinician than 2-3 sides of paper in a leaflet.

We’ll adopt this ‘technology’ in the future. For instance, the TRIPanswers site will be fairly groundbreaking (in the clinical world) so we’ll need to explain the broad principles to users – video seems ideal.

Blog at WordPress.com.

Up ↑