Search

Trip Database Blog

Liberating the literature

Category

Uncategorized

Search improvements

The TRIP Database search algorithm works pretty well most of the time. But the biggest annoyance to me is the ‘over promotion’ of eTextbook records. eTextbooks have their place, but at TRIP we try to offer the highest quality material first and then users can work down the ‘quality’ scale.

We’ve identified the cause – the overly high weighting caused by title term density – I will explain!

Our general view is that if a document contains the search term(s) in the title it is likely to be more relevant than a document that mentions it only in the text. As a result we give a higher weighting to the title score. The problem we have is that our underlying software (Lucene) incorporates a title word density score. So if you have two documents:

  1. Prostate cancer
  2. Blah blah blah prostate cancer

The first gets a very high score (100% match) while the second gets a lower score (40% match). Typically users search using 1-2 terms and eTextbooks, typically, have 1-2 terms title. While resources such as Cochrane, Bandolier etc have much longer titles. This has caused much frustration and we’ve even considered creating our own, bespoke, search mechanism (which would be costly and much slower).

However, a reading of Alf Eaton’s HubLog show’s he has more than a working knowledge of Lucene. A quick e-mail to Alf (he’s helped with advice on TRIP in the past) and he’s suggested a couple of fixes. We’re currently creating a testing system to test these. With any luck these alterations will be in place shortly.

Although we may be tempted to wait until the current round of upgrading is over and roll everything out in one go!

TRIP Database and Lyme Disease

An interesting e-mail landed on my desk asking me to remove a guideline from the TRIP Database. The guideline in question being Infectious Diseases Society of America practice guidelines for clinical assessment, treatment and prevention of Lyme disease, human granulocytic anaplasmosis, and babesiosis. We link to this guideline via the American government’s National Guideline Clearinghouse. The reason for this request:

“I feel it is wholly inappropriate that this document is still on this website and being used as reference guide. The authors of this document have been subpoeaned by Conneticut Attorney General in the US over the likelihood of the breaking antitrust laws because of biased and warped content of this document, furthermore the document does not take into account of any other medical practises for the treatment of Lyme Disease and is not peer reviewed, please remove this document immediately as their could be legal consequences ensuing by practioners following this protocol.”

It caused me some concern as this is the first time someone has asked for material to be removed. So I did a bit of digging round:

1) It does appear that the Connecticut DA is looking into this.
2) There appears to be a great deal of controversy around Lyme disease (see, for instance The Dirty Truth About Lyme Disease Research or wikipedia entry The Lyme controversy).
3) The person contacting me stated that the guideline had not been peer-reviewed, yet I found that the peer-reviewed journal ‘Clinical Infectious Diseases’ published the guideline in 2006 (click here).

Some final thoughts:

  • Ultimately, we (at TRIP) are not in a position to arbitrate on this one, the Connecticut DA appears to be.
  • The statement “innocent until proven guilty” springs to mind.
  • Unless anything substantial appears as long as the National Guidelines Clearinghouse contain the guideline, so shall we.

Tagging takes off

A blog earlier this year highlighted an interesting article on the merits of tagging – focussed on accuracy. The article Patterns and Inconsistencies in Collaborative Tagging Systems: An Examination of Tagging Practices is well worth a read.

Hot on the heals of this is a BBC News article Tagging ‘takes off for web users’ which reports on the increased use of tagging by web-users.

Roll on Gwagle, shortly to hit the alpha-testing phase.

Another record month!

A very pleasing set of search statistics for January with a total of 365,855 searches. Our previous record, for November, was 274,106 (with a lull in December due to Christmas). Find below a graph of our monthly search stats. What these stats show me is that there is a real desire for clinicians to access, easily and for free, high-quality medical literature.

As I mentioned at the start of December, when will our increase is search stats plateau? With work due to start on our latest batch of improvements, TRIP will continue to improve so, I imagine, users will continue to use us and spread the word. Given our lack of marketing budget we are effectively reliant on ‘word of mouth’ – making our search increases even more rewarding.

Wikipedia

An amusing post on the SEOmoz blog. Picking out the best bits:

  • Wikipedia – The encyclopedia where you can be an authority, even if you don’t know what the hell you’re talking about.
  • When Wikipedia becomes our most trusted reference source, reality is just what the majority agrees upon.
  • When money determines Wikipedia entries, reality has become a commodity.

Fortunately, medical blogs, such as Ganfyd, are above these criticisms.

Open access and Cochrane

Ben Toth frequently comments on open access issues relating to Cochrane. His most recent can be viewed here. Seeing this entry coincided with an e-mail from the rather good RD Info people. RD Info highlights new R&D funding and is often interesting to browse. One of their entries included the following:

NHS Cochrane Collaboration Programme Grant Scheme
Description: NHS Cochrane Collaboration Programme Grants will provide new funding to support the production and updating of Cochrane reviews in areas of priority or need for the NHS. The grants will build on existing Cochrane Collaboration infrastructure, supplementing rather than replacing current funding, activities and outputs of The Cochrane Collaboration.
Funding: Funding for approximately 8 grants will be available. Grants will be up to a maximum of £420,000 over three years (ie up to £140,000 pa) including institutional overheads or full economic costs at 80%. Grants will be awarded to support a coherent programme of work that includes both new Cochrane reviews and updating of existing reviews.
Amount : > £100K Closing Date : 05 March 2007 Duration : 1 year – 3 years

What struck me (and clever people like Ben have been commenting for an age on this subject) is that the NHS gives large quantities of money to Cochrane to produce the systematic reviews (as well as untold NHS staff time taken by volunteers actually producing the reviews). Cochrane (well Wiley, the publishers) then has the temerity to charge the NHS to look at them.

TRIP Updates

For a number of years TRIP has been used by a number of organisations to highlight new research evidence. Some ad hoc , others more substantial. Some examples of our work include

  1. We supply a portal site that news new evidence to populate their site and e-mail campaigns. For this group we supply them 15 links to evidence in over 20 clinical areas every month. The content includes secondary and primary research.
  2. A number of groups use our services to help keep them up-to-date with new research in which they are, or have, carried out reviews (systematic or otherwise).
  3. Some clinicians and researches, with a particular interest in an area, simply use our services to send them e-mails of new content every month.

Today we’ve launched TRIP Updates which will help highlight the existing and new services. Currently, e-mail is the way users receive updates (although some of our clients receive the content via XML). Shortly, we’ll be launching a comprehensive RSS feed system.

The premium service (a paid for service) is the most comprehensive package we offer. Not only can we update users of new content in TRIP but also in Medline. We can configure our output to suit you. Do you want all new TRIP content as well as all new Medline content in your area? Or would you prefer just TRIP content and RCTs in an area?

For more information see TRIP Updates.

Number of clinical questions

We’ve just completed an analysis of clinical questions for the National Library for Health, principally based on the work of the NLH Q&A Service. One issue we addressed was the number of repeat questions. On a very crude level we have been able to map out the proportion of repeat questions based on the size of our answer bank (repository of previously answered questions). To repeat this is very crude, but still an interesting exercise.

When we had 2,000 previous answers the repeat rate was approximately 3.5%, when increased to 5,900 it increased to 9% and when it reached 6,900 it rose to 14.5 %. We’ve graphed this and it can be viewed below.

We’ve applied the MS-Excel trendline. If you extend the line, how many questions does an answer bank require to answer 100% of questions as repeat? 48,000.

I don’t actually believe that figure. There will always be new interventions so new questions will be asked about these. However, an answer back of say 25,000 questions will possible answer 50% of questions. I look forward to our next analysis, with an even larger answer bank to see where the trendline goes!

Adding TRIP to your website

Since going free the use of TRIP has increased dramatically. Another heavily used feature of TRIP has been the incorporation of TRIP content into third-party websites. This has taken two main forms.

First, the incorporation of the TRIP search box into a website. This is simply a question of adding a few lines of HTML and away you go. For example see here and here.

Second, and more sophisticated, is the adding of TRIP via the ‘backdoor’. This means that your website has a search box (as above) but when someone adds a search term it doesn’t cause the user to leave your site and visit TRIP’s. Instead it sends the search term to TRIP via a backdoor. The results are then returned in a format that easily allows them to be displayed as you wish. You can make the results return in your homepage with your ‘look and feel’. Not only do you retain your users you give them high-quality evidence that has the look of your site. This has been used by a number of site, principally portal sites. However, an example can be seen via NHS Scotland’s eLibrary (click here for details). TRIP supplies the bulk of the links, but as you can see the results are displayed in a non-TRIP way and integrated with other sources of information.

If you’re interested in ehancing the content of your site, see ‘Add TRIP to Your Site‘.

Blog at WordPress.com.

Up ↑