Thursday, May 29, 2008

Search 4.0: Putting Humans Back In Search

A great article on where search MAY be heading - Search 4.0: Putting Humans Back In Search

I've been intrigued by the notion of human search and the site Mahalo was/is of interest. The main interest has stemmed from the following:
  • I feel the tweaking of search algorithms can only take you so far.
  • Most users will only visit the first page of results.
  • Given the experience within TRIP we have a pretty good idea what would constitute a good set of results.
  • We have some experience in this area. A while back (so not up-to-date) we created a number of reviews e.g. statins or knee osteoarthritis.

The big problem is scale. We could create human-powered search results for the top 100 searches - but that's scratching the surface. Perhaps I'll do some mock-up and get users to 'vote' to see which they prefer.

Watch this space.

Saturday, May 24, 2008

Web users are getting more ruthless and selfish...

That's according to Jakob Nielson (click here)!

...people are becoming much less patient when they go online.
Instead of dawdling on websites many users want simply to reach a site quickly, complete a task and leave.....Instead, many are "hot potato" driven and just want to get a specific task completed.....

The above has got me very excited as it fits in with my view based on observations over the years of running fairly successful websites. It's also something Bandolier have long advocated with their frequent references to nuggets of evidence.

So what does the above suggest? Give small bite-sized chunks (nuggets) of evidence. More than that it's giving people nuggets of evidence AND making them easy to find.

The NLH Q&A Service is great as it gives individuals rapid answers to their questions. The site is let down though by a poor search mechanism. So it's not so great for others.

Resources like Clinical Knowledge Summaries have great content (arguably they are a large collection of nuggets) but again findability is a real issue.

Is there a solution? We hope so, and the launch of TRIPanswers in mid-summer will reveal our hand...

Friday, May 23, 2008

David Rothman @ the MLA 08

David Rothman has been to the Medical Librarian Association meeting and taken a video camera (click here).

This is a great use of technology...

As is this telectroscope.

Monday, May 19, 2008

Advanced search

I see that PubMed have rolled out a beta version of advanced search (click here).

While I play with the new features I really need to turn my attention to TRIP's advanced search. Relative to the other part of the site it's 'weak'. Something for the last half of 2008!

Semantic Web & Calais

I spotted a semantic web product called Calais a while back on (Reuters Wants The World To Be Tagged). Basically, it takes any document and 'marks it up' with various semantic web tags. I didn't give it much thought till a blog post this morning Reuters Launches Calais 2.0 - Now With Pop-Culture which reports support for the pharmaceutical and medical world.

You can use this Calais Viewer link to try it out.

I enjoyed using it, but it still has a long way to go, the number of terms it recognises is limited.

Given my lack of vision I'm not sure how you can utilise knowing that ibuprofen is a 'product' and feverish illness is a 'medical condition'. Perhaps one day it'll become obvious...

Thursday, May 15, 2008

Further feedback on the demise of Q&A

Further to a post last Friday (click here).

  • Why will you cease operations on 27th June? Is there any point inlobbying to try & keep your service available? If so, who should I contact (from a v satisfied customer!)
  • Why is the clinical Q&A service being axed? It is excellent and will be sorely missed
  • What a desperate shame this service is to discontinue! I have only just discovered it and have found it enormously useful because it is evidence based. Please feel free to pass this message on to anyone who has influence over this decision.
  • Why is the NLH Q+A service ceasing operations? Has the funding been pulled? Will there be an alternative service?
  • This service has greatly improved my practice it has been a great resource i cant believe its being withdrawn i will find it a great loss
  • Interesting- found this site via google- I see it's a pilot and about to be withrdrawn. I never heard of it- was it something that we should have been informed of? muight have been useful (or not) but if we had known about it we might at elast had the chance to try it out. More wasted NHS money on something not discussed with users and not properly tried and evaluated?
  • We will miss your service greatly! Is it migrating somewhere else/what alternatives are there?
  • Is there a replacement once the current service close in June? it has been a very helpful service for lots of us in primary care

I'll add extra comments as and when I get a decent batch....

Quality standards in Q&A

At the start of the week I had the pleasure of presenting at the 2008 Clinical Librarian Study Day. I was tasked with talking about quality standards in Q&A. This was a tough subject to do justice. I've worked hard for ten years on Q&A yet had never really thought about 'standards'. So it took a while to distill my thinking into reasonable standards.

I came up with two types of standards:

  • Easy
  • Real

Easy standards are the ones I consider self-evident e.g.

  • Competency in searching various databases
  • Return answers in an agreed time
  • Keep responses to a reasonable (brief) length
  • Answers should be referenced

But adhering to these standards means very little.

Real standards are the ones I think mean something and are perhaps less obvious and I came up with 7:

  1. Competency of answerer
  2. Transparency
  3. Communication
  4. Feedback
  5. Correctness of answer
  6. Boundaries of Q&A
  7. Quality control

Competency of answerer. It's relatively straightforward to search medline and learn to appraise. However, it's much harder to understand the clinical context. This involves trying to understand the motive for the question, what it actually means, the sort of evidence required and knowing when the question has been answered.

Transparency. This is not as simply as linking to an article informing users about the process. It's ensuring that they actually know what the process is and potential shortcomings.

Communication. Linked with transparency this relates to simple things such as using a clear narrative to more interesting challenges - such as explaining uncertainty.

Feedback. Is there easy feedback from the user but also from others viewing the service? We receive a small amount of feedback, we should get more!

Correctness of answer. Is the correct answer given? A tough question to answer..

Boundaries of Q&A. A bit vague this one, but when/where should a 'quick and dirty' Q&A service operate. I often worry that we spend too little time on questions, rushing off to answer the next one. Other areas worry me such as high-risk questions - but then we pass them through to our clinical director to check. But every now and then I worry that we're going beyond what we should be doing.

Quality control. Is there a QC system? We have internal and external systems, I'm pretty sure they're robust - but an important standard all the same.

With ATTRACT and the NLH Q&A Service we could improve, I'm thinking particularly of transparency and feedback. We'll be addressing both these issues (and others) with TRIPanswers.

Of the seven I think the two really important standards are Competency of answerer and Transparency.

Wednesday, May 14, 2008

Another milestone - 50,000+ searches in a day

Yesterday, TRIP was searched 51,021 times, the first time we've been searched over 50,000 times per day.

This increase must be down to the significant increase in users who are coming back on a more regular basis.

What's the next milestone? 2 million searches per month, 400,000 per week or 75,000 per day....

Saturday, May 10, 2008

Exporting TRIP records

We occasionally get requests from users for new features. Where possible we try and accommodate these wishes - assuming they make sense to us and we've got the money. One feature has recently been requested more than any other is the ability to export TRIP results.

Separately, in the development of the Spanish version of TRIP, it was 'showcased' to the Spanish Ministry of Health and they too wanted an export feature.

Bottom line: We're creating an export feature.

All the results in TRIP will have a tick box and users will be able to select the ones of interest. They will then be able to export the selected records to file or e-mail to a colleague.

This should be out by the start of June

Friday, May 09, 2008

The sad end of the NLH Q&A Service

The saddest thing for me about the ending of the NLH Q&A Service is the reaction of the users. We've just placed a notice on the site alerting users to the ending of the service. Two comments received within 35 minutes of it the notice going live:

"I have just found this answering service through a colleague. I think it is absolutely brilliant. I have just found the answer to a question within 1 minute which would normally have taken me hours, maybe days of research. I am paid £40k p.a. so this time saving is bound to be very valuable to the NHS when multiplied by all the people who use it. Why is the service ending???"

"Hello. As an EBM practitiopner I am disappointed to read that as of June 27th the NLH Q&A Service will cease operations and be unable to answer any new questions. Could you kindly inform me of the reasons behind this decision, and what will replace this very important and I think successful primary care service. Thanks,"

I'm so proud of what we're achieved and the fact that clinicians are prepared to contact us to express surprise and disappointment is impressive. The NLH suffers from a lack of engagement with primary care health professionals, an area that needs more support that secondary care (that has access to librarians). By removing of this key service it further hampers engagement.

Still the secondary care clinicians will still have their services...

Wednesday, May 07, 2008

Dramatic increase in search speed

I've been worried about the search speed for some time. The removal of the number of results for each individual category was meant to improve things, it didn't.

Yesterday (see previous post) I introduced a system that removed search results which we considered not relevant. As well as improving the search relevancy it reduced - dramatically - the number of search results returned. By accident this has resulted in a huge increase in search speed.

Not all accidents are bad!

Tuesday, May 06, 2008

Improving search on TRIP

We have just released a major improvement in the TRIP search results.

This has been brought about by the refinement of our 'text cutoff tool'. Prior to this introduction a search would return every result that contained the search terms. So a search of acute kidney injury returned the CKS guideline on ankles and sprains! I imagine most people would agree that that is not even an average result - it simply shouldn't be there. But how has it got there? It contains all the individual terms, so is returned.

So how does the text cutoff tool work?

Each search on TRIP looks for matches to the search term(s) used and these are ordered based on our algorithm. One component of the algorithm is a text score. The higher the text score the more likelt the result is to be pertinent to the search terms. Some factors that affect the text score include:

  • Location - where the term occurs (e.g. if someone searches on asthma, documents with asthma in the title tend to get a text score higher than if they are only mentioned in the text).
  • Term density. If you have two documents, both 1000 words long and without the search term in the title and one mentions the search term once and the other ten times, the latter will receive a higher text score.

The idea behind the text cutoff is simply to say that all results with a text score of lower than X do not get returned.

This sounds nice and simply but it fairly arbitary and if you set the score too high you may miss important documents out, too low and you allow too many in. It's also complication that there is no magic score that defines pertinence. We have tested lots of combinations of search terms and text cutoff points and we've managed to remove an awful lot of noise from the searches. The cutoff point we've used has reduced the number of results from the acute kidney injury from 2,609 to 440.

Ironically, the CKS guideline on ankles and sprains still remains. Lowering the score to remove that caused the removal of other documents from other searches.

We're not claiming its perfect, but it's pretty good!

Monday, May 05, 2008

Health and Second Life

I've been on Facebook for a while, for most of that time I have not understood the appeal. I joined due to the buzz and I wanted to understand it.

Similarly I joined Second Life, hoping to understand it. In many ways I can see the appeal of Second Life more than Facebook. I don't use it but can appreciate the ability to immerse yourself in an alternative reality. Perhaps if I was 20 and had lots of time on my hands I'd be hooked.

But this report Providing Consumer Health Outreach and Library Programs to Virtual World Residents in Second Life has just been released:

"The major accomplishments of the project include: the careful and thorough development of an island in the virtual world called Second Life, replete with buildings, grounds, meeting spaces, exhibits, collections, and other information resources and services; development and deployment of a variety of informational exhibits and displays; and numerous contacts and collaborative efforts with health-related groups and organizations. Overall, HealthInfo Island has become the focal point for many health-related initiatives in Second Life. It has been very successful and well-received by the general public in Second Life, healthcare professionals, and health sciences librarians."

Sunday, May 04, 2008

Situated Question Answering in the Clinical Domain

I'm finding myself increasingly drawn to computer/statistical techniques for helping in the Q&A service. Many papers, such as highlighted below, look at answering the question; my main interested - currently - is in using sematic analysis in updating clinical questions.

Situated Question Answering in the Clinical Domain: Selecting the Best Drug Treatment for Diseases

Abstract: Unlike open-domain factoid questions, clinical information needs arise within the rich context of patient treatment. This environment establishes a number of constraints on the design of systems aimed at physicians in real-world settings. In this paper, we describe a clinical question answering system that focuses on a class of commonly-occurring questions: “What is the best drug treatment for X?”, here X can be any disease. To evaluate our system, we built a test collection consisting of thirty randomly-selected diseases from an existing secondary source. Both an automatic and a manual evaluation demonstrate that our system compares favorably to PubMed, the search system most commonly-used by physicians today.

Thursday, May 01, 2008

Oops, we did it again

Just to show our million plus searches wasn't a fluke for March, we had 1,077,190 searches in April. As April only contains 3o days this is more of an acheivement. Perhaps more meanigful is the average daily searches:

  • March - 32,302
  • April - 35,906

So roughly a 10% increase...