At the end of last year I posted Connected/related articles to highlight some thinking about combining different connections between documents to help ensure users can quickly and find articles related to the documents they’ve already clicked on. In other words, as users click on documents of interest, we start to collate connected articles to present to the user to help them ensure they’ve not missed important documents.

I have had the pleasure of testing out our text version and it’s really really good. Currently its not had the design treatment but you can start to see the power. I did this search on Trip bisphosphonates prostate cancer and clicked on these results:

  • Contemporary Population-Based Analysis of Bone Mineral Density Testing in Men Initiating Androgen Deprivation Therapy for Prostate Cancer
  • Hypocalcaemia in patients with prostate cancer treated with a bisphosphonate or denosumab: prevention supports treatment completion
  • The role of bisphosphonates or denosumab in light of the availability of new therapies for prostate cancer
  • Use of bisphosphonates and other bone supportive agents in the management of prostate cancer-A UK perspective
  • Bone Health in Patients with Prostate Cancer

The top four being from PubMed and the bottom one a Canadian guideline. Our connected articles outputted the following:

So a lot’s going on in the above screenshot, so some explanations around the score. Note, this is still in the testing phase so these weights/scores are liable to change. The following factors are shown:

  1. N – number of results to display.
  2. Clicks – this is based on our co-click data. If any of the 5 documents I clicked on have been co-clicked in a previous search session this is noted and added to the list of connected articles. We have currently weighted this by a factor of 3, as we feel this is a really important factor.
  3. Rel – stands for related articles. This is only available, at present, for PubMed articles and we extract the related articles from the 5 documents clicked. These related articles are added to the list of connected articles.
  4. Ref – references. We extract the references used from the 5 documents clicked. These referenced articles are added to the list of connected articles.
  5. Cites – this looks to see if any of those 5 clicked documents been cited by other documents. These cited articles are added to the list of connected articles.
  6. Incl – shows if the document is already in the Trip index.
  7. Words – this explores if the connected articles contain the initial search terms in the document title. The more words the documents find in the title the closer we judge it to be to the initial search and therefore the user’s intentions.

So, from the above, points 2-5 are about identifying connected articles while the others are additional factors we’re using.

Here are the top 5 articles that our system generated with hyperlinks:

Some observations:

  • They are all highly relevant documents (although I note that one returned article was one we originally clicked on – this will be fixed before we released this).
  • The connected articles tend to be older. This makes sense as the Trip algorithm favours newer articles so these tend to be shown first. So, connected helps unearth older, important, papers that a user may have missed.
  • The 3rd article in a Cochrane systematic review. Interestingly not the most up-to-date version but it still unearths it and a user can quickly navigate to the latest version.
  • One of the top 5 articles was not in the main Trip index (and in the larger list of 11 documents over half were not in Trip). That was the Cochrane review, we have the more up-to-date version. But it’s nice that the system can highlight possibly important papers outside of Trip.

I’m possibly biased but I’m really excited by the possibilities of this system to help users find the best articles they need. And remember, this is a test system and we have some fixes/improvements to roll out. Once we’ve done that we need to incorporate this into Trip and that highlights the next challenge – the user interface/design. We need to balance making it obvious to users yet not too intrusive. Given how good the system is I consider this a nice problem to have!