My post of two days ago was slightly premature! After using the new algorithm I started noticing some strange results. So much digging around with the maths helped me discover a variable I had overlooked - inverse document frequency. After understanding this concept, a bit, I have now used it to our advantage.
The net result being the algorithm is now, without doubt, a significant improvement on algorithm we launched with - just 7 weeks ago. If the first algorithm was version 1, the version earlier this week must have been version 2, then I guess we're now at version 2.1.
Happy searching and as ever please let us know if you conduct a search and receive strange results!