Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Ranking errors when or-ing many search terms

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Fri Aug 24 2007 - 14:21:40 GMT
Peter Karman wrote:

> Can you assemble that repeatable test case you mentioned? It would be very 
> helpful. I think I know what is going on, but would be nice to have a test to 
> code against.
> 

I am able to reliably reproduce this now, so no need for a test case.

I can see what the issues are, but not sure yet the best way to fix
them, other than to overhaul the way rank scoring is done. The main
issue is that OR'd results are added together and then multiplied by 2
(and there is no check to see if the result will fit in an int), and
then the rank display normalization has some hardcoded scaling numbers
that simply fail when scores get big (likely because ints are usually
now 32 or 64 bits rather than 16, as they were back when the code was
written).

So there's just some unusual and disparate math going on. :/

I think the best route would be to come up with a raw scoring algorithm
that doesn't depend upon the size of any given C type on any given
system (as it does now). I'll have to ask some of my smart math friends
to help (any smart math folks on this list please chime in... :)).


-- 
Peter Karman  .  peter(at)not-real.peknet.com  .  http://peknet.com/

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Aug 24 10:21:40 2007