Skip to main content.
home | support | download

Back to List Archive

Re: word weight

From: <redna(at)not-real.euskalerria.org>
Date: Tue Mar 09 2004 - 08:44:31 GMT
>I posted a question before that got no reply...
>
>My understanding based on results is that swish-e does not discriminate 
>between words. Word frequency in a document is used to compute rank, but
the 
>word's frquency in the overall document set is not considered. I just 
>remember being taught that the weight of a word in the rank should be 
>inversly proportional to the number of documents it appears in. This would

>cause the word 'the' to be of less weight than the word 'democracy', even
if 
>(in most document sets) 'the' appears in the title and 'democracy' only in

>the body.
>
>Was disciminating among terms considered for swish-e and considered to be 
>too much additional work, or was it not included cuz it's a bad idea?
>
>Or did did the issue never come up?
>
>It seems it would give more relevant results.

I think you are right. Now it seems, that swish-e uses the number of hits
in a document to calculate rank whithout considering file size.

The way you propose will fix this, or no?
 

>dave
>
>_________________________________________________________________
>The new MSN 8: smart spam protection and 2 months FREE*  
>http://join.msn.com/?page=features/junkmail
>
>
_________________________________________________________
Txat euskalduna >>> http://www.euskalerria.org/solasgunea
Received on Tue Mar 9 00:44:31 2004