>I posted a question before that got no reply...
>My understanding based on results is that swish-e does not discriminate
>between words. Word frequency in a document is used to compute rank, but
>word's frquency in the overall document set is not considered. I just
>remember being taught that the weight of a word in the rank should be
>inversly proportional to the number of documents it appears in. This would
>cause the word 'the' to be of less weight than the word 'democracy', even
>(in most document sets) 'the' appears in the title and 'democracy' only in
>Was disciminating among terms considered for swish-e and considered to be
>too much additional work, or was it not included cuz it's a bad idea?
>Or did did the issue never come up?
>It seems it would give more relevant results.
I think you are right. Now it seems, that swish-e uses the number of hits
in a document to calculate rank whithout considering file size.
The way you propose will fix this, or no?
>The new MSN 8: smart spam protection and 2 months FREE*
Txat euskalduna >>> http://www.euskalerria.org/solasgunea
Received on Tue Mar 9 00:44:31 2004