Greg Ryjikh scribbled on 5/15/07 6:38 PM:
> Thanks Peter,
> Your first point explained my result. When I changed my search query
> from -w (content=test) to -w (contentlabel=test or contentbody=test)
> then I started to see an effect which MetaNamesRank gives. It all sounds
> good in general but not "good enough" in our particular case. I provided
> this simple test data just to show a problem. In reality xml files we
> need to search have about couple hundreds of different tags and we don't
> even know all of them in advance. We do want to search all of them but
> give some priority to few. I was planned to use
> UndefinedMetaTags auto
> and use known top level tag (or wrapper) "content" for searching
> criteria but it seems that ranking is not working in that case. Is it
> any other way to give more "priority" to some meta tags but still search
> content in all other tags without explicitly creating huge search query
> with all xml tag names ?
First, make sure you read:
(though ignore the typo about MetaRankBias -- that should be MetaNamesRank and
is fixed in svn trunk.)
The thing to notice is how MetaNamesRank is used in calculating rank scores.
Basically is just artificially inflates the frequency count for a term in a
document. Therefore other factors, like the doc's relative length and the term's
IDF, will also pull the score one way or the other. One thing I haven't tried
but have considered is recompiling Swish-e with the RANK_BIAS_RANGE set to
something much higher than the default '10' (like maybe 100 or 1000), because
then setting it to 50 will make that feature weigh more heavily in the algorithm.
Second, it sounds like there are at least 2 things you are asking:
1. if you are wanting to give a priority boost to a few MetaNames that you know
in advance, you can still use "UndefinedMetaTags auto". That should work fine.
2. If you want to make searches look in 'content' by default, add an alias for
that to the 'swishdefault' MetaName. See
You don't want to bias 'content' with MetaNamesRank at all; that will have zero
net effect, since all words will be indexed under that MetaName.
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
Users mailing list
Received on Tue May 15 19:58:23 2007