Skip to main content.
home | support | download

Back to List Archive

Biasing the search ranking

From: Aliasgar Dahodwala <adahodwala(at)not-real.umassd.edu>
Date: Tue Jan 31 2006 - 16:48:59 GMT
We are currently using swish-e 2.4.3 to index our University webpages. 
We use the supplied spider.pl to grab all the webpages and to filter pdf 
files.

There are considerable amount of pdf files that get indexed. The search 
results always returns the pdf files ranked higher then the webpages. As 
I understand this could be due to the higher frequency of the word 
appearing in the pdf files. I tried using the 
IgnoreTotalWordCountWhenRanking directive in swishe config file. But 
that didn't help much.

Another option was to use the MetaNamesRank  to bias the ranks for meta 
tags. The output generated by the pdf filter is always enclosed by <pre> 
.. </pre> tags. So having something like

MetaNamesRank -3 pre

and including pre in the MetaNames list does the trick for now.

Is there any other way to do this that i missed? does anyone have a 
better suggestion?
Would such a feature of biasing ranks based on file types be include in 
future versions?


Thanks,
-- 
------------------------------------------------
  Aliasgar Dahodwala
  Application Integration Analyst
  Information Systems Integration Team
  Computing and Information Technology Services
  University of Massachusetts Dartmouth

  Phone :   508-910-6599
  email :   adahodwala [at] umassd [dot] edu
-------------------------------------------------
Received on Tue Jan 31 08:49:11 2006