Skip to main content.
home | support | download

Back to List Archive

Re: Indexing takes forever

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Fri May 06 2005 - 20:36:18 GMT
Nick scribbled on 5/6/05 3:28 PM:
> As far as docs go, I would like to see a few different sample swish.conf
> files (and possibly related command line options like you showed below)
> for different applications.  Generally when I am setting up something I
> like to see example setups/configs and play around with it before trying
> to fine-tune it.  If there were more example configs then a user could
> just pick one that is close to what they are looking for to get it going,
> then work from that.
> 


there should be example config docs installed by default in
swish_prefix/share/doc/swish-e/examples/conf/

check /usr/local/share/doc/swish-e/examples/conf/ if you installed in default 
location.



> On the same note what should I put in the config file if I use the:
> 
> swish-e -c /etc/swish.conf -S prog -i DirTree.pl
> 


that command should work with your existing config file (I think). DirTree.pl 
will try and load SWISH::Filter for file formats it recognizes.

> 
> I am guessing that it was just using the default html filter to find text
> in the doc and ppt files that I searched then?  I know that it could find
> text in these binary files using my existing config, that is why I thought
> it was somehow finding the extra progs I had installed to filter the file
> types.

yes, I have been misled that way too. swish-e does its best to get whatever text 
it finds, and since word .doc (especially) files have real text mixed in with 
all the proprietary formatting instructions, swish-e probably got lots of chunks 
of text. but a proper filter will ensure you get all of it, as the author 
intended it.

-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
Received on Fri May 6 13:36:18 2005