Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] index a list of files

From: Peter Karman <peter(at)>
Date: Wed Jul 09 2008 - 02:01:32 GMT
Brad Bauer wrote on 7/8/08 8:44 PM:
> SWISH-E 2.2.1
> Linux 2.4.9-e.68 #1 Thu Jan 19 18:24:23 EST 2006 i686 unknown

first, get an up-to-date version. 2.2 was last maintained over 5 years ago.

> I have begun converting from fs to spidering, but find that downloading pdfs
> considerably slows the spidering process.  So what I would like to do is
> index html/php/cgi using the spider, at the same time building a list of
> local pdfs for indexing using the considerably faster fs method.  
> Is there an easy way to feed a specific list of files into swish-e for
> indexing?

I'm guessing you are using -S http under swish-e 2.2. In the 2.4.x releases that 
method is deprecated in favor of using the Perl script in conjunction 
with the -S prog method.

I would suggest using to fetch and cache all your content, then use 
the -S prog swish-e option to index the cache. Alternately, you could configure to download only certain content types, and then make multiple 
spidering runs, creating multiple caches, and then either create multiple 
indexes for later merge, or index the multiple caches into a single index.

Peter Karman  .  .  peter(at)
Users mailing list
Received on Tue Jul 8 22:01:30 2008