Skip to main content.
home | support | download

Back to List Archive

Multiple sites index: configuration and performance questions

From: Gaël Lams <lamsgael(at)>
Date: Wed Feb 22 2006 - 13:04:40 GMT
Hi all,
I've to index more or less 14 web sites (for the time being). It's myunderstanding (probably wrong?) that I've to  indicate in theswish-conf to use for indexing and then create thespider.config file with the array of the web sites I have to index(testing only with two for the time being).
I then run "/usr/local/bin/swish-e -S prog -c swish.conf" but only thelast web site indicated in the array seems to be indexed: the firstsite (, as you will see below in my configuration) doesnot seem to be taken into account: it does not appear on the terminaloutput and the search's tests confirm that it has not been indexed(I'm able to search
I read the documentation but I'm probably missing something. Any helpwould be appreciated.
Also, as the I will probably have to index +/- 50 web sites in a nearfuture, I was wondering whether there was any kind of "best practices"or advices to have a scalable set-up.
You will find below my exact configuration:

- swish-e -V: SWISH-E 2.4.3- perl -v: v5.8.1 built for i586-linux-thread-multi- OS: Suse Professional 9.0, distribution's kernel 2.4.21
- swish-conf:# Use for indexingIndexDir
# Use's default configuration and specify the URL to spider# run it with /usr/local/bin/swish-e -S prog -c swish-e/swish.confSwishProgParameters spider.config
# Allow extra searching by titleMetanames swishtitle
# Set StoreDescription for each parser to display context with search resultsStoreDescription TXT* 10000StoreDescription HTML* <body> 10000
- spider.config:my %site1 = (   base_url   => '',   email      => 'internetpo(at)',);
my %site2 = (   base_url   => '',   email      => 'info(at)',);
@servers = ( \%site1, \%site2 );1;
Received on Wed Feb 22 05:04:52 2006