I'm using swish-e for over a year now with so far good results, but
recently, two questions arose:
I currently use several swish-e based search tools from a mod_perl
application, with several index files with up to 100MB index size.
The speed requirements for our application are very high - searches need
to be performed < 1 second, but recently, the times got as high as 3.5
As we're still using the old SWISH.pm with an external swish-e binary, I
suppose I could speed up the searches by using SWISH::API.
Now, my question is:
I guess the major speedup SWISH::API allows is to keep the index file
open between searches, so it needn't be reopened and reparsed for every
request. How would I use it the best way, especially in the context of
Apache 1 and mod_perl, where Apache forks new children. Can I already
open the index files in my Apache mod_perl startup script (=before the
fork of the children) and it will automatically do the right thing, or
should I write Apache child startup handlers, so that they are opened
immediately after an Apache child has been forked? How much memory does
an index of that size, which is kept open, consume, by the way, and how
much of that memory is shared between Apache processes?
The other question I have is regarding incremental mode. So far I've
been using the traditional mode with cron jobs to update once or twice a
day, but I'd really like to convert the search to be "real time". How
stable is incremental mode? And "how incremental" is it? Can I use it,
to add/modify/remove documents from the search index on the fly, as they
are added/modified or is it rather targetted at batch processing a larger
number of updates (=merely a better merge)?
Markus Peter - SPiN AG email@example.com
Received on Thu Feb 17 02:38:20 2005