On Sat, 14 Dec 2002, Yann wrote:
> i wanted some informations about performance while searching in a large
> number of index files, such as 50.. i wanted to do so in order to allow
> the user to search only among rfcs, or only among man's, or among mysql
> doc... (i'm working on a large documentation site).
>
> Do you think that it is a good way to create so much index files to
> achieve that, having in mind that most often the search will request all
> the indexes? Is there any other way to do what I want?
It all depends. You need to test and see if it's acceptable. If you can
write a small Perl program to generate documents (I have used one that
made documents from random words from /usr/share/dict) then it's quite
easy to generate some test indexes.
When I tried 100 indexes it was a little slow. But the opening of the
indexes is the slow part so you could overcome that issue using the
Swish-e library to keep the indexes open. You will want plenty of RAM,
too. But, again, that's something you need to test.
If you are not indexing a huge number of total documents then I'd also try
creating one index and then using a metaname to limit to the various
sections. If some of your files require filtering before indexing
consider creating a compressed cache of the filtered documents that can be
incrementally updated and then have swish-e index that to speed up
indexing time.
Whatever you do, please report back your findings.
--
Bill Moseley moseley@hank.org
Received on Sat Dec 14 14:55:42 2002