Skip to main content.
home | support | download

Back to List Archive

Re: Will swish-e index *very* large sites?

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Mon Jul 29 2002 - 13:25:01 GMT
On Mon, 29 Jul 2002, Ace wrote:

>  So what I need is a search engine that will index .doc, .pdf, 
> .ps and all kinds of html and text, that can also deal with umlauts, 
> which doesn't crash when the ammount of data to be indexed is a bit 
> bigger than usual and that will return search results within reasonable 
> time though the database might be of some GB of size.

You may be pushing the limits of these search engines and may want to
consider one of the comericial products.  http://searchtools.com might
be a good place to look.

Swish is very fast at indexing and searching.  You can build a collection
of indexes to avoid the 2GB limit and search them at the same time, with
a minor reduction in searching time.

But, that speed comes at the cost of scalability and the lack of ability
to do incremental indexing.  For most sites with something around 100,000
docs that is not really a big issue since it's so fast to index.

But if you have a very large number of docs that change often then swish
might not be the best solution.



-- 
Bill Moseley moseley@hank.org
Received on Mon Jul 29 13:28:37 2002