Skip to main content.
home | support | download

Back to List Archive

Re: converting .temp indices to usable indices

From: John Angel <angel_john(at)not-real.hotmail.com>
Date: Sun Dec 07 2003 - 10:43:35 GMT
> On such a large scale you need something where you can incrementally
> update the index.  Frankly, if documents are available locally I think
> completely reindexing with swish-e is often as fast as updating other
> types of indexes.  Maybe.
>
> Another to look at, if you can stand java, is Lucene.  I haven't tried
> it but their goal is an Open Source large-scale search engine.  Hey, Bob
> Dylan's site uses it (although I could not get it to work).


There are also Mnogosearch, ht://dig, Harvest and Nutch:

- Mnogosearch is extremely slow (both indexing and searching), completely
unusable for more > 100.000 pages
- ht://dig doesn't have duplicate detection; but it is the fastest crawler I
have ever seen; search speed is also fine, but it is resource eater
- I am just testing Harvest, but they state in docs that Swish is faster.
- Nutch is promising and very fast, but still not even in beta stage.
Received on Sun Dec 7 10:43:38 2003