Hi Patrick,
We have a similar problem; we have about 900,000+ documents at over
4GB.
Fortunately for me the documents are grouped into directories and I
only reindex the groups that change into a "intermediary" index (I
actually
use a Makefile to detect which directories were updated). Then I merge
all the intermediary indexes into the final index. It still takes a
while (~1 hour on a sparc V210) but it's faster than doing it all from
scratch.
On average it's faster to merge, however, if everything changes then it
actually takes longer... fortunately, that does not happen very often.
Also, be careful in the number of "intermediary" indexes as Swish can
only merge a few dozen at once.
I hope this helps.
Regards,
Peter Finch
________________________________
From: users-bounces@lists.swish-e.org
[mailto:users-bounces@lists.swish-e.org] On Behalf Of Patrick May
Sent: Saturday, 12 July 2008 12:26 AM
To: users@lists.swish-e.org
Subject: [swish-e] indexing performance expectations
Hello,
How should I expect indexing to perform when indexing 900,000+ very
small documents (256 Mb)? Thus far, my observation is that it takes a
while. Could it be helpful to move to an incremental format?
Cheers,
~ p
--
Patrick May
135 Oak Street
New York, NY 11222
+1 (347) 232-5208
patrick@hexane.org
http://www.hexane.org
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Sun Jul 13 18:43:07 2008