Skip to main content.
home | support | download

Back to List Archive

Re: More on indexing and memory requirements in

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Aug 31 2000 - 15:08:37 GMT
At 07:36 AM 08/31/00 -0700, Frank Heasley wrote:
>OS Redhat Linux v 6.1
>pII 233
>128Mb RAM
>
>3,000 files, 1-2k each
>
>v 1.3.0: 8 minutes, 97.6% (no Meta indexxing)
>v 1.3.2: 74 minutes, 99.7% of RAM (with Meta indexxing)
>v2.0.1: 77 minutes, 99.5% of RAM (with Meta indexxing)

Wow, what do you have in those files?  I think something is broken.  What
else is running?  Do you have any swap space?

I'm running Suse Linux with P550 128M.  Twice your number of files (6414)
all about 1-2k each with quite a few meta tags and it indexes in 33 seconds.

MetaNames SUBJECT TITLE DESCRIPTION URLS IDENTIFIER KEYWORDS CREATOR
CATEGORY AUTHOR PUBLISHER
PropertyNames CATEGORY SUBJECT

> wc -w *.htm | grep total
1239639 total words

> ll | wc -l
   6434 total files

> ./swish -c swish_no_stem.conf
Indexing Data Source: "File-System"
Indexing ../docs..
Removing very common words...
8 words removed.
0 words removed not in common words array:

Writing main index...
Computing hash table ...
Writing header ...
Writing index entries ...
Writing stopwords ...
28016 unique words indexed.
Writing file index...
Writing file list ...
Writing file offsets ...
Writing MetaNames ...
Writing offsets (2)...
6414 files indexed.
Running time: 38 seconds.
Indexing done!


Bill Moseley
mailto:moseley@hank.org
Received on Thu Aug 31 15:12:49 2000