I gave up on indexing 40000 files because it was running up against the
ulimit of 128MB dataspace after 3 hours of indexing (BSDI on PPro 200). I
tried indexing subparts and merging with the same result. I switched to
Excite's EWS which manages the same task in less than 80MB and in 30
minutes if you don't enable "quality" summaries.
At 08:16 PM 10/24/97 -0700, Roy Tennant wrote:
>Yes, memory is an issue, as is time. Some of our indexes take hours to
>complete (hey, isn't that why it gets dark at night?). I'm not sure of
>the amount of RAM usage, but it must be considerable. One machine we are
>using has 1 GB of RAM, so if we had a lot less I'm sure it would be
>something to watch more closely.
>On Fri, 24 Oct 1997, WWW server manager wrote:
>> Roy Tennant wrote:
>> > On Fri, 24 Oct 1997, Michael A. Tilp wrote:
>> > >
>> > > Anyone have any guesses as to the maximum size of a SWISH index? I've
>> > > seen it used on sites of 5000+ pages (the old version); I'm just
>> > > how far that could go. Call me a bit afeared of beginning a large
>> > > then watching it choke a few months down the line ;)
>> > Our largest index here so far is in the 15 MB range, and probably in the
>> > neighborhood of 20,000 files. When we are through with our transition we
>> > will be creating indexes over 20 MB in size. So far no problems. This is
>> > on a DEC Alpha and a Sun SPARCCenter.
>> One thing to watch, though, is memory use during indexing - building an 8MB
>> index from around 6500 documents here takes about 32MB of memory at its
>> (and 17 minutes on a SPARC 10/51, an "old" and hence slow system by current
>> standards). If you don't have enough memory to avoid it (and/or the rest of
>> the system) being slowed by paging/swapping, whether or not it is
>> to build large indexes may be only half the story!
>> John Line
>> University of Cambridge WWW manager account (usually John Line)
>> Send general WWW-related enquiries to firstname.lastname@example.org
Received on Fri Oct 24 20:47:14 1997