Skip to main content.
home | support | download

Back to List Archive

Re: Re: Document properties - code sample

From: Mark Gaulin <gaulin(at)not-real.globalspec.com>
Date: Tue Sep 07 1999 - 21:03:59 GMT
The index file is packed to be as small as possible, presumably to handler
larger number of documents with smaller files and to get some kind of
performance advantage. I'm not sure that it has much of an effect of
performance but I could imagine it since read-ahead caching comes into
play. That has to balanced with the extra CPU work to decompress things.
Hopefully someone did a benchmark on it way back (when?) when the
compression part was introduced.

As far as an ASCII index goes, that would be slower because of all of the
numerical values that would have to be converted back from text strings to
numbers on each and every search.  Also, the index uses lots of "pointers"
(file position info) to refer to objects within the file and a pure ASCII
index would be too tempting to just "tweak" by hand, corrupting the pointer
offsets.

At 06:15 PM 9/7/99 +0000, Einar Indridason wrote:
>> For the "official" release I would like to suggest something a little
>> easier to maintain... why not create a small set of routines to read &
>> write the basic data types to & from a file stream.  The might look like
this:
>
>This might be a naive question, but why is the index file kept in a
>binary format?
>Wouldn't it be easier (and more portable) to keep the index file in a
>ASCII format?
>
>--
>einari@complex.is
> 
Received on Tue Sep 7 16:02:34 1999