At 11:23 AM 09/21/01 +0200, jmruiz@boe.es wrote:
>> Perhaps the best solution would be to have a common routine to read in
>> a chunk from a file handle (or stream for filters and -S prog) used by
>> all parser types. After reading in a chunk it would need to look
>> backwards for white space so that the TXT parser doesn't index partial
>> words. If this seems like a good idea, I'd appreciate any suggestions
>> or help in implementing this. I guess I'd expand the fprop structure
>> to include a buffer address, and pointers to the max size, size read
>> from the stream, and the size to use for parsing. I'd also have to
>> keep track of total bytes read since with -S prog I can't just look
>> for eof.
>>
>
>Yep, you can read 4096 bytes and then delimite the buffer
>using the latest white space, eg 4025. So, for next read you can
>fseek the file pointer in 4025 (-71) and read another 4096 bytes...
Oh, that makes sense.
I was thinking about reading in 4096 bytes, then, say, mark the end at
whitespace (4025). Then when I go to read more I shift the end of the
buffer to the front and read from their and then read in 4096 less the
bytes already in the buffer. Your was is much easier ;)
Thanks,
Bill Moseley
mailto:moseley@hank.org
Received on Fri Sep 21 14:05:23 2001