On Tue, Mar 16, 2004 at 10:03:44AM -0800, Steve Harris wrote:
> All I get from a backtrace is:
> #0 0x400e04a9 in compress3 (num=2139062143,
Yep, buffer overflow.
I'll have to defer to Jose for this problem.
What's the point of indexing such a large file?
> at compress.c:140
> 140 _s[_i++] = _r & 127;
> #1 0x7f7f7f7f in ?? ()
> Cannot access memory at address 0x7f7f7f7f
> The file its processing is quite large:
> $ wc /raid/swh/lit_index/segv.lit
> 5065943 9424230 50550321 /raid/swh/lit_index/segv.lit
> and contains some 8bit characters, but if I run it through sort | uniq it
> doesn't cause problems. Its fairly simple file, with one phrase per line,
> longest line is 255 characters.
> There are a few thousand similar files in the directory tree, that parse
> fine, but this is by far the largest. It doesnt appear to matter at what
> position it appears in the parse order.
> I've made the file available at http://triplestore.aktors.org/~swh/segv.lit
> incase anyone wants to test it.
> - Steve
Received on Tue Mar 16 11:20:46 2004