At 11:22 AM 09/01/02 -0700, Rodney Barnett wrote:
>I just ran into a segmentation fault while using the prog method. I tracked
>the trigger down to a very long "word" (in this case, it was roughly 429,000
>characters long). I certainly don't want that "word" to be indexed, but the
>program shouldn't crash either.
I haven't been able to duplicate the problem. I added a printf statement
and wrote a -S prog program to generate a word 429,000 chars long and
another 2,000,000 chars long and I see this:
> perl prog.pl | ./swish-e -S prog -i stdin
Indexing Data Source: "External-Program"
word is too long [2000000 bytes]. Skipping
word is too long [429000 bytes]. Skipping
Removing very common words...
no words removed.
Writing main index...
err: No unique words indexed!
Perhaps you have some characters that are causing problem and it's not just
the length of the data.
>I was first using swish-e from a snapshot from a week or two ago, but
>switched to today's CVS and the problem's still there.
Can you send me a test case off-list? Something the example above?
>I'm not using libxml2 and I have not changed the MaxWordLimit parameter from
>Are there any other details that are important?
A backtrace from gdb might give some clues.
Received on Mon Sep 2 17:03:46 2002