Skip to main content.
home | support | download

Back to List Archive

Re: input conversion failed

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Sun Oct 26 2003 - 01:11:03 GMT
On Sat, Oct 25, 2003 at 05:51:51PM -0700, J Robinson wrote:
> Hello Everyone;
> 
> Sometimes when indexing HTML using the HTML2 backend,
> I get messages like these from SWISH-E:
> 
> input conversion failed due to input error
> Bytes: 0x25 0x00 0x61 0x3E

That's a message generated by libxml2, not by swish-e.  Code in swish-e
causes it to print, so there should be a way to print the file.

> I know that it's multi-byte files  that are causing
> the errors. Does anyone know  if there's an easy
> workaround to avoid getting these, for example, to
> detect that a file is multi-byte in your -S prog and
> not index it?

I wonder if it's more a problem of libxml2 not figuring out the encoding
correctly -- or perhaps truly an invalid sequence of bytes for the given
encoding.  How to deal with it probably depends on what the problem is.

-- 
Bill Moseley
moseley@hank.org
Received on Sun Oct 26 01:24:38 2003