Skip to main content.
home | support | download

Back to List Archive

Re: error in use libxml2

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Mar 22 2005 - 16:27:02 GMT
On Tue, Mar 22, 2005 at 06:44:37AM -0800, ??????????? ?????? wrote:
> Hi
> I use swish 2.4.3 with libxml2.
> After update libxml2 to 2.6.18 I take an error in doc which
> contain text:
> <META http-equiv=Content-Type content="text/html; charset=iso-8859-2">
> 
> Swish core dumped on indexing.
> 
> #0  0xa80bd6b4 in ISO8859xToUTF8 () from /usr/local/lib/libxml2.so.5

What did you upgrade libxml2 from?  There was a change in libxml2
that required a patch in swish (which is in the development version).

Still, that looks more like an error inside libxml2.

I wrote a small test program the other day -- maybe this can help
isolate the problem outside of swish.  If you need to post to the
libxml2 list it will be helpful to show the test code used.


#include <stdio.h>
#include <string.h>
#include <libxml/HTMLparser.h>
#include <libxml/xmlerror.h>

typedef struct {
    htmlParserCtxtPtr    ctxt;
} USER_DATA;


static void char_hndl(void *data, const char *txt, int txtlen);

int main ( int argc, char **argv )
{
    htmlSAXHandler       SAXHandlerStruct;
    htmlSAXHandlerPtr    SAXHandler = &SAXHandlerStruct;
    htmlParserCtxtPtr    ctxt;
    USER_DATA            user_data;
    size_t res;
    FILE *f;
    char buf[1002];

    if ( argc < 1 ) 
        return 1;

    if ( !(f = fopen( argv[1], "r"))  )
    {
        printf("failed to open %s\n", argv[1] );
        return 1;
    }
    if ( ! (res = fread( &buf, 1, 1000, f )) )
    {
        printf("failed to read any data\n");
        return 1;
    }


    memset( SAXHandler, 0, sizeof( xmlSAXHandler ) );
    SAXHandler->characters = (charactersSAXFunc)&char_hndl;

    /* Create the context and read first few char */
    ctxt = htmlCreatePushParserCtxt(
                SAXHandler,
                &user_data,
                buf,
                5,
                argv[1],
                0
    );

    user_data.ctxt = ctxt;

    /* say we are done */
    htmlParseChunk( ctxt, &buf[5], res-5, 1 );
    htmlFreeParserCtxt( ctxt );



    return 0;
}


static void char_hndl(void *user_data, const char *txt, int txtlen)
{
    char    out[1000];
    int     outlen = 1000;
    int     ret;


    printf("input length = %d\n", txtlen );

    ret = UTF8Toisolat1( out, &outlen, txt, &txtlen );

    if ( ret > 0 )
        printf("Wrote %d bytes to buffer. Consumed %d of input\n",
                outlen, txtlen );

    if ( ret >= 0 )
    {
        out[outlen] = '\0';
        printf("[%s]\n", out );
        return;
    }

    xmlParserWarning(
            (xmlParserCtxtPtr)((USER_DATA *)user_data)->ctxt,
            "Failed to convert. ret=%d\n", ret
    );
}


-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Tue Mar 22 08:27:08 2005