Re: [swish-e] html parse problem?

From: Bill Moseley <moseley(at)>
Date: Mon Feb 05 2007 - 14:42:32 GMT
On Mon, Feb 05, 2007 at 06:27:46AM -0800, Bill Moseley wrote:
> On Mon, Feb 05, 2007 at 12:25:24AM -0800, Jordan Hayes wrote:
> > I upgraded to 2.4.5 (from 2.4.3) today, but none of my Mailman archives 
> > will index anymore.
> > 
> > I've narrowed it down to this:
> > 
> > ./001946.html:3: error: htmlParseEntityRef: expecting ';'
> >    <A HREF="">Hi</A>
> >                                                          ^
> Yes, I've sen that.  It's an invalid entity according to libxml2.

Eh, not according to, but reported by.

You can set:

    ParserWarnLevel 0

to quiet that.

Or maybe find out where Pipermail is generating that broken html and
patch it so you are not sending invalid markup.

>From 2.4.4

# Changes to ParserWarnLevel

The default value for ParserWarnLevel was changed form zero to two.

The ParserWarnLevel controls the error handling of the libxml2 parser.
The higher the setting, the more verbose the output. The change to the
default is to report when libxml2 has problems parsing a document
(which often times results in processing only part of a document).

To get the old behavior, either set ParserWarnLevel to zero in your
config file, or use the new -W command line option to set the
ParserWarnLevel at run time. If ParserWarnLevel is set in the config
file, it will override the -W option.

Also, to see UTF-8 to 8859-1 conversion errors set ParserWarnLevel to
3 or more. Previously, these warning were issues at ParserWarnLevel of

Bill Moseley

