Re: [swish-e] htmlParseEntityRef: expecting ;

From: Peter Karman <peter(at)>
Date: Thu Aug 16 2007 - 02:55:33 GMT
Bohl, Thomas (StBA Krumbach) wrote on 6/25/07 10:58 AM:
> Hello everybody,
> some hours ago i have updated from 2.4.2 to 2.4.5. Now, when i will create the index, i get hundredths of error messages:
> Indexing Data Source: "HTTP-Crawler"
> Indexing ""
> error: htmlParseEntityRef: expecting ';'
> <li><a href=nachricht.php?tn=1;&ID=326>Verbindungsabbrüche in Outlook</a><br>
>                                   ^
> error: htmlParseEntityRef: expecting ';'
> <li><a href=nachricht.php?tn=2;&ID=3>Einheitliche Datenablagestruktur und Dateib
>                                   ^
> error: htmlParseEntityRef: expecting ';'
> <li><a href=nachricht.php?tn=4;&ID=10>Informationen zum Datenaustausch mit GE-Of
>                                   ^
> ...and so on.
> I don't get the point! Why semicolon are expected there?
> The index is fine, but my errorlog is full with these messages.

Sorry this thread is so old, Thomas. You may have figured out the issue. But for 
the archives:

In 2.4.5 the libxml2 parser warnings were turned on by default. You can set the 
level of warnings with the -W option at the command line, or in the config file 
with ParserWarnLevel (note that the config option overrides the -W flag if present).

The warning you are getting is likely because you have a & in your URLs instead 
of a &amp; -- the latter is XML-compliant, while the former is not.

In fact, it looks like you have 2 delimiters: the ';' and the '&'. Only one is 
necessary, and the ';' is preferred because it is XML compliant (and shorter).

