Skip to main content.
home | support | download

Back to List Archive

Re: XML2 parser error?

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Aug 03 2004 - 03:43:43 GMT
On Mon, Aug 02, 2004 at 09:49:54AM -0700, Peter Karman wrote:
>  > I use the prog option to generate XML documents to be indexed, using the
>  > XML2 parser. To make sure that the XML2 parser does not break, I do
>  > HTML::Entities::encode_entities on the text that i enclose in xml tags.
> 
> 
> I wonder if instead the encode_entities is having the opposite effect. 
> Have you tried NOT using encode_entities at all to see what happens?

Well, if the text has any of the standard five internal entities then
you will need to escape them.  (Well, at least <, > and & have to be
escaped.)

You have to be a little careful with HTML::Entities as it can create
named entities that are valid in HTML but would need to be specified
in the DTD to be valid XML  (I think &copy; is an example).

$ perl -MHTML::Entities -le 'print encode_entities("<>foo&barę")'
&lt;&gt;foo&amp;bar&copy;

-- 
Bill Moseley
moseley@hank.org
Received on Mon Aug 2 20:44:05 2004