On Mon, Aug 02, 2004 at 09:49:54AM -0700, Peter Karman wrote:
> > I use the prog option to generate XML documents to be indexed, using the
> > XML2 parser. To make sure that the XML2 parser does not break, I do
> > HTML::Entities::encode_entities on the text that i enclose in xml tags.
> I wonder if instead the encode_entities is having the opposite effect.
> Have you tried NOT using encode_entities at all to see what happens?
Well, if the text has any of the standard five internal entities then
you will need to escape them. (Well, at least <, > and & have to be
You have to be a little careful with HTML::Entities as it can create
named entities that are valid in HTML but would need to be specified
in the DTD to be valid XML (I think © is an example).
$ perl -MHTML::Entities -le 'print encode_entities("<>foo&barę")'
Received on Mon Aug 2 20:44:05 2004