Skip to main content.
home | support | download

Back to List Archive

Re: problem preserving specific special characters/unicodes characters

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Mar 11 2004 - 13:41:12 GMT
On Wed, Mar 10, 2004 at 11:33:52PM -0800, Prashant Badhe wrote:
> Hi,
> 
>     Can anybody give some idea about how to preserve some specific
> characters such as copy right symbol, endash, emdash, smart quotes etc.
> that are appearing in our input XML files??

Libxml2 converts to utf-8.  (Entities are also converted by libxml2.)
Swish-e is only 8-bit so it has to convert utf-8 to an 8-bit encoding, 
which is currently hard-coded to 8859-1.  Characters that can't make 
that conversion are lost.

My guess is your source is encoded in Windows 1252 which contain 
characters that do not map to 8859-1.  I thought copyright was ok, 
though.  Trademark, will not convert, though.


-- 
Bill Moseley
moseley@hank.org
Received on Thu Mar 11 05:41:24 2004