Skip to main content.
home | support | download

Back to List Archive

Re: Indexing International Files

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Aug 24 2004 - 12:56:34 GMT
On Tue, Aug 24, 2004 at 12:19:28AM -0700, Roman Chyla wrote:
> (note also, you may index utf-8 with libxml2)

libxml2 can parse many encodings, including UTF-8.  And libxml2
outputs UTF-8.  But, swish-e converts output from libxml2 to 8859-1
encoding so you can't really index UTF-8 in swish.  You will get a
warning if the source document contains characters that do not map to
8859-1 and the character will be replaced with a space character.

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Tue Aug 24 05:57:05 2004