Skip to main content.
home | support | download

Back to List Archive

Re: Swish roadmap

From: Peter Karman <peter(at)>
Date: Thu Mar 31 2005 - 15:05:24 GMT
Roman Chyla scribbled on 3/31/05 7:02 AM:
>>By the way, could any in the list recommend any UTF 8 capable indexing tool?
> Lucene

Lucene does UTF.

> I believe that htdig can index u8 too

htDig does not:

4.27. How can I get htdig to index Chinese, Japanese or Korean text?

You can't do that yet. Current versions of ht://Dig only support 8-bit 
characters, so languages such as Chinese, Japanese and Korean, which require 
16-bit characters, are not currently supported. The same goes for documents in 
any language if the document is encoded in anything but simple 8-bit character 
sets. Unicode and UTF-8 documents are not supported. There are long-range plans 
to add support for these, but it's a huge task that no developer has taken up yet.

That last sentence sounds just like Swish-e...
Peter Karman  .  .  peter(at)
Received on Thu Mar 31 07:05:36 2005