Skip to main content.
home | support | download

Back to List Archive

Re: Fw: Re: 8-bit chars

From: david moreau <davidjmoreau(at)>
Date: Tue Dec 16 2003 - 01:08:13 GMT
Open source means if you want a feature, you can implement it. But I think
Bill probably has more urgent things to fix. Time is a limited resource and
every feature implemented involves opportunity costs.

The main problem I see is that search engines should send relevant and
complete results. In order to get such results using the scheme proposed,
you need to know numeric representations for each letter in each encoding
and map them. Otherwise, when a user types 'dog' in you web site and clicks
on search, you might miss many relevant 'dog' documents while retrieving
many irrelevant 'cat' documents (I'm alluding to the earlier example).

Is that going to make swish-e a better tool?

dave moreau

Bill wrote:
>> We agreed that utf-8 is the right thing, but who knows when it will be
>> implemented.
>> I repeat the question - what is the alternative until utf-8 support is
>> implemented? You don't have one. Proposed solution is something which can
>> used in the meantime.
>Ok.  Send the patches.
Received on Tue Dec 16 01:08:20 2003