Skip to main content.
home | support | download

Back to List Archive

Re: Problems with sorting German Umlaut

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Mon Jan 31 2005 - 17:26:51 GMT
On Mon, Jan 31, 2005 at 05:33:03AM -0800, swishe wrote:
> Hello swish-e group,

Hello swishe!

> 
> we are using swish-e 2.4.3.
> We feed swish-e by putting XML files into it.
> One xml tag which is used for sorting the search result
> contains german umlauts.

Properties are pre-sorted at indexing time.  The function that does
this is called Compare_Properties() in docprop.c.  For string
properties flagged as "case:compare" it uses the library function
strncmp(), which does not take LC_COLLATE into consideration.  For
strings marked as "case:ignore" it uses strncasecmp() which does check
LC_COLLATE.

If your property is flagged as case:ignore then check your locale
(LC_COLLATE) setting.

There's a strcoll() function to replace strcmp(), but the code would
need to be rewritten since the strings are not null terminated.

You can check your property's case setting by running

   swish-e -f myindex -T index_metanames

Use PropertyNamesIgnoreCase to set properties to ignore case.



-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Mon Jan 31 09:26:52 2005