Re: Problems with sorting German Umlaut

From: Bill Moseley <moseley(at)>
Date: Mon Jan 31 2005 - 17:26:51 GMT
On Mon, Jan 31, 2005 at 05:33:03AM -0800, swishe wrote:
> Hello swish-e group,

Hello swishe!

> we are using swish-e 2.4.3.
> We feed swish-e by putting XML files into it.
> One xml tag which is used for sorting the search result
> contains german umlauts.

Properties are pre-sorted at indexing time.  The function that does
this is called Compare_Properties() in docprop.c.  For string
properties flagged as "case:compare" it uses the library function
strncmp(), which does not take LC_COLLATE into consideration.  For
strings marked as "case:ignore" it uses strncasecmp() which does check

If your property is flagged as case:ignore then check your locale
(LC_COLLATE) setting.

There's a strcoll() function to replace strcmp(), but the code would
need to be rewritten since the strings are not null terminated.

You can check your property's case setting by running

   swish-e -f myindex -T index_metanames

Use PropertyNamesIgnoreCase to set properties to ignore case.

Bill Moseley

Received on Mon Jan 31 09:26:52 2005