Skip to main content.
home | support | download

Back to List Archive

Re: Problems indexing german umlauts

From: Sven Schupp <sven.schupp(at)not-real.gmx.net>
Date: Mon May 10 2004 - 10:03:53 GMT
Hi,

Bill Moseley wrote:
> On Fri, May 07, 2004 at 07:45:51AM -0700, Sven Schupp wrote:
> moseley@bumby:~$ swish-e -w Überbrückungsgeld
> # SWISH format: 2.5.1
> # Search words: Überbrückungsgeld
> # Removed stopwords: 
> # Number of hits: 1
> # Search time: 0.001 seconds
> # Run time: 0.055 seconds
> 1000 uber "uber" 19
> .
> 
> (and lower:)
> 
> moseley@bumby:~$ swish-e -w überbrückungsgeld
> # SWISH format: 2.5.1
> # Search words: überbrückungsgeld
> # Removed stopwords: 
> # Number of hits: 1
> # Search time: 0.001 seconds
> # Run time: 0.049 seconds
> 1000 uber "uber" 19
> .
> 
> Can you repeat the above?

Yes, I can. It seems to be not a problem in swish-e, but a perl 
problem...I tried to lowercase the umlauts in my cgi with: 
lc($searchword) but I did not succeed.

Solution for my problem now is (line 1019 in swish.cgi):

     for ( $query ) {  # trim the query string
         s/Ü/ü/;
         s/Ä/ä/;
         s/Ö/ö/;
         s/\s+$//;
         s/^\s+//;
     }

Thanks for the hint,

sven
Received on Mon May 10 03:03:54 2004