Re: URL case during multiple index search

From: Bill Moseley <moseley(at)>
Date: Mon Sep 30 2002 - 01:25:12 GMT
At 05:55 PM 09/29/02 -0700, Trond Nilsen wrote:
>Am I right in assuming that when Swish-E performs a search on multiple
>that when the results are merged, they are done so with case sensitivity?

Results are not merged when searching multiple indexes.

~/swish-e/src $ ./swish-e -i index.c -f 1 -v0
~/swish-e/src $ ./swish-e -i index.c -f 2 -v0 
~/swish-e/src $ ./swish-e -w not dkdkd -f 1 2 -H0
1000 index.c "index.c" 81446
1000 index.c "index.c" 81446

Are you talking about -M type of merge where indexes are merged before
searching the combined single index?

>So, is there any way to get Swish to ignore case when merging? I'm having 
>trouble spidering a large site over which I have no editorial control, where 
>the writers have been lazy and specified pages with both cases. I can solve 
>the problem with some post-processing, but I figured I'd check first :)

If you are talking about -M merge then check out:

I think you can set the swishdocpath as case-insensitve.

The other thing is to lowercase the URL when spidering by editing the
spider program ( swishspider or ).

The right solution is to convince the site owner to fix their broken URLs.
All it would take is a short perl script...

Bill Moseley
