Skip to main content.
home | support | download

Back to List Archive

Re: Fw: Re: 8-bit chars

From: John Angel <angel_john(at)not-real.hotmail.com>
Date: Sun Dec 14 2003 - 20:05:42 GMT
Exactly. Results depend on charset provided in search form (and then
supplied to search script).


----- Original Message ----- 
From: "Frances Coakley" <frances@fcoakley.net>
To: "Multiple recipients of list" <swish-e@sunsite.berkeley.edu>
Sent: Sunday, December 14, 2003 20:15
Subject: [SWISH-E] Re: Fw: Re: 8-bit chars


>
> > > There is NO WAY to store more than one encoding in the index as it is
> > > currently designed.
>
> Doesnt the meta charset give you the coding used in the original
document -
> assuming that the 8bit chars are the more unusual chars then it is
possible
> that a word in Icelandic charset maps onto the same sequence of 8 bit
chars
> as would a different word in the Norse charset.  But if the searcher is
> viewing with the charset Icelandic set then searching for Meta
> Charset=Icelandic and word=whatever will find the Icelandic word.  Those
> pages not encoded under the Icelandic charset cannot by definition contain
> that char.
> Or have I misunderstood the problem ?
> Frances Coakley - website http://www.manxnotebook.com
>
>
Received on Sun Dec 14 20:05:48 2003