> Think of your suggestion. One document is 1250 and it includes a word
> with the "d"-slash character. That word gets indexed -- since the index
> stores numbers (not characters) that stored word includes the F0 byte.
> The next document is in 8859-1 and it includes some word with the "eth"
> character (it's an Icelandic document, I suppose) and that gets indexed,
> and again there's a word that includes byte F0 in the index.
> Now you have a value in the index "F0" that represents more than one
> character. So when searching are you looking for a 1250 char or 8859-1
> char? You can't tell.
It doesn't matter, as long as you find that character.
Why it doesn't matter? Because I will put charset directly in HTML. Search
script just has to find F0 always, it is not important what character is
Received on Fri Dec 12 15:43:57 2003