Sorry for two in two days. I hope that I can solve this one as easily as
yesterday...
Finally, I had time to work on my multilanguage search.
Mostly is going swimmingly using the followinc conf:
WordCharacters abcdefghijklmnopqrstuvwxyz0123456789.
IgnoreFirstChar .-
IgnoreLastChar .-
BeginCharacters abcdefghijklmnopqrstuvwxyz0123456789
EndCharacters abcdefghijklmnopqrstuvwxyz0123456789
IndexReport 2
IgnoreTotalWordCountWhenRanking yes
IndexComments 0
BumpPositionCounterCharacters |.
FuzzyIndexingMode Stemming_es
DefaultContents XML
MetaNames sphotogs categories sort_date qphotographer image_restrictions
id agents_off crop profile
UndefinedMetaTags index
PropertyNamesDate sort_date
PropertyNames id photographer subject released orig_id date_shot weight image_restrictions short_caption tsize siteowner adweight
SwishProgParameters sp
this hands off to my script which creates xml with the specification:
<?xml version='1.0' encoding="ISO-8859-1"?>
the indexes are going through great, and searches on the word espana (
with the ~ over the n) is searched correctly, as are all words with
spanish characters.
It would seem that the problem has beeen solved, but it is not. now i need
to get espana (without the spanish n to return the same set of results,
but alas, I have not been able to. I tried TranslateCharacters option,
that ended up removing all the results.
I could so some kung fu in the indexer to index both versions of the
words, but it seems clunky.
So any advice appreciated, and since I am most likeley leaving out
critical pieces, I am fully prepared for bill to ask me to send
requests for more specifics
Brad
------------------------------------------------------------
Brad Miele
Technology Director
AuroraPhotos.com
(207) 828-8787 x110
bmiele@auroraphotos.com
Oh, I don't blame Congress. If I had $600 billion at my disposal, I'd
be irresponsible, too.
-- Lichty & Wagner
Received on Wed Feb 25 10:57:37 2004