I am working on using swish-e and the included spider to index a web
site which uses the occasional accented character. The most common
one is the acute e, or in html: é. A specific example is
So, from what I can tell, the html encoding of entreée is
being stored as entrée (the accented e translated to its proper
encoding) in the database. I can then search for entree (unaccented
e) and get results that had the html encoded entreée.
However, and this is the problem I need to solve: the results are
returned as entrée (the accented e translated to its proper encoding)
rather than the html encoded entreée. I need to have the text
as it was originally presented, not as it was translated. What is the
best way to do this?
I am using SWISH-E 2.4.4 on Gentoo Linux.
In my config I have set TranslateCharacters :ascii7:
Anything else you need to know?
Thanks in advance for any help.
Users mailing list
Received on Wed Oct 24 16:56:20 2007