Skip to main content.
home | support | download

Back to List Archive

Re:

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Fri Apr 19 2002 - 14:08:05 GMT
At 06:53 AM 4/19/2002 -0700, Cristiano Corsani wrote:
>I have xml data in iso8859 encoding. Does swish support it? How can I manage
>strings like this:
>
>"Wolfgang St&#117;&#782;rner"
>
>does swish index such special character? how can I solve the problem?

It's easy to check:

> echo 'Wolfgang St&#117;&#782;rner' > 1.html

> ./swish-e -i 1.html -v0 -T indexed_words   
Indexing Data Source: "File-System"
    Adding:[1:swishdefault(1)]   'wolfgang'   Pos:1  Stuct:0x1 ( FILE )
    Adding:[1:swishdefault(1)]   'sturner'   Pos:2  Stuct:0x1 ( FILE )
Indexing done!

swish-e uses its own Entity converter for HTML, and libxml2 has its own.
So might as well check both:

> echo 'DefaultContents HTML2' > c
> ./swish-e -i 1.html -v0 -T indexed_words
Indexing Data Source: "File-System"
    Adding:[1:swishdefault(1)]   'wolfgang'   Pos:1  Stuct:0x1 ( FILE )
    Adding:[1:swishdefault(1)]   'sturner'   Pos:2  Stuct:0x1 ( FILE )
Indexing done!




Bill Moseley
mailto:moseley@hank.org
Received on Fri Apr 19 14:09:31 2002