On Mon, Dec 12, 2005 at 12:03:22PM -0800, Thomas Nyman wrote:
> I made a word document for testing.
> The document contains the following two word
>
> Överskottslager
>
> boy
>
> when i run swish-e -c swish_se.conf -i test.doc -T indexed_words -v0
>
> i get the following
>
> Adding:[1:swishdocpath(11)] 'test' Pos:1 Stuct:0x1 ( FILE )
> Adding:[1:swishdocpath(11)] 'doc' Pos:2 Stuct:0x1 ( FILE )
> Adding:[1:swishdefault(1)] 'a' Pos:1 Stuct:0x1 ( FILE )
> Adding:[1:swishdefault(1)] 'verskottslager' Pos:2 Stuct:0x1
> ( FILE )
> Adding:[1:swishdefault(1)] 'boy' Pos:3 Stuct:0x1 ( FILE )
Odd, works for me.
moseley@bumby:~$ cat word
Överskottslager
boy
moseley@bumby:~$ swish-e -i word -T indexed_words -v0
Adding:[1:swishdefault(1)] 'överskottslager' Pos:5 Stuct:0x9 ( BODY FILE )
Adding:[1:swishdefault(1)] 'boy' Pos:6 Stuct:0x9 ( BODY FILE )
moseley@bumby:~$ cat c
TranslateCharacters :ascii7:
moseley@bumby:~$ swish-e -i word -T indexed_words -c c -v0
Adding:[1:swishdefault(1)] 'overskottslager' Pos:5 Stuct:0x9 ( BODY FILE )
Adding:[1:swishdefault(1)] 'boy' Pos:6 Stuct:0x9 ( BODY FILE )
Is it possible your config or source file is in a different encoding?
Doesn't seem likely, but I can't think of why it wouldn't be working.
I just cut from your email so seems like it would be the same
encoding.
moseley@bumby:~$ od -t x1c word
0000000 d6 76 65 72 73 6b 6f 74 74 73 6c 61 67 65 72 0a
Ö v e r s k o t t s l a g e r \n
0000020 62 6f 79 0a
b o y \n
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
swish-e@sunsite.berkeley.edu
Received on Mon Dec 12 17:39:58 2005