various problems on windows

From: Philippe A. <futhark77(at)>
Date: Fri Sep 22 2006 - 23:47:29 GMT
I am having many little problems with 2.4.3 on Windows. I have ActivePerl
5.8.8. Any assistance is most welcome. I apologize in advance if I missed
anything obvious.


1. Accented characters do not get translated properly

My cfg is as follow:

IndexFile hrtool.index
IndexDir ../docs
IndexOnly .doc
FileFilter .doc ./lib/swish-e/ '"%p" "%P"'
TranslateCharacters :ascii7:

A word spelled "montr=E9al" gets converted to "montrcal", as shown by -T
    Adding:[7:swishdefault(1)]   'montrcal'   Pos:2  Stuct:0x9 ( BODY FILE =
    Adding:[7:swishdefault(1)]   'montrcal'   Pos:3  Stuct:0x9 ( BODY FILE =

Other accented letters produce similar odd results.

I tried both options, none helps:

TranslateCharacters :ascii7:
#TranslateCharacters =E9 e

If I omit TranslateCharacters, words get cut at accented letters position.
"Montr=E9al" becomes two words: "montr" and "al".

I need to be able to parse english and french documents. I don't mind
"loosing" accented letters during indexing, in fact I was quite happy when =
read swish could do that for me.

2. Can't locate object method "filter" via package "SWISH::Filter"

I am running the following command:
swish-e -c ..\hrtool.cfg -S prog

swish_filter reports an error and nothing gets indexed.
Can't locate object method "filter" via package "SWISH::Filter"

My cfg is as follows:

IndexFile hrtool.index
SwishProgParameters ../docs
TranslateCharacters :ascii7:

In 2.4.2, I have a different error:

Indexing Data Source: "External-Program"
Indexing ""
External Program found: C:\phil\pgms\swish\swish-
Use of uninitialized value in concatenation (.) or string at
C:\phil\pgms\swish\swish-2.4.2\lib\swish-e\perl/SWISH/ line 341.
Failed to set content type for file reference ''Use of uninitialized value
in concatenation (.) or string at C:\phil\pgms\swish\swish-
2.4.2\lib\swish-e\ line 53.
 - Not filtered:  (../docs)
Use of uninitialized value in print at C:\phil\pgms\swish\swish-
2.4.2\lib\swish-e\ line 56.
Removing very common words...
no words removed.
Writing main index...
err: No unique words indexed!

With a different config, swish_filter will work under 2.4.2 (but never unde=

IndexFile hrtool.index
IndexDir ../docs
IndexOnly .doc
FileFilter .doc ./lib/swish-e/ '"%p" "%P"'
TranslateCharacters :ascii7:

But needless to say, I'd prefer not to have to define individual filters.

3. Systematic error on PDF files: "May not be a PDF file"

The complete error is the following:
Error: May not be a PDF file (continuing anyway)
Error (0): PDF file is damaged - attempting to reconstruct xref table...
Error: Couldn't find trailer dictionary
Error: Couldn't read xref table

I obtain this error with PDF generated by OpenOffice or a PDF printer in

Options I use to parse them are the following:
IndexOnly .pdf
FileFilter .pdf ./lib/swish-e/pdftotext.exe '"%p" "%P"'

Received on Fri Sep 22 16:47:34 2006