Skip to main content.
home | support | download

Back to List Archive

Re: Help Getting the PDF Filter to Work on a Windows Machine

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Nov 04 2003 - 17:21:35 GMT
On Mon, Nov 03, 2003 at 06:48:43PM -0800, Nathan Schile wrote:
> I am trying to filter pdf files with SWISH-E.  My pdf file is located at =
> F:/SWISH-E/TeenSnapshot.pdf
> 
> I used the example8.conf as my base point:
> 
>     IncludeConfigFile "F:/SWISH-E/conf/example4.config"
>     IndexDir "F:/SWISH-E/"
>     IndexOnly .pdf
>     FileFilter .pdf "F:/SWISH-E/lib/swish-e/_pdf2html.pl"

I have Windows 98 so I can't run perl scripts directly.  I have to use:

E:\SWISH-E>cat c
FileFilter .pdf perl 'e:/swish-e/lib/swish-e/_pdf2html.pl %p'


> I also made the following change in the _pdf2html.pl file
>     =20
>      $ENV{PATH} =3D 'F:/SWISH-E/lib/swish-e/'

I used this:

E:\SWISH-E>fgrep lib lib/swish-e/_pdf2html.pl
$ENV{PATH} = 'E:/swish-e/lib/swish-e;' . $ENV{PATH};

> When I run the index command, I recieve the following output:
> 
> F:\SWISH-E>SWISH-E -c "F:\SWISH-E\conf\example8.config"
> Indexing Data Source: "File-System"
> Indexing "F:/SWISH-E/"
> 
> Checking dir "F:/SWISH-E"...
> 'pdfinfo' is not recognized as an internal or external command,
> operable program or batch file.

Wow, that's better that the old "Bad command or file name"!

Otherwise, I don't see anything wrong.  It's just trying to run a
program called pdfinfo.  Make sure that you can run pdfinfo from the
command line would be my only suggestion.  Try adjusting your path
before running swish, test pdfinfo from the command line, then test
_pdf2html.pl from the command line.

E:\SWISH-E>swish-e -c c -i hp-cms.pdf
Indexing Data Source: "File-System"
Indexing "hp-cms.pdf"
Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 439 words alphabetically
Writing header ...
Writing index entries ...
  Writing word text: Complete
  Writing word hash: Complete
  Writing word data: Complete
439 unique words indexed.
4 properties sorted.
1 file indexed.  18026 total bytes.  817 total words.
Elapsed time: 00:00:01 CPU time: 00:00:00
Indexing done!



-- 
Bill Moseley
moseley@hank.org
Received on Tue Nov 4 17:34:28 2003