Skip to main content.
home | support | download

Back to List Archive

Re: How to Index PDF and Doc files in WinNT Enviro

From: makroindia <makroindia(at)not-real.satyammail.com>
Date: Fri Dec 13 2002 - 07:14:35 GMT
Hai Eleazar,

Thanks for ur advice. Using ur command I have created my .config file.. =
My .config file is as follows :-

IndexDir D:/pdfs/
IndexContents TXT2 .pdf=20
FileFilter .pdf E:/xpdf/pdftotext.exe "%p -"


Is this .config file is correct ? Also advice me how to execute this =
.config file using Swish-E.exe from the=20

command prompt or could u pls tell me how u are using the swish-e.exe.

Thanks a lot.

S.K.Charan

-----Original Message-----

From: swish-e@sunsite.berkeley.edu

[mailto:swish-e@sunsite.berkeley.edu]On Behalf Of Eleazar Morales

Chaires

Sent: Friday, December 13, 2002 11:22 AM

To: Multiple recipients of list

Subject: [SWISH-E] Re: How to Index PDF and Doc files in WinNT Enviro



Its quite easy to do that,

First go to

- http://www.ice.ru/~vitus/catdoc/

- http://www.foolabs.com/xpdf/

and grab the binaries. The first program is for converting DOC files =
into=20

TXT files, the second one is for converting PDF files into TXT files.

The next step its to add this lines to your config file:

- IndexContents TXT2 .pdf .txt .doc

- FileFilter .pdf C:/Projects/Xpdf/pdftotext.exe "%p -"

- FileFilter .doc C:/Projects/CatDoc/catdoc.exe '-s8859-1 -d8859-1 "%p"'

With the first line you tell Swish-e to use the TXT parser for DOC and =
PDF=20

files, the second and third line uses those programs to make the =
conversion=20

and get a plain TXT file.

Im currently using Windows 2000 with SWISH-E ver 2.2.1 and I havent had =
any=20

problem, I hope this can help you out.

_________________________________________________________________

MSN. M=E1s =DAtil Cada D=EDa http://www.msn.es/intmap/




*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************
Received on Fri Dec 13 07:14:47 2002