On Mon, Jun 14, 2004 at 11:40:52AM -0400, Kaplan, Andrew H. wrote:
> I've continued work on trying to get Swish-e to be able to index the pdf
So, to be clear, the problem is what?
> I went through the motions
> of setting up the swish.conf file according to the instructions listed on
> the website. Here is what the file
> text looks like:
> IndexDir spider.pl
> SwishProgParameters default http://localhost/www
> Metanames swishtitle swishdocpath
> StoreDescription HTML* <body> 200000
> StoreDescription TXT* <body> 200000
I will note for the archives that those StoreDescription directives will
only work if the -S prog program tells swish-e the document type (as
spider.pl does), otherwise you need DefaultContents or IndexContents to
map a file extension to a type like HTML* or TXT*.
> I ran the command swish-e -S prog -c swish.conf and the result was the
> Indexing Data Source: "External-Program"
> Indexing "spider.pl"
> External Program found: /usr/local/lib/swish-e/spider.pl
> Removing very common words...
> no words removed.
> Writing main index...
> err: No unique words indexed!
> I have had no luck in resolving this issue.
Now, have you read any of my responses?
So there's no words indexed. So why not? Repeating, run the spider by
itself. Is it generating output? Yes or No. If No figure out why by
turning on debugging as I explained before. If Yes then figure out why
swish-e isn't indexing.
If the spider isn't indexing because it can't convert the PDF files, use
swish-filter-test program. See the debugging and testing comments at:
> I am the point where I am ready to install a pdf to word converter
> program that will change all the pdf files to .doc and/or .rtf files.
> Unless there is something else that I have missed, I have run out of
My vote is you missed something.
Received on Mon Jun 14 16:46:00 2004