I've spent part of the last weeks trying to get swish-e index a
intranet web site with all kind of documents (HTML, doc, xls, pdf,
rtf, images, ... ) and set up swish.cgi to perform searches. Finally
everything seems to work rightly and, if it's considered of interest,
I could post the whole "story" for reference to others.
However there is a problem I can't resolve. I'm using spider.pl for
indexing. There are certain types of documents that cause swish-e to
fail when reading spider.pl output, producing the error: "External
program failed to return required headers Path-Name:".
This happens when spider.pl finds some types of binary documents (.dwg
or .psd files for example, I don`t even know what these files are). I
configured spider.pl to not index those files and the problem was
gone. But if a new file, whose indexing results in the same error and
whose type is not registered in the configuration file as no_index, is
uploaded to the portal the error will appear again.
My question is: Is there anyway to get swish-e to not abort after this
error and to continue with the next document?
Thank you in advance.
Received on Wed Jun 22 00:48:30 2005