That did it! Thanks...
[mailto:email@example.com] On Behalf Of Bill Moseley
Sent: Friday, February 11, 2005 2:03 PM
To: Multiple recipients of list
Subject: [SWISH-E] Re:
On Fri, Feb 11, 2005 at 10:52:14AM -0800, Shaffer, Chris wrote:
> Hi... I've gotten swish-e (using spider.pl) to crawl a couple of our
> intranet sites. The filters seem to be working okay for excel. And
> it seems to be looking at word documents. However, (using swish.cgi),
> I don't get any descriptions for those word docs.
> Any idea where I can look? I have no idea where to begin digging.
Sure. spider.pl just writes to stdout, so you can run it on a few test
docs and see what it outputs. Do it on a file that generates a
description and then another that doesn't and compare.
> StoreDescription HTML* <body> 200000
Make sure in the spider.pl output that the document's header is indeed
$ SPIDER_QUIET=1 /usr/local/lib/swish-e/spider.pl default
http://localhost/apache/test.doc | head
That's saying the document is TXT*, so you would need to add another
StoreDescription line for TXT*
Unsubscribe from or help with the swish-e list:
Help with Swish-e:
The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential, proprietary, and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from all computers. 117
Received on Fri Feb 11 11:34:39 2005