> Huh? Why is
> Path-Name: http://arena.internet2.edu:80/sample.htm
> Content-Length: 33
> Last-Mtime: 1013569857
> <HTML>Sample document</HTML>
> showing up? That's stdout from the spider.cgi script that should be
> captured by swish that's running the spider. You will note that was not
> my example.
I did just notice that. I'm curious about how swish reads from the stdout.
I can capture the web documents to be indexed in one file by putting this in
the swish config file:
Then the file output.txt looks something like this:
<HTML>....code for page here...</HTML>
<HTML>....more html code here...</HTML>
..etc for all web pages spidered
Would there be some way (function call in swish?) to get swish to read from
output.txt as if it were being directly passed from spider.pl in stdout so
that the effect (multiple web pages indexed) would be the same? Thanks.
Received on Wed Feb 13 18:42:26 2002