Re: Indexing University Site

From: Shaffer, Chris <Chris.Shaffer(at)not-real.BELLSOUTH.COM>
Date: Mon Aug 30 2004 - 14:28:11 GMT

> Does this come close?  

I saw that...  Yeah, it does come close...  The since the real power of
spidering is in the customization of the spider, it would be nice if
that section would contain a 'sane default' spider.conf file.  It can be
overwhelming to a novice user just being dropped at an all inclusive
list of options.  Took me a little while to figure out what was
important, and what wasn't...

Some thing like the included sample file, with the comments and some
'sane defaults'.

I hope I don't sound like I'm ragging on the docs...  They're
wonderfully comprehensive...  Its just that when you getting started,
it's a bit overwhelming...


Chris Shaffer

On Sun, Aug 29, 2004 at 01:10:38PM -0700, Shaffer, Chris wrote:
> David,
> I had a similar situation.  Because some of our sites are dynamic in 
> nature, we chose to go with spidering.  However, I found some 
> documentation around setting up spidering a little confusing (there 
> was a lot of it, it was just ordered a little weird).  I think what 
> the documentation could use is a Spidering Getting Started Guide.

Doesn't really describe how to customize the spider, though, other than

 Detailed instructions on using the swish.cgi script and debugging  tips
can be found by running:

    $ perldoc swish.cgi

I understand that it can be confusing, so let me know if you have any
suggestions or edits.


> 	# Only index .html .htm and .q files
> 	IndexOnly .html .htm .txt

IndexOnly does nothing in this case -- the spider determines what files
to index.

Bill Moseley

