Skip to main content.
home | support | download

Back to List Archive

Re: Problem with spider?

From: Ron Klatchko <ron(at)not-real.library.ucsf.edu>
Date: Tue Jan 26 1999 - 22:13:21 GMT
That is very strange.  I just tried your configuration file and I was able
to spider your site properly.  Could you try running swishspider manually.
Run the following command:
  /usr/users/bowler/swishe/src/swishspider.pl /tmp/moo http://www.bigelow.org/

and check that the following three files are created:
  /tmp/moo.response
  /tmp/moo.contents
  /tmp/moo.links

and then see if moo.links has anything inside of it.  Also, do you get any
errors from swishspider?

moo


At 12:53 PM 1/26/99 -0800, Bruce Bowler wrote:
>Hi,
>
>I'm a swish-e newbie so forgive me.  I searched the archive but didn't find
>anything that looked relevant.  Maybe my expectations are off...
>
>I run swish-e as follows....
>
># /usr/local/bin/swish-e -S http -c bcb.config
>Indexing Data Source: "HTTP-Crawler"
>retrieving http://www.bigelow.org/ (0)...
> (122 words)
>
>Removing very common words... no words removed.
>Writing main index... 96 unique words indexed.
>Writing file index... 1 file indexed.
>Running time: 1 minute, 7 seconds.
>Indexing done!
>#
>
>It's possible that there are 122 words on the main page, but there are also
>lots of links that I would have expected to be followed but apparently
>weren't.
>
>What I would like from swish-e is to give it a starting point (like
>http://www.bigelow.org/) and have it index all of the local pages
>referenced from there, either directly or indirectly.  
>
>My config file looks like
>
>	IndexDir http://www.bigelow.org/
>	IndexFile ./index.swishe
>	IndexName "Bigelow Index"
>	IndexDescription "This is the index of our site."
>	IndexPointer "http://www.bigelow.org/swish/index.html"
>	IndexAdmin "Bruce Bowler (bbowler@bigelow.org)"
>	MetaNames first author
>	IndexReport 3
>	FollowSymLinks yes
>	IgnoreLimit 50 1000
>	IndexComments 0
>	MaxDepth 0
>	Delay 60
>	TmpDir /tmp
>	SpiderDirectory /usr/users/bowler/swishe/src
>	EquivalentServer http://www.bigelow.org http://alpha1.bigelow.org
>
>I'm using perl 5.00404 and I think I've installed all of the modules that
>are documented as being needed.	
>
>Any ideas?
>
>Bruce
>
>Bruce Bowler                             207.633.9600 (voice)
>Research Associate                       207.633.9641 (fax)
>Bigelow Laboratory for Ocean Sciences    bbowler@bigelow.org
>West Boothbay Harbor ME  04575           http://www.bigelow.org/
>
>
----------------------------------------------------------------------
          Ron Klatchko - Manager, Advanced Technology Group           
           UCSF Library and Center for Knowledge Management           
                        ron@library.ucsf.edu                
Received on Tue Jan 26 14:08:21 1999