On Sat, 2002-07-13 at 12:31, Michael wrote:
> > perl swishspider ./testing http://members.aol.com/CamelsRFun/
> > Does this take 5-10 minutes and use 100% CPU?
> 22725 diabetes 19 0 4484 4484 1552 R 99.3 1.8 0:28 perl
> ~5-6 minutes later
> 22725 diabetes 18 0 4496 4496 1552 R 98.2 1.8 5:09 perl
> ~ 7-8 minutes
> 22725 diabetes 17 0 4496 4496 1552 R 98.6 1.8 7:03 perl
Well, I'm not sure what the problem is. But, whatever is wrong is
related to PERL and/or the swishspider script. Seems like it would be a
problem with the PERL interpreter. But, I'm not sure.
You might try using the prog method with spider.pl. In the prog-bin
directory are some example scripts. Grab spider.pl and make sure the #!
line points to perl.
Then try this:
./spider.pl default http://members.aol.com/CamelsRFun/
You should see each document printed to your terminal. If that script
has problems then I'd be looking around to see if there are problems
with the PERL interpreter.
$ cat c
IndexDir ./spider.pl
SwishProgParameters default http://www.insulin-pumpers.org/
IndexFile ./swish.index
IndexName "Insulin Pumpers Mail Archive"
IndexDescription "no other index was specified."
IndexPointer "www.insulin-pumpers.org"
IndexAdmin "webmaster@insulin-pumpers.org"
MetaNames author description datamodified
IndexReport 3
UseStemming yes
PropertyNames author description datamodified
IgnoreTotalWordCountWhenRanking yes
MinWordLimit 4
WordCharacters abcdefghijklmnopqrstuvwxyz0123456789.-_'"
IgnoreLimit 80 1000
IndexComments 0
TmpDir ./
$ ./swish-e -c c -S prog -v3
Parsing config file 'c'
Indexing Data Source: "External-Program"
Indexing "./spider.pl"
./spider.pl: Reading parameters from 'default'
http://www.insulin-pumpers.org/ - Using DEFAULT (HTML) parser - (340
words)
..
--
David Norris
Dave's Web - http://www.webaugur.com/dave/
Augury Net - http://augur.homeip.net/
ICQ - 412039
Received on Sat Jul 13 19:16:45 2002