Skip to main content.
home | support | download

Back to List Archive

Re: swish-e 2.1 hangs for a very long time

From: David L Norris <dave(at)not-real.webaugur.com>
Date: Sat Jul 13 2002 - 19:10:51 GMT
On Sat, 2002-07-13 at 12:31, Michael wrote:
> >   perl swishspider ./testing http://members.aol.com/CamelsRFun/
> > Does this take 5-10 minutes and use 100% CPU?
> 22725 diabetes  19   0  4484 4484  1552 R    99.3  1.8   0:28 perl
> ~5-6 minutes later
> 22725 diabetes  18   0  4496 4496  1552 R    98.2  1.8   5:09 perl
> ~ 7-8 minutes
> 22725 diabetes  17   0  4496 4496  1552 R    98.6  1.8   7:03 perl

Well, I'm not sure what the problem is.  But, whatever is wrong is
related to PERL and/or the swishspider script.  Seems like it would be a
problem with the PERL interpreter.  But, I'm not sure.

You might try using the prog method with spider.pl.  In the prog-bin
directory are some example scripts.  Grab spider.pl and make sure the #!
line points to perl.

Then try this:
  ./spider.pl default http://members.aol.com/CamelsRFun/

You should see each document printed to your terminal.  If that script
has problems then I'd be looking around to see if there are problems
with the PERL interpreter.


$ cat c
IndexDir ./spider.pl
SwishProgParameters default http://www.insulin-pumpers.org/
IndexFile ./swish.index
IndexName "Insulin Pumpers Mail Archive"
IndexDescription "no other index was specified." 
IndexPointer "www.insulin-pumpers.org"
IndexAdmin "webmaster@insulin-pumpers.org"
MetaNames author description datamodified
IndexReport 3
UseStemming yes
PropertyNames author description datamodified
IgnoreTotalWordCountWhenRanking yes
MinWordLimit 4
WordCharacters abcdefghijklmnopqrstuvwxyz0123456789.-_'"
IgnoreLimit 80 1000
IndexComments 0
TmpDir ./


$ ./swish-e -c c -S prog -v3
Parsing config file 'c'
Indexing Data Source: "External-Program"
Indexing "./spider.pl"
./spider.pl: Reading parameters from 'default'
http://www.insulin-pumpers.org/ - Using DEFAULT (HTML) parser -  (340
words)
..

-- 
 David Norris
  Dave's Web - http://www.webaugur.com/dave/
  Augury Net - http://augur.homeip.net/
  ICQ - 412039
Received on Sat Jul 13 19:16:45 2002