Skip to main content.
home | support | download

Back to List Archive

Re: Spider

From: PropheZine Owner <bob(at)not-real.prophezine.com>
Date: Thu Feb 24 2000 - 14:11:03 GMT
Mike:

I installed the patch and it still stopped on the home page for the site.
So, I removed the following 2 lines and sure enough it is now indexing.
BUT, I am not sure how many of my html files on the site have these lines.
So, I am going to perform a global replace to remove the lines.  I'll
re-index and see if more files are indexed.

<meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<meta NAME="GENERATOR" CONTENT="Microsoft FrontPage 3.0">

I have no idea on how to change the Perl for this.

Bob Lally
PropheZine
bob@prophezine.com



-----Original Message-----
From: swish-e@sunsite.berkeley.edu
[mailto:swish-e@sunsite.berkeley.edu]On Behalf Of Mike Downey
Sent: Wednesday, February 23, 2000 8:18 PM
To: Multiple recipients of list
Subject: [SWISH-E] Re: Spider


Could it be the SWISHSPIDER hasn't been patched? I noticed: <meta
HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> in the
head of the document and that stopped my indexing until I acquired the
latest version of the spider....


----- Original Message -----
From: Ron Samuel Klatchko <rsk@brightmail.com>
To: Multiple recipients of list <swish-e@sunsite.berkeley.edu>
Sent: Wednesday, February 23, 2000 5:00 PM
Subject: [SWISH-E] Re: Spider


> PropheZine Owner wrote:
> > I finally have the spider working.  Well, almost.
> >
> > I have tried all combinations for the IndexDir entry to get the spider
to
> > index my entire web site.
> >
> > If I use  http://www.prophezine.com/  it only indexes the index.shtml
page.
> > I tried http://www.prophezine.com/index.shtml  and it still only indexes
the
> > index.shtml page.
>
> I just invoked swishspider manually on that URL and it worked properly.
> Perhaps something else in your config is incorrect.  Try running
> swishspider with an IndexReport level of 3 and see what it says.
>
> moo
> ------------------------------------------------------------
>            Ron Samuel Klatchko - Software Jester
>             Brightmail Inc - rsk@brightmail.com
>
Received on Thu Feb 24 09:15:41 2000