Skip to main content.
home | support | download

Back to List Archive

Spider Design Flaw!

From: PropheZine Owner <bob(at)not-real.prophezine.com>
Date: Sat Feb 26 2000 - 13:17:56 GMT
Hi:

As by the number of posts I have sent in you can tell I am experimenting
with Spidering and also AutoSwish.  Thank you all for your help.

Here is a design flaw.  I'm not knocking anyone as I think the software is
wonderful.  I wish I knew "c" and Perl better to offer modifications.

I have a website that is 4+ years old.  Back then we created a directory
(actually we have this problem in many directories) and instead of an
index.html we had a file named archives.html.  We added ssi at some point
and since the search engines had the archive.html indexed we created
archives.shtml and turned the archive.html into a redirect page.

Later we created an index.html and inserted this code:

  <META HTTP-EQUIV="Refresh" CONTENT="1;
URL=http://www.prophezine.com/search/database/archives.html">

Turns out that when I insert http://www.prophezine.com/search/database/ in
the config file it only indexes the index.html page that is returned.  That
page has some meta tags but no body.

What is needed is a change to the spider to follow the refresh tag.  I am
not sure of all the tags possible so there may be another to follow but this
should definitely be followed.

Thoughts?

Bob
Received on Sat Feb 26 08:21:43 2000