This is a message for Ron Samuel Klatchko:
Ron, I have tried the following in your spider program. It seems to allow the spider to read through the links in frameset HTML documents.
# This is the whole of "sub linkcb"
# The small bit that I added runs from "Start of addition" to "End of addition"
sub linkcb {
my($tag, %links) = @_;
if (($tag eq "a") && ($links{"href"})) {
my $link = $links{"href"};
#
# Remove fragments
#
$link =~ s/(.*)#.*/$1/;
#
# Remove ../ This is important because the abs() function
# can leave these in and cause never ending loops.
#
$link =~ s/\.\.\///g;
print LINKS "$link\n";
}
# Start of addition
# Extract frameset links
if (($tag eq "frame") && ($links{"src"})) {
my $link = $links{"src"};
#
# Remove fragments
#
$link =~ s/(.*)#.*/$1/;
#
# Remove ../ This is important because the abs() function
# can leave these in and cause never ending loops.
#
$link =~ s/\.\.\///g;
print LINKS "$link\n";
}
# End of addition
}
Chris Humphries
Received on Tue Feb 22 07:22:32 2000