Another problem with frames is that pages are often sourced from a
different domain. I came across a page where virtually all the content came
from a domain other than the one which held the main page. All the pages in
the frameset were indexed, but because the site had other important content
only accessible through href links, these would not be available, even when
spidering, because Swish-E would reject them for being "Wrong method or
server". Perhaps Swish-E needs an option to spider documents from other
domains albeit not to the same level that it spiders documents from the
same domain (NetAttache from Tympani has this feature).
From: Ron Samuel Klatchko [SMTP:email@example.com]
Sent: Sunday, February 27, 2000 1:40 AM
To: Chris Humphries
Subject: Re: [SWISH-E] Re: Swish-E and HTML documents with frames
On Sat, 26 Feb 2000, Chris Humphries wrote:
> This is very true, and if one were spidering indiscriminately, it would
> a problem because there is probably no way of knowing that the page you
> found *was* indirectly referenced. However, most of my indexing so far
> been just the first page of a Web site, which means that my approach to
> reading through the frames is probably safe. Each Web site will already
> have been looked at by a human being and its basic structure understood.
> If you can think of a case you would like to see handled that isn't
> by the approach I am using, I would really appreciate it if you could
> supply a url for me to try out.
That was the only thing. I get the feeling we're in agreement that
there's no practical solution for the problem I brought up. But I have a
feeling that you're going to have people banging on you about this issue.
If I may put in my two cents, I'd definitely consider putting in a major
disclaimer about this.
Received on Mon Feb 28 07:22:20 2000