Web Consultant, Dept. 3187
NORTEL, Signaling Solutions Group
* Phone: (919) 905-4975 ESN 355-4975
* Fax: (919) 905-8313 ESN 395-8313
* Email: Christian.Stalberg.email@example.com
From: Ron Klatchko [SMTP:firstname.lastname@example.org]
Sent: Friday, October 16, 1998 3:14 PM
To: Multiple recipients of list
Subject: [SWISH-E] RE: swish-e spider does not go beyond index.html
Christian Stalberg wrote:
> Oops, someone reminded me that starting with a frames webpage will
> I have changed the IndexDir to a TOC page for the frames and it
> be working. Is there any special wisdom anyone can share re. using
> to index frames webpages?
Frames are a tricky situation when it comes to searching. It would
simple to make the spider fall frame links as well, but what happens
retrieval? SWISH would return the URL of the individual frame that
contained what they were searching for and the user would see only
frame instead of the nicely constructed frameset you constructed.
There might be a solution to that in some clever file layout and use
ReplaceRules. One idea is below.
Another possibility would be to have a no frames version with
content. SWISH can spider that currently. This also has the nice
benefit of opening your site to non-frames aware browsers.
Unfortunately, even frames aware browser would end up with the
non-frames version when they search.
So, going back to the clever layout/rewrite idea. Let's assume that
swishspider can now follow frame links. Also assume you have a
frame set with the left side as a table of contents and the right
with your various data pages.
In order to do this, you'll need a main directory and one
for each page.
The main directory contains index.html which defines your frameset
toc.html which is your table of contents. You have a series of
directories called page1, page2, etc. inside of which you have
page1.html, page2.html, etc. Also, each of these directories
index.html. The different between this and the main index.html is
starting page for the right side; the main index.html points to page
where the one in the subdirectories points to their own page
(page2/index.html has page2.html as the right hand side). For an
example of this structure you can check out
If you then introduce the rule:
ReplaceRules remove "page[0-9]+.html"
a search that gets ../pageN/pageN.html gets rewritten to ../pageN/
preserves the frame set.
More complicated use of frames would require even more thought, but
is a possibility.
Are people interested in doing such a thing? Should I modify
swishspider to be able to follow framelinks?
Ron Klatchko - Manager, Advanced Technology Group
UCSF Library and Center for Knowledge Management
Received on Tue Oct 27 09:39:43 1998