Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Can I index by filename and directory over http

From: Peter Karman <peter(at)>
Date: Tue Feb 12 2008 - 14:55:22 GMT
On 02/12/2008 08:47 AM, David Annis wrote:
> I am a longtime user of htdig and would like to switch to swish-e, but I
> need to be able to index part of sites in several ways.  I need to be able
> to do particular page(s) on one site, a directory on a second and a set of
> pages on a third that all use a common naming convention, but the page that
> links to them does not.
> Here's an example and how I think the swish configuration might work.  I
> want to index:
> anything in
> And all of the pages linked from
> that match flowers_*.html
> I think that the first two would be:
> IndexDir
> IndexDir
> But the third line of the config is harder.  I don't see how to start at one
> page (products.html) that I really don't care to have indexed but follow its
> links or how to use a regex on the results only from the links on that
> particular page.  Is this doable with swish-e?

If you use to aggregate your docs, you can define a regex check with a callback:

You might even consider creating 3 separate indexes, one for each site, and then merging
them. Might be easier to debug, etc.

Peter Karman  .  peter(at)  .

Users mailing list
Received on Tue Feb 12 09:55:22 2008