> On Wed, Apr 14, 2004 at 09:05:58AM -0700, Rob de Santos AFANA wrote:
> > This is done. All the files are .asp files but saved as .asp.html
> > make them visible to Swish-e.
> Bill Moseley wrote:
> That should not be necessary. Swish doesn't do anything
> special with ".html" files unless told to.
Understood. This is easily changed via the wget options.
> > The problem now is that it does not appear that Swish-e is indexing
> > the necessary directory in total:
> > http://www.afana.com/www.othersite.com/afl/
> You can use -v (indexing verbose) to see what files are being
> indexed. You can also use -T properties to list the files as
> they are indexed. So you should be able to see what files are
> indexed. Use -T and -v and you might get an idea how
> ReplaceRules is working.
It seems ReplaceRules is working just fine. Because I am using -S prog
and not -S fs (see below) not all the files in the directory in question
are indexed, but the rest of the site is spidered just fine and indexed.
> > Apparently, the other 600 files in my directory are skipped.
> > they are extracted from the dynamically generated pages at the other
> > site they aren't necessarily linked in a "spiderable" chain from the
> > index file but all of them need to be indexed.
> Makes sense. So either use -S fs method to index (instead of
> spidering) or maybe try the --convert-links option of wget.
> Read the wget man page for details.
I know about --convert-links and it doesn't do what I need. It's simply
a matter of getting this one directory included in the index at this
point. Wget is getting all the files, the rest of the swish-e index is
working just fine.
So, is there a way via the configuration file to tell Swish-e to index
this one directory via the "fs" method? and still do the rest of the
site via spidering? Or do I need to run two indexes, merge them, and
rename it [I gather from reading the docs that when using swish-e -m the
out_index must not previously exist, so it would have to be renamed each
time to the one used for searching.]
Perhaps I can use multiple configuration files so swish-e does each task
in one indexing job? Thanks in advance for any advice.
Received on Mon Apr 19 09:24:00 2004