Skip to main content.
home | support | download

Back to List Archive

Re: Re: Re: Swish-E with incremental index building

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Sun Dec 05 2004 - 16:48:24 GMT
On Sun, Dec 05, 2004 at 06:54:52AM -0800, Peter Karman wrote:
> yes. the -i inputstuff is the key. if you tell it to use the same 
> inputstuff as when creating the db initially, it will.
> 
> i.e., these are the same
> 
> swish-e -u -i file
> swish-e -i file

I have not tested, but my guess is that's due to the bug reported by
Paul a few days ago.  -i would cause it to be just normal indexing.
The fix is in cvs.

Also note that -r or -u only set a mode in swish.c -- you shouldn't
use both.  If you do then only the last one will work, and that would
cause unexpected results.

(I suppose -u or -r should check for an existing mode and give an
error).  -i only sets the mode to normal index if -u or -r has not
been given (maybe the command should be 

    {-i|-u|-r} file/path

so they all take a path.  That might make it more clear, would you
agree?  At least making swish complain if using mixed settings would
be smart.  Poor old "IndexDir" setting is a bit outdated.  It can be a
list of files and directories with -s fs, a list of urls with -S http
and a list of programs with -S prog.

Maybe someone can come up with a more modern alias for "IndexDir".

And it even gets more confusing when you use -S prog to index but then
wish to update or remove.  For example, if using -S prog and spider.pl
it would be smart to do a HEAD request, ask swish if it knows that
file and if the HEAD request shows that the file is newer before
fetching the document with a GET request.

Then keeping track of what files may need to be removed (because they
were removed on the web site that's being index) is another problem.

I'll at least add logic to prevent mixing -u and -r, but I'd like
input on if -u and -r should take a parameter like -i (that is, you
use one of the three instead of -r or -u plus -i to list the files.

I trust Jose will speak up if anything I said above is wrong.


-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Sun Dec 5 08:48:35 2004