Skip to main content.
home | support | download

Back to List Archive

Index headers

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Sep 19 2000 - 13:33:57 GMT
I have a few questions here:

Stop words:
-----------

For phrase highlighting I need to know what stop words were used to create
the index.  I'd like a switch that would make swish print stopwords when
printing the headers.  I'm not sure what the best swish letter would be.
-W is too close to -w, perhaps.

swish -x -f index.file
Stopwords: and if the a an

-x could be used to say "extended headers" and so if -x was used additional
headers such as Wordcharacters, IgnoreFirst, and Stopwords would be included
in the header display.

Does -x seem like a good switch letter for this?

BTW -- if there was enough info stored in the index headers, I could see
reindexing an index just by saying:

   swish-e -C -f index.file



Library version of swish-e and reindexing:
------------------------------------------

Does SwishOpen() really open the index file?  Or is the file opened and
closed on each search?

I ask this as I could see a mod_perl application where SwishOpen() is
called once on the first request, but then the index is left open for the
life of the Apache child process.  So if the file was reindexed you might
end up searching an old index file until that Apache child dies.

Perhaps SwishSearch() could stat the index file to see if it changed on
disk and reopen?  Or maybe it would be better for the application to stat
the index file and look for changes.


Multiple indexes:
-----------------
When searching multiple indexes swish processes one index file at a time.
You end up with headers like:

  # Search words: ( foo )
  # Number of hits: 13

For each index file searched with results mixed in between.

Is there anyway to process multiple index files as if they are a merged
index file?  That is, get one set of headers where Number of hits: is equal
to the total hits of all index files (and where the -b sort would sort ALL
the results)?

I have two index files -- on is indexed once a week, and the other is
indexed whenever a new entry is made during the week.  I don't want to
merge the weekly index with the incremental index every time a new entry is
made.

Thanks,


Bill Moseley
mailto:moseley@hank.org
Received on Tue Sep 19 13:34:15 2000