I have used Swish to index and search document collections, and now want to
"filter" documents before indexing using the same query syntax, i.e.
Given a document, I will extract its text and want to run a swish-format
query on the text to see if it matches the query criteria; if it does, I
will add it to my collection.
The simplest method is to add everything to a collection and do a swish
search on the collection, but I'm looking for a more efficient method,
especially if the hit percentage is small.
Can anyone suggest anything?
I looked at the parse_swish_query and tokenize_query_string functions, but
it gets too complicated quickly.
Thanks in advance for any ideas and comments.
Received on Wed Nov 17 19:57:37 2004