Skip to main content.
home | support | download

Back to List Archive

Mind Freeze

From: Bill Moseley <moseley(at)>
Date: Fri Nov 14 2003 - 20:39:06 GMT
Ok, you programmers.  Anyone take compiler theory classes and studied
recursive descent parsers?

I'm back at looking at rewriting the swish-e query parsing code.
I've been reading a bit about recursive descent parsers but seem to have
a block in implementing one.  Maybe someone here can help me past that

Erik Corry offered up this almost a year ago:

but I'm not seeing how the final parse tree should look from that

We already have scanning code to tokenize a query string into an array
of search words and operators.  So now I'm trying to work on the code to
generate the parse tree.

For a simple query like 

   foo and bar

I can see a tree with three nodes, with AND at the top and two leafs
of "foo" and "bar".

But how to then represent these two queries?:

    foo and bar or baz
    foo and (bar or baz)

Swish-e currently only has three boolean operators, "and", "or", and
"phrase_and".  phrase_and is the special case of the and where the
position numbers must be only one apart.

Parenthesis and double quotes are for grouping -- I guess they represent
a sub-query.

The "not" operator is really just a search modifier.  It simply negates
the results of a list of results, which can be either a single term or
applied to the result of a sub-query started by parenthesis or phrase
quotes.  Metanames are somewhat similar in that they can apply to a
single search words, or a group.

   foo and meta=bar or baz # only on bar
   foo and meta=(bar or baz) # apply to both

I'm not sure what this should do:

   foo and meta="some phrase"

BTW -- the Swish-e distribution contains a Perl module to parse a query
into words based on metanames.  It's somewhat what I'm trying to do, but
I want to end up with a parse tree that can then be passed off to the
search code for processing.*checkout*/swishe/swish-e/example/modules/SWISH/

Bill Moseley
Received on Fri Nov 14 20:39:22 2003