On Fri, Aug 22, 2003 at 07:39:26AM -0700, David Hoare wrote:
> One of the things I would like to do however is if a search did not return
> a hit then check the search words against the indexed words and do some
> approximate matching (agrep type of thing) and return a link to a new
> "corrected" search. The equivalent of google's "did you mean _BLAH_" when
> you misspell something.
> I use a Linux box and program in tcl or perl for preference.
There's a module included in the swish-e distribution called
ParseQuery.pm. It's suppose to take the output from the "Parsed Words:"
header and, well, parse it.
Then on CPAN grab the Text::Asepll Perl module. You can use that to
lookup words (parsed by ParseQuery.pm) to lookup words.
You will likely want to create a dictionary of only the words in your
index. You can use one of the -T options to extract out all the words
from the index for use in your dictionary. The Text::Aspell
docs describe how to create an Aspell dictionary from this list of
words. You might decide to create a dictionary for each metaname.
So, when you get "no results" back from swish you can look up each word
in the query and check if it's spelled correctly and if there are any
suggestions from Aspell.
It's a bit complicated due to the fact that a misspelled word can return
many, many suggestions -- and that there can be more than one misspelled
word in a query. Also, with boolean searches you can get no results
when all the terms are indeed ok search words.
<he says without the time to do it>
It would not be that hard to hack swish-e to link directly with Aspell
and do the word lookup at search time and offer a list of suggestions
for each word not found in the index.
Received on Fri Aug 22 14:57:29 2003