Remember that a thesaurus expansion needs to be able to be turned off,
so using the stemmer to drop the synonyms in the index will be a
problem. Also, most often a thesaurus is used to add not only synonyms
to the users query, but also to add "narrower" terms, assuming the
thesaurus is a hierarchy of some sort.
I wrote a quick and dirty thesaurus server in perl, just using hashes,
and I have swish.pl communicating with that server via a socket. This
works well, but I plan to rewrite the thesaurus server so that it uses a
reverse index system instead of sucking up huge amounts of memory with
giant hashes. I can probably use Swish-e itself to store the thesaurus
data simply by feeding the indexer artificial "documents", each of which
contains the relationships (such as synonyms and narrower terms) for a
From: email@example.com [mailto:firstname.lastname@example.org]
On Behalf Of David L Norris
Sent: Friday, December 12, 2003 2:20 PM
To: Multiple recipients of list
Subject: [SWISH-E] Re: SWISH - IE -Integration - possible?
On Fri, 2003-12-12 at 14:10, Bill Moseley wrote:
> I can't really see how that would work for WordNet -- as it seems to
> generate so many cross references.
Seems like it would be easiest to do outside of SWISH-E. In the CGI or
whatever. After indexing completes you could dump the word list and
generate a simple cross reference table. Column A is a list of synonyms
and Column B is the word they map to in the SWISH-E index.
ICQ - 412039
Received on Fri Dec 12 19:30:00 2003