One thing I find interesting is that although Lucene is an Apache project, if
you search the Apache website, you'll find the search engine is ... swish-e.
Guess they don't eat their own dogfood over there (and I mean that in a very
nice, not-mean way).
I assume that's the case for the reasons you describe: Lucene doesn't have a web
front-end as part of the package. They don't have a Bill Moseley to write all
the friendly Perl front-end stuff.
Plucene I have only read about; I love Perl, but for performance reasons, I just
can't imagine using Plucene in any kind of production environment. Hobbyists,
personal websites, etc., seem fine, but Perl just isn't robust enough, imo, to
stand up against Lucene (Java) or Swish-e (C).
There are a number of features in Lucene that I would love to see in Swish-e,
among them the things you mention:
* solid incremental indexing (optional db backends, too)
* multi-byte indexing
* lots of query syntax options: proximity, native fuzziness, rank biasing, etc.
But as we (developers) keep saying, to do those things well (esp multi-byte)
requires a lot of code re-writing and it isn't a high enough priority yet for
any developer. We're waiting for some venture capitalist to fund us...
Eric Lease Morgan scribbled on 5/9/05 9:34 AM:
> At the risk to starting a religious war, what do folk here think of
> My favorite indexer is swish-e. Quick. Easy. Well documented. Feature
> rich. Comes with a Perl API as well as API's for other languages. The
> query syntax is straight forward.
> As the amount of content I plan to index increases I begin to need an
> incremental indexing feature. I also need multi-byte character
> indexing. Lucene/Plucene offer these features at the expense of greater
> complexity. Lucene/Plucene is 100% a toolbox, no application. It does
> not index files, but rather data structures. This means I can not point
> it to file system and have it automatically extract the meta data,
> content, etc.
> What experience do others here in Swish-E World have with
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
Received on Mon May 9 08:51:21 2005