On Thu, May 12, 2005 at 11:50:35AM -0700, Bill Moseley wrote:
> On Thu, May 12, 2005 at 02:20:55PM -0400, John Paige wrote:
> > So, if someone is deleting in the same frequency as adding files in
> > the index (for example user's mailbox), the best approach would be to,
> > use incremental -r option to delete, and periodically, re-index and
> > remove the old index file.
> Incremental is good for a mailing list where you never delete.
> Searching an active mail box is another question. I've been thinking
> about setting up swish for a long time on my mail. But, I get
> hundreds of emails each day and delete almost that many. Actually, I
> get thousands -- but most get dropped or rejected early. So it would
> be hard to keep up with all the updates. Plus, I often move messages
> around -- from one folder to another.
I have this little toy that I have been playing with (and breaking
swish-e incremental indexing while doing so) called Mail::Box Web Search.
Svnweb is at
but I can assemble some kind of tar.gz is anybody is interested. It's
basically a thin wrapper between Mail::Box module and code that produces
swish-e index from it and local http server using SWISH perl API.
A swift warning, Mail::Box modules are big and bulky. I should probably
rewrite that using newer Email modules on CPAN, but I just haven't had
time. Currently it doesn't support incremental indexing, but adding that
should be fun weekend project.
> I guess I'd use incremental indexing and when searching make sure the
> mail still exists before presenting the results. What's a few stat
I'm also using mbox format for archive, so I can't just stat to see if
message is deleted. I guess I could convert that to maildir, but I'm
just lazy. I also planned on supporting remote IMAP and POP servers
(with Mail::Box module they basically come free).
> Also, I've thought about installing Mairix since it's just an apt-get
> away. http://www.rpcurnow.force9.co.uk/mairix/
I'm using mairix with mutt. It's very fast, and I thought about adding
it as indexing engine to MWS (which as abstraction layer to indexing engine,
and some support for Plucene and CLucene but limited) but I haven't written
perl wrapper around it yet.
However, using mairix is somewhat limiting for mws, because I plan to add
file-system search (locate just doesn't work for me any more), so I really
need swish-e :-)
If I let my imagination wild, daemon running in background indexing changed
files on file-system would make it even more useful. With RSS feeder which
creates searchable database of weblogs that I read. Or web pages that I
browsed (reading FireFox history or using tricks with wwwoffle for example).
Oh, why do I need to re-invent Google desktop, dammit it? :-))
Dobrica Pavlinusic 2share!2flame email@example.com
Unix addict. Internet consultant. http://www.rot13.org/~dpavlin
Received on Fri May 13 00:13:50 2005