Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Swish3 vs Omega

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Thu Feb 05 2009 - 14:32:46 GMT
Kevin Bowling wrote on 02/04/2009 11:57 PM:

>> Ranking should be somewhat better.
> 
> I look forward to the better rankings!
> 

have you tried the different RankScheme options in Swish-e 2.4?

> What really kills performance is PDF/PS to HTML conversions on my box.  It 
> would be really nice to thread the indexing and converting so it doesn't block 
> on this case.

you can do that yourself. Just filter your PDF/PS separately and cache
the output, then index your cache. That's a common approach.

> 
>>> What I am confused about is that it now uses Xapian.  I haven't tried
>>> Xapian but I know they have their own system called Omega.  How does
>>> Swish3 differ from it?  I just need a local filesystem indexer for a
>>> website with 200k+ HTML, PDF, TXT and PS files.  Are Swish-e and Omega
>>> the only two FOSS contenders?
>> Oh no. There are many. Lucene and its clones. KinoSearch. HyperEstraier
>> (though it seems to have fallen out of support). There are many others.
> 
> I tried several solutions at one time, but I'm really not interested in 
> writing an indexer, interface, or anything else as many of these are just 
> libraries.  To me, web search for a case like this (static documents) should 
> be pretty turn-key.  Lucene seems to be really nice but it suffers from this.  
> I couldn't find a simple, direct file system indexer.

yes. That's what Swish3 will try to do: make IR libraries like Lucene,
Xapian, et al, into turnkey apps. Omega is like that, but it only works
with Xapian.

> 
> That is what I like about Swish-e.  It was somewhat easy to set up, and at 
> least straight forward.

Exactly. See http://blog.peknet.com/projects/swish/whySwish3

> 
>> If all you need is a local filesystem indexer for a website with 200k+ docs
>> (which I would call medium-sized -- these days folks deal with
>> multi-million doc collections), and you don't need UTF-8 or incremental
>> indexing, Swish-e 2.4.5 is about as good as it gets. Don't let its age fool
>> you. :)
> 
> Yes improvements and a nicer interface are really all I would like to see.  
> It's 2009 and the interface looks like it is 10+ years old (not that that is a 
> bad thing, but a 'modern' interface would be nice as well).

by 'interface' do you mean the swish.cgi script? or the options to the
swish-e cli? or ...?


-- 
Peter Karman  .  peter(at)not-real.peknet.com  .  http://peknet.com/

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Feb 5 09:32:46 2009