Well, it's quite true. swish-e IS easy to use, freaking fast, scaleable
to GB-level of indexing, extendable to index anything you can normalise
to HTML via Perl.
The good news aside, I think swish-e needs to continue to move forward
with the utf8 and (date) range support. The mail below only confirms
that swish-e is an international demand, and it is natural to support
content in their local languages.
Is that something we can all look forward to this year? :)
Roy Tennant wrote:
> Forwarded by permission.
> ------ Forwarded Message
> From: Prashant Iyengar <firstname.lastname@example.org>
> Date: Mon, 25 Dec 2006 09:44:02 +0530
> To: <email@example.com>
> Subject: Swish-e
> Dear Mr. Roy Tennant, I am an indpendent researcher in India and
> currently a fellow of the Open Society Institute, Budapest. I've
> recently finished implementing a database website of 22000 Indian
> Supreme Court decisions from 1950 onwards (http://judis.openarchive.in)
> for which I make use Swish-e.
> I'd like to extend my heartfelt thanks to you and your team for
> providing such a fabulous tool for free to the open source community.
> My website certainly would not have been possible but for the Swish-e
> search engine. Being my first such enterprise, I had started off
> building my database without a thought to whether I would be able to
> search such a bulky database quickly and comprehensively. I had rather
> naively assumed that Mysql's fulltext search capabilities would be
> adequate for my purposes. I discovered later, unfortunately, that Mysql
> capitulates under the weight of 22000+ documents with an average size
> of 27kb. Left in the lurch, I tried various other open source search
> engine softwares like Perlfect, htdig etc which either didnt support
> such a voluminous database - taking hours to index- or were difficult
> to install on my webhost. At the point at which I experimented with
> Swish-e, I was all but ready to give up the enterprise as lost.
> Fortunately, swish-e turned out to be this marvellous tool that it is,
> and from that point onwards everything just fell into place. It has
> taken me about 5 days to configure swish-e to my liking - constantly
> experimenting and adding new capabilities along the way - and I'm now
> the "possessor" of an extremely robust search engine of Indian Supreme
> Court cases that people are already thanking me for. But I know, and
> will always make it known that the real "heroes" of this project are
> the people who built Swish-e and gave it away for free. Hope you will
> always continue to do so, so that the tribe of independent hackers like
> me is maintained.
> I dont know how I can give back for what I've got from swish-e. I'm
> only a part-time coder and have little knowledge and no experience of
> coding in Perl. If there is anything that strikes you as something i
> could do to help, please do let me know.
> Warm Regards
> Prashant Iyengar.
> Ps. Some facts about my database that might interest you
> 1) No of docs: 22070
> 2) Average size - 27 kb, Full size 530 MB
> 3) Time taken to index : 10 - 12 minutes
> 4) Size of index: 190Mb
> 5) Average search time: Doesnt exceed 1 second (average less than half
> a second)
> Some problems I encounter:
> 1) Sorting by dates doesnt work with dates prior to 1970 (since it is
> stored as unsigned long). I got around this problem by creating a
> pseudo date in each of my documents which was the actual date plus 20
> years (since my documents begin from 1950). So swish-e sorts by the
> pseudo date and displays the actual date. 2) Still trying to figure out
> how to get one numeric property to "reverse sort" by default.
> Prashant Iyengar
> IPF Policy Fellow
> firstname.lastname@example.org www.policy.hu/iyengar/
> ------ End of Forwarded Message
Received on Thu Jan 4 23:17:20 2007