The current stable release is 2.4.7.
Swish-e is continually under development. This page contains a laundry list of requested features planned for a future Swish-e release. To request new features, bug fixes, or (best of all) to submit code patches, send e-mail to the Swish-e mailing list.
Swish-e source is available for anonymous public download from the swish-e subversion server.
The daring and adventurous can download the daily build snapshot from the swish-daily page. This is not an official release of Swish-e, rather the current development version. There is no guarantee that these packages run. Please do not use this code in production.
For Windows development binary (pre-compiled) snapshots, please
The most current Windows development version is here.
Questions regarding daily development builds, or about using Swish-e in general, should be directed to the Swish-e mailing list.
Features planned for 2.6
- Remove expat and other older parsers. Libxml2 will be default (only) parser.
- Remove -S http method.
- Documentation overhaul.
Features planned for 3.0
Swish-e 3.0 (abbreviated Swish3) will be a complete overhaul of the code. You can track development progress here. Major feature improvements will include:
- Unicode support
- Unicode is the international standard
for character encodings. Swish3 will implement
support for the UTF-8
which should handle all major languages in the world (UTF-8 handles up to
2,147,483,648 unique characters).
The Swish-e developers need input from non-English language experts.
Please contribute to the discussion at the
Swish-e mailing list.
Some significant known issues include:
- lowercase vs. UPPERCASE
- Version 2.x uses tolower() to lowercase all characters before searching and indexing. Should the same approach be used for UTF-8? Will this have significant impact on usability for non-English languages?
- Version 2.x uses an internal table to support wildcard searching with *. The table assumes 8-bit (non-Unicode) character encoding. That approach will likely need to be re-thought for multibyte encodings like UTF-8.
- Version 2.x uses 5 different configuration options to control how a 'word' (token) is defined. The basic assumption is that a word is defined by which characters it includes. That assumption is based on a manageable character set of 256 characters. However, the sheer size of UTF-8 makes that system unworkable. Instead, some kind of regular expression library will likely be used.
- The stemmers used will need full international support.
- Configuration format
- Since Swish-e depends on a configuration file for StopWords, Character definitions, etc., the parsing of the configuration file must support UTF-8 as well. The current idea is to switch to XML-style configuration files and use Libxml2 to parse them.
- Incremental indexing
- Swish3 will support true incremental indexing. This will allow for document records to be modified, added and deleted in an existing index. This feature may or may not build on the version 2.x experimental btree/incremental feature.
- Swish3 will reliably scale to larger (multimillion) document collections.
- Indexing API
- Swish3 will include an indexing API in addition to the current searching API.
- Streamlined feature set
- Swish3 will not contain several features in the current version:
- Expat parsers
- -S http indexing method and related configuration options
- Older stemmers
- Current native index format
- Alternate index backends
- Swish3 will offer alternate index backends using available open source libraries, such as Xapian, HyperEstraier, Lucene, or Lemur.
You can't tell the players without a program. And we wouldn't have a program without all these players! All these folks have made key contributions to Swish-e: If you are not listed here, and you should be, drop a line.
On the Field
- Bill Moseley
- The person leading the charge. Rewrote much of the documentation and bundled it with the distribution (you now know who to complain to), added the "prog" document source feature, added Expat and libxml2 parsers, redesigned properties, and added many new and exciting features.
- Jose Manuel Ruiz
- Jose added phrase searching and has made huge contributions toward speed and memory usage improvements. He added result sorting, improved metanames and properties, merging, and searching. Swish is the powerful program it is today because of Jose. And there's more coming!
- David Norris
- David has provided ports to all flavors of Windows, as well as a Swish-e interface script written in PHP3. The windows version is now bundled with a self installer, making instalation just a click away.
- Peter Karman
- Peter added improvements to the ranking code and a new website design. His main role is creating more work for Bill.
- Roy Tennant
- Roy was the one who originally rescued SWISH when Kevin Hughes, the original author, was no longer supporting it. He has remained active in the effort since the beginning, but can't code in C to save his life, and therefore must remain content with web site support and other such minor tasks.
Hall of Fame
- Bill Meier
- Bill improved the ranking code, and provided much help in memory optimizations and indexing speed.
- Rainer Scherg
Rainer has worked on Swish-e for many years. Rainer added Swish-e's filters providing ways to index many
document types. Rainer also added the powerful "-x" feature to easily control Swish-e's output.
- Giulia Hill
- Giulia was the first programmer to tackle upgrading SWISH to Swish-e, back when it was a project of the UC Berkeley Library. Without her, we would not have gotten out of the starting gate.
- Ron Klatchko
- Ron added the crawling capability to Swish-e, subsequently enhanced by others.
- Kirk Hastings
- Kirk programmed a neat Perl-based tool, called "AutoSwish" that allowed anyone to easily set up and maintain indexes from a web page. Unfortunately, this program is no longer a part of the release due to security issues.
- Bas Meijer
- Bas has been an active member of the Swish-e team since 1999 providing code enhancements and user support. He converted Swish-e's build process to the GNU Auto Configure script and ported Swish-e to a number of platforms. Bas has also provided add-on scripts to the Swish-e user community.
- Marc Gaulin
- Marc added code to support the document properties and stemming features, among other things.
- Warren Jones
- Prentiss Riddle, Rice University
- The source of a number of SWISH bug fixes that were implemented in the first Swish-e release
- Mark Seiden
We owe a debt of gratitude to Kevin Hughes, without whom there would be no SWISH, and definitely no Swish-e. His dedication to building useful tools and making them widely available should be an inspiration to us all.