Skip to main content.
home | support | download

Back to List Archive

RE: new version of swish-e-1.3.2-PHRASE (m)

From: <Rainer.Scherg(at)>
Date: Fri Jun 09 2000 - 08:51:27 GMT
Hi Jose!

If you have added the filter patch to the updated version,
I would remove "my version" of swish-e from the net
(it would be very confusing to have many similar versions
on the net).

But one Q:

  Why not making a new subversion of
  swish-e? There are so many fixes & enhancements
  in swish-e that there could be at least 
  a 1.3.3 version (to get rid of the

hasta proxima - Rainer

-----Original Message-----
From: Jose Manuel Ruiz []
Sent: Thursday, June 08, 2000 8:52 PM
To: Multiple recipients of list
Subject: [SWISH-E] new version of swish-e-1.3.2-PHRASE (m)

Hi all,

Sorry for the delay, here is my last try...
Download it from

By demand, here is all the features added from the first version.

First, the good news:

New general features:
- Faster index and retrieval of douments (wildcard
search outperforms old one).A hash approach has been added for speed up
searches. This reduces disk i/o. Now you can search for things like
 "a* or b* or c* or d* or e* ..." without the penalty of reading the
linked list
 for each word of the expanding list.

- Better use of memory. Lots of calls to free memory have been added.

- Phrase search. Example:
swish-e -w "John Smith" -f index.file
(Use " to delimite the phrase).

- XML MetaNames style. Example: <metaname1>SomeText</metaname1>
Nested XML Metanames are allowed:

- Other options like filtering and some patches from different
people have been added. (See previous messages).

- Better compression of numbers.

- Portable index file.

New features in config file:
- New directive TranslateCharacters to translate some characters in
the words. It takes two strings: The original characters and 
the translated characters.

TranslateCharacters - aa/

This makes word "rea" indexed as "area" and "9-1" as "9/1"
Remember that all the chars int these strings must also be in
This option is useful for non english languages.

- Special word in MetaNames. If you specify automatic in
MetaNames directive, the indexer will try to extract all the MetaNames
dinamically. This option only works with these types of MetaNames:



<!-- META START NAME="keyName" --> someContent <!-- META END -->

(Nested MetaNames are allowed!!)

Sorry, it does not support:
<META NAME="keyName" CONTENT="someContent">

New search options:
- Option -s to sort results by one or more document properties 
(those specified in PropertyNames in config file). 
(always descending)

swish-e -w test -f index.file -s cod aut

This will sort results by properties cod aut.

- Option -b to display results from the number specified up to the
number specified in -m.

swish-e -b 10 -m 5 -w test -f index.file 

This will show 5 results starting at 10th position

New decompress option:
- Option -D shows more information

And now, the bad news:

- This version uses more memory than old swish-e. Like swish-e-1.3.2,
it stores all the data (words, files, properties, metanames) in memory
during the index proccess. But, now it also stores all the word
in memory during the index process (positions are required for phrase
- Be careful using IgnoreLimit directive in config file. With this
you can get "Automatic" stopwords and remove them from the index
file. The problem is that this feature is executed at the end of the
proccess. So, if an automatic word is found, all the word positions must
recomputed increasing the index time (this a pure memory-cpu process).
It is 
better to add these words in the IgnoreWords directive.

Thanks to all of you for your help: Bill Moseley,
Andrew Linn, SRE, Roy Tennant, David Norris and
many others I could not remember now (sorry).

To do:
I am open to any suggestion.

Have a nice day 

Jose Ruiz

This Mail has been checked for Viruses
Attention: Encrypted Mails can NOT be checked !

* * *

Diese Mail wurde auf Viren ueberprueft
Hinweis: Verschluesselte Mails koennen NICHT geprueft werden !
Received on Fri Jun 9 04:55:48 2000