I have different problems and questions concerning swish-e:
- How do you index an email archive with swish-e? Each file in
directory is an email message - I think - mbox format (nnml spool
under gnus). Are there pre-configured filters or other tools to get
subject and other mail headers as properties? How to I instruct
swish-e to not index embedded mail attachments?
- With -S fs turned on, I do not get NoContents, FileRules or
FileMatch accepted by swish-e (a cygwin problem?). swish-e seems to
scan index.swish-e.temp and index.swish.e.prop.temp, or what does
the "Warning: Substitute possible embedded null character(s) in file
index.swish-e" (and index.swish-e.temp, index.swish-e.prop,
index.swish-e.prop.temp) mean? I have set "NoContents .swish-e .temp
.prop" in my config file.
- Even with option -e I ran often out of memory: "err: Ran out f
memory(could not allocate NNNN more bytes)!", even wtih IgnoreWords
instead of IgnoreLimit. Is swish-e not made for scanning of some
2000 email messages in a directory (some 2'000'000 words)? I have a
reasonable PC with 128MB RAM and free disk space.
- In WordCharacters I define an extended international character set
(only accented letters). Would it help to solve memory problems, if
I reduced this set of characters? I am not shure what exactly
happens with this extended character set combined with
TranslateCharacters set to :ascii7:? Is configuring both
- I tried different (development) versions of swish-e. On some
versions I also get an COALESCE_BUFFER_MAX_SIZE error, but
increasing the value in config.h (do not change this!) does not
help. Any idea?
In short: How do I configue a "small" swish-e to index my (huge?) mail
message archive with plenty of accented characters in the mail bodies?
Any help would be greatly appreciated!
Received on Fri Nov 23 09:54:44 2001