One more comment. Seems you are using a config file from an old version
of swish-e. It's a common thing to do, it seems.
There's used to be a config file with most if not all the
options defined -- and some incorrectly. For example, that default
config contains both IndexOnly and NoContents -- but the NoContents
includes files that will never be touched because of the IndexOnly.
My suggestion (and that's all it is) is to have a config file with just
a few things that need to be changed from the default.
So your 160 line config file below could be reduced to:
MetaNames dc.description dc.title dc.creator
PropertyNames dc.description dc.title dc.creator
ReplaceRules remove /home/hul/htdocs
IndexOnly .pdf .html
# Filter PDF
# See http://swish-e.org/current/docs/INSTALL.html#Filtering_Overview
# for a possibly faster, and better supported way
FileFilter .pdf /usr/local/apache/swish/filter-bin/_pdf2html.pl
# Skip these (pathname may match a file -- do you mean dirname?)
FileRules pathname contains BudgRep
I just think that's easier to manage. But, again, that's just my
Then use an indexing script that does something like:
echo "Indexing Aleph staff documentation"
swish-e -c /path/to/config \
-i /home/hul/htdocs/ois/systems/aleph/docs/test/ \
-f /usr/local/apache/swish-indexes/metadata3.index \
And only generates output when there's a problem. You can put the
paths in the config file if you like (and can override with command line
Received on Fri Jan 16 19:25:56 2004