I updated some of the code in SWISH::Filter to work a bit differently.
This effects how individual filter modules interface with
SWISH::Filter.
So, if you have any custom filters created for use with SWISH::Filter
you will need to update your filters for the next version of Swish.
Also, if you do have a custom filter then perhaps it could be included in
the distribution.
I've also been updating spider.pl -- not as much as I'd like -- but I
added a few features.
For example, you can now test a content-type with SWISH::Filter to see
if it *would* try and filter it. So, spider.pl now does HEAD requests
to fetch the content type before actually fetching the document.
Before, spider.pl would always use a GET request. The GET request
could be aborted before downloading all the content, but that breaks
the existing keep-alive connection.
Another feature that will help with new users is the ability to modify
the default configuration of the spider instead of just being able to
override it.
Before:
spider.pl default <URL>
would automatically use SWISH::Filter for converting PDFs and MS Word
docs. But if you use your own config:
spider.pl spider.config
then you had to arrange to use SWISH::Filter in spider.config if you
wanted to filter. A bit confusing, so you can now merge spider.config
with the "default" config if you only want to change a few things from
the default config.
You can look at the dev docs if curious about the changes.
http://swish-e.org/dev/docs/
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
swish-e@sunsite.berkeley.edu
Received on Tue Oct 5 11:59:51 2004