I updated some of the code in SWISH::Filter to work a bit differently.
This effects how individual filter modules interface with
So, if you have any custom filters created for use with SWISH::Filter
you will need to update your filters for the next version of Swish.
Also, if you do have a custom filter then perhaps it could be included in
I've also been updating spider.pl -- not as much as I'd like -- but I
added a few features.
For example, you can now test a content-type with SWISH::Filter to see
if it *would* try and filter it. So, spider.pl now does HEAD requests
to fetch the content type before actually fetching the document.
Before, spider.pl would always use a GET request. The GET request
could be aborted before downloading all the content, but that breaks
the existing keep-alive connection.
Another feature that will help with new users is the ability to modify
the default configuration of the spider instead of just being able to
spider.pl default <URL>
would automatically use SWISH::Filter for converting PDFs and MS Word
docs. But if you use your own config:
then you had to arrange to use SWISH::Filter in spider.config if you
wanted to filter. A bit confusing, so you can now merge spider.config
with the "default" config if you only want to change a few things from
the default config.
You can look at the dev docs if curious about the changes.
Unsubscribe from or help with the swish-e list:
Help with Swish-e:
Received on Tue Oct 5 11:59:51 2004