Hello experts (and fellow lurkers too),
I've been using swish-e for almost a year now and am really happy with
it. This is my first post, having recently joined the mailing list...
I'm using the filesystem option only, swish traverses a number of samba
shares and indexes the files therein. So far most users are pretty
pleased, they can search a hell of a lot of docs and generally find what
they're looking for via swish.cgi and a quick click.
I'm keen to offer a directory service too, allow them to traverse a
classification hierarchy ala Google Directory.
But I'm really, really, lazy.
Has anyone done anything similar, either with a taxonomy they've
developed or even better one that gets automatically generated?
I'd be intrigued to see if any clever use of metadata might help, for
info the documents indexed are almost exclusively ms office & pdf files.
I use a collection of filters and shell scripts to extract content and
properties/metadata (but do nothing with the properties/metadata at the
In short, I'm seeking your wisdom;
- General Tips on using metadata
- Experience of generating a hierarchy or taxonomy -or- linking in
someone else's taxonomy
- Experience of building web-pages to traverse a classification
hierarchy or taxonomy
- Updating shed loads of office documents (~30,000) to clean their
property fields (eek!)
Thanks in advance,
Received on Fri Mar 7 12:20:30 2003