As I have to index a fileserver which contains some OpenDocument, PDF,
.doc, .xls and other stuff like that, I decided to create a spider in
It uses some perl embeded libraries (such as CAM::PDF, File::Find and
some others), and some system call (antiword, xls2csv...).
so I guess it can interest other people ;).
I tried to put some doc inside, it should be sufficiant if you want to
modify it for your needs.
My perl level is not very high, so if you see mistakes or possible
improvement, please feel free to show it to me.
PS : sorry for my english, it's not my mother tongue...
Jeanneret Internux | 078 748 03 02
Av. des Alpes 123 | 021 550 02 09
1814 La Tour-de-Peilz | skype: phoenix818
cjeanneret(at)not-real.internux.ch | http://www.internux.ch
Users mailing list
Received on Sat Mar 29 15:18:44 2008