Is there a way to get Swish-e to find and index the document properties
in Excel spreadsheets?
We are putting titles and authors in the document properties, but they
are inaccessible.
I am using the XLtoHTML filter, which doesn't appear to have a way of
doing any of these things.
TITLES:
Because we are spidering with -S prog all our titles in Swish-e look
like
Library Chart - /tmp/sG36QWw6iW v.1536
which I gather is from this line in XLtoHTML
$ExcelFirstWorksheetName - $ExcelFilename v.$ExcelVersion
and since the filter is reading a temporary copy of the file, it loses
the actual path and filename.
I would really like to get the title out of the document properties; it
differs quite a bit from the filename.
AUTHORS:
Author appears to be coming for the "Last Saved by" value in the
spreadsheets document properties (statistics), not the Author in
Properties | Document Summary. Thes are clearly not going to be the same
people much of the time.
Some threads recommend using the Win32::OLE Perl module to grab some of
this data, but we are on Unix, and apparently that modlule only works
for Windows versions of Perl. So I have not tried to see if this would
work.
pjk
Paul J. Kissman
Library Information Systems Specialist
Massachusetts Board of Library Commissioners
648 Beacon St.
Boston, MA 02215
paul.kissman@state.ma.us
mass.gov/mblc or mblc.state.ma.us
617-267-9400 * 800-952-7403 (in-state)
Fax: 617-421-9833
Received on Thu Dec 16 07:09:56 2004