On Mon, Dec 08, 2003 at 12:56:10PM -0800, Matt Torbin wrote:
> Hey all,
>
> I'd like to be able to index the attribute values in a PDF document so
> that instead of the title of the document coming up as "whatever.pdf"
> it would come up as "This is my Document" (since that is what I have
> filled in as my document attributes). Is this possible? Has anyone
> done this? Can anyone guide me in the right direction?
Is this a different question from the one on December 4th?
http://swish-e.org/archive/6277.html
Here's the docs for that filter:
NAME
SWISH::Filters::Pdf2HTML - Perl extension for filtering PDF documents
with Swish-e
DESCRIPTION
This is a plug-in module that uses the xpdf package to convert PDF doc-
uments to html for indexing by Swish-e. Any info tags found in the PDF
document are created as meta tags.
This filter plug-in requires the xpdf package available at:
http://www.foolabs.com/xpdf/
You may pass into SWISH::Filter's new method a tag to use as the html
<title> if found in the PDF info tags:
my %user_data;
$user_data{pdf}{title_tag} = 'title';
$was_filtered = $filter->filter(
document => $filename,
user_data => \%user_data,
);
Then if a PDF info tag of "title" is found that will be used as the
HTML <title>.
--
Bill Moseley
moseley@hank.org
Received on Mon Dec 8 21:18:44 2003