Bill Moseley wrote:
> On Mon, Nov 21, 2005 at 08:02:32AM -0800, J. David Boyd wrote:
>
>>Is it possible to hook into the index generation code of swish-e, and
>>insert my own translation code, such that when the indexer sees
>>S2171_TABLE.pdf, I can look in my translation table, and stuff the value
>>'Module 72 Tables' to be displayed in the hyperlink of the search
>>result? Of course, the hyperlink still has to point to the original file.
>
>
> Easily. It should already do that, but maybe you don't have titles in
> your pdf docs.
>
> Anyway, in SWISH::Filters::Pdf2HTML (assuming that's what you are
> using just set the title:
>
> $title ||= lookup_title( $file_name );
>
I've been looking through SWISH::Filters::Pdf2HTML, and I just realize
more and more all the time that I'm no Perl expert.
I don't see any code that looks like what you have there. I see code in
sub filter() that sets a title. Do I monkey around in there? That
looks, to me, like a good way to break something.
Now, as an alternative, I find that I can actually set a title in my PDF
file, using pdftk. It's kind of convoluted, but it works okay.
I see that Pdf2HTML mentions that it can store the title, but it doesn't
work by default. (By which I mean that I have manually set some titles
in my PDF files, run the index, perform a search, and it shows the file
name as the hyperlink, rather than the PDF file's internal title)
---------------------------------
You may pass into SWISH::Filter's new method a tag to use as the html
<title> if found in the PDF info tags:
my %user_data;
$user_data{pdf}{title_tag} = 'title';
$was_filtered = $filter->filter(
document => $filename,
user_data => \%user_data,
);
Then if a PDF info tag of "title" is found that will be used as the HTML
<title>.
---------------------------------
Does this mean that if I copy the actual code (skipping comments, of
course), from the above quoted section, and place it into the sub new()
function, that I will be adding in the ability to read the titles? If
so, where do I put it? Before the return statement obviously (even to
me), but does it go inside of bless(), before it, after it?
Am I even close here?
Received on Tue Nov 22 07:00:37 2005