Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] How swish-e returns PDF's meta description

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Thu Oct 15 2009 - 04:46:19 GMT
Daqi Li wrote on 10/14/09 2:54 PM:
> Hi,
> 
> I have swish-e-2.4.7 on Linux Fedora core 8 (see below uname -a).
> 
> I have PDF documents that have the summaries in their meta description (or keywords). When I do searches, if the keyword is found in a pdf body or title, I need swish-e returns its size, last modify date, etc. as well as the meta description (or keywords). Here are the things I did:
>  
> 1. I copied your swish.cgi to /var/www/cgi-bin.
> 2. created .swishcgi.conf in /var/www/cgi-bin (as the attached).
> 3. Created swish.conf in /var/www/cgi-bin (as the attached).
> 3. Ran the command to index the files:
> 	swish-e -c swish.conf
> 4. Then browsed to the URL http://localhost/cgi-bin/swish.cgi.
> 
> Here is The search result I got:
> 1 09-71298_Spina_mem_op-signed.pdf -- rank: 1000 
> Title: 09-71298_Spina_mem_op-signed.pdf 
> Last Modified Date: 2009-10-14 12:44:14 EDT 
> Document Size: 127153 
> Description: (null) 
> Keywords: 

try these changes in your swish.conf:

# don't know what my_pdf2html.pl looks like, but swish-filter-test
# does the trick
FileFilter .pdf /your/path/to/swish-filter-test '-headers -content %p'

# add the missing * after the parser type
StoreDescription HTML* <meta> 1000
StoreDescription TXT* 1000

-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Oct 15 00:46:21 2009