Skip to main content.
home | support | download

Back to List Archive

Re: Indexing PDF files - reliable ?

From: David Larkin <david.larkin(at)not-real.djl.co.uk>
Date: Thu Dec 08 2005 - 22:43:46 GMT
On Thu, 8 Dec 2005 13:42:31 -0800 (PST)
Bill Moseley <moseley@hank.org> wrote:

> On Thu, Dec 08, 2005 at 01:05:59PM -0800, David Larkin wrote:
> > Is it due to PDF version number ?
> 
> Swish uses pdftotext.  Run that on the docs and see what comes out.
> 

79:k{david}% grep the Samba-Developers-Guide.txt | wc -l
     206
80:{david}% grep the spm.txt  | wc -l
     437
81:{david}% grep the isj2001-final.txt | wc -l
       0
82:{david}%

 isj2001-final.txt looks very strange , i wonder if original pdf came from a scanner or some such thing


> -- 
> Bill Moseley
> moseley@hank.org
> 
> Unsubscribe from or help with the swish-e list: 
>    http://swish-e.org/Discussion/
> 
> Help with Swish-e:
>    http://swish-e.org/current/docs
>    swish-e@sunsite.berkeley.edu
> 
Received on Thu Dec 8 14:44:01 2005