Skip to main content.
home | support | download

Back to List Archive

Re: PDF to HTML causing swish-e to crash

From: David L Norris <dave(at)not-real.webaugur.com>
Date: Fri Oct 11 2002 - 00:02:56 GMT
On Thu, 2002-10-10 at 18:47, David L Norris wrote:
> > Question: is anyone successfully indexing PDF documents on Linux with
> > swish-e-2.2.1 ?  If so, can you please post your swish-e configuration

> $ swish-e -V
> SWISH-E 2.3-dev-02

Ooops, you did specify 2.2.1...  Let's try again.  Everything else is
the same.  I just dropped in a freshly compiled 2.2.1.


$ ./swish-e -V
SWISH-E 2.2.1

$ ./swish-e -c _swish.conf -i Electronics/semiconductors/Common_Parts/
-v3

Parsing config file '_swish.conf'
Indexing Data Source: "File-System"
Indexing "Electronics/semiconductors/Common_Parts/"

Checking dir "Electronics/semiconductors/Common_Parts"...
  nte923.pdf - Using HTML2 parser -  (658 words)
  nte5470_76.pdf - Using HTML2 parser -  (468 words)
  nte5061a.pdf - Using HTML2 parser -  (1224 words)
  nte1690.pdf - Using HTML2 parser -  (1329 words)
  nte130.pdf - Using HTML2 parser -  (492 words)
  nte180.pdf - Using HTML2 parser -  (433 words)
  nte245.pdf - Using HTML2 parser -  (352 words)
  nte283.pdf - Using HTML2 parser -  (394 words)
  nte5562_66.pdf - Using HTML2 parser -  (284 words)

Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 1070 words alphabetically
Writing header ...
Writing index entries ...
  Writing word text: Complete
  Writing word hash: Complete
  Writing word data: Complete
1070 unique words indexed.
5 properties sorted.                                              
9 files indexed.  236629 total bytes.  5634 total words.
Elapsed time: 00:00:01 CPU time: 00:00:00
Indexing done!

-- 
 David Norris
  Dave's Web - http://www.webaugur.com/dave/
  Augury Net - http://home.webaugur.com/
  ICQ - 412039
Received on Fri Oct 11 00:06:35 2002