Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Empty index file(s) error

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Mon Sep 17 2007 - 16:26:40 GMT
On 09/14/2007 04:15 PM, William M Conlon wrote:
> The indexing process is not completing, hence the temp files.
> 
> Take a look at the indexer output.
> 
> Bill
> 
> 
> On Sep 14, 2007, at 2:03 PM, Parker, Peter A CONTRACTOR WRAIR-Wash DC  
> wrote:
> 
>> Greetings,
>> I have recently completed installation of Swish-e on an apache server
>> machine with the follows details:
>>
>> Swish-e version: 2.4.5
>> Apache version: 2.0.52
>>
>> I now have approximately 50 files in the directory indexed, including
>> Word, Excel and Powerpoint documents and PDFs. I have gone through the
>> steps outlined for indexing non-text file. Initially, when there were
>> only about 7 files in the html directory the indexing worked fine and
>> command line searches worked flawlessly. Now after adding more  
>> files to
>> the directory (about 50 files), the indexing is not working as it was.
>>

My guess is one of the filter helper programs (pdftotext, catdoc, etc) are
choking the indexer and not delivering all the content you expect. Encodings
are often an issue; there are others.

>> FileFilter .pdf share/doc/swish-e/examples/filter-bin/_pdf2html.pl

Try running that pdf2html script by itself on some docs.

Also, I don't see any FileFilter lines for .doc, .ppt etc. You might want to
try DirTree.pl script instead, since it has all the filtering stuff work with
SWISH::Filter instead of FileFilter config opts.

-- 
Peter Karman  .  peter(at)not-real.peknet.com  .  http://peknet.com/

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Mon Sep 17 12:26:42 2007