Skip to main content.
home | support | download

Back to List Archive

Re: win2k unknown header problem

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Wed Sep 25 2002 - 17:57:01 GMT
At 08:47 AM 09/25/02 -0700, Matt Kynaston wrote:
>Warning: Unknown header line: '<html>' from program prog-bin/spider.pl

Two things:

-S prog (extprog.c) opens the program in text mode, not binmode.  So on
Windows \r\n should be converted to \n before swish sees it.  You do not
want binmode in your external program.  This only makes a difference on
non-unix platforms where line endings are not \n.  I frankly do not know
how the C library does this translation -- that is, I'm not sure if \

Now the other thing that can cause that is an incorrect content-length
setting.  This might be the case if the first file is indexed, but then you
see the errors on the second file.

Here's on Win98.  Do you get the same results on Win2K?

E:\Program Files\SWISH-E2.2>perl prog-bin\spider.pl default
http://hank.org:2342/test.html | swish-e -S prog -i stdin
prog-bin\spider.pl: Reading parameters from 'default'

Summary for: http://hank.org:2342/test.html
    Skipped:   1  (1.0/sec)
Total Bytes: 373  (373.0/sec)
 Total Docs:   3  (3.0/sec)
Unique URLs:   4  (4.0/sec)
Indexing Data Source: "External-Program"
Indexing "stdin"

Warning: Failed to properly close external program: No error
Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 8 words alphabetically
Writing header ...
Writing index entries ...
  Writing word text: Complete
  Writing word hash: Complete
  Writing word data: Complete
8 unique words indexed.
4 properties sorted.
3 files indexed.  373 total bytes.  17 total words.
Elapsed time: 00:00:00 CPU time: 00:00:00
Indexing done!

Not sure why it reports 
  Warning: Failed to properly close external program: No error

on Windows, but it is harmless.



-- 
Bill Moseley
mailto:moseley@hank.org
Received on Wed Sep 25 18:00:34 2002