Skip to main content.
home | support | download

Back to List Archive

Re: error indexing pdf files

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Mon Apr 14 2003 - 19:11:05 GMT
On Mon, 14 Apr 2003, Jody Cleveland wrote:

> Is there a way to put things like Index File and StoreDescription in the
> SwishSpiderConfig.pl file?

Swish doesn't know about SwishSpiderConfig.pl -- all it knows is it's
running a progam and that program is returning documents.  spider.pl just
happens to look for SwishSpiderConfig.pl by default.

You can turn things around and call swish from a program.  So instead of:

  swish-e -c config -S prog -i /path/to/program

where swish reads input from /path/to/program, you can do:

  /path/to/program | swish-e -c config -S prog -i stdin

that "stdin" is a hack to make swish read from standard input.  So you can
extend that, and from within a program you can say, for example:

open CONF, "swish.conf" or die $!;
print CONF <<EOF;

DefaultContents HTML*
StoreDescripiton HTML* <body>
PropertyNames foo
EOF

close CONF;

open SWISH, "swish-e -c swish.conf -S prog -i stdin|" or die $!;
while ( $doc = fetch_next_doc() ) {
    print SWISH $doc;
}
close SWISH or die "failed to close";

unlink swish.conf;




-- 
Bill Moseley moseley@hank.org
Received on Mon Apr 14 19:12:19 2003