Skip to main content.
home | support | download

Back to List Archive

Re: Fixing Swish 2.4.0 To Work on Windows

From: <moseley(at)not-real.hank.org>
Date: Fri Nov 07 2003 - 15:05:36 GMT
Note -- searching the swish-e list archive for windows and pdf2html will 
find similar tips for using pdf2html.


On Thu, Nov 06, 2003 at 11:28:47PM -0800, David L Norris wrote:

> I probably should install the example scripts (index_hypermail.pl,
> _pdf2html.pl, MySQL.pl, file.pl, DirTree.pl) somewhere other than
> lib\swish-e by default.  I think Bill made that suggestion at some
> point.  I can see how that would be extremely confusing.

They should go in share/doc/swish-e/examples.

One exception is DirTree.pl.  That's a tiny little example script, but 
perhaps it needs to be developed into something that provides the 
functionality of -S fs indexing mode, but with SWISH::Filter built in.

I was going to post with a modified DirTree.pl program that included 
SWISH::Filter for the original poster, but I have not had time to get to 
my Windows machine to test (I keep it locked away for health reasons).

Adding SWISH::Filter support is not hard, and one can use the example in
the spider.cgi example config SwishSpiderConfig.pl as a template.

Now, I have not tried this yet, but there's also the utility called 
swish-filter-test.  That program is modified at install time to find the 
SWISH::Filter module (which is also designed to find the binaries.

So, I suppose one could do this if using -S fs indexing method:

  FileFilter .pdf swish-filter-test '-quiet -content "%p"'

Now, I don't recommend that in general because of the cost of Perl 
loading and compiling all those modules for every document.  That's why 
I'm suggesting modifying DirTree.pl --  it would only load the 
SWISH::Filter modules one time.

BTW - swish-filter-test is installed in the same place as the swish-e
binary, so it may be in your path.  If not, then specify the path, of
course. Oh, I suppose Windows would need to know that was a Perl script
and thus you would need to rename it to swish-filter-test.pl (on some
versions of Windows or run it as "perl /path/to/swish-filter-test" on
other versions of Windows).

Hey Dave.  How long does it take you to install Linux these days?


> > Set correct path to pdf converters in _pdf2html.pl. e.g.
> >    $ENV{PATH} ='D:/Program Files/SWISH-E4/lib/swish-e;'. $ENV{PATH};
> 
> Yes, that's a good workaround.  I'll see if we can get that fixed in the
> next release.  We don't seem to be handling FileFilter correctly.  My
> intention is that everything in {prefix}\lib\swish-e should be directly
> executable using it's base name.

Basically, those filters have been left behind.  They never had the 
right path in them unless you happened to install pdftotext and other 
binaries in your PATH.  _pdf2html.pl says:

    This filter requires two programs "pdfinfo" and "pdftotext".
    These programs are part of the xpdf package found at
    http://www.foolabs.com/xpdf/xpdf.html.

    These programs must be found in the PATH when indexing is run, or
    explicitly set the path in this program:

      $ENV{PATH} = '/path/to/programs'

It's just that we made other things work automatically but not this.

I don't really want to maintain two different sets of filters, but those 
files in the filter-bin directory are nice as examples, I think.  That's 
why I'd install them in the documentation directory.

> Bill, looks like we never fixed FilterOpen() in filter.c, line 298. 
> It's not using the new PATH stuff.  It's simply doing a popen().  I
> think it needs to be doing a get_env_path_with_libexecdir() beforehand.

We have talked about that a few times, and I've looked into it, too.  I 
wish I could remember exactly my reasoning for not making that change.  
I think I felt that FileFilter was more of a general hook and thus you 
(the user of FileFilter) would setup paths as needed.

That said, remember that there was code in there for a while that added 
libexecdir() to the path at startup of swish.  That would make the 
filters work better.  

 http://cvs.sourceforge.net/viewcvs.py/swishe/swish-e/src/swish.c

I think the problem was that setting PATH was not portable, but I think 
there was another reason I removed it.  That the part I can't remember.


-- 
Bill Moseley
moseley@hank.org
Received on Fri Nov 7 15:05:55 2003