Skip to main content.
home | support | download

Back to List Archive

Re: dirTree.pl

From: <moseley(at)not-real.hank.org>
Date: Sat Nov 15 2003 - 15:52:53 GMT
On Sat, Nov 15, 2003 at 07:17:14AM -0800, Lee Thompson wrote:
> Hi,
>  
> Has anyone tried modifying dirTree.pl for use on Windows?  It does find
> all files, but swish-e doesn't seem to be able to tell where one file
> ends and the next file starts.

Make sure the version you are using does NOT use binmode (swish-e reads
in text mode).  Does windows have a standard tool like "od" or "file" to
look at the output from DirTree.pl to see what kind line endings it has?

If that's not it, maybe you  are using utf_8 and the 
content length is wrong.  For that you could output one file with 
DirTree.pl, edit it and note the content-length.  Then cut all the 
header lines, including the blank line between the header and content 
and save the file.  The resulting file size should be what the 
content-lenght header said.  That's assuming you have an editor that 
won't screw things up and add a line ending at the end if there isn't 
already one there.


> Should the data from dirTree.pl have
> something specific that indicates where one file ends and the next
> starts?

No.  It knows the end by the content-length.


> It does put in the same headers as spider.pl, spider.pl works
> fine here.

spider.pl uses this to determine the content length (in the event that 
the content ends up in utf-8 with multi-byte chars:

    # ugly and maybe expensive, but perhaps more portable than "use bytes"
    my $bytecount = length pack 'C0a*', $$content;

But DirTree.pl uses the length from the stat command.  Hard to imagine 
that would be wrong.





  The errors show are:
>  
> ----------------------------------------
> C:\KaTS\SWISH-E>swish-e  -S prog -c conf/filetree.config -i
> ./prog-bin/DirTree.pl  -f i:\Data\Taxonomy\mydrive.swish-e
> Indexing Data Source: "External-Program"
> Indexing "./prog-bin/DirTree.pl"
> External Program found: ./prog-bin/DirTree.pl
>  
> Warning: Unknown header line: 'Logging information for IE6Setup.exe ...'
> from program ./prog-bin/DirTree.pl
> /WINNT/Active Setup Log.txt - Using TXT2 parser -  (2819 words)
>  
> Warning: Unknown header line: 'b.dll' from program ./prog-bin/DirTree.pl
>  
> Warning: Unknown header line: 'Search fixed drives = FALSE' from program
> ./prog-bin/DirTree.pl
>  
> Warning: Unknown header line: 'Search remote drives = FALSE' from
> program ./prog-bin/DirTree.pl
>  
> Warning: Unknown header line: 'Search removable drives = FALSE' from
> program ./prog-bin/DirTree.pl
>  
> Warning: Unknown header line: 'Search CD-ROM drives = FALSE' from
> program ./prog-bin/DirTree.pl
>  
> Warning: Unknown header line: 'Search specific directories = TRUE' from
> program ./prog-bin/DirTree.pl
>  
> Warning: Unknown header line: 'Custom directories =
> C:\WINNT\Microsoft.NET\Framework' from program ./prog-bin/DirTree.pl
>  
> Warning: Unknown header line: 'Recurse custom dirs = TRUE' from program
> ./prog-bin/DirTree.pl
>  
> Warning: Unknown header line: 'Result = 0' from program
> ./prog-bin/DirTree.pl
>  
> Warning: Unknown header line: 'END: Perform action: Search for File'
> from program ./prog-bin/DirTree.pl
> err: External program failed to return required headers Path-Name:
> .----------------------------------------
>  
> If I run dirTree.pl on it's own I do get all the correct swish-e header
> lines, for example:
>  
> Path-Name: /WINNT/Active Setup Log.txt
> Content-Length: 21382
> Last-Mtime: 1062197811
> Document-Type: TXT*
> 
> 
>  
> 
> Lee Thompson
> 
>  
> 
>  
> 
> 
> 
> *********************************************************************
> Due to deletion of content types excluded from this list by policy,
> this multipart message was reduced to a single part, and from there
> to a plain text message.
> *********************************************************************
> 

-- 
Bill Moseley
moseley@hank.org
Received on Sat Nov 15 15:52:59 2003