Skip to main content.
home | support | download

Back to List Archive

Re: Adding files from external site - suggestions?

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Wed Apr 28 2004 - 15:28:53 GMT
On Wed, Apr 28, 2004 at 10:55:04AM -0400, Rob de Santos AFANA wrote:
> Well, this worked just great.  Thanks, Bill for all of your help.  The
> files in the directory in question are now in the index.  Seems there is
> one remaining problem though.  Using a hacked version of DirTree.pl to
> feed the files to the index causes them to be indexed with the "path"
> not the directory info and the ReplaceRules not to be applied to this.

Hum, I thought ReplaceRules still worked with -S prog.

Anyway, just rewrite the URL in DirTree.pl -- that's the natural place
to do it, and you have Perl regular expressions to make things easier.


> Not what I need so I may have to rethink this.  I get this in the index
> (these will split across two lines):
> 
> /home/afana/public_html/www.sportsdelivered.com/afl/video_detail.asp?vid
> _id=342
> 
> instead of this as the document path:
> http://www.sportsdelivered.com/cgi-bin/cgi-bin/at.pl?a=195711&e=afl/vide
> o_detail.asp?vid_id=342

Something like:

    $path =~ s[^/home/afana/public_html][http:/];

Remember you can run DirTree.pl (and spider.pl) outside of swish so look
at things first:

    ./DirTree.pl | less
Received on Wed Apr 28 08:28:53 2004