Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] swish.conf problems - was ignorewords wildcard?

From: <Rene.Kloos(at)not-real.esa.int>
Date: Thu May 24 2007 - 12:32:37 GMT
BTW, if using the spider, won't that simply get blocked when coming across a
directory with .htaccess? After all I suppose that's what the .htaccess is
for, to set up some form of access control. You can provide the spider with
the appropriate credentials to get in, but if that's not what you want, then
things should be fine. Or is that too simplistic :-)

Bye,
René

users-bounces@lists.swish-e.org wrote on 24/05/2007 13:33:10:

> OK, let's start over. . .
>
> I want to index the site.
> Only .htm and .html
> I don't want to index directories containing .htaccess
> I don't want to index documents beginning with "dsc_" )
>
> --
> Swish-e version:  2.4.5
> OS:  RH9
> Current run string:  swish-e -S prog -c swish.conf
>
> Current swish.conf:
>
> # Swish-e config
> #
> IndexDir spider.pl
> IndexFile index.swish-e
>
> SwishProgParameters default http://nottherealsitename.com/
>
> IndexReport 3
>
> Metanames swishtitle swishdocpath
>
> IndexOnly .htm .html
>
> IgnoreWords File: /usr/local/swish-e-2.4.5/conf/stopwords/english.txt
>
> StoreDescription TXT* 10000
> StoreDescription HTML* <body> 10000
>
>
> Need some help.
>
>
> Bill Moseley wrote:
> > On Wed, May 23, 2007 at 10:35:47PM -0400, Frank Hunt wrote:
> >> this fails:
> >>
> >> IndexDir spider.pl
> >> SwishProgParameters default http://website.com/
> >> FileRules directory contains ^\.htaccess
> >>
> >> run string:  swish-e -S prog -c swish.conf2
> >
> > -S prog means you are not reading from the file system -- FileRules is
> > only for reading from the file system.
> >
> >
> >
> >
>
> --
> frank hunt
> PLUG member-in-absentia
> confused linux admin
> part time windows(r) washer
> rochester hills, mi
> _______________________________________________
> Users mailing list
> Users@lists.swish-e.org
> http://lists.swish-e.org/listinfo/users

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu May 24 08:32:40 2007