On Tue, Nov 08, 2005 at 12:49:31PM -0800, Michael Porcaro wrote:
> Question 1:
> Lets say I add a new page. Do I have to spider the whole site again to
> index the 1 page?
> Question 2:
> I finally was able to spider my site, and get the search engine to work.
> One problem now:
> The spider indexed every single link when I instructed it to index .html
> by using this config file called swish.conf
> # Use spider.pl for indexing
> IndexDir spider.pl
> IndexOnly .html
IndexOnly isn't used when using -S prog input method (i.e. using
> It took about 7 hours to spider the whole site with this command:
> Swish-e -e -S prog -c swish.conf
> There are a lot of useless links in the index file which is 80 megs.
> How can I filter out every page except .html? How come it didn't obey
> the config file?
http://swish-e.org/docs/spider.html should cover most of that.
Unsubscribe from or help with the swish-e list:
Help with Swish-e:
Received on Tue Nov 8 20:44:39 2005