Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] partial indexing

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Fri Mar 27 2009 - 17:11:51 GMT
Zhou Xiang wrote on 03/27/2009 11:50 AM:
> Thank you for your reply!
> 
> I just tried the spider.pl method you suggested and I added an external link
> "http://www.amazon.com" to the list, but the spider still does not index it.
> 
> What's more, it still does not index any webpages outside the local server,
> digital.lib.lehigh.edu.
> 
> My spider config file:
> @servers = (
> {
>   base_url    => '
> http://digital.lib.lehigh.edu/beyondsteel_test/admin/index.php',

you must add all the base names you want included, either in base_url or
same_hosts (depending on how you want them indexed).

Read the docs:

 http://swish-e.org/docs/spider.html#configuration_options

the default behaviour is to remain only on the same host.

-- 
Peter Karman  .  peter(at)not-real.peknet.com  .  http://peknet.com/

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Mar 27 13:11:50 2009