Skip to main content.
home | support | download

Back to List Archive

Re: Indexing remote documents

From: Thomas Nyman <thomas(at)not-real.teg.pp.se>
Date: Sun Jun 05 2005 - 13:25:34 GMT
Sorry about the encrypted mail .. my mistake... which file contains  
the necessary parameters used when spidering. I found on the site the  
following

#    my %ccenter = (

#            email       => 'Lance.Perry(at)not-real.ourdomain.com',
#            base_url    => 'http://our.domain.com/ccenter/',
#            delay_sec   => '0',
#            max_depth   => '1',
#            credentials => 'username:password'

#   );

#    @servers = ( \%ccenter );

the question is where should this go?




5 jun 2005 kl. 13.54 skrev Thomas Nyman:

> Hi
>
> I have created a conf file that contains
>
> IndexDir http://192.168.1.2/archive/
>
> I wish to index all files found in the "archive" on the remote
> machine. The remote machine uses htpasswd to access it, so one need a
> password to surf to the machine.
>
> When running swish i  recieve the following messages
>
> Indexing Data Source: "HTTP-Crawler"
> Indexing "http://192.168.1.2/archive/"
> Removing very common words...
> no words removed.
> Writing main index...
> err: No unique words indexed!
>
> It seems that its not indexing any documents.
>
> I have not made any particular changes to any other file than my conf
> file.
>
> I can successfully index on the same machine that swish is  
> installed on.
>
> I'm guessing I'm missing something here but I'm not sure what. I
> would appreciate any pointers. If someone wants me to send additional
> info I will.
>
> Thanks
>
> Thomas
>
>
Received on Sun Jun 5 06:25:35 2005