Skip to main content.
home | support | download

Back to List Archive

Strange behaviour indexing remote site

From: Thomas Nyman <thomas(at)not-real.teg.pp.se>
Date: Mon Jun 06 2005 - 18:38:31 GMT
Hi

I'm still struggling a bit with my remote indexing. I can index the  
remote machine directory called arkiv but when I do a search using  
that index I receive hits on the relevant documents but also on  
something called index of arkiv. What that is I dont know.

An example to illustrate.

I search for a document called "Introducing CGI.doc"

I get the following hits using "cgi" as the search term

1. Introducing cgi.doc

2 Index of /arkiv
    document path: http://192.168.1.2/arkiv/?D=A

3 Index of /arkiv
document path: http://192.168.1.2/arkiv/?S=A



4 Index of /arkiv
http://192.168.1.2/arkiv/?M=A

5 Index of /arkiv
http://192.168.1.2/arkiv/?N=D

6 Index of /arkiv
http://192.168.1.2/arkiv/

The document path is http://192.168.1.2/arkiv/?D=A


I might add the repository contains six .doc file and one txt file

a ls -a show this to be correct with the execption of macos-x  
obligatory .DS-Store file

Does anyone have an idea as to why I'm getting these results?

I might also add that I get the following messages when indexing

Parsing of undecoded UTF-8 will give garbage when decoding entities  
at /usr/local/lib/swish-e/spider.pl line 1198.
Parsing of undecoded UTF-8 will give garbage when decoding entities  
at /Library/Perl/5.8.1/LWP/Protocol.pm line 156.
Parsing of undecoded UTF-8 will give garbage when decoding entities  
at /Library/Perl/5.8.1/LWP/Protocol.pm line 156.

I,ve googled and seen some various responses that almost make me  
think its a bug..but I have to admit I havent really understood the  
responses regarding the UTF bit



Thomas
Received on Mon Jun 6 11:38:37 2005