Skip to main content.
home | support | download

Back to List Archive

Re: 2.4.3 Refuses to Index Virtual Host

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Sun Apr 10 2005 - 23:16:49 GMT
caveat: you'd have to deal with the fact that recursion would include the 2nd 
site's files within the first index, since the filetree includes both sites.

Peter Karman scribbled on 4/10/05 6:13 PM:
> if you have access to the filesystem where the files are stored, is there some 
> advantage to using the spider at all?
> 
> otherwise you could do:
> 
>    swish-e -i /path/to/site1 -c config1
> 
> and
> 
>    swish-e -i /path/to/site1/site2 -c config2
> 
> which would be both faster and create two different indexes for searching.
> 
> fh oregon scribbled on 4/10/05 6:02 PM:
> 
>>My goal here is to have the main site and the virtual site(s) indexed 
>>and searchable.  The more I mull this over I came up with a way to fake 
>>out the indexer.   As a test, I placed a (hidden) link on the main page 
>>directly to the /SFCC directory and !!!  It looks like it is all working 
>>now.  I need to do more testing.
>>
>>-fh
>>
>>Bill Moseley wrote:
>>
>>
>>
>>>On Sat, Apr 09, 2005 at 11:10:12AM -0700, fh oregon wrote:
>>>
>>>
>>>
>>>
>>>>The root of the site (frankhunt.com) is /web/httpd/htdocs  Within that 
>>>>directory is the main index.html as well as a few other html documents 
>>>>and directorys for other parts of the site.  One of those directorys is 
>>>>/web/httpd/htdocs/SFCC which is the root of the 
>>>>siliconforestcorvetteclub.com domain.
>>>>  
>>>>
>>>
>>>Again, the spider has NO knowledge of your directory structure.  If
>>>you spider frankhunt.com and there's no pages in frankhunt.com in CFCC
>>>then it won't spider them.
>>>
>>>Try it yourself.  Go to frankhunt.com and only click on links that
>>>include frankhunt.com as the host name.  That's all that will be
>>>indexed.  That link to CFCC is not the same host name.
>>>
>>>Look, you also link to http://www.fs.fed.us/gpnf/volcanocams/msh/ --
>>>do you expect that to get indexed?  And everything it links to, also?
>>>
>>>Sounds like you are not clear on how web servers map directories.
>>>
>>>
>>>
>>
>>
> 

-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
Received on Sun Apr 10 16:16:50 2005