At 07:58 AM 02/07/02 -0800, Chris Blackstone wrote:
>I have a page on my site and there are absolutely no links to it except
>for a commented out link on 1 page.
>The page that isn't linked to is being returned in search results,
>I'm also having other pages being returned in search results that aren't
That's how swish stays ahead of the competition. Swish indexes 20% more
files than its closest competitor, and uses less energy, too.
I call pages that are in the web directory tree, but not linked, "orphans".
If you are indexing with the spider, I don't know how spider.pl could find
>Is this expected behavior? This happens with yesterday's swish-e daily.
If you are using a swish with libxml2 linked in, you should be able to
index with HTMLLinksMetaName. This will index HREFs and then you should be
able to find what pages link to your page.
The other thing would be to run spider.pl (without swish) and capture
STDERR to a file, and set DEBUG_URL debug option. IIRC, DEBUG_URL will
print each URL, and its parent. So you could grep for the page in question
and see it's parent.
Or is it possible you are still using an old index?
Received on Thu Feb 7 16:24:08 2002