assuming you are using the spider.pl script and swish-e -S prog, then
the href link should be followed.
however, there is no support for omitting the <a> text from the description.
intervolved none scribbled on 4/27/06 7:43 AM:
> Thank you for the response.
>
> Actually I do not care if the text within the the <a> tag is indexed. I just want the links followed and indexed. I do not want the text in the <a> tags stored in my description property.
>
> example:
> -----------------
> ...
> <a href="page.html">testing</a> <- the page page.html should be indexed but the text not included in the description of the current page
> this is the body of the text. <- this should be the description of the page
> ...
> -------------------------------
> Peter Karman <peter@peknet.com> wrote:
> If I am understanding you correctly, you want the text within the
> tagset to be indexed but not stored in the description Property. I don't
> believe there is a config option to allow that. The properties simply
> suck up all the characters they find, optionally converting entities,
> and ignoring tags.
>
> intervolved none scribbled on 4/26/06 11:29 AM:
>> I have noticed on a lot of my pages that get indexed that the
>> description displayed is from the href tags and not from the actual
>> body of the content. Is there anyway to fix this? I want the links
>> to be indexed but I do not want the text to be included in the
>> description of the page.
>>
>>
>>
>>
>> Config :
>>
>> MaxDepth 0 Delay 0 Metanames keywords MetaNamesRank 10 keywords
>> IndexContents HTML2 .htm .html .shtml .jsp IndexContents TXT .pdf
>> .doc DefaultContents HTML2 StoreDescription HTML2 200
>> StoreDescription TXT 200 PropertyNameAlias swishdescription
>> description obeyRobotsNoIndex yes
>>
>> HTMLLinksMetaName links IndexDir http://testserver/testpage.html
>>
>>
>>
>>
>> d:>\swish-e.exe -f "d:\testing\indexes\temp.index" -wdirectives -p
>> swishdescription -d :: # SWISH format: 2.4.2 # Search words:
>> directives # Removed stopwords: # Number of hits: 1 # Search time:
>> 0.000 seconds # Run time: 0.015 seconds
>> 1000::http://testserver/testpage.html::My Title::932::one two three
>> one two three one two three. four five six. seven eight nine ten,
>> uno dos tres quatro Advance Directives and Organ Donation
>> Page body text example
>>
>> The description is : one two three one two three one two three. four
>> five six. seven eight nine ten, uno dos tres quatro Advance
>> Directives and Organ Donation Page body text example
>> . Not : Advance Directives and Organ Donation Page body
>> text example
>>
>> .
>>
>> Html Page that is indexed:
>>
>> > valign="top">> src="/images/nav/navStd.gif" class="vimg"
>> border="0">
>> > target="">one two three one two three one two three. four five six.
>> seven eight nine ten, uno dos tres quatro
>>
>>
>>
>>
>>
>>
>> Advance Directives and Organ Donation Page body text
>> example
> test page line 1
> test page line 2
> body test line 2 more info...
>
>>
>
>>
>> --------------------------------- Love cheap thrills? Enjoy
>> PC-to-Phone calls to 30+ countries for just 2�/min with Yahoo!
>> Messenger with Voice.
>>
>>
>> *********************************************************************
>> Due to deletion of content types excluded from this list by policy,
>> this multipart message was reduced to a single part, and from there
>> to a plain text message.
>> *********************************************************************
>>
>>
>
--
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
Received on Thu Apr 27 06:14:56 2006