Skip to main content.
home | support | download

Back to List Archive

[swish-e] Swish-e prefs to ignore certain sentences

From: <jp(at)not-real.bobschem.com>
Date: Thu Apr 17 2008 - 16:17:26 GMT
Good morning swish-e folks!

Is there an easy way to configure swish-e to ignore a few entire
sentences common to 400+ html/php pages?

Here's what's going on:

I've got a website with roughly 500 html and php pages.  On roughly
400-425 of those pages, there are two "welcome" sentences at the top
of the page.  Those sentences have been given CSS class="subtitle".
Additionally, there's another random sentence on quite a number of
pages, like "Find out more about this subject."  These sentences are
not given any special font class.  An example of a welcome sentence
would be 'Welcome to ABC, voted the best widget company."

Swish-e search results are pulling the 'welcome' sentences and placing
them at the very beginning of each search result.  When someone
searches for a key word, the first 15-25 words of each result are the
welcome sentences.  The random 'find out more...' sentences also often
show up in search results, making the swish-e results contaminated
with useless data.

I am familiar with configuring swish-e to ignore individual words
(with the stopwords file) and to even ignore meta names like <h1> or
<h3>, which I have done successfully.

I didn't find any info on how to ignore a particular sentence or
multi-word phrase, especially sentences that aren't using a tag like
<h3>, but rather a CSS class or no class at all.

Any help would be greatly appreciated.

JP

P.S.  Is there a way to instruct swish-e to ignore entire divs within  
an html/php page?
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Apr 17 12:12:41 2008