Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Swish-e prefs to ignore certain sentences

From: William M Conlon <bill(at)not-real.tothept.com>
Date: Thu Apr 17 2008 - 23:18:50 GMT
use a callback function to remove the content prior to indexing.

Bill


On Apr 17, 2008, at 2:46 PM, jp@bobschem.com wrote:
> Jordan's <!-- SwishCommand noindex --><!-- SwishCommand index -->
> suggestion is pretty cool but, as he said, it seems useful only in a
> case of template use or PHP includes where one file gets the tag (or a
> limited number of static html pages).  Is there any way to centralize
> this sentence restriction within the swish-e configuration files so
> hundreds of html files don't need to be edited (and dozens of work
> hours lost) each time a new phrase/sentence has to be eliminated from
> swish-e search results?
>
> JP
>
>
>
>
>
> On Apr 17, 2008, at 9:43 AM, Jordan Hayes wrote:
>
> I've got a website with roughly 500 html and php pages.  On roughly
> 400-425 of those pages, there are two "welcome" sentences at the top
> of the page.  Those sentences have been given CSS class="subtitle".
>
> One way to do this, especially if you've been using a templating  
> system
> to generate these pages, is to take something like
>
>     <p>Welcome to ABC, voted the best widget company!<hr/>
>
> and make it look like this:
>
>     <!-- SwishCommand noindex -->
>     <p>Welcome to ABC, voted the best widget company!<hr/>
>     <!-- SwishCommand index -->
>
> /jordan
>
> _______________________________________________
> Users mailing list
> Users@lists.swish-e.org
> http://lists.swish-e.org/listinfo/users
>
>
>
> I've got a website with roughly 500 html and php pages.  On roughly
> 400-425 of those pages, there are two "welcome" sentences at the top
> of the page.  Those sentences have been given CSS class="subtitle".
> Additionally, there's another random sentence on quite a number of
> pages, like "Find out more about this subject."  These sentences are
> not given any special font class.  An example of a welcome sentence
> would be 'Welcome to ABC, voted the best widget company."
>
> Swish-e search results are pulling the 'welcome' sentences and placing
> them at the very beginning of each search result.  When someone
> searches for a key word, the first 15-25 words of each result are the
> welcome sentences.  The random 'find out more...' sentences also often
> show up in search results, making the swish-e results contaminated
> with useless data.
>
> I am familiar with configuring swish-e to ignore individual words
> (with the stopwords file) and to even ignore meta names like <h1> or
> <h3>, which I have done successfully.
>
> I didn't find any info on how to ignore a particular sentence or
> multi-word phrase, especially sentences that aren't using a tag like
> <h3>, but rather a CSS class or no class at all.
> _______________________________________________
> Users mailing list
> Users@lists.swish-e.org
> http://lists.swish-e.org/listinfo/users

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Apr 17 19:18:53 2008