Skip to main content.
home | support | download

Back to List Archive

Re: Stop words and meta tags NOTE ADDED

From: Frank Heasley <DrHeasley(at)not-real.chemistry.com>
Date: Sat Sep 16 2000 - 15:42:42 GMT
You are correct, if you specify the stop words as Ignorewords, instead of 
Swishdefault, leaving out the ones you want to index will allow you to 
index them.

However, it looks like the Ignorewords variable has a size limit.  In my 
system, if you try to use the original list from swish.h, it gives you a 
"bad directive" error until you delete the words beyond "why".

Frank

At 07:21 AM 9/16/00, Bill Moseley wrote:
>At 03:20 AM 09/16/00 -0700, you wrote:
> >Agreed, one could index ALL stop words, but that would be extremely
> >inefficient, right?
>
>Stop words are words that aren't indexed.  So if you index stop words such
>as "yes" and "no" then they are not stop words anymore.
>
> >[Actually, commenting out IgnoreWords wouldn't work either.  It would just
> >cause Swish to spend an inordinate amount of time calculating the frequency
> >of all of the potential stop words, as defined in config.h, and then
> >deleting them from the index, ending up in essentially the same
> >place.
>
>Don't use IgnoreLimit
>http://sunsite.berkeley.edu/SWISH-E/archive/1848.html
>
>
> >It would be better to allow a few specified terms, like yes/no, etc. as
> >"non" stop words.  Then you could search within meta or xml tags like a
> >real database.
>
>Sounds like you are confused about stop words.  There are two kinds of
>words: words in the index, and words that aren't in the index.  You can't
>say "yes" and "no" are stop words, but include them in the index.  They are
>no longer stop words if you do that.
>
>Again, use IgnoreWords and specify your stop words:
>IgnoreWords a an the and
>
>Now you can search for "yes" and "no", but not "a" "an" "the" and "and".
>
> >Suggested variable names: "SpecialWords" "Override_Stop_Words"
> >"StopWordsNOT" ???
> >
> >At 02:34 PM 9/15/00, you wrote:
> >>At 02:12 PM 09/15/00 -0700, Frank Heasley wrote:
> >> >Although stop words are important, there is no provision (that I'm aware
> >> >of) that can override them.
> >>
> >>http://sunsite.berkeley.edu/SWISH-E/Manual/config.user.html
> >>
> >>#IgnoreWords SwishDefault
> >># The IgnoreWords option allows you to specify words to ignore.
> >># Comment out for no stopwords; the word "SwishDefault" will
> >># include a list of default stopwords. Words should be separated by spaces
> >># and may span multiple directives.
> >>
> >>
> >>
> >>Bill Moseley
> >>mailto:moseley@hank.org
> >
> >
> >
>
>Bill Moseley
>mailto:moseley@hank.org
Received on Sat Sep 16 15:46:00 2000