Re: wildcard and stop words in properties

From: Bill Moseley <moseley(at)>
Date: Tue Sep 21 2004 - 19:23:10 GMT
On Tue, Sep 21, 2004 at 12:08:48PM -0700, Michael wrote:
> Bill Moseley wrote:
> > On Tue, Sep 21, 2004 at 11:51:17AM -0700, Michael wrote:
> > 
> >>I was wondering if wildcards (*) and stopwords were applied to properties.
> > 
> > 
> > You are confusing properties and metanames.
> I'm using the line
> 	PropertyNames category
> so doesn't this make it a property?

It makes category also a property, but if you are using ExtractPath
and expect to search on it you are using it as a metaname.  Metanames
are the things that are searched -- PropertyNames is used for storing
data (properties) about the file -- like file name, file size, and so
on.  (Things also called meta data, but I won't go there.)

> I just saw this setting and am currently investigating it. If I use 
> WordCharacters do I also need IgnoreFirstChar, IgnoreLastChar, 
> BeginCharacters, EndCharacter ?

Yes.  This is in the docs, but Wordchars are the set of possible chars
-- Begin and EndChars add restrictions that the word must begin/end
with chars in that set.  Ignore chars are used to remove chars from
the start and end of a word (e.g. to remove a comma at the end of a
word but not within the word.

> If I add something to WordCharacters 
> does it replace the default or add to it?

No.  I've wanted to code that for a long, long time.  Just use -T
index_header and copy-n-paste into your config.

> > Likely.  Maybe you don't really need a stopword list.
> No, I do need them 'cause the customer specifically asked for them.

Doesn't that drive you nuts?  It's often fun to ask "Why?" once in a
while.  It's one of my favorite things to do.  Ask them about
searching for phrases.

TODO: don't remove stopwords during indexing, then remove at search
time unless word used in phrase or with a "+" prefix.

Bill Moseley

Received on Tue Sep 21 12:23:40 2004