Skip to main content.
home | support | download

Back to List Archive

Using MetaNames... which are also common words

From: Tom Malaher <tmalaher(at)not-real.netstart.com>
Date: Wed Jan 06 1999 - 15:23:29 GMT
Has anyone else run into this?

Config file says:
MetaNames subject from

HTML says:

<HTML><HEAD>
<TITLE>Re: EAI not as an Java applet -- jdunn@netscape.com Jim Dunn</TITLE>
<META NAME="date" CONTENT="Wed, 17 Dec 1997 09:31:13 -0800">
<META NAME="subject" CONTENT="Re: EAI not as an Java applet">
<META NAME="from" CONTENT="jdunn@netscape.com Jim Dunn">
</HEAD>
<BODY>
<PRE>

Date: Wed, 17 Dec 1997 09:31:13 -0800
From: jdunn@netscape.com (Jim Dunn)
Subject: Re: EAI not as an Java applet

...

I.e. I'm trying to index a mailing list. I've minimally formatted the
messages so they are HTML documents with some meta tags to reflect the
date, subject and from line, and I've indexed the subject and from as
specific meta tags so I can search the from and subject lines
specifically.

However, when I try to search:

  % swish-e -f index.swish -w from=fred
  # SWISH format 1.3
  #...
  # Search words: = fred
  err: no results
  .

It looks to me like the "remove common words" code ran before the "look
for METANAME=VALUE" code, leaving me with a bogus search.  Since "From"
and "Subject" (by definition) appear in all the messages they get added
to the common word list.

On the other hand:

  % swish-e -f index.swish -w from=fred
  # SWISH format 1.3
  #...
  # Search words: bogus = fred
  err: The metaName bogus doesn't exist in  user configfile

So that seems to work.

Do I have to invent bogus names for the keywords so that I can search
them (eg. msgfrom and msgsubject)?  For the moment I guess can use
'swish -e -t t -w word' to search the TITLE tag, which gets the effect
of (from=word or subjet=word).

If swish-e is doing what I think it is doing, then I'd consider this a
bug report.

Tom - tmalaher@netstart.com
Received on Wed Jan 6 07:51:09 1999