Skip to main content.
home | support | download

Back to List Archive

Re: meta names not included in swishdefault?

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Feb 26 2002 - 19:08:03 GMT
On Sunday 24 February 2002 06:03 pm, Fred Toth wrote:

[I seem to have failed to queue this message a few days ago -- sorry]


> It appears that when meta names are defined ("author" for example), the
> data accumulated for the "author" meta name is no longer included in the
> default meta name "swishdefault". Is this correct?

Yes.  "swishdefault" is just another metaname, the one that's used if 
something else doesn't match.


> Meaning, if I have <author>smith</author> somewhere in my input, and I
> search:
>
> 	swish-e -w author=smith
>
> I get a hit. This is good. However, if I search:
>
> 	swish-e -w smith
>
> I don't get a hit, since, presumably, "smith" does not exist in
> "swishdefault".
>
> So, to my questions: Is there any way to change this? I'd like
> "swishdefault" to be the "full text" of the input, including any and all
> meta names. Is this possible? If so, how do I express that in the
> configuration?

So you want -w author=smith to only search the author field, but -w smith 
(which is the same as swishdefault=smith) to search all fields?

    -w smith or author=smith

is the current way to do that.

There has been discussion about extending the search syntax to do something 
like

    -w swishdefault,author,subject=smith

to search all those listed fields.  I've also thought about something like

    -w *=smith

to say search all metaIDs.  But for my use I'd think I'd want more control -- 
that is I'd want to specify what metaIDs were also part of an "ALL" search.
That means either defining what metaID should also be indexed as 
"swishdefault", or to allow multiple metaID searches as shown above.


> However, this could get very cumbersome if there are a lot of meta names:
>
> 	swish-e -w 'smith or author=smith or abstract=smith or keywords=smith'
> (etc. etc.)

One work around is to use nested metanames in your source documents:

<html>
<head>
<title>Title</title>
<group>
   <meta name="author" content="bla">
   <meat name="abstract" content="foo">
</group>

metanames autho abstract group

And then use libxml2 as your HTML parser.

Now, it's not always possible to change the source.  There's a way to alias a 
collection of metanames onto one metaname.  You can say that author, and 
abstract and keywords are all aliases for the "group" metaname, but then you 
can't search for just "author".  That might be a nice feature.

> Another thing I thought of was to have 2 indexes. One has all my meta names
> defined,
> and the other has none. A full text query then queries the index with no
> meta names.
> Any other field specific query goes against the meta name index.

What would be an easy workaround.
Received on Tue Feb 26 19:08:36 2002