I'm indexing a mail archive (one file per message) and searching
with swish.cgi. (I'm running 2.4.1.) It was recently pointed
out to me that "Subject & Body" searches don't find all the
messages that "Subject" does - that is, if the keyword only
appears in the subject field, which becomes swishtitle, it
isn't found by Subject & Body.
I'm guessing that my metanames aren't set up quite right, but haven't
been able to figure out how. Help?
swish.conf:
IndexDir ./index_mh.pl
SwishProgParameters msgs
MetaNames swishtitle from
PropertyNames from
PropertyNamesDate date
DefaultContents HTML
StoreDescription HTML <body> 10000
UndefinedMetaTags ignore
swishcgi.conf:
use lib '/var/www/cgi-bin/modules';
return {
title => 'Search the Silent-Tristero Archives',
swish_binary => '/usr/local/bin/swish-e',
swish_index => '/WebPages/dimebank/s-t/index.swish-e',
description_prop => 'swishdescription',
template => {
package => 'stTemplateDefault',
},
timeout => 30,
display_props => [qw/ from date /],
sorts => [qw/swishrank swishtitle from date/],
secondary_sort => [qw/date desc/],
metanames => [qw/swishdefault swishtitle from all/],
name_labels => {
swishrank => 'Rank',
all => 'Entire message',
swishtitle => 'Subject Only',
from => "Poster's Email",
date => 'Message Date',
swishdefault => 'Subject & Body',
},
meta_groups => {
all => [qw/swishdefault from /],
},
highlight => {
package => 'PhraseHighlight',
show_words => 10, # Number of swish words words to show around highlighted word
max_words => 100, # If no words are found to highlighted then show this many words
occurrences => 6, # Limit number of occurrences of highlighted words
highlight_on => '<font style="background:#FFFF99">',
highlight_off => '</font>',
meta_to_prop_map => { # this maps search metatags to display properties
swishdefault => [ qw/swishtitle swishdescription/ ],
swishtitle => [ qw/swishtitle/ ],
from => [ qw/from/ ],
all => [ qw/swishdefault swishtitle from/ ],
swishdocpath => [ qw/swishdocpath/ ],
},
},
date_ranges => {
property_name => 'date', # property name to limit by
time_periods => [
'All',
'Today',
'Yesterday',
'This Week',
'Last Week',
'Last 90 Days',
'This Month',
'Last Month',
],
line_break => 0,
default => 'All',
date_range => 1,
},
};
A typical message after index_mh.pl gets done with it:
Content-Length: 975
Last-Mtime: 1073517607
Path-Name: msgs/msgs24001-27000/24106
<html>
<head>
<title>
</title>
<meta name="precedence" content="list">
<meta name="swishtitle" content="Girls Aloud's year at the top">
<meta name="to" content="Name <your@name.here>">
<meta name="sender" content="your@name.here">
<meta name="date" content="1066685834">
<meta name="from" content="Another Name <my@name.here>">
<meta name="received" content="by wolfe.bbn.com (Postfix, from userid 13274)">
</head><body>
<quot>
Tweedy's life was transformed when she joined Girls Aloud, leaving behind life o
n a council estate.
</quot>
<quot>
"Four months earlier I was sitting in a council house drinking tea and watching
Oprah Winfrey on television all day."
</quot>
<http://news.bbc.co.uk/1/low/entertainment/tv_and_radio/3207926.stm>
<http://news.bbc.co.uk/1/low/england/3207822.stm>
</body>
</html>
Received on Tue May 11 12:24:57 2004