I have a list of documents to be indexed. In addition to the
document path, the list includes other attributes that should be
searchable, so they need to included in the index, although they may
not be in the document itself.
My first thought was to use -S prog, with my external program reading
each document, generating HTML to feed swish-e, and inserting <meta
name="lanuage" content="english"> for each attribute into the <head>
section of the HTML.
My second thought was that swish-e needs to accept attributes that
are fed to the indexer with the document, perhaps in a *NEW*
Attribute header, a la:
------ snip from documentation -----
# Prepare the headers for swish
my $path = 'Example.file';
my $size = length $doc;
my $mtime = time;
my $attr = { language => "english", author => "Shakespeare"};
<----- new
# Output the document (to swish)
print <<EOF;
Path-Name: $path
Content-Length: $size
Last-Mtime: $mtime
foreach $name {print "Attribute: $name => $attr{$name}\n";}
<----- new
Document-Type: HTML*
EOF
print $doc;
----- end snip ---------------------
And my last thought was to overload the Path-Name with the attributes
and use ExtractPath to build metanames.
Any other ways to skin this cat? Thanks.
Bill
William M. Conlon, P.E., Ph.D.
To the Point
2330 Bryant Street
Palo Alto, CA 94301
vox: 650.327.2175 (direct)
fax: 650.329.8335
mobile: 650.906.9929
e-mail: mailto:bill@tothept.com
web: http://www.tothept.com
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Apr 4 18:09:58 2008