Try this in your config file:
IndexContents HTML2 .html
IndexContents HTML2 .htm
StoreDescription HTML2 <BODY> 100000
# To index PDF files as well, try something like this...
FileFilter .pdf pdftotext "'%p' -"
IndexContents TXT .pdf
StoreDescription TXT 250000
This will store the BODY tag text of all files that end in .htm and .html,
using the HTML2 parser.
If you're running a slower machine and performance is an issue, lower the
100,000 number to somthing
smaller. If you have mostly smaller HTML files, this number can be lower
and you won't lose any content
when the descriptions are stored.
The command you listed looks like something you'd use to create the index.
As long as your config
file is right, you don't need to do anything else to store your
descriptions. You just need the right switches
when doing your search.
Try doing a search like this once you've created the new index file:
cgi-bin/swish-e -w <your search string> -f index.swish -x '%t -
%p\n%d\nlast updated %D\trank %r\tsize %l bytes\n\n'
This will actually return a lot more info than just the description. The
%d part shows the description.
Take a look at
and scroll down to the
section titled "-x formatstring (extended output format)".
<dena.wolf@orcinc. To: Multiple recipients of list <firstname.lastname@example.org>
Sent by: Subject: [SWISH-E] how to get a description
Please respond to
Two questions; Ive been reading the past archives that deal with this and
understanding a little but don't know if I am doing this at all right.
My indexing is working and I am getting results now. Now what I am trying
to do is to get a chunk of the body of the document in the results page
has say 40 words of the document body in it that includes the search word
In my config file:
#MetaNames keywords description
ReplaceRules replace "/export/home/orcsolar/html/" "http://www.orcinc.com/"
ReplaceRules remove "html/"
IgnoreLimit 50 1000
FileRules pathname contains members
IndexOnly .html .doc .xls .htm .ppt .txt .pdf
IndexContents HTML* .html .htm
StoreDescription HTML <body> 40
NoContents .gif .xbm .au .mov .mpg .ps
I added the IndexContents line & the StoreDescription line. I get a bad
directive error for both of those 2 new lines. Why? I checked that there
Also, in my index command line, how do I add something to make the
description run (assuming i get the indexing to work).
Right now my line says: cgi-bin/swish-e -c cgi-bin/orcsolar/config -i html
-v -f index.swish
Can I put -p swishdescription somewhere in that line? If so where?
I'm sorry I am having so much trouble trying to get all this to work.
for your help.
Organization Resources Counselors, Inc.
Received on Tue Nov 19 20:04:38 2002