Skip to main content.
home | support | download

Back to List Archive

How to configure swish-e to index content/meta in OO.o?

From: Philip Young <philipishere(at)>
Date: Sun May 29 2005 - 14:10:36 GMT

As I'm having alot of frustration trying to get the meta.xml (document
properties) and the content.xml to be indexed.   I would like the
content to be indexed into the "swishdefault" category (normal indexed
content) and the document properties indexed with the
"UndefinedMetatags auto" .

So I'm Just looking for a quick and dirty way to accomplish this task.
 Originally I thought of concatenating the two .xmls to be indexed
like so:

FileFilterMatch "/usr/bin/unzip" "-p \"%p\" meta.xml content.xml"

This line compiles and indexes with no syntax errors.  But the problem
is it does not seem to index properly.

Anyone got any ideas on how to get the meta.xml and content.xml indexed?

My swish.conf file is located below.


Philip Young

-- swish.conf --
IndexDir	/var/www/test
IndexFile	/var/www/test/index.swish-e
IndexName	Documents
IndexOnly	.xml .htm .html .txt .doc .rtf .sxw .sxc .sxi .odt 
DefaultContents	TXT
SwishProgParameters -S fs

ReplaceRules replace /var/www/test /test
ExtractPath subject regex !^/test/([^/]+)/.*$!$1!

# Allow extra searching by title, path
metanames swishtitle swishdocpath
UndefinedMetaTags auto

IndexContents TXT* .pdf
FileFilter .pdf "/usr/bin/pdftotext" "'%p' -"
#SWISH::Filter .pdf "/usr/bin/pdftotext" "'%p' -"

IndexContents TXT* .doc
FileFilter .doc "/usr/bin/catdoc" "-s8859-1 -d8859-1 '%p'"
#SWISH::Filter .doc "/usr/bin/catdoc" "-s8859-1 -d8859-1 '%p'"

IndexContents TXT* .rtf
FileFilter .doc "/usr/bin/catdoc" "'%p'"
#SWISH::Filter .doc "/usr/bin/catdoc" "'%p'"

FileFilterMatch "/usr/bin/unzip" "-p \"%p\" meta.xml" /\.(sxw|sxc|sxi|odt)$/i
IndexContents XML* .sxw .sxc .sxi .odt
StoreDescription XML* <text:p>

FileFilterMatch "/usr/bin/unzip" "-p \"%p\" content.xml" /\.(sxw|sxc|sxi|odt)$/i
IndexContents XML* .sxw .sxc .sxi .odt
StoreDescription XML* <text:p>
Received on Sun May 29 07:10:50 2005