Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] swish.conf questions

From: Kerry Kobashi <kkobashi(at)not-real.comcast.net>
Date: Thu Aug 09 2007 - 17:52:14 GMT
Maybe I should back up and explain what I'm trying to accomplish.

I have a file hierarchy that is deeply nested. Here's a snippet:

/foobar
---index.xml
---1.xml
---2.xml
---otherfile.htm
---otherfile2.htm
---/foobarsubcategory1
------index.xml
------1.xml
------2.xml
------otherfile.htm
---/foobarsubcategory2
------index.xml
------otherfile.htm

I want swish-e to index only index.xml files as it contains 
metainformation for me to search on those XML documents.
Inside each index.xml is the following:

<?xml version="1.0" encoding="UTF-8"?>
<index>
<metaheader>
    <title>The title</title>
    <description>The description</description>
    <keywords>
       <keyword>kw1</keyword>
       <keyword>kw2</keyword>
    </keywords>
</metaheader>
<section>
    <title>The title</title>
    <description>The description</description>
    <body>
       Lorem ipsum blah blah <keyword>the keyword</keyword> more stuff 
follows.
    </body>
</section>
.
.
</index>

I want swish-e to index only the metaheader tag elements - the title, 
description, and the keyword. I do not want it to index the title, 
description, keyword tags, and anything else including the other title, 
description, and keyword tags located in other elements like section.

Here is what my swish.conf file looks like at the moment:

# Swish index is stored in same directory as swish.conf
IndexFile index.swish-e

# Walk the foobar directory and all subdirectories
IndexDir foobar

# Index only XML files
IndexOnly .xml

# Index only index.xml
FileMatch filename is index\.xml

# Store and index only the metaheader information
MetaNames title, description, keyword

1) I am developing this with PHP 5, XSL, DOM. Can swish-e accomplish the 
job? Or is a RDBMS + PHP solution more suitable?
2) If swish-e can do the job
    a) Why is it indexing not only index.xml, but other XML files as well?
    b) How do I avoid having swish-e index the section's title, 
description, and keyword tags in the section, if not everywhere else?
    c) I am using the PHP PECL swish extension
       1) How do I get at the title, description, and keyword after I 
query swish?

Thanks for your input.

--------------------------
Kerry Kobashi
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Aug 9 13:51:20 2007