Skip to main content.
home | support | download

Back to List Archive

[swish-e] PropertyNames not being indexed

From: Matt Paine <matt(at)not-real.mattsoftware.com>
Date: Fri Feb 09 2007 - 03:51:01 GMT
Hi guys. I've been using swish-e on a production site fantastically with 
2.4.3 for a while now, but now i'm looking at using it on another site 
and I cant seem to replicate the success I've had on our production site.

The problem I have is with the PropertyName in the config file.....

I'll provide a simple example and see if I'm incorrect with my 
assumptions (probably, cause we all know why you shouldn't assume).


------------------1: test.conf  --->8-------------------
IndexFile test.index
DefaultContents HTML*

PropertyNames id type
MetaNames id type
-------------------8<-------------------------

My assumption: this will save the index in a file called "test.index"
During indexing, additional information will be stored per document, 
namely the property names id and type
During searching, the user may optionally search on particular meta 
data, namely the meta names id and type




--------------2: doc.html ------->8---------------------

        <h1>hello</h1>
        <id>1</id>
        <name>hi</name>
        <type>product</type>

------------------------8<--------------------------

Okay, not the most interesting file to search, but in my thinking the 
indexer should pick out the <id> and the <type> tags, and store them as 
propertys.



------------3: command ------->8---------------------
[matt@test swish-test]$ /opt/swish-e-2.4.5/bin/swish-e -c test.conf -i 
doc.html  -T indexed_words -T properties
Indexing Data Source: "File-System"
Indexing "doc.html"
    Adding:[1:swishdefault(1)]   'hello'   Pos:1  Stuct:0x21 ( HEADING 
FILE )
    Adding:[1:swishdefault(1)]   '1'   Pos:2  Stuct:0x1 ( FILE )
    Adding:[1:swishdefault(1)]   'hi'   Pos:3  Stuct:0x1 ( FILE )
    Adding:[1:swishdefault(1)]   'product'   Pos:4  Stuct:0x1 ( FILE )
          swishdocpath: 6 (  8) S: "doc.html"
          swishdocsize: 8 (  4) N: "71"
     swishlastmodified: 9 (  4) D: "2007-02-09 13:36:44 EST"
Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 4 words alphabetically
Writing header ...
Writing index entries ...
  Writing word text: Complete
  Writing word hash: Complete
  Writing word data: Complete
4 unique words indexed.
6 properties sorted.
1 file indexed.  71 total bytes.  4 total words.
Elapsed time: 00:00:00 CPU time: 00:00:00
Indexing done!
[matt@test swish-test]$
-------------------8<---------------

Here I see the file has been index, all the words have been indexed and 
associtated with the swishdefault metadata (not completely sure on that 
assumption). After that file was indexed the document stores the 
swishdocpath, swishdocsize and swishlastmodified as properties of that 
document. This is where I would like the id and type propeties stored as 
well.


If someone could clear up my understanding of whats happening that would 
be fantastic. I'm not sure how I got it working on the production site, 
i've even tried installing swish-e-2.4.3 with no success in reproducing 
the properties I want to store.

Thankyou in advance


Matt.



_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Feb 8 22:50:33 2007