Skip to main content.
home | support | download

Back to List Archive

Re: Re: Re: Re: Swish-E with incremental index building

From: <tmuetze(at)not-real.alanti.net>
Date: Sun Dec 05 2004 - 05:06:09 GMT
Hi Peter,

Comments below, the indexing has been run on a small test-dir.

Peter Karman <peter@peknet.com> schrieb am 05.12.2004, 03:40:19:
> to create the initial index:
> swish-e -f index.idx -c swish-e.config

Done.

Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 252,288 words alphabetically
Writing header ...
Writing index entries ...
  Writing word text: Complete
252,288 unique words indexed.
4 properties sorted.                                              
61 files indexed.  16,510,271 total bytes.  1,295,175 total words.
Elapsed time: 00:02:48 CPU time: 00:02:09
Indexing done!

> to update files that have changed:
> 
> swish-e -u -f index.idx -c swish-e.config

Just don't work. Swish-e seems to do the same as the command above.

Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 252,288 words alphabetically
Writing header ...
Writing index entries ...
  Writing word text: Complete
252,288 unique words indexed.
4 properties sorted.                                              
61 files indexed.  16,510,271 total bytes.  1,295,175 total words.
Elapsed time: 00:02:59 CPU time: 00:02:09
Indexing done!

> to remove specific files:
> 
> swish-e -r -i filetoremove -f index.idx -c swish-e.config

This also doesn't seem to work. Or am I doing something wrong?

**** Search for the word "demohu", it's found:
swish-e -f index.idx -w"demohu*"
# SWISH format: 2.5.2
# Search words: demohu*
# Removed stopwords: 
# Number of hits: 1
# Search time: 0.004 seconds
# Run time: 0.009 seconds
1000 /attachments/SM/XM08.doc "XM08.doc" 19456

**** Try to remove that file from the index:
swish-e -r -i /attachments/SM/XM08.doc -f index.idx -c swish-e.config 
Indexing Data Source: "File-System"
Indexing "/attachments/SM/XM08.doc"

Checking file "/attachments/SM/XM08.doc"...
  XM08.doc - Using TXT parser -  (9 words)

Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 9 words alphabetically
Writing header ...
Writing index entries ...
  Writing word text: Complete
9 unique words indexed.
4 properties sorted.                                              
1 file indexed.  74 total bytes.  9 total words.
Elapsed time: 00:00:00 CPU time: 00:00:00
Indexing done!

**** Search again, it's still found:
swish-e -f index.idx -w"demohu*"
# SWISH format: 2.5.2
# Search words: demohu*
# Removed stopwords: 
# Number of hits: 1
# Search time: 0.001 seconds
# Run time: 0.005 seconds
1000 /attachments/SM/XM08.doc "XM08.doc" 19456

Regards,
Tilo
Received on Sat Dec 4 21:07:08 2004