I was running swish using the http method (I bet Bill just cringed) and it
stopped on a certain file. When I did a -T INDEX_FILES I get :
Indexing Data Source: "File-System"
Indexing "./index.swish-e.temp"
Warning: Substituted possible embedded null character(s) in file
'./index.swish-e.temp'
Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 2 words alphabetically
Writing header ...
Writing index entries ...
Writing word text: Complete
Writing word hash: Complete
Writing word data: Complete
2 unique words indexed.
4 properties sorted.
1 file indexed. 393216 total bytes. 3 total words.
Elapsed time: 00:00:09 CPU time: 00:00:09
Indexing done!
swish.conf:
StoreDescription HTML2 <body> 100000
DefaultContents HTML2
Delay 0
MaxDepth 2
command line:
./swish-e -S http -c ./swish.conf -i
http://ublin.lib.buffalo.edu/webcat/libcat/E/E/A/9/ -v9
Where it stops is:
yada...yada...(editorial comment to replace lines and lines of typing)
retrieving http://ublin.lib.buffalo.edu/webcat/bibcat/E/E/A/9/284.html
(1)...
- Using HTML2 parser - (95 words)
retrieving http://ublin.lib.buffalo.edu/webcat/bibcat/E/E/A/9/285.html
(1)...
My version
# ./swish-e -V
SWISH-E 2.1-dev-25
Why would the first line of output from the -T switch say Indexing Data
Source: "File-System"
Any ideas?
Thanx,
Rich
*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************
Received on Tue Jan 29 14:13:31 2002