Hello,
I'm using swish-e 2.4.3, and I have a duplicate URL problem when merging
to indexes. I have a "big" index and do an incremental indexing of XML
files everyday. But yesterday, I changed one file and when doing the
merge I got the same url (swishdocpath) twice, with different dates.
Shouldn't I get only one (the newest one)?
index1:
$ swish-e -f index1 -w KEYWORD=Gregorio -x"%p -- %D\n"
# SWISH format: 2.4.3
# Search words: KEYWORD= Gregorio
# Removed stopwords:
# Number of hits: 1
# Search time: 0.000 seconds
# Run time: 0.008 seconds
http://localhost/page/bilaketadatuak?gid=332884 -- 2008-02-19 23:30:07 CET
index2:
$ swish-e -f index2 -w KEYWORD=Gregorio -x"%p -- %D\n"
# SWISH format: 2.4.3
# Search words: KEYWORD=Gregorio
# Removed stopwords:
# Number of hits: 1
# Search time: 0.000 seconds
# Run time: 0.010 seconds
http://localhost/page/bilaketadatuak?gid=332884 -- 2008-02-05 01:26:42 CET
Merged index:
swish-e -f index.merged-w KEYWORD=Gregorio -x"%p -- %D\n"
# SWISH format: 2.4.3
# Search words: GAKOAK=sorgortasuna
# Removed stopwords:
# Number of hits: 2
# Search time: 0.000 seconds
# Run time: 0.011 seconds
http://localhost/bilaketadatuak?gid=332884 -- 2008-02-05 01:26:42 CET
http://localhost/bilaketadatuak?gid=332884 -- 2008-02-19 23:30:07 CET
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Feb 21 05:45:39 2008