I'm using an incremental-mode build of swish-e to index data using -S prog
swish-e -S prog -i stdin to create the initial index
swish-e -u -S prog -i stdin to add to it.
The Path-Name is a GUID (not a file name), and XML data passed on the
swish-e stdin pipe. I also provide the Update-Mode header with Index and
Update respectively. This works fine.
I'd then expect that
swish-e -r -S prog -i stdin
with Update-Mode: Remove and Path-Name set to the GUID would remove that
data from the index.
It doesn't; swish still reports the same number of indexed files, and
searches for strings appearing only in the supposedly removed data still
come back as matching its GUID. I don't get any swish-e errors (using -v3)
running the removal script.
When I dump my XML data to files, and I use the file name for Path-Name in
stead of the GUID both when I create and udpate the index as well as for the
removal, the file is removed correctly.
Does the removal functionality only work when files are indexed, as opposed
to data passed to swish-e via a pipe? So you can't remove files from an
index if you use -S prog stdin?
Another niggly; even in the case where the removal is working (ie when I'm
indexing files) I see that the words appearing only in the removed file
still shows when doing
Searching for those words don't bring back any results, so the file was
removed. But I would have thought that the indexed words would also be
removed if there are no files referring to them? Not that this makes a huge
difference, I'm just worried about the index files growing too large over
Users mailing list
Received on Tue Oct 30 08:59:41 2007