For a contemplated application of open source indexing, it is important that
the index not be reverse engineerable, i.e., it is important that the index
be such that the intellectual content of the original document cannot be
re-created from the index.
My understanding is that Swish-e, as currently configured, does not meet
this criterion. In other words, if one starts with just the Swish-e index,
it is feasible to re-create the intellectual content of the original
document. Thus, I am interested in discussing whether there might be a
modification that could be introduced to Swish-e and/or the index. I
realize that such a modification might, to a degree, compromise the search
capabilities against the index.
For example, might it be possible to obscure the word count of the SWISH
index (like dividing the word count number by two and rounding up) so that
someone viewing the index couldn't determine the correct word sequence in
the original document? My understanding is that such a modification would
compromise the capability to search literal strings like phrases, but one
might still be able to design the search feature to allow "near" searches.
Please comment on the technological feasibility of modifying Swish-e and/or
the index so that the intellectual content of the original document is not
re-creatable from the index.
Received on Mon Aug 12 20:57:30 2002