Skip to main content.
home | support | download

Back to List Archive


From: <jmruiz(at)>
Date: Fri Nov 17 2000 - 12:00:27 GMT
Hi Roy,

On 16 Nov 2000, at 12:42, Roy Tennant wrote:

> To me, the problem with SWISH-E and XML is not the searching, but the
> results. What you would get back is that a given file matches your
> search, *not* each XML segment that matches and the URL of the file
> from which it was extracted (which is more like what I want). So
> that's why I'm looking at other things to search XML content (like
> sgrep) rather than use SWISH-E. To make SWISH-E really work the way I
> want it to, there would need to be a module that could extract
> relevant segments from files that match. Roy

Well, this problem can be solved using a workaround. I introduced 
in 2.1.x a new value in the result list, only visible by the extended 
info output (option -x): The offset of the document inside the file (one 
file can contain several documents). Now this value is always 0 and 
the size is the total length of the file. But in a future this values can 
be different to delimite a document inside the file (offset + length).

So, reading only the bytes starting at the offset upto the length from 
the file, will give you just the document you need. Unfortunately, this 
cannot be applied to filtered documents (Rainer's Filter option).

Received on Fri Nov 17 12:02:04 2000