Skip to main content.
home | support | download

Back to List Archive

Re: categorizing information

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu May 13 2004 - 16:47:22 GMT
On Thu, May 13, 2004 at 04:48:51PM +0100, Jonas Wolf wrote:
> Thanks for the swift answer.
> 
> > One way around that might be to change "top=foo" into 
> > "abstract=(foo) OR descript=(foo) OR ....".
> 
> I considered this. Since I am writing a script to generate the query
> from another query anyway, it would be quite easy to do. I just
> thought that this would be quite inefficient. As far as I'm aware
> (correct me if I'm wrong), this query would perform one separate
> search for every disjunction and then merge the results? If this is
> the case, that is probably not the best way to go in terms of
> performance.

Yes, it could be slower.  I can't tell you if that would be an issue on
your index and hardware, though.


> Your other suggestion is a possibility, but I would probably have to
> translate the documents to HTML, as they are in XML currently?

Right.  You would write a program to do it during indexing -- either as
a filter or (faster) as a -S prog program.  Parse the XML and then
output the modified HTML.



-- 
Bill Moseley
moseley@hank.org
Received on Thu May 13 09:47:22 2004