Skip to main content.
home | support | download

Back to List Archive

RE: Document Summaries/Descriptions

From: <Rainer.Scherg(at)not-real.rexroth.de>
Date: Wed Nov 15 2000 - 14:23:14 GMT
Hi Bill,

IMO it's not that bad to store the description info.
The performance penalty might be very low.

I calculated this for our intranet site.

We have @16000 docs and 4.5 Megs data volume without databases.
16000 docs * 200 chars ~ @ 3-4 Megs add size

Our swish index would increase from 40 Megs to 45 Megs.
Storing the description along with the title and path, it
will not slow down the search process, because this information
doesn't need another hash (or whatever).


Retrieving this info by an external process (e.g. search.cgi)
will have an impact on the server load. In our case we cannot
provide an text extract of thousands of pdf and doc files.
An online filter call per file to get this information will
IMO slow down the search process to a "non-acceptable"...

cu - rainer



-----Original Message-----
From: Bill Moseley [mailto:moseley@hank.org]
Sent: Wednesday, November 15, 2000 2:58 PM
To: Multiple recipients of list
Subject: [SWISH-E] RE: Document Summaries/Descriptions


At 02:35 AM 11/15/00 -0800, jmruiz@boe.es wrote:
>So, a document may contain both title and description, right?
>I would also like the possibility that the description can be a field 
>(Metaname). What about:
>
>StoreDescription <field>|size
>
[...]

OTOH, I'm not sure that this feature can't be handled outside of swish if
Properties won't work in some case.  It's faster to access the document
summaries if they are in the index, but it might come at the expense of
speed when searching -- and that is swish's main job.

[...]


----------------------------------------------------------------------
This Mail has been checked for Viruses
Attention: Encrypted Mails can NOT be checked !

* * *

Diese Mail wurde auf Viren ueberprueft
Hinweis: Verschluesselte Mails koennen NICHT geprueft werden !
----------------------------------------------------------------------
Received on Wed Nov 15 14:24:45 2000