Skip to main content.
home | support | download

Back to List Archive

AW: Re: Getting Results faster? (libswish-e)

From: Gunnar Mätzler <maetzler(at)not-real.mediadynamics.de>
Date: Wed Dec 15 2004 - 15:30:21 GMT
Thanks a lot Bill,

i actually need nothing more than the swishdocpath.

What i tried now is:

SwishSeekResult( App->results, pDispInfo->item.iItem );
SW_RESULT  result;
result = SwishNextResult( App->results );
CString string;
PropValue* value = getResultPropValue (result, "swishdocpath", 0 );
string = value->value.v_str;
freeResultPropValue(value);

It's not much faster (if ever). Did i miss something?

You are right, accessing the results list is fast. I just checked it by just
moving through the list
without reading any properties.

The reason i am insisting is that i might have to store the index (and
properties) on CD-ROM. And i think it will
become a lot worse then. So any idea would be appriciatet.

Best regards

Gunnar Mätzler


-----Ursprüngliche Nachricht-----
Von: swish-e@sunsite3.berkeley.edu
[mailto:swish-e@sunsite3.berkeley.edu]Im Auftrag von Bill Moseley
Gesendet: Mittwoch, 15. Dezember 2004 15:44
An: Multiple recipients of list
Betreff: [SWISH-E] Re: Getting Results faster? (libswish-e)


On Wed, Dec 15, 2004 at 12:43:53AM -0800, Gunnar Mätzler wrote:
> 1. I use a results list, which fills itself by using a callback function.
So
> it fetches only the results which have to be displayed at a given time.
> But i am a bit puzzled how to do this. What i do right now is:
>
> SwishSeekResult( results, number_of_result_to_display ) - to set the
results
> pointer to the wanted result
> result = SwishNextResult( results ); - to get the result.
> string = SwishResultPropertyStr( result, "swishdocpath" ); - to get the
> docpath.

Yes, I think that looks correct.

If you don't display all properties, you might just store the result
and let your output generation code call functions on the result using
getResultPropValue() and freeing after calling with
freeResultPropValue().  That might save reading properties you don't
need.

> Is this correct? I am not so sure whether i am skipping a result or not.
If
> for example i set the results pointer to result number 500 with
> "SwishSeekResult", doesn't "SwishNextResult" give me result number 501?

Yes -- but that's because SwishSeekResult() takes a zero-based offset,
but the results are numbered starting from 1.

SwishNextResult() actually returns the "current" result as set by
SwishSeekResutl() and then moves the pointer to the next result.

    /* Check for a unique index file */
    if (!results->db_results->next)
    {
        if ((res = results->db_results->currentresult))
        {
            /* Increase Pointer */
            results->db_results->currentresult = res->next;
        }
    }
    else -- a bit more complex when searching multiple indexes.


> 2. This method seems a bit slow. I can actually see the list filling
itself
> line by line (while hearing a lot of hard disk access). It gets better
after
> i scrolled throu the complete list a few times. Is there a faster way to
get
> a specific result out of the results list? I will definitely have to speed
> it up somehow.

The result list is in memory, and is a linked list.  Accessing it is
fast.  Reading a result's properties is slower since it has to go to
disk.  First time swish reads a property for a given file it reads a
table of file offsets into memory that tell swish where the individual
props are stored on disk.  Reading subsequent properties just requires
going to disk again.  All the properties are stored at the same place
on disk so you should get some OS buffering.  Larger properties are
compressed to help reduce the number of times the disk must be read.

It might be worth profiling your code to see where it's going slowly.

--
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list:
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Wed Dec 15 07:30:22 2004