Skip to main content.
home | support | download

Back to List Archive

Swish-e with incremental mode still crashes sometimes

From: Uwe Dierolf <swishe(at)not-real.ubka.uni-karlsruhe.de>
Date: Fri Sep 30 2005 - 11:23:34 GMT
Dear Jose, dear Bill and dear all the other Swish-e developers,

we still have problems with Swish-e compiled with incremental mode.
It still crashes sometimes while updating an index.
We use the incremental mode to delete from and insert single records
into our index with about 1 million XML records.
If the crash occurs the index seems to work but it's damaged.
We only recognized that it must be defect cause of strange search results.
Perhaps you have an idea how to find this ugly bug.
We regret that we can not give you test data to reproduce a crash.

Another aspect is the performance:
Incremental updating the index by adding just one record is extremely slow. 
It takes some minutes while we expected it to be seconds cause of the usage
of the database (Berkeley DB). We don't know this database system but  
performance with Oracle or PostgreSQL is really good when updating a record.

Swish-e spends most of the time sorting properties. We profiled the code to 
identify the functions:

82 % of the time in  compFileProps (pre_sort.c)
13 % of the time in  CreatePropSortArray (pre_sort.c)

Would it be feasable to:

1) Use another sorting algorithm (not quicksort) in case of incremental update.
For example insertion sort or shell sort perform much better on almost sorted
data (inserting few records in a already sorted list). See for example:

    http://www.softpanorama.org/Algorithms/sorting.shtml

2) Take the discrimination between strcoll, ignore_case, use_case (see
Compare_Properties [docprop.c] which is called from compFileProps) out of the
loop which implicit in swish_qsort.

That is instead of:

    // sorting one property
    loop
       if (is_meta_ignore_case) then strncasecmp (val1, val2)
       if (is_meta_use_strcoll) then strcoll     (val1, val2)
    endloop

use:

    // sorting one property
    if (is_meta_ignore_case) then cmp_function = strncasecmp
    if (is_meta_use_strcoll) then cmp_function = strcoll
    loop
        call cmp_function (val1, val2)
    endloop

Thanks a lot in advance.

Best regards, Uwe Dierolf

------------------------------------------------------------------
Uwe Dierolf
University of Karlsruhe - University Library
P.O.Box 6920, 76049 Karlsruhe, Germany
phone(fax) : 49/721/608-6076(4886)
www        : http://www.ubka.uni-karlsruhe.de/dierolf/
------------------------------------------------------------------
Received on Fri Sep 30 04:23:37 2005