I wonder if this is an alignment issue on 64 bit machines.
By the way, here's some debugging info:
On Mon, Nov 28, 2005 at 02:50:55AM -0800, Arndt Droullier wrote:
> Is there a way to find out if the index wasn`t created properly?
> (This would reduce the problem to searching)
It's a bit of work to debug, but it will be faster than I can get to
You can build swish like this:
and swish will write out data about where properties are stored in the
prop file while indexing. It will generate quite a bit of output and
you will likely need to look at db_native.c to understand the output.
You can then compare it with the debug output you seen when searching.
moseley@bumby:~/build$ src/swish-e -i config.log -c c -v0
# In the main index is a table that points to all the properties in
# .prop file:
InitWriteProperties: Start of property table in main index at offset: 401084
# Here's writing the individual properties to the .prop file which
# tells you the file number, the Property ID, the seek position in the
# .prop file and the number of bytes in the property
# You can see for prop 4 (swishdescription in this case) the
# property was compressed and you can see the uncompressed
# length and the compressed size of the data written to disk
Write Prop: file 1 PropIDX 0 (meta 6) seek: 4 data=[uncompressed_len: 0 (2 bytes), prop_data: (10 bytes)]
Write Prop: file 1 PropIDX 2 (meta 8) seek: 16 data=[uncompressed_len: 0 (2 bytes), prop_data: (4 bytes)]
Write Prop: file 1 PropIDX 3 (meta 9) seek: 22 data=[uncompressed_len: 0 (2 bytes), prop_data: (4 bytes)]
Write Prop: file 1 PropIDX 4 (meta 10) seek: 28 data=[uncompressed_len: 63592 (5 bytes), prop_data: (10795 bytes)]
uncompressed_len: 63592 (5 bytes) == the uncompressed size is 63592
and 5 bytes were used to store that number in the .prop file.
prop_data: (10795 bytes) == the number of bytes written to the .prop
file to store the data.
# Then it writes the positions of the individual properties to the
# table in the main index file. Those match up with the seek
# positions above. The file didn't have any data for propIDX 1 so
# it's entered as zero.
Writing seek positions to index for file 1
PropIDX: 0 data=[seek: 4] main index location: 401084 for 4 bytes (one print long)
PropIDX: 1 data=[seek: 0] main index location: 401088 for 4 bytes (one print long)
PropIDX: 2 data=[seek: 16] main index location: 401092 for 4 bytes (one print long)
PropIDX: 3 data=[seek: 22] main index location: 401096 for 4 bytes (one print long)
PropIDX: 4 data=[seek: 28] main index location: 401100 for 4 bytes (one print long)
# After indexing is all done swish reads back all the properties for
# creating the pre-sorted arrays used for sorting search results.
# When swish needs to read properties for a given file it first has to
# lookup the seek positions in the main index file.
Fetching seek positions for file 1
property index table at 401084, this file at 401084
PropIDX: 0 data[Seek: 4] at seek 401084 read 4 bytes (one readlong)
PropIDX: 1 data[Seek: 0] at seek 401088 read 4 bytes (one readlong)
PropIDX: 2 data[Seek: 16] at seek 401092 read 4 bytes (one readlong)
PropIDX: 3 data[Seek: 22] at seek 401096 read 4 bytes (one readlong)
PropIDX: 4 data[Seek: 28] at seek 401100 read 4 bytes (one readlong)
# Those better match the same values that were written previously.
# Then swish reads the .prop file using the seek positions fetched
# from the main index.
Fetching filenum: 1 propIDX: 0 at seek: 4
Fetched uncompressed length of 0 (2 bytes storage), now fetching 10 prop bytes from 6
Fetching filenum: 1 propIDX: 2 at seek: 16
Fetched uncompressed length of 0 (2 bytes storage), now fetching 4 prop bytes from 18
Fetching filenum: 1 propIDX: 3 at seek: 22
Fetched uncompressed length of 0 (2 bytes storage), now fetching 4 prop bytes from 24
Fetching filenum: 1 propIDX: 4 at seek: 28
Fetched uncompressed length of 63592 (5 bytes storage), now fetching 10795 prop bytes from 33
So for propidx 4, the property starts at seek position 28. It reads
the uncompressed size in the first five bytes. That leave the
compressed data starting at position 33.
What seems to be happening in your case is the uncompressed length
read from the .prop file is not correct. The question is when it get
set incorrectly. When the index is written or when it's read back.
Also, if see something odd in the debug output, then don't trust the
debugging code without checking it in the source -- might be an
incorrect cast, for example.
Unsubscribe from or help with the swish-e list:
Help with Swish-e:
Received on Mon Nov 28 06:37:44 2005