Re: [swish-e] portable indexes for
swish-e
Follow up:
I updated the script to convert more types, and to rewrite files
in subdirs of src/. It's appended below.
I also noticed that I have to leave the prototype of main
like:
int main(int, char**) or it has problems at startup.
I was able to get a 64-bit swish-e exe on 32 bit systems that was
able to start up. When I tried to build an index with it, though, I
get this from 2.6 from SVN
http://rafb.net/p/xZyvpu13.html
From swish-e 2.4 from SVN with the int64 changes applied, I
get:
% sman-update
...
*** glibc detected *** swish-e: realloc(): invalid next size:
0x0859c988 ***
======= Backtrace: =========
/lib/i686/nosegneg/libc.so.6[0xb7afea]
/lib/i686/nosegneg/libc.so.6(realloc+0x105)[0xb7be75]
/usr/local/lib/libswish-e.so.2(erealloc+0x2c)[0x4002c9cc]
swish-e[0x8068521]
swish-e[0x80694cd]
/usr/lib/libxml2.so.2(xmlParseCharData+0x1ad)[0x599c60d]
/usr/lib/libxml2.so.2(xmlParseChunk+0x9a0)[0x599edf0]
swish-e[0x806b894]
swish-e(parse_XML+0x43)[0x806ba83]
swish-e[0x805b237]
swish-e[0x8053c6c]
swish-e[0x805ee9a]
swish-e[0x804cec2]
/lib/i686/nosegneg/libc.so.6(__libc_start_main+0xdc)[0xb26dec]
swish-e[0x804b611]
======= Memory map: ========
00aef000-00b08000 r-xp 00000000 09:01 9633808
/lib/ld-2.5.so
00b08000-00b09000 r-xp 00019000 09:01 9633808
/lib/ld-2.5.so
00b09000-00b0a000 rwxp 0001a000 09:01 9633808
/lib/ld-2.5.so
00b11000-00c4e000 r-xp 00000000 09:01 9633802
/lib/i686/nosegneg/libc-2.5.so
00c4e000-00c50000 r-xp 0013d000 09:01 9633802
/lib/i686/nosegneg/libc-2.5.so
00c50000-00c51000 rwxp 0013f000 09:01 9633802
/lib/i686/nosegneg/libc-2.5.so
00c51000-00c54000 rwxp 00c51000 00:00 0
00c7c000-00c7e000 r-xp 00000000 09:01 9633816
/lib/libdl-2.5.so
00c7e000-00c7f000 r-xp 00001000 09:01 9633816
/lib/libdl-2.5.so
00c7f000-00c80000 rwxp 00002000 09:01 9633816
/lib/libdl-2.5.so
00c9b000-00cad000 r-xp 00000000 09:01 6547429
/usr/lib/libz.so.1.2.3
00cad000-00cae000 rwxp 00011000 09:01 6547429
/usr/lib/libz.so.1.2.3
00cfc000-00d21000 r-xp 00000000 09:01 9633842
/lib/i686/nosegneg/libm-2.5.so
00d21000-00d22000 r-xp 00024000 09:01 9633842
/lib/i686/nosegneg/libm-2.5.so
00d22000-00d23000 rwxp 00025000 09:01 9633842
/lib/i686/nosegneg/libm-2.5.so
00dad000-00db8000 r-xp 00000000 09:01 9633843
/lib/libgcc_s-4.1.2-20070626.so.1
00db8000-00db9000 rwxp 0000a000 09:01 9633843
/lib/libgcc_s-4.1.2-20070626.so.1
05968000-05a94000 r-xp 00000000 09:01 6539412
/usr/lib/libxml2.so.2.6.26
05a94000-05a99000 rwxp 0012b000 09:01 6539412
/usr/lib/libxml2.so.2.6.26
05a99000-05a9a000 rwxp 05a99000 00:00 0
08048000-0808b000 r-xp 00000000 09:01 6522803
/usr/local/bin/swish-e
0808b000-0808e000 rwxp 00042000 09:01 6522803
/usr/local/bin/swish-e
0808e000-080cf000 rwxp 0808e000 00:00 0
0857d000-085c0000 rwxp 0857d000 00:00 0
40000000-40001000 r-xp 40000000 00:00
0 [vdso]
40001000-40005000 rwxp 40001000 00:00 0
40011000-40054000 r-xp 00000000 09:01 6520839
/usr/local/lib/libswish-e.so.2.0.0
40054000-40063000 rwxp 00043000 09:01 6520839
/usr/local/lib/libswish-e.so.2.0.0
40063000-40065000 rw-p 40063000 00:00 0
40065000-40265000 r--p 00000000 09:01 6527227
/usr/lib/locale/locale-archive
40265000-405a1000 rw-p 40265000 00:00 0
40600000-40621000 rw-p 40600000 00:00 0
40621000-40700000 ---p 40621000 00:00 0
bfbbd000-bfbd3000 rw-p bfbbd000 00:00
0 [stack]
Broken pipe
At 12:48 PM -0500 12/14/07, Josh Rabinowitz wrote:
Hello, All:
So, I was talking to a swish-e developer about how nice it would
be
if swish-e indexes were portable across OS's and architectures,
and
he mentioned how one of the remaining barriers was that an 'int' is
a
different size on different machines.
So I got to thinking, and wrote a script that tries to make almost
all swish-e integer types of the same size, regardless of the
platform. It's pasted below.
What I found was that it the resulting swish-e worked on the
64bit
system I tried, but not the 32bit system
(with the caveats in the
script). It would be great to get the indexes fully portable,
though!
Again, the script I used is below; hopefully my email program
won't
mangle it (let me know if you need a copy directly). I'm very
interested to hear feedback from other users and developers!
Josh Rabinowitz
Author of "How To Index Anything" and "Indexing
Arbitrary Data Using
Perl and Swish-e"
######### begin script
swish-e-src-rewrite.pl #############
#!/usr/bin/perl -w
use strict;
use Getopt::Long;
# call main()
main();
# main()
sub main {
#include <stdint.h> has the uint... and
int... typedefs used below
# note
that we don't make exceptions for main() or waitpid(),
# but
need to.
my @regexes = (
q(s/
\bunsigned\s+long\s+long\s+int\b
/uint64_t/gx),
q(s/
\blong\s+long\s+unsigned\s+int\b
/uint64_t/gx),
q{s/
\bunsigned\s+long\s+int\b /uint64_t/gx},
q{s/
\bunsigned\s+long\b
/uint64_t/gx},
q(s/
\bunsigned\s+int\b
/uint64_t/gx),
q(s/
\blong\s+long\s+int\b /int64_t/gx),
q(s/
\blong\s+long\b
/int64_t/gx),
q(s/
\blong\s+int\b /int64_t/gx),
q(s/
\blong\b /int64_t/gx),
q(s/
\bint\b
/int64_t/gx),
);
my @files = glob( "src/*.c src/*.h src/*/*.c
src/*/*.h src/*/*/*.h src/*/*/*.c");
for my $file (@files) {
print
"$file\n";
_apply_regexes( $file,
@regexes );
}
}
#================================================================
# _apply_regexes( $file, @search_and_replace_regexes )
# backs up $file to $file.bak, and
# applies supplied regexes to the lines of a file,
sub _apply_regexes {
my ($file, @regexes) = @_;
# changes a file by applying the supplied regexes
to each line
my $tmpfile = "$file.tmp";
open(my $rfh, "<", $file)
|| die "$0: Can't open $file: $!";
open(my $wfh, ">", $tmpfile) ||
die "$0: Can't open $tmpfile: $!";
#
clobber old $file.tmp
print "Applying regexes to file
$file\n"; # . join("\n", @regexes) .
"\n";
while(<$rfh>) {
chomp();
for my $r (@regexes) {
#
$r should operate on $_ !
eval $r;
die
"$0: Error in regex: $r: $@" if $@;
}
print $wfh
"$_\n";
}
close($rfh) || die "$0: Can't open $file:
$!";
close($wfh) || die "$0: Can't close $tmpfile:
$!";
rename( $file, "$file.bak" );
rename( $tmpfile, $file ) || die "$0: Can't
rename $tmpfile to $file: $!";
}
--
----------------------------------------------------------------------
-- Josh
Rabinowitz joshr-swishe@joshr.com
--
-- SkateboardDirectory.com(tm)
http://SkateboardDirectory.com/ --
-- SkateTalk Chat
Systems(tm) http://www.skatetalk.com/ --
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Dec 14 14:27:45 2007