Skip to main content.
home | support | download

Back to List Archive

[swish-e] portable indexes for swish-e

From: Josh Rabinowitz <joshr-swishe(at)>
Date: Fri Dec 14 2007 - 17:48:06 GMT
Hello, All:

So, I was talking to a swish-e developer about how nice it would be 
if swish-e indexes were portable across OS's and architectures, and 
he mentioned how one of the remaining barriers was that an 'int' is a 
different size on different machines.

So I got to thinking, and wrote a script that tries to make almost 
all swish-e integer types of the same size, regardless of the 
platform. It's pasted below.

What I found was that it the resulting swish-e worked on the 64bit 
system I tried, but not the 32bit system (with the caveats in the 
script). It would be great to get the indexes fully portable, though!

Again, the script I used is below; hopefully my email program won't 
mangle it (let me know if you need a copy directly). I'm very 
interested to hear feedback from other users and developers!

  Josh Rabinowitz
  Author of "How To Index Anything" and "Indexing Arbitrary Data Using 
Perl and Swish-e"

######### begin script #############

#!/usr/bin/perl -w
use strict;

# converts swish-e source code to all 64-bit integers.
# --> in progress.
# Copyright 2007 Josh Rabinowitz

# call main()

# main()
sub main {
     # #include <stdint.h> has the uint... typedefs used below
     # note that 1) stdint.h needs to be included in mem.h and swish.h
     # 2) wait() still needs a true 'int' in http
     # and 3) printf() formats need to be changed to match
     # see build warnings for more.
     # this works on 64bit CentosOS5, but not 32bit CentosOS5,
     # on which swish-e -h says:
     #  err: Missing switch character at ''. 
Use -h for options.
     my @regexes = (
         q{ s/ \bunsigned\s+long\s+int\b /uint64_t/gx},
         q{ s/ \bunsigned\s+long\b       /uint64_t/gx},
         q( s/ \bunsigned\s+int\b        /uint64_t/gx),
         q( s/ \blong\s+long\s+int\b     /int64_t/gx),
         q( s/ \blong\s+long\b           /int64_t/gx),
         q( s/ \blong\s+int\b            /int64_t/gx),
         q( s/ \bint\b                   /int64_t/gx),
     my @files = glob( "src/*.c src/*.h");
     for my $file (@files) {
         print "$file\n";
         _apply_regexes( $file, @regexes );

# _apply_regexes( $file, @search_and_replace_regexes )
# backs up $file to $file.bak, and
# applies supplied regexes to the lines of a file,
sub _apply_regexes {
     my ($file, @regexes) = @_;
     # changes a file by applying the supplied regexes to each line
     my $tmpfile = "$file.tmp";
     open(my $rfh, "<", $file)    || die "$0: Can't open $file: $!";
     open(my $wfh, ">", $tmpfile) || die "$0: Can't open $tmpfile: 
$!";  # clobber old $file.tmp
     print "Applying regexes:to file $file\n" . join("\n", @regexes) . "\n";
     while(<$rfh>) {
         for my $r (@regexes) {
             # $r should operate on $_ !
             eval $r;
             die "$0: Error in regex: $r: $@" if $@;
         print $wfh "$_\n";
     close($rfh) || die "$0: Can't open $file: $!";
     close($wfh) || die "$0: Can't close $tmpfile: $!";
     rename( $file, "$file.bak" );
     rename( $tmpfile, $file ) || die "$0: Can't rename $tmpfile to $file: $!";
-- Josh Rabinowitz                  --
-- --
-- SkateTalk Chat Systems(tm)    --
Users mailing list
Received on Fri Dec 14 12:48:17 2007