At 09:58 AM 08/16/02 -0700, David VanHook wrote:
>OK, I fiddled around with the meta tag names, but that appeared not to make
>any difference. Still not working on multiple Meta tags. But it does sound
>like maybe using the new 2.1dev is a good idea, anyway -- should I just pick
>the latest daily snapshot?
Yes. I actually use cvs every place I use swish -- makes it really easy to
get updates or to fetch a given version.
>Once we've got it installed and config'd, we'll be using this on a website
>which gets about 1 million hits per week, and about 3,000 searches a day.
>Is the dev version ready for that kind of spotlight?
Two a minute? I sure hope so.
You can either fork every request to swish, or use the swish library and
embed swish into your application. Here's a quick benchmark of an index
with 50,000 files searching for "we or you or them":
Benchmark: timing 3000 iterations of fork_swish, library_swish...
fork_swish: 148 wallclock secs
( 4.44 usr 1.43 sys + 111.02 cusr 31.07 csys = 147.96 CPU)
@ 511.07/s (n=3000)
library_swish: 62 wallclock secs
(37.51 usr + 24.53 sys = 62.04 CPU) @ 48.36/s (n=3000)
Fork: 3000 requests, 2067000 total results
Library: 3000 requests, 2067000 total results
Even faster for a simple single keyword search "hello"
library_swish: 26 wallclock secs
(13.82 usr + 11.76 sys = 25.58 CPU) @ 117.28/s (n=3000)
Library: 3000 requests, 765000 total results
>Or will there be a new
>stable release sometime soon which perhaps we should wait for?
You should see the pile on my desk. I wouldn't wait.
Disclaimer: Benchmarks are never right. Here's the code:
moseley@bumby:~/swish-e/src$ cat bench.pl
#!/usr/local/bin/perl -w
use strict;
use SWISHE;
use Symbol;
use Benchmark;
my $fork_count_recs = 0;
my $fork_count = 0;
my $lib_count_recs = 0;
my $lib_count = 0;
my $query = 'food not hello';
my $handle = SwishOpen( 'index.swish-e' )
or die "Failed to open index";
timethese( 3000, {
'library_swish' => \&library_swish,
'fork_swish' => \&fork_swish,
});
print "Fork: $fork_count requests, $fork_count_recs total results\n";
print "Library: $lib_count requests, $lib_count_recs total results\n";
sub library_swish {
my $num_results = SwishSearch($handle, $query, 1, '','' );
if ( $num_results <= 0 ) {
print ($num_results ? SwishErrorString( $num_results ) : 'No
Results');
my $error = SwishError( $handle );
print "\nError number: $error\n" if $error;
return; # or next.
}
my @recs;
while ( (my @rec = SwishNext( $handle )) ) {
push @recs, \@rec;
}
$lib_count_recs += @recs;
$lib_count++;
}
sub fork_swish {
my $swish = gensym;
my $pid = open( $swish, '-|' );
die "failed to fork" unless defined $pid;
if ( !$pid ) {
exec( "./swish-e", '-w', $query, '-H0' )
or die "failed to exec";
}
my @recs = <$swish>;
$fork_count_recs += @recs;
$fork_count++;
}
--
Bill Moseley
mailto:moseley@hank.org
Received on Fri Aug 16 18:09:13 2002