Skip to main content.
home | support | download

Back to List Archive

Re: swish-e - tuning

From: <moseley(at)not-real.hank.org>
Date: Wed Aug 27 2003 - 14:29:24 GMT
On Wed, Aug 27, 2003 at 06:07:17AM -0700, Aaron Bazar wrote:
> Good Day everyone!
> 
> I am working on tuning my swish-e setup. I am using speedycgi. Here are some
> of my observations/benchmarks. I used the apache benchmark software. I also
> tried a few different set ups. Here are the results...
> 
> perl with API simple highlight ----> 2 requests/second
> perl with swish-e binary and simple highlight ---> 4 requests/second
> 
> perl with API no highlighting ===> 3.23 requests/second
> perl with swish-e no highlighting ===> 4.74 requests/second
> 
> 
> speedycgi with highlighting and API ---> 5.5 requests/second
> speedycgi highlighting and swish-e ----> 9 requests/second
> 
> 
> speedycgi with no highlighting 10 requests/second using binary
> speedycgi with no highlighting 17 requests/second using api
> 
> 
> 
> So, here is my conclusion from these tests: Only use the API if you are
> going to use NO highlighting. If you use highlighting, it seems to me that
> using the swish-e binary is faster. This is not what I expected.

Seems unlikey.  Can you post a sample test script?  

In message http://swish-e.org/archive/5636.html I posted:

<quote>

 ab -n 2000 -c 10 http://localhost/swish.cgi?query=install

and that was returning about 200 hits.

                use_library=0             use_library=1
              --------------------  ----------------------
  mod_cgi           3.7                      3.7
  mod_perl          8.9                     30.0
  SpeedyCGI         8.6                     26.0

</quote>

But those were without highlighting enabled.


> Here is my theory, and I suspect one of the developers has a better
> explanation (or my set up is flawed). When using the API a search query is
> performed by the cgi script (speedycgi process) AND sorting is done by the
> script, one after the other. When the swish-e binary is used, another
> process is launched to get the data and the cgi script parses the results.
> In this case, the slower cgi script only is doing one job and the faster
> binary is doing the other job. Perhaps this is why I got the results that I
> did.

No, the same swish-e code is sorting in both cases.

The difference between running the API and the swish-e binary is with 
the binary you have to fork and exec to run swish-e (so you are forking 
the entire web server process).  These days forking is not so expensive, 
but it seems at one point (or perhaps on some systems) it was 
significant.  Here's some more info:

 http://perl.apache.org/docs/1.0/guide/performance.html#Forking_and_Executing_Subprocesses_from_mod_perl

The other difference is that with the API you can open the index file 
*once* and then make repeated queries on that *open* index file.  This 
does make a difference when using swish-e.

The highlighting code is so slow that those speedups above get lost as 
noise.

> Anyway, the performance increase in turning off the highlighting, using the
> API, and running under speedycgi is pretty good. Speedycgi is very, very
> easy to get working. I'd try mod_perl, but it does not seem worth it since
> speedy does such a good job.

I'd agree.  speedyCGI makes a bid difference and is easy to setup and 
use.  mod_perl is not hard to get going, but is a lot more involved.  
Use mod_perl if you have other needs for it.  In benchmarks mod_perl 
will be faster than speedyCGI because speedy (when used as 
/usr/bin/speedy under CGI) still requires a fork/exec where mod_perl 
doesn't at all.







> 
> Best regards,
> 
> Aaron Bazar
> http://www.buyasundial.com
> 
> 
> 
> 

-- 
Bill Moseley
moseley@hank.org
Received on Wed Aug 27 14:29:39 2003