Skip to main content.
home | support | download

Back to List Archive

Re: swish-e - tuning

From: Aaron Bazar <aaronb(at)>
Date: Wed Aug 27 2003 - 15:47:07 GMT
Thanks for the reply.

It must have been too late for benchmark testing last night. Now, I get the
correct results. The API is faster with highlighting. However, I added some
modifications to the cgi script and I think that threw the whole thing off.

I just figured out how to get speedycgi to cache the array I added to the
script and the API is now faster. Sorry for lousy benchmarks.


Aaron Bazar

-----Original Message-----
[]On Behalf Of
Sent: Wednesday, August 27, 2003 10:30 AM
To: Multiple recipients of list
Subject: [SWISH-E] Re: swish-e - tuning

On Wed, Aug 27, 2003 at 06:07:17AM -0700, Aaron Bazar wrote:
> Good Day everyone!
> I am working on tuning my swish-e setup. I am using speedycgi. Here are
> of my observations/benchmarks. I used the apache benchmark software. I
> tried a few different set ups. Here are the results...
> perl with API simple highlight ----> 2 requests/second
> perl with swish-e binary and simple highlight ---> 4 requests/second
> perl with API no highlighting ===> 3.23 requests/second
> perl with swish-e no highlighting ===> 4.74 requests/second
> speedycgi with highlighting and API ---> 5.5 requests/second
> speedycgi highlighting and swish-e ----> 9 requests/second
> speedycgi with no highlighting 10 requests/second using binary
> speedycgi with no highlighting 17 requests/second using api
> So, here is my conclusion from these tests: Only use the API if you are
> going to use NO highlighting. If you use highlighting, it seems to me that
> using the swish-e binary is faster. This is not what I expected.

Seems unlikey.  Can you post a sample test script?

In message I posted:


 ab -n 2000 -c 10 http://localhost/swish.cgi?query=install

and that was returning about 200 hits.

                use_library=0             use_library=1
              --------------------  ----------------------
  mod_cgi           3.7                      3.7
  mod_perl          8.9                     30.0
  SpeedyCGI         8.6                     26.0


But those were without highlighting enabled.

> Here is my theory, and I suspect one of the developers has a better
> explanation (or my set up is flawed). When using the API a search query is
> performed by the cgi script (speedycgi process) AND sorting is done by the
> script, one after the other. When the swish-e binary is used, another
> process is launched to get the data and the cgi script parses the results.
> In this case, the slower cgi script only is doing one job and the faster
> binary is doing the other job. Perhaps this is why I got the results that
> did.

No, the same swish-e code is sorting in both cases.

The difference between running the API and the swish-e binary is with
the binary you have to fork and exec to run swish-e (so you are forking
the entire web server process).  These days forking is not so expensive,
but it seems at one point (or perhaps on some systems) it was
significant.  Here's some more info:

The other difference is that with the API you can open the index file
*once* and then make repeated queries on that *open* index file.  This
does make a difference when using swish-e.

The highlighting code is so slow that those speedups above get lost as

> Anyway, the performance increase in turning off the highlighting, using
> API, and running under speedycgi is pretty good. Speedycgi is very, very
> easy to get working. I'd try mod_perl, but it does not seem worth it since
> speedy does such a good job.

I'd agree.  speedyCGI makes a bid difference and is easy to setup and
use.  mod_perl is not hard to get going, but is a lot more involved.
Use mod_perl if you have other needs for it.  In benchmarks mod_perl
will be faster than speedyCGI because speedy (when used as
/usr/bin/speedy under CGI) still requires a fork/exec where mod_perl
doesn't at all.

> Best regards,
> Aaron Bazar

Bill Moseley
Received on Wed Aug 27 15:47:36 2003