Is the PHP frontend to swish-e less of a server burden?
Thanks for the information below. I am looking into SpeedCGI now. It looks
[mailto:firstname.lastname@example.org]On Behalf Of Bill Moseley
Sent: Tuesday, July 22, 2003 8:36 PM
To: Multiple recipients of list
Subject: [SWISH-E] Re: cgi question / core
On Tue, Jul 22, 2003 at 02:29:39PM -0700, Aaron Bazar wrote:
> Hi Everybody!
> One of the sites that I set Swish-e on is a success! The only problem is
> that it is killing my poor server. I am also finding a large core file in
> web directory, basically daily. Does anybody have any experience with core
> files and the cgi script? Is there any way to lessen the load on the
> of the cgi script, without using mod_perl (I could not get it to work
Well, you are going to get a longer answer than you probably wanted....
Yes. A few things.
SpeedyCGI http://daemoninc.com/SpeedyCGI/ You don't need to be root to
install. Many OSes have it as a package so you might be able to easily
install it (it was a apt-get on my Debian machine).
Yesterday I updated in cvs the swish.cgi script to work with SpeedyCGI
-- which was only changing the shebang line at the top of the program
plus I set it up to cache the configuration so it's only read upon
Really, all you need is to install SpeedyCGI and change the top line to
#!/usr/bin/speedy -w -- -t60 -M3
Let's see, where did I put that envelope.... oh ya, I ran Apache
Benchmark on localhost and doing a simple search without highlighting a
description I went from 3.7 requests/second to about 30/sec. mod_perl
was about 47/sec so it's a bit faster but not that much.
Now that was also using the SWISH::API module instead of running the
swish-e binary. Here's the swish.cgi configuration I was using:
moseley@bumby:~/apache$ cat .swishcgi.conf
title => "This is my title $$ --",
swish_index => '/home/moseley/apache/index.swish-e',
use_library => 1,
So the numbers looked like this. I was using Apache Benchmark as:
ab -n 2000 -c 10 http://localhost/swish.cgi?query=install
and that was returning about 200 hits.
mod_cgi 3.7 3.7
mod_perl 8.9 30.0
SpeedyCGI 8.6 26.0
Just running the search form page (i.e. without a query) mod_perl was
doing 76 hits per second. ;)
Again, this was without any StoreDescription setting in the config. In
previous tests the phrase highlighting has been the limiting factor.
And also this is not using a templating system, rather just the default
perl output generation.
Oh, heck, let me try with a templating system:
With the default output setup: Requests per second: 25.95 [#/sec] (mean)
Template-Toolkit: Requests per second: 21.58 [#/sec] (mean)
HTML::Template: Requests per second: 13.37 [#/sec] (mean)
I'm sure that HTML::Template can be better tuned. I'm not caching the
template object or using JIT.
Now, going back to the default output, but enabling StoreDescription,
you can see how the term highlighting kills things:
SWISH::PhraseHighlight: Requests per second: 1.14 [#/sec] (mean)
SWISH::DefaultHighlight: Requests per second: 1.56 [#/sec] (mean)
SWISH::SimpleHighlight: Requests per second: 7.29 [#/sec] (mean)
NONE (shwing first 100 chars) Requests per second: 15.53 [#/sec]
Need to do some caching there. Anyone want to write highlighting code
in C? Enough benchmarking.
You can specify with speedy how many "back end" processes to run, which
can help prevent spiders from hammering your script so hard the load
average goes through the roof. Probably not as effective as a tuned
mod_perl server since the requests are still being processed by apache,
but should help.
The other thing you might want to try is changing the highlighting
module. The "PhraseHighlight" module is way slow -- it has to parse the
entire description into words and then nested loops look for phrases to
You will want to do your own benchmarking. I'm sure mine are flawed in
some way. Try various settings of the -M SpeedyCGI parameter.
As for the core files, I often don't have much luck with them. I think
you run gdb with something like (assuming it's swish-e that is core
gdb /usr/local/bin/swish-e /path/to/core
then use "bt" or "where" to show a backtrace. You might modify
swish.cgi to write pid and the request and then another log entry at the
end of the request to a log file to try and get an idea of what request
is causing the core.
Would hitting a resource limit (set with ulimit, for example) cause a
Received on Wed Jul 23 13:01:01 2003