I'm looking for advice on tracking down a segfault in swish.
I'm seeing segfaults from swish 2.0.1, but I think they are resource
related as they happened often at the same time of day. This is new to
version 2, as I didn't see this in 1.3.x.
For example, yesterday they happened at these times:
These were all forked from different Apache child processes and they had
their SEGV flag set when reaped via close(). I see a few of these every day.
Since the times are so close together and the swish processes were forked
all from different Apache children that continued to run, I would assume
it's a system resource issue. Perhaps the system is low on memory and a
failed memory allocation by swish is done, but swish fails to see the error
The old thing is the machine has *really* run out of memory a few times in
the last few days. I get a bunch of "Failed to Fork" messages when
attempting to open swish. I would have assumed that if the swish segfaults
were due to low memory resources that I would see swish segfaults right
before and right after the "Failed to Fork" messages -- assuming that there
must be some point between enough memory and not enough memory where I
could fork swish, yet swish would segfault. But I'm guessing. My Apache
children are about 11M each, but a good chunk of that is shared. I'm not
clear how much memory is needed for the initial fork of the Apache child.
So, for the programmers, are there any tricks to determine the cause of the
swish segfaults? Is there a way I can force it to dump core for a gdb
backtrace after the fact? Does swish need to be compiled a different way
to get a good backtrace?
This is running on:
SunOS 5.6 Generic_105181-17 sun4u sparc SUNW,Ultra-Enterprise
with 2GB or RAM.
top shows 3GB swap, but swap -s shows:
total: 455904k bytes allocated + 184008k reserved = 639912k used, 4103664k
"Funny" how I can get 100 "Failed to Fork" messages in my log file but
/var/adm/messages doesn't note the resource low problem.
Received on Wed Sep 27 15:09:00 2000