Skip to main content.
home | support | download

Back to List Archive

Re: swish-e 2.4.0 seg fault followup/ resolution

From: Dave Stevens <dstevens(at)not-real.roaddog.com>
Date: Mon Nov 10 2003 - 11:44:20 GMT
dave@webaugur.com wrote:

> On Sun, 2003-11-09 at 22:55, Dave Stevens wrote:
>> There are a couple of significant issues with the current public RH9
>> glibc
>> implementation.  ...  The issues with this are well documented,
>> so I won't bother the list with them.  If anyone would like any pointers
>> email me and I'll provide the info or elaborate on the list.
>
> A pointer would be handy.  I'm curious since I've not experienced any
> problems on RH9 with glibc-2.3.2-27.9.


That certainly adds another twist.  I've a 7.3 that runs an identical
swish-e install without a hitch.  There were some irregularities
discovered in April and one of first posts I happened on was from the GNU
glibc dev list when someone from Sun enquired about a similar instance. 
Googling
http://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&q=in+free+%28%29+from+%2Flib%2Ftls%2Flibc.so.6&btnG=Google+Search
will yield several similar circumstances.  I'll point to a few specific
references below.

This issue is not the same as the "downgrade" issue RedHat had where users
were inadvertantly downgrading i686 installs to i386 installs.  My
understanding is previously using either i686 or i386 archectures didn't
much matter, but with the upgrade it was indeed an issue, particularly
with RPM installs as there is (or was at the time) no checking to make
sure one didn't over write the proper install.  My understanding is that
the i686 requirement is due to a new threading model in the kernel
allowing  for POSIX compliance.  If one was to upgrade via RPM without
manually checking the arch, it would install an i386 glibc over the
required i686 install.  The result turns the machine into a paperweight. 
;-)  The downgrade issue is a separate boondoggle not related to the issue
I have on this install.

The Sun discovery of an issue nearly identical to what is happening to me
is here.
http://developer.java.sun.com/developer/bugParade/bugs/4885046.html 
(registration required) Basically anything using a JVM in RH9 with glibc
2.3.2-27.9 seg faults in the same way as my issue.


The RedHat Bugzilla entry is here 
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=90301  The specific
cause according one of the RedHat engineers was "due to a missing check in
unregister_atfork when freeing the dso handles. The patched
unregister_atfork does work but isn't currently available to redhat 9
users in glibc".

In the Google listed above there are also other people that are
experiencing the same issue, a segfault on exiting the program, with
traceback results similar to mine.  There are also several compiling and
linking issues with various software.  As of last Weds an engineer at Sun
was still able to replicate the issue with the current RC.  They offered
this as the cause "We now believe this bug and 4938816 are due to the same
gcc problem - all dynamic libraries share the same __dso_handle, so when a
DSO is unloaded, dlclose() improperly calls the clean-up method of _all_
DSO."

"In 4938816, it caused a crash in compiler thread; in this bug, exit()
will unload DSO one by one, it's very likely the clean-up method of some DSO
is called more than once, so it tries to free the same object twice, causing
a crash in __libc_free.  The fix for 4938816 is to build gcc with a newer
binutils package."

I suppose my question is, what compiler version are you using to build
swish-e under RH9?  I'm using gcc 3.2 in both the 9 and 7.3 installs and
the  7.3 install under that glibc works as expected.


> Rather than upgrade your system glibc you could install the RC RPM to
> another location and use LD_LIBRARY_PATH just for your application.

That's a good idea.  Some engineers from Sun working around the issue are
setting LD_ASSUME_KERNEL to 2.4.1, it will disable NPTL.  I haven't tried
it for this issue yet.  I was planning to use a JVM, specifically the
Blackdown JVM to support messaging on the site so the JVM issues have some
relevance to me.  If it seems to be a real issue I'll roll back to 7.3 or
forward to Fedora.

As a swish user for at least the last four or five years, let me know if
there is anything I can do to assist in this, or anything else for that
matter.


Dave
Received on Mon Nov 10 11:57:56 2003