From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14885 invoked by alias); 24 Jun 2005 21:53:21 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 14861 invoked by uid 22791); 24 Jun 2005 21:53:17 -0000 Received: from sibelius.xs4all.nl (HELO sibelius.xs4all.nl) (82.92.89.47) by sourceware.org (qpsmtpd/0.30-dev) with ESMTP; Fri, 24 Jun 2005 21:53:17 +0000 Received: from elgar.sibelius.xs4all.nl (root@elgar.sibelius.xs4all.nl [192.168.0.2]) by sibelius.xs4all.nl (8.13.0/8.13.0) with ESMTP id j5OLrFsf021669 for ; Fri, 24 Jun 2005 23:53:15 +0200 (CEST) Received: from elgar.sibelius.xs4all.nl (kettenis@localhost.sibelius.xs4all.nl [127.0.0.1]) by elgar.sibelius.xs4all.nl (8.13.4/8.13.3) with ESMTP id j5OLrEvJ007394 for ; Fri, 24 Jun 2005 23:53:14 +0200 (CEST) Received: (from kettenis@localhost) by elgar.sibelius.xs4all.nl (8.13.4/8.13.4/Submit) id j5OLrEPi014281; Fri, 24 Jun 2005 23:53:14 +0200 (CEST) Date: Fri, 24 Jun 2005 21:53:00 -0000 Message-Id: <200506242153.j5OLrEPi014281@elgar.sibelius.xs4all.nl> From: Mark Kettenis To: gdb@sources.redhat.com CC: gdb@sources.redhat.com In-reply-to: <20050624151129.GH66895@keyslapper.net> (message from Louis LeBlanc on Fri, 24 Jun 2005 11:11:29 -0400) Subject: Re: aborted thread backtrace stops at sighandler call References: <20050624151129.GH66895@keyslapper.net> X-SW-Source: 2005-06/txt/msg00245.txt.bz2 Date: Fri, 24 Jun 2005 11:11:29 -0400 From: Louis LeBlanc Hey everyone. I've got an app that seems ok under some pretty heavy load, but once in a great while, it blows up during some network related operation, particularly host name lookups. I'm having similar problems with other apps (even perl scripts) on the same OS (Solaris 8 & 9). On sparc or i386? Well, often these were bus errors and gdb just couldn't nail things down for me. Finally, I decided to catch these signals (SIGBUS, SIGSEGV) and collect what info I could before calling abort() - which preserves the stack pretty well according to pstack. When a problem arises, I can look at the pstack output and see quite clearly that the problem is this screwy network glitch I've never been able to track down. Problem is when it's something else, gdb doesn't seem to be able to see the preserved stack. Anyone have any idea how to get this? Sorry, but I don't understand your problem. Is it the fact that gdb's backtrace is different from the backtrace shown by pstack? This is the backtrace for the aborted thread: (gdb) bt #0 0xfea1f82c in __tbl_2_huge_digits () from /usr/lib/libc.so.1 #1 0xfe9d0a24 in sysconf () from /usr/lib/libc.so.1 #2 0xfe9b6ce0 in ascftime () from /usr/lib/libc.so.1 #3 0x0003c72c in XPCSigCheck (Sig=11, Info=0xfe776ad0, Context=0xfe776818) at xpcsig.c:347 #4 0xff365b14 in ?? () #5 0xff365b18 in ?? () Previous frame identical to this frame (corrupt stack?) But pstack shows this: ----------------- lwp# 3 / thread# 3 -------------------- fea1f82c _lwp_kill (6, 0, fe7765b8, fe776630, 0, 1) + 8 fe9b6cd8 abort (df708, 0, 0, 0, 0, 0) + 100 0003c724 XPCSigCheck (b, fe776ad0, fe776818, 0, 0, 0) + 2c0 ff365b0c __sighndlr (b, fe776ad0, fe776818, 3c464, 0, 0) + c ff35f804 call_user_handler (b, fe776ad0, fe776818, 0, 0, 0) + 234 ff35f9b4 sigacthandler (b, fe776ad0, fe776818, 7efefeff, 81010100, 0) + 64 --- called from signal handler with signal 11 (SIGSEGV) --- fe9b44e4 strlen (fe7778c0, 0, fe777850, 0, 0, 0) + 80 fea08c98 vsnprintf (fe7784c0, c00, fe7778c0, fe779118, 7300, fe7778c0) + 5c 000d3390 ERROR (dffa8, 0, 7530, 1, 81010100, 3d740) + 48 0003d590 make_ssl_connection (fe779368, a060164, 0, fd043b8, 7530, fe77bed0) + 57c 000300c0 handle_check (10e800, dc290, ffffd438, 1, 0, 5b7550) + 1c08 000d7bbc spawn (fcaa2b0, 0, 0, 0, 0, 0) + 20 ff3657b4 _lwp_start (0, 0, 0, 0, 0, 0) BTW, I am using gdb 6.3.50.20050621-cvs - it's the only one I've found that doesn't bonk rolling over the end of a thread stack on Solaris.