From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16955 invoked by alias); 19 Nov 2001 19:30:59 -0000 Mailing-List: contact gdb-patches-help@sourceware.cygnus.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 16901 invoked from network); 19 Nov 2001 19:30:53 -0000 Received: from unknown (HELO cygnus.com) (205.180.230.5) by sourceware.cygnus.com with SMTP; 19 Nov 2001 19:30:53 -0000 Received: from cse.cygnus.com (cse.cygnus.com [205.180.230.236]) by runyon.cygnus.com (8.8.7-cygnus/8.8.7) with ESMTP id LAA20334 for ; Mon, 19 Nov 2001 11:30:51 -0800 (PST) Received: (from kev@localhost) by cse.cygnus.com (8.9.3/8.9.3) id MAA16377 for gdb-patches@sources.redhat.com; Mon, 19 Nov 2001 12:30:45 -0700 Date: Wed, 07 Nov 2001 15:07:00 -0000 From: Kevin Buettner Message-Id: <1011119193045.ZM16376@ocotillo.lan> X-Mailer: Z-Mail (4.0.1 13Jan97 Caldera) To: gdb-patches@sources.redhat.com Subject: [PATCH RFA] lin-lwp.c: Block SIGCHLD events when attaching MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2001-11/txt/msg00121.txt.bz2 When doing a lin_lwp_attach_lwp(), it is sometimes possible to receive a SIGCHLD signal thus causing the waitpid() call to fail with EINTR. This in turn causes the second assert() in lin_lwp_attach_lwp() to fail. Reproducing this problem can be somewhat difficult. I've only been able to reproduce it on a dual processor Linux/x86 machine. I did manage to reproduce it using the linux-dp program as follows: 1) Start linux-dp (again, running on an SMP machine). 2) Determine the process id. Let's call it PID. (Make appropriate substitution below.) 3) Invoke gdb: gdb linux-dp 4) Do: set height 0 break print_philosopher 5) Do: while 1 attach PID continue detach end 6) Go get a cup of coffee, take a nap, etc. When you come back, you may be fortunate enough to see the following: ... Breakpoint 1, print_philosopher (n=2, left=95 '_', right=95 '_') at ../../../src/gdb/testsuite/gdb.threads/linux-dp.c:105 105 shared_printf ("%*s%c %d %c\n", (n * 4) + 2, "", left, n, right); ../../src/gdb/lin-lwp.c:416: gdb-internal-error: lin_lwp_attach: Assertion `pid == GET_PID (inferior_ptid) && WIFSTOPPED (status) && WSTOPSIG (status) == SIGSTOP' failed. An internal GDB error was detected. This may make further debugging unreliable. Continue this debugging session? (y or n) Jim Blandy deserves credit for arriving at the above procedure for reproducing this bug. (The actual test case that we used for tracking down this problem is different and is able to demonstrate it in a very short period of time. Having *lots* of running threads greatly increases the chances of being able to reproduce it quickly.) The fix is below. Okay to commit? * lin-lwp.c (lin_lwp_attach_lwp): Make sure SIGCHLD is in set of blocked signals. Index: lin-lwp.c =================================================================== RCS file: /cvs/src/src/gdb/lin-lwp.c,v retrieving revision 1.30 diff -u -p -r1.30 lin-lwp.c --- lin-lwp.c 2001/10/14 11:30:37 1.30 +++ lin-lwp.c 2001/11/19 19:08:10 @@ -352,6 +352,14 @@ lin_lwp_attach_lwp (ptid_t ptid, int ver gdb_assert (is_lwp (ptid)); + /* Make sure SIGCHLD is blocked. We don't want SIGCHLD events + to interrupt either the ptrace() or waitpid() calls below. */ + if (! sigismember (&blocked_mask, SIGCHLD)) + { + sigaddset (&blocked_mask, SIGCHLD); + sigprocmask (SIG_BLOCK, &blocked_mask, NULL); + } + if (verbose) printf_filtered ("[New %s]\n", target_pid_to_str (ptid));