From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20018 invoked by alias); 10 Oct 2011 16:52:13 -0000 Received: (qmail 19996 invoked by uid 22791); 10 Oct 2011 16:52:09 -0000 X-SWARE-Spam-Status: No, hits=-1.2 required=5.0 tests=AWL,BAYES_00,FROM_12LTRDOM X-Spam-Check-By: sourceware.org Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 10 Oct 2011 16:51:48 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=EU1-MAIL.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1RDJ4t-00049b-2Z from pedro_alves@mentor.com for gdb-patches@sourceware.org; Mon, 10 Oct 2011 09:51:47 -0700 Received: from scottsdale.localnet ([172.16.63.104]) by EU1-MAIL.mgc.mentorg.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 10 Oct 2011 17:51:45 +0100 From: Pedro Alves To: gdb-patches@sourceware.org Subject: This a couple problems around PTRACE_EVENT_CLONE handling of signals in the new clone Date: Mon, 10 Oct 2011 16:52:00 -0000 User-Agent: KMail/1.13.6 (Linux/2.6.38-11-generic; KDE/4.7.1; x86_64; ; ) MIME-Version: 1.0 Content-Type: Text/Plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-Id: <201110101751.42903.pedro@codesourcery.com> X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2011-10/txt/msg00278.txt.bz2 This fixes two problems around PTRACE_EVENT_CLONE handling of signals in the new clone. - if some other signal != SIGSTOP comes out first on the new clone (because a signal with a number < SIGSTOP was pending, and the LWP dequeued it), we're currently just passing it on to the inferior. GDBserver will need fixing too, and even has a comment to that effect: /* Pass the signal on. This is what GDB does - except shouldn't we really report it instead? */ - It's possible for the core to see the new thread/lwp, and ask it to stop, queueing it a SIGSTOP, before we have a chance of handling the PTHREAD_EVENT_CLONE event. (e.g., non-stop; info threads; interrupt -a; while threads are spawning) (see comment in patch). In that case, we'd always resume the new thread, while the core would forever expect it to report a stop. I trigger this more easily with some later patches that make the core stop all threads, rather than the target. Tested on x86_64-linux, and applied. -- Pedro Alves 2011-10-10 Pedro Alves gdb/ * linux-nat.c (linux_handle_extended_wait): Don't resume the new new clone lwp if the core asked it to stop. Don't pass on unexpected signals to the new clone; leave them pending instead. --- gdb/linux-nat.c | 64 +++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 43 insertions(+), 21 deletions(-) Index: src/gdb/linux-nat.c =================================================================== --- src.orig/gdb/linux-nat.c 2011-10-10 15:46:31.967659200 +0100 +++ src/gdb/linux-nat.c 2011-10-10 15:50:45.637659153 +0100 @@ -2237,7 +2237,29 @@ linux_handle_extended_wait (struct lwp_i new_lp->signalled = 1; } else - status = 0; + { + struct thread_info *tp; + + /* When we stop for an event in some other thread, and + pull the thread list just as this thread has cloned, + we'll have seen the new thread in the thread_db list + before handling the CLONE event (glibc's + pthread_create adds the new thread to the thread list + before clone'ing, and has the kernel fill in the + thread's tid on the clone call with + CLONE_PARENT_SETTID). If that happened, and the core + had requested the new thread to stop, we'll have + killed it with SIGSTOP. But since SIGSTOP is not an + RT signal, it can only be queued once. We need to be + careful to not resume the LWP if we wanted it to + stop. In that case, we'll leave the SIGSTOP pending. + It will later be reported as TARGET_SIGNAL_0. */ + tp = find_thread_ptid (new_lp->ptid); + if (tp != NULL && tp->stop_requested) + new_lp->last_resume_kind = resume_stop; + else + status = 0; + } if (non_stop) { @@ -2266,39 +2288,39 @@ linux_handle_extended_wait (struct lwp_i } } + if (status != 0) + { + /* We created NEW_LP so it cannot yet contain STATUS. */ + gdb_assert (new_lp->status == 0); + + /* Save the wait status to report later. */ + if (debug_linux_nat) + fprintf_unfiltered (gdb_stdlog, + "LHEW: waitpid of new LWP %ld, " + "saving status %s\n", + (long) GET_LWP (new_lp->ptid), + status_to_str (status)); + new_lp->status = status; + } + /* Note the need to use the low target ops to resume, to handle resuming with PT_SYSCALL if we have syscall catchpoints. */ if (!stopping) { - enum target_signal signo; - - new_lp->stopped = 0; new_lp->resumed = 1; - new_lp->last_resume_kind = resume_continue; - - signo = (status - ? target_signal_from_host (WSTOPSIG (status)) - : TARGET_SIGNAL_0); - linux_ops->to_resume (linux_ops, pid_to_ptid (new_pid), - 0, signo); - } - else - { - if (status != 0) + if (status == 0) { - /* We created NEW_LP so it cannot yet contain STATUS. */ - gdb_assert (new_lp->status == 0); - - /* Save the wait status to report later. */ if (debug_linux_nat) fprintf_unfiltered (gdb_stdlog, "LHEW: waitpid of new LWP %ld, " - "saving status %s\n", + "resuming %s\n", (long) GET_LWP (new_lp->ptid), status_to_str (status)); - new_lp->status = status; + linux_ops->to_resume (linux_ops, pid_to_ptid (new_pid), + 0, TARGET_SIGNAL_0); + new_lp->stopped = 0; } }