From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15735 invoked by alias); 9 Aug 2008 15:34:22 -0000 Received: (qmail 15721 invoked by uid 22791); 9 Aug 2008 15:34:21 -0000 X-Spam-Check-By: sourceware.org Received: from hiauly1.hia.nrc.ca (HELO hiauly1.hia.nrc.ca) (132.246.100.193) by sourceware.org (qpsmtpd/0.31) with ESMTP; Sat, 09 Aug 2008 15:33:47 +0000 Received: by hiauly1.hia.nrc.ca (Postfix, from userid 1000) id 994A84E2A; Sat, 9 Aug 2008 11:33:44 -0400 (EDT) Date: Sat, 09 Aug 2008 15:34:00 -0000 From: John David Anglin To: Pedro Alves Cc: gdb-patches@sourceware.org Subject: Re: ttrace: Protocal error Message-ID: <20080809153343.GA29008@hiauly1.hia.nrc.ca> Reply-To: John David Anglin References: <20080808201457.3064B4EBE@hiauly1.hia.nrc.ca> <200808091551.17013.pedro@codesourcery.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="a8Wt8u1KmwUX3Y2C" Content-Disposition: inline In-Reply-To: <200808091551.17013.pedro@codesourcery.com> User-Agent: Mutt/1.5.16 (2007-06-09) Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2008-08/txt/msg00244.txt.bz2 --a8Wt8u1KmwUX3Y2C Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-length: 2010 On Sat, 09 Aug 2008, Pedro Alves wrote: > If I'm analising the race correctly, an alternative simpler solution > to the above, would be to just ignore a failed resume on a > dying thread, but still try to resume it. Not resuming dying threads > may be a problem. It occured to me that my suggested change wasn't going to handle the race associated with threads dying, although it improves the situation. The ttrace documentation suggested that only stopped threads should be resumed. The thread info indicated the thread in question was running. I changed the code to only check for thread stopped. However, this didn't work. My testcase hangs when run under gdb: (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /mnt/gnu/gcc/objdir/hppa2.0w-hp-hpux11.11/libgomp/testsuite/vla6.x3g warning: Private mapping of shared library text was not specified by the executable; setting a breakpoint in a shared library which is not privately mapped will not work. See the HP-UX 11i v3 chatr manpage for methods to privately map shared library text. [New process 2316, lwp 7552188] [process 2316, lwp 7552188 exited] [New process 2316, lwp 7552189] [process 2316, lwp 7552189 exited] [New process 2316, lwp 7552190] [process 2316, lwp 7552190 exited] [New process 2316, lwp 7552191] [process 2316, lwp 7552191 exited] [New process 2316, lwp 7552192] [New process 2316, lwp 7552193] [New process 2316, lwp 7552194] [New process 2316, lwp 7552195] [New process 2316, lwp 7552196] Program received signal SIGINT, Interrupt. [Switching to process 2316, lwp 7552196] 0xc03bea58 in __lwp_sema_wait () from /usr/lib/librt.2 So, there's apparently a race in detection when a program is stopped. It looks like we need to try your suggestion(s). Attached same with debug infrun 1. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) --a8Wt8u1KmwUX3Y2C Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="vla6.dbg" Content-length: 4141 (gdb) set debug infrun 1 (gdb) r Starting program: /mnt/gnu/gcc/objdir/hppa2.0w-hp-hpux11.11/libgomp/testsuite/vla6.x3g infrun: wait_for_inferior (treat_exec_as_sigtrap=1) infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x91d0 infrun: quietly stopped infrun: stop_stepping infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: wait_for_inferior (treat_exec_as_sigtrap=1) infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x19f8 infrun: quietly stopped infrun: stop_stepping warning: Private mapping of shared library text was not specified by the executable; setting a breakpoint in a shared library which is not privately mapped will not work. See the HP-UX 11i v3 chatr manpage for methods to privately map shared library text. infrun: proceed (addr=0xffffffff, signal=0, step=0) infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: wait_for_inferior (treat_exec_as_sigtrap=0) [New process 2557, lwp 7552464] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait [process 2557, lwp 7552464 exited] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait [New process 2557, lwp 7552465] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait [process 2557, lwp 7552465 exited] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait [New process 2557, lwp 7552466] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait [process 2557, lwp 7552466 exited] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait [New process 2557, lwp 7552467] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait [process 2557, lwp 7552467 exited] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x19e0 infrun: BPSTAT_WHAT_CHECK_SHLIBS infrun: no stepping, continue infrun: resume (step=1, signal=0), stepping_over_breakpoint=1 infrun: prepare_to_wait infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x19e4 infrun: no stepping, continue infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait [New process 2557, lwp 7552468] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait [New process 2557, lwp 7552469] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait [New process 2557, lwp 7552470] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait [New process 2557, lwp 7552471] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait [New process 2557, lwp 7552472] infrun: infwait_normal_state infrun: TARGET_WAITKIND_SPURIOUS infrun: resume (step=0, signal=0), stepping_over_breakpoint=0 infrun: prepare_to_wait infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0xc020c878 infrun: context switch infrun: Switching context from process 2557, lwp 7552460 to process 2557, lwp 7552472 infrun: random signal 2 Program received signal SIGINT, Interrupt. infrun: stop_stepping [Switching to process 2557, lwp 7552472] 0xc020c878 in __ksleep () from /usr/lib/libc.2 --a8Wt8u1KmwUX3Y2C--