From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2389 invoked by alias); 13 May 2013 11:22:40 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 2379 invoked by uid 89); 13 May 2013 11:22:39 -0000 X-Spam-SWARE-Status: No, score=-8.0 required=5.0 tests=AWL,BAYES_00,KHOP_THREADED,RCVD_IN_HOSTKARMA_W,RCVD_IN_HOSTKARMA_WL,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.1 Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Mon, 13 May 2013 11:22:39 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r4DBMaiJ003315 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 13 May 2013 07:22:36 -0400 Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r4DBMYYF007760; Mon, 13 May 2013 07:22:35 -0400 Message-ID: <5190CCF9.3020004@redhat.com> Date: Mon, 13 May 2013 11:22:00 -0000 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4 MIME-Version: 1.0 To: Joel Brobecker CC: gdb-patches@sourceware.org Subject: Re: [RFA] gdbserver/lynx178: spurious SIG61 signal when resuming inferior. References: <1368441986-14478-1-git-send-email-brobecker@adacore.com> In-Reply-To: <1368441986-14478-1-git-send-email-brobecker@adacore.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-SW-Source: 2013-05/txt/msg00430.txt.bz2 Hi Joel, On 05/13/2013 11:46 AM, Joel Brobecker wrote: > On ppc-lynx178, resuming the execution of a program after hitting > a breakpoint sometimes triggers a spurious SIG61 event: I'd like to understand this a little better. Could that mean the thread that gdbserver used for ptrace hadn't been ptrace stopped, or doesn't exist at all? "sometimes" makes me wonder about the latter. > (gdb) cont > Continuing. > > Program received signal SIG61, Real-time event 61. > [Switching to Thread 39] > 0x10002324 in a_test.task1 (<_task>=0x3ffff774) at a_test.adb:30 > 30 select -- Task 1 > > From this point on, continuing again lets the signal kill the program. > Using "signal 0" or configuring GDB to discard the signal does not > help either, as the program immediately reports the same signal again. > > What happens is the following: > > - GDB sends a single-step order to gdbserver: $vCont;s:31 > This tells GDBserver to do a step using thread 0x31=49. > GDBserver does the step, and thread 49 receives the SIGTRAP > indicating that the step has finished. > > - GDB then sends a "continue", but this time does not specify > which thread to continue: $vCont;c > GDBserver uses an arbitrary thread's ptid to resume the program's > execution (the current_inferior's ptid was chosen for that). > See lynx-low.c:lynx_resume: Urgh. So does that mean scheduler locking doesn't work? E.g., (gdb) thread 2 (gdb) si (gdb) thread 1 (gdb) c That'll single-step thread 2, and then continue just thread 1, supposedly triggering this issue too? If not, why not? BTW, vCont;c means "resume all threads", why is the current code just resuming one? This: lynx_wait_1 () ... if (ptid_equal (ptid, minus_one_ptid)) pid = lynx_ptid_get_pid (thread_to_gdb_id (current_inferior)); else pid = BUILDPID (lynx_ptid_get_pid (ptid), lynx_ptid_get_tid (ptid)); retry: ret = lynx_waitpid (pid, &wstat); is suspicious also. Doesn't that mean we're doing a waitpid on a possibly not-resumed current_inferior (that may not be the main task, if that matters)? Could _that_ be reason for that magic signal 61? -- Pedro Alves