From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-101424-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 2389 invoked by alias); 13 May 2013 11:22:40 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 2379 invoked by uid 89); 13 May 2013 11:22:39 -0000
X-Spam-SWARE-Status: No, score=-8.0 required=5.0 tests=AWL,BAYES_00,KHOP_THREADED,RCVD_IN_HOSTKARMA_W,RCVD_IN_HOSTKARMA_WL,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.1
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28)    by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Mon, 13 May 2013 11:22:39 +0000
Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23])	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r4DBMaiJ003315	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);	Mon, 13 May 2013 07:22:36 -0400
Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11])	by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r4DBMYYF007760;	Mon, 13 May 2013 07:22:35 -0400
Message-ID: <5190CCF9.3020004@redhat.com>
Date: Mon, 13 May 2013 11:22:00 -0000
From: Pedro Alves <palves@redhat.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4
MIME-Version: 1.0
To: Joel Brobecker <brobecker@adacore.com>
CC: gdb-patches@sourceware.org
Subject: Re: [RFA] gdbserver/lynx178: spurious SIG61 signal when resuming inferior.
References: <1368441986-14478-1-git-send-email-brobecker@adacore.com>
In-Reply-To: <1368441986-14478-1-git-send-email-brobecker@adacore.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-SW-Source: 2013-05/txt/msg00430.txt.bz2

Hi Joel,

On 05/13/2013 11:46 AM, Joel Brobecker wrote:

> On ppc-lynx178, resuming the execution of a program after hitting
> a breakpoint sometimes triggers a spurious SIG61 event:

I'd like to understand this a little better.

Could that mean the thread that gdbserver used for ptrace hadn't
been ptrace stopped, or doesn't exist at all?  "sometimes" makes
me wonder about the latter.

>     (gdb) cont
>     Continuing.
> 
>     Program received signal SIG61, Real-time event 61.
>     [Switching to Thread 39]
>     0x10002324 in a_test.task1 (<_task>=0x3ffff774) at a_test.adb:30
>     30          select  -- Task 1
> 
> From this point on, continuing again lets the signal kill the program.
> Using "signal 0" or configuring GDB to discard the signal does not
> help either, as the program immediately reports the same signal again.
> 
> What happens is the following:
> 
>   - GDB sends a single-step order to gdbserver: $vCont;s:31
>     This tells GDBserver to do a step using thread 0x31=49.
>     GDBserver does the step, and thread 49 receives the SIGTRAP
>     indicating that the step has finished.
> 
>   - GDB then sends a "continue", but this time does not specify
>     which thread to continue: $vCont;c
>     GDBserver uses an arbitrary thread's ptid to resume the program's
>     execution (the current_inferior's ptid was chosen for that).
>     See lynx-low.c:lynx_resume:

Urgh.

So does that mean scheduler locking doesn't work?

E.g.,

(gdb) thread 2
(gdb) si
(gdb) thread 1
(gdb) c

That'll single-step thread 2, and then continue just thread 1, supposedly
triggering this issue too?  If not, why not?

BTW, vCont;c means "resume all threads", why is the current code just
resuming one?

This:

lynx_wait_1 ()
...
  if (ptid_equal (ptid, minus_one_ptid))
    pid = lynx_ptid_get_pid (thread_to_gdb_id (current_inferior));
  else
    pid = BUILDPID (lynx_ptid_get_pid (ptid), lynx_ptid_get_tid (ptid));

retry:

  ret = lynx_waitpid (pid, &wstat);


is suspicious also.  Doesn't that mean we're doing a waitpid on
a possibly not-resumed current_inferior (that may not be the main task,
if that matters)?  Could _that_ be reason for that magic signal 61?

-- 
Pedro Alves