From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-60648-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 6727 invoked by alias); 15 Dec 2008 22:50:20 -0000
Received: (qmail 6717 invoked by uid 22791); 15 Dec 2008 22:50:19 -0000
X-Spam-Check-By: sourceware.org
Received: from mtagate8.de.ibm.com (HELO mtagate8.de.ibm.com) (195.212.29.157)     by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 15 Dec 2008 22:49:31 +0000
Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) 	by mtagate8.de.ibm.com (8.13.8/8.13.8) with ESMTP id mBFMnS2F538944 	for <gdb-patches@sourceware.org>; Mon, 15 Dec 2008 22:49:28 GMT
Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) 	by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v9.1) with ESMTP id mBFMnSlD3813616 	for <gdb-patches@sourceware.org>; Mon, 15 Dec 2008 23:49:28 +0100
Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) 	by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id mBFMnRYK020255 	for <gdb-patches@sourceware.org>; Mon, 15 Dec 2008 23:49:28 +0100
Received: from tuxmaker.boeblingen.de.ibm.com (tuxmaker.boeblingen.de.ibm.com [9.152.85.9]) 	by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with SMTP id mBFMnRu0020252; 	Mon, 15 Dec 2008 23:49:27 +0100
Message-Id: <200812152249.mBFMnRu0020252@d12av02.megacenter.de.ibm.com>
Received: by tuxmaker.boeblingen.de.ibm.com (sSMTP sendmail emulation); Mon, 15 Dec 2008 23:49:27 +0100
Subject: Re: [RFA] Fix hand called function when another thread has hit a bp.
To: dje@google.com (Doug Evans)
Date: Mon, 15 Dec 2008 22:50:00 -0000
From: "Ulrich Weigand" <uweigand@de.ibm.com>
Cc: gdb-patches@sourceware.org
In-Reply-To: <e394668d0812151406r1852ce82sa6d810675efb785f@mail.gmail.com> from "Doug Evans" at Dec 15, 2008 02:06:34 PM
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2008-12/txt/msg00294.txt.bz2

Doug Evans wrote:

> Hmm, I think we need to distinguish which testcase we're talking about.
> 
> hand-call-in-threads.exp:
>   What it does:
>     - start up N threads and stop
>     - set scheduler-locking on
>     - hand-call a function in N threads, each of which hits a breakpoint
>     - manually pop pending dummy frames
>     - set scheduler-locking off
>     - continue
>   What it tests:
>     - verify a previously hit breakpoint in a different thread is NOT
> singlestepped when resuming with scheduler-locking on
>     - verify manually popping all pending dummy frames works
>     - verify that resuming after manually popping all pending dummy
> frames works (i.e. a previously hit and reported breakpoint is not
> re-reported)
> 
> multi-bp-in-threads.exp:
>   What it does:
>     - start up N threads and stop
>     - set scheduler-locking on
>     - hand-call a function in N threads, each of which hits a breakpoint
>     - set scheduler-locking off
>     - continue
>   What it tests:
>     - see what happens ... it's not clear to me what _should_ happen here
> 
> Perhaps first we should agree on what the right behavior is.
> For hand-call-in-threads.exp I think the testcase is correct as is.
> For multi-bp-in-threads.exp, one approach is to just define the
> problem away and say that GDB's current behavior is the correct
> behavior. :)

Ah, OK.  It seems to be those problems are all triggered by
scheduler locking; this is a feature that doesn't seem to be
handled cleanly by the existing "prepare_to_proceed" mechanism.

Generally, I think what *should* happen is pretty straightforward:
each instance of a breakpoint hit should be reported exactly once.

And in fact in the absence of scheduler locking, this is what the
current code should (I hope!) achieve.  In the case of the
multi-bp-in-threads test case (without scheduler locking), we'd
hit the breakpoint on all_threads_running (thread 1) first.

If the user then switch to some (any) other thread, and issues the
"continue" command, the prepare_to_proceed logic would switch back
to thread 1, single-step once across the breakpoint without reporting
it, and then continue all threads.

Some thread X would then hit the breakpoint on thread_breakpoint.
Again, if the user switches to any other thread and issues continue,
GDB would switch back to thread X, single-step once, and then continue
all threads.  The breakpoint on thread_breakpoint would then hit in
some other thread etc.

All in all, every breakpoint would be hit and reported exactly once.


Similarly, in the case of the hand-call-in-threads test case without
scheduler locking, we hit the breakpoint on all_threads_running in
thread 1 first.  If the user switches to some other thread X and issues
an inferior function call, GDB would switch back to thread 1, single-
step over the breakpoint, and then continue all threads -- this will
allow the inferior call to proceed to breakpoint on hand_call in 
thread X.

If the user then switches to yet another thread Y and issues the
inferior call there, GDB will switch back and single-step over the
breakpoint in thread X, before continuing all threads allowing the
breakpoint on hand_call in Y to hit, etc.

This works without requiring per-thread state because at any point
there is only ever one thread in the breakpoint-reported-and-needs-
to-be-skipped state, and this state is handled/resolved as the very
first action in any case, no matter what action the user performs.


The problem arises when scheduler locking is switched on.  Actually,
I think there are really two problems.  First of all, after we've
switched back and single-stepped over an already-hit breakpoint via
the prepare_to_proceed logic, we'll continue only a single thread
if scheduler-locking is on -- and that is the wrong thread.  The
prepare_to_proceed logic only explicitly switches *back* to the
user-selected thread if the user was *stepping* (that's the 
deferred_step_ptid logic).  For scheduler-locking, we should probably
switch back always ...

The second problem is more a problem of definition: even if the
first problem above were fixed, we've have to single-step the other
thread at least once to get over the breakpoint.  This would seem
to violate the definition of scheduler locking if interpreted
absolutely strictly.  Now you could argue that as the user should
never be aware of that single step, it doesn't really matter.

If we really wanted to avoid that completely, I don't see any other
way but adding a new per-thread flag that says "we've just stopped
at and reported a breakpoint".  Whenever we finally switch back and
start executing that thread, and we're still on that breakpoint,
we need to perform the single-step at this point.

Bye,
Ulrich


-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com