From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-60650-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 14629 invoked by alias); 15 Dec 2008 23:15:39 -0000
Received: (qmail 14617 invoked by uid 22791); 15 Dec 2008 23:15:38 -0000
X-Spam-Check-By: sourceware.org
Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.45.13)     by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 15 Dec 2008 23:15:03 +0000
Received: from wpaz5.hot.corp.google.com (wpaz5.hot.corp.google.com [172.24.198.69]) 	by smtp-out.google.com with ESMTP id mBFNF1nZ024346 	for <gdb-patches@sourceware.org>; Mon, 15 Dec 2008 15:15:01 -0800
Received: from rv-out-0506.google.com (rvbf6.prod.google.com [10.140.82.6]) 	by wpaz5.hot.corp.google.com with ESMTP id mBFNExk0018527 	for <gdb-patches@sourceware.org>; Mon, 15 Dec 2008 15:14:59 -0800
Received: by rv-out-0506.google.com with SMTP id f6so3385123rvb.9         for <gdb-patches@sourceware.org>; Mon, 15 Dec 2008 15:14:59 -0800 (PST)
MIME-Version: 1.0
Received: by 10.141.89.13 with SMTP id r13mr3979148rvl.76.1229382898994; Mon,  	15 Dec 2008 15:14:58 -0800 (PST)
In-Reply-To: <200812152249.mBFMnRu0020252@d12av02.megacenter.de.ibm.com>
References: <e394668d0812151406r1852ce82sa6d810675efb785f@mail.gmail.com> 	 <200812152249.mBFMnRu0020252@d12av02.megacenter.de.ibm.com>
Date: Mon, 15 Dec 2008 23:15:00 -0000
Message-ID: <e394668d0812151514j531e5be7g8f8f716908a9e9ea@mail.gmail.com>
Subject: Re: [RFA] Fix hand called function when another thread has hit a bp.
From: Doug Evans <dje@google.com>
To: Ulrich Weigand <uweigand@de.ibm.com>
Cc: gdb-patches@sourceware.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2008-12/txt/msg00296.txt.bz2

On Mon, Dec 15, 2008 at 2:49 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> Doug Evans wrote:
>
>> Hmm, I think we need to distinguish which testcase we're talking about.
>>
>> hand-call-in-threads.exp:
>>   What it does:
>>     - start up N threads and stop
>>     - set scheduler-locking on
>>     - hand-call a function in N threads, each of which hits a breakpoint
>>     - manually pop pending dummy frames
>>     - set scheduler-locking off
>>     - continue
>>   What it tests:
>>     - verify a previously hit breakpoint in a different thread is NOT
>> singlestepped when resuming with scheduler-locking on
>>     - verify manually popping all pending dummy frames works
>>     - verify that resuming after manually popping all pending dummy
>> frames works (i.e. a previously hit and reported breakpoint is not
>> re-reported)
>>
>> multi-bp-in-threads.exp:
>>   What it does:
>>     - start up N threads and stop
>>     - set scheduler-locking on
>>     - hand-call a function in N threads, each of which hits a breakpoint
>>     - set scheduler-locking off
>>     - continue
>>   What it tests:
>>     - see what happens ... it's not clear to me what _should_ happen here
>>
>> Perhaps first we should agree on what the right behavior is.
>> For hand-call-in-threads.exp I think the testcase is correct as is.
>> For multi-bp-in-threads.exp, one approach is to just define the
>> problem away and say that GDB's current behavior is the correct
>> behavior. :)
>
> Ah, OK.  It seems to be those problems are all triggered by
> scheduler locking; this is a feature that doesn't seem to be
> handled cleanly by the existing "prepare_to_proceed" mechanism.
>
> Generally, I think what *should* happen is pretty straightforward:
> each instance of a breakpoint hit should be reported exactly once.
>
> And in fact in the absence of scheduler locking, this is what the
> current code should (I hope!) achieve.  In the case of the
> multi-bp-in-threads test case (without scheduler locking), we'd
> hit the breakpoint on all_threads_running (thread 1) first.
>
> If the user then switch to some (any) other thread, and issues the
> "continue" command, the prepare_to_proceed logic would switch back
> to thread 1, single-step once across the breakpoint without reporting
> it, and then continue all threads.
>
> Some thread X would then hit the breakpoint on thread_breakpoint.
> Again, if the user switches to any other thread and issues continue,
> GDB would switch back to thread X, single-step once, and then continue
> all threads.  The breakpoint on thread_breakpoint would then hit in
> some other thread etc.
>
> All in all, every breakpoint would be hit and reported exactly once.
>
>
> Similarly, in the case of the hand-call-in-threads test case without
> scheduler locking, we hit the breakpoint on all_threads_running in
> thread 1 first.  If the user switches to some other thread X and issues
> an inferior function call, GDB would switch back to thread 1, single-
> step over the breakpoint, and then continue all threads -- this will
> allow the inferior call to proceed to breakpoint on hand_call in
> thread X.
>
> If the user then switches to yet another thread Y and issues the
> inferior call there, GDB will switch back and single-step over the
> breakpoint in thread X, before continuing all threads allowing the
> breakpoint on hand_call in Y to hit, etc.
>
> This works without requiring per-thread state because at any point
> there is only ever one thread in the breakpoint-reported-and-needs-
> to-be-skipped state, and this state is handled/resolved as the very
> first action in any case, no matter what action the user performs.
>
>
> The problem arises when scheduler locking is switched on.  Actually,
> I think there are really two problems.  First of all, after we've
> switched back and single-stepped over an already-hit breakpoint via
> the prepare_to_proceed logic, we'll continue only a single thread
> if scheduler-locking is on -- and that is the wrong thread.  The
> prepare_to_proceed logic only explicitly switches *back* to the
> user-selected thread if the user was *stepping* (that's the
> deferred_step_ptid logic).  For scheduler-locking, we should probably
> switch back always ...

If scheduler locking is on, why is there any switching at all?  If
scheduler-locking is on and I switch threads I'd want gdb to defer
single-stepping the other thread over its breakpoint until the point
when I make that other thread runnable.

Also, I think removing the notion of one previously stopped thread and
generalizing it to not caring, i.e. checking the status of every
stopped thread before resuming will simplify things and fix a few bugs
along the way.  IOW, make deferred_ptid go away.

>
> The second problem is more a problem of definition: even if the
> first problem above were fixed, we've have to single-step the other
> thread at least once to get over the breakpoint.  This would seem
> to violate the definition of scheduler locking if interpreted
> absolutely strictly.  Now you could argue that as the user should
> never be aware of that single step, it doesn't really matter.

I'm not sure how we necessarily have a violation of the definition of
scheduler locking.

> If we really wanted to avoid that completely, I don't see any other
> way but adding a new per-thread flag that says "we've just stopped
> at and reported a breakpoint".  Whenever we finally switch back and
> start executing that thread, and we're still on that breakpoint,
> we need to perform the single-step at this point.

I agree with wanting to singlestep the thread at the point it's made
runnable, and not before.