From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14629 invoked by alias); 15 Dec 2008 23:15:39 -0000 Received: (qmail 14617 invoked by uid 22791); 15 Dec 2008 23:15:38 -0000 X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.45.13) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 15 Dec 2008 23:15:03 +0000 Received: from wpaz5.hot.corp.google.com (wpaz5.hot.corp.google.com [172.24.198.69]) by smtp-out.google.com with ESMTP id mBFNF1nZ024346 for ; Mon, 15 Dec 2008 15:15:01 -0800 Received: from rv-out-0506.google.com (rvbf6.prod.google.com [10.140.82.6]) by wpaz5.hot.corp.google.com with ESMTP id mBFNExk0018527 for ; Mon, 15 Dec 2008 15:14:59 -0800 Received: by rv-out-0506.google.com with SMTP id f6so3385123rvb.9 for ; Mon, 15 Dec 2008 15:14:59 -0800 (PST) MIME-Version: 1.0 Received: by 10.141.89.13 with SMTP id r13mr3979148rvl.76.1229382898994; Mon, 15 Dec 2008 15:14:58 -0800 (PST) In-Reply-To: <200812152249.mBFMnRu0020252@d12av02.megacenter.de.ibm.com> References: <200812152249.mBFMnRu0020252@d12av02.megacenter.de.ibm.com> Date: Mon, 15 Dec 2008 23:15:00 -0000 Message-ID: Subject: Re: [RFA] Fix hand called function when another thread has hit a bp. From: Doug Evans To: Ulrich Weigand Cc: gdb-patches@sourceware.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2008-12/txt/msg00296.txt.bz2 On Mon, Dec 15, 2008 at 2:49 PM, Ulrich Weigand wrote: > Doug Evans wrote: > >> Hmm, I think we need to distinguish which testcase we're talking about. >> >> hand-call-in-threads.exp: >> What it does: >> - start up N threads and stop >> - set scheduler-locking on >> - hand-call a function in N threads, each of which hits a breakpoint >> - manually pop pending dummy frames >> - set scheduler-locking off >> - continue >> What it tests: >> - verify a previously hit breakpoint in a different thread is NOT >> singlestepped when resuming with scheduler-locking on >> - verify manually popping all pending dummy frames works >> - verify that resuming after manually popping all pending dummy >> frames works (i.e. a previously hit and reported breakpoint is not >> re-reported) >> >> multi-bp-in-threads.exp: >> What it does: >> - start up N threads and stop >> - set scheduler-locking on >> - hand-call a function in N threads, each of which hits a breakpoint >> - set scheduler-locking off >> - continue >> What it tests: >> - see what happens ... it's not clear to me what _should_ happen here >> >> Perhaps first we should agree on what the right behavior is. >> For hand-call-in-threads.exp I think the testcase is correct as is. >> For multi-bp-in-threads.exp, one approach is to just define the >> problem away and say that GDB's current behavior is the correct >> behavior. :) > > Ah, OK. It seems to be those problems are all triggered by > scheduler locking; this is a feature that doesn't seem to be > handled cleanly by the existing "prepare_to_proceed" mechanism. > > Generally, I think what *should* happen is pretty straightforward: > each instance of a breakpoint hit should be reported exactly once. > > And in fact in the absence of scheduler locking, this is what the > current code should (I hope!) achieve. In the case of the > multi-bp-in-threads test case (without scheduler locking), we'd > hit the breakpoint on all_threads_running (thread 1) first. > > If the user then switch to some (any) other thread, and issues the > "continue" command, the prepare_to_proceed logic would switch back > to thread 1, single-step once across the breakpoint without reporting > it, and then continue all threads. > > Some thread X would then hit the breakpoint on thread_breakpoint. > Again, if the user switches to any other thread and issues continue, > GDB would switch back to thread X, single-step once, and then continue > all threads. The breakpoint on thread_breakpoint would then hit in > some other thread etc. > > All in all, every breakpoint would be hit and reported exactly once. > > > Similarly, in the case of the hand-call-in-threads test case without > scheduler locking, we hit the breakpoint on all_threads_running in > thread 1 first. If the user switches to some other thread X and issues > an inferior function call, GDB would switch back to thread 1, single- > step over the breakpoint, and then continue all threads -- this will > allow the inferior call to proceed to breakpoint on hand_call in > thread X. > > If the user then switches to yet another thread Y and issues the > inferior call there, GDB will switch back and single-step over the > breakpoint in thread X, before continuing all threads allowing the > breakpoint on hand_call in Y to hit, etc. > > This works without requiring per-thread state because at any point > there is only ever one thread in the breakpoint-reported-and-needs- > to-be-skipped state, and this state is handled/resolved as the very > first action in any case, no matter what action the user performs. > > > The problem arises when scheduler locking is switched on. Actually, > I think there are really two problems. First of all, after we've > switched back and single-stepped over an already-hit breakpoint via > the prepare_to_proceed logic, we'll continue only a single thread > if scheduler-locking is on -- and that is the wrong thread. The > prepare_to_proceed logic only explicitly switches *back* to the > user-selected thread if the user was *stepping* (that's the > deferred_step_ptid logic). For scheduler-locking, we should probably > switch back always ... If scheduler locking is on, why is there any switching at all? If scheduler-locking is on and I switch threads I'd want gdb to defer single-stepping the other thread over its breakpoint until the point when I make that other thread runnable. Also, I think removing the notion of one previously stopped thread and generalizing it to not caring, i.e. checking the status of every stopped thread before resuming will simplify things and fix a few bugs along the way. IOW, make deferred_ptid go away. > > The second problem is more a problem of definition: even if the > first problem above were fixed, we've have to single-step the other > thread at least once to get over the breakpoint. This would seem > to violate the definition of scheduler locking if interpreted > absolutely strictly. Now you could argue that as the user should > never be aware of that single step, it doesn't really matter. I'm not sure how we necessarily have a violation of the definition of scheduler locking. > If we really wanted to avoid that completely, I don't see any other > way but adding a new per-thread flag that says "we've just stopped > at and reported a breakpoint". Whenever we finally switch back and > start executing that thread, and we're still on that breakpoint, > we need to perform the single-step at this point. I agree with wanting to singlestep the thread at the point it's made runnable, and not before.