From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6727 invoked by alias); 15 Dec 2008 22:50:20 -0000 Received: (qmail 6717 invoked by uid 22791); 15 Dec 2008 22:50:19 -0000 X-Spam-Check-By: sourceware.org Received: from mtagate8.de.ibm.com (HELO mtagate8.de.ibm.com) (195.212.29.157) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 15 Dec 2008 22:49:31 +0000 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate8.de.ibm.com (8.13.8/8.13.8) with ESMTP id mBFMnS2F538944 for ; Mon, 15 Dec 2008 22:49:28 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v9.1) with ESMTP id mBFMnSlD3813616 for ; Mon, 15 Dec 2008 23:49:28 +0100 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id mBFMnRYK020255 for ; Mon, 15 Dec 2008 23:49:28 +0100 Received: from tuxmaker.boeblingen.de.ibm.com (tuxmaker.boeblingen.de.ibm.com [9.152.85.9]) by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with SMTP id mBFMnRu0020252; Mon, 15 Dec 2008 23:49:27 +0100 Message-Id: <200812152249.mBFMnRu0020252@d12av02.megacenter.de.ibm.com> Received: by tuxmaker.boeblingen.de.ibm.com (sSMTP sendmail emulation); Mon, 15 Dec 2008 23:49:27 +0100 Subject: Re: [RFA] Fix hand called function when another thread has hit a bp. To: dje@google.com (Doug Evans) Date: Mon, 15 Dec 2008 22:50:00 -0000 From: "Ulrich Weigand" Cc: gdb-patches@sourceware.org In-Reply-To: from "Doug Evans" at Dec 15, 2008 02:06:34 PM MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2008-12/txt/msg00294.txt.bz2 Doug Evans wrote: > Hmm, I think we need to distinguish which testcase we're talking about. > > hand-call-in-threads.exp: > What it does: > - start up N threads and stop > - set scheduler-locking on > - hand-call a function in N threads, each of which hits a breakpoint > - manually pop pending dummy frames > - set scheduler-locking off > - continue > What it tests: > - verify a previously hit breakpoint in a different thread is NOT > singlestepped when resuming with scheduler-locking on > - verify manually popping all pending dummy frames works > - verify that resuming after manually popping all pending dummy > frames works (i.e. a previously hit and reported breakpoint is not > re-reported) > > multi-bp-in-threads.exp: > What it does: > - start up N threads and stop > - set scheduler-locking on > - hand-call a function in N threads, each of which hits a breakpoint > - set scheduler-locking off > - continue > What it tests: > - see what happens ... it's not clear to me what _should_ happen here > > Perhaps first we should agree on what the right behavior is. > For hand-call-in-threads.exp I think the testcase is correct as is. > For multi-bp-in-threads.exp, one approach is to just define the > problem away and say that GDB's current behavior is the correct > behavior. :) Ah, OK. It seems to be those problems are all triggered by scheduler locking; this is a feature that doesn't seem to be handled cleanly by the existing "prepare_to_proceed" mechanism. Generally, I think what *should* happen is pretty straightforward: each instance of a breakpoint hit should be reported exactly once. And in fact in the absence of scheduler locking, this is what the current code should (I hope!) achieve. In the case of the multi-bp-in-threads test case (without scheduler locking), we'd hit the breakpoint on all_threads_running (thread 1) first. If the user then switch to some (any) other thread, and issues the "continue" command, the prepare_to_proceed logic would switch back to thread 1, single-step once across the breakpoint without reporting it, and then continue all threads. Some thread X would then hit the breakpoint on thread_breakpoint. Again, if the user switches to any other thread and issues continue, GDB would switch back to thread X, single-step once, and then continue all threads. The breakpoint on thread_breakpoint would then hit in some other thread etc. All in all, every breakpoint would be hit and reported exactly once. Similarly, in the case of the hand-call-in-threads test case without scheduler locking, we hit the breakpoint on all_threads_running in thread 1 first. If the user switches to some other thread X and issues an inferior function call, GDB would switch back to thread 1, single- step over the breakpoint, and then continue all threads -- this will allow the inferior call to proceed to breakpoint on hand_call in thread X. If the user then switches to yet another thread Y and issues the inferior call there, GDB will switch back and single-step over the breakpoint in thread X, before continuing all threads allowing the breakpoint on hand_call in Y to hit, etc. This works without requiring per-thread state because at any point there is only ever one thread in the breakpoint-reported-and-needs- to-be-skipped state, and this state is handled/resolved as the very first action in any case, no matter what action the user performs. The problem arises when scheduler locking is switched on. Actually, I think there are really two problems. First of all, after we've switched back and single-stepped over an already-hit breakpoint via the prepare_to_proceed logic, we'll continue only a single thread if scheduler-locking is on -- and that is the wrong thread. The prepare_to_proceed logic only explicitly switches *back* to the user-selected thread if the user was *stepping* (that's the deferred_step_ptid logic). For scheduler-locking, we should probably switch back always ... The second problem is more a problem of definition: even if the first problem above were fixed, we've have to single-step the other thread at least once to get over the breakpoint. This would seem to violate the definition of scheduler locking if interpreted absolutely strictly. Now you could argue that as the user should never be aware of that single step, it doesn't really matter. If we really wanted to avoid that completely, I don't see any other way but adding a new per-thread flag that says "we've just stopped at and reported a breakpoint". Whenever we finally switch back and start executing that thread, and we're still on that breakpoint, we need to perform the single-step at this point. Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE Ulrich.Weigand@de.ibm.com