From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id SMnTC+/Y/V+9SgAAWB0awg (envelope-from ) for ; Tue, 12 Jan 2021 12:14:23 -0500 Received: by simark.ca (Postfix, from userid 112) id 209981EF7E; Tue, 12 Jan 2021 12:14:23 -0500 (EST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RDNS_NONE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.2 Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 8180D1EE1B for ; Tue, 12 Jan 2021 12:14:22 -0500 (EST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D8E6C3892451; Tue, 12 Jan 2021 17:14:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D8E6C3892451 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1610471661; bh=2tzrNtN1NWvg4qg6TnD402yEsWAtZRF63/EgJm4d9PY=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=ZIBf+VsnaC4E1fFWdMSec5AHzDzFGUIW8w/mx8NRiMfRhM+5G6fZxapx3FSbfSssn 6oN2wr71DeRD+lQFnikYo0jJ9mISSqymvcalQMXVQXP93HbwRzxCPKQdnaxFk/Yc+U DO+iCgQo/DWR3ANhfkpJ0xiN8aiwG331gFKX9UG4= Received: from smtp.polymtl.ca (smtp.polymtl.ca [132.207.4.11]) by sourceware.org (Postfix) with ESMTPS id E8FE43858031 for ; Tue, 12 Jan 2021 17:14:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E8FE43858031 Received: from simark.ca (simark.ca [158.69.221.121]) (authenticated bits=0) by smtp.polymtl.ca (8.14.7/8.14.7) with ESMTP id 10CHE6of009640 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 12 Jan 2021 12:14:11 -0500 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.polymtl.ca 10CHE6of009640 Received: from [10.0.0.11] (192-222-157-6.qc.cable.ebox.net [192.222.157.6]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 8BBB41EE1B; Tue, 12 Jan 2021 12:14:06 -0500 (EST) Subject: Re: [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses To: gdb-patches@sourceware.org References: <20210108041734.3873826-1-simon.marchi@polymtl.ca> <20210108041734.3873826-5-simon.marchi@polymtl.ca> Message-ID: Date: Tue, 12 Jan 2021 12:14:06 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <20210108041734.3873826-5-simon.marchi@polymtl.ca> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Poly-FromMTA: (simark.ca [158.69.221.121]) at Tue, 12 Jan 2021 17:14:06 +0000 X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Simon Marchi via Gdb-patches Reply-To: Simon Marchi Cc: Simon Marchi Errors-To: gdb-patches-bounces@sourceware.org Sender: "Gdb-patches" On 2021-01-07 11:17 p.m., Simon Marchi via Gdb-patches wrote: > From: Simon Marchi > > The rationale for this patch comes from the ROCm port [1], the goal > being to reduce the number of back and forths between GDB and the target > when doing successive operations. I'll start with explaining the > rationale and then go over the implementation. In the ROCm / GPU world, > the term "wave" is somewhat equivalent to a "thread" in GDB. So if you > read if from a GPU stand point, just s/thread/wave/. > > ROCdbgapi, the library used by GDB [2] to communicate with the GPU > target, gives the illusion that it's possible for the debugger to > control (start and stop) individual threads. But in reality, this is > not how it works. Under the hood, all threads of a queue are controlled > as a group. To stop one thread in a group of running ones, the state of > all threads is retrieved from the GPU, all threads are destroyed, and all > threads but the one we want to stop are re-created from the saved state. > The net result, from the point of view of GDB, is that the library > stopped one thread. The same thing goes if we want to resume one thread > while others are running: the state of all running threads is retrieved > from the GPU, they are all destroyed, and they are all re-created, > including the thread we want to resume. > > This leads to some inefficiencies when combined with how GDB works, here > are two examples: > > - Stopping all threads: because the target operates in non-stop mode, > when the user interface mode is all-stop, GDB must stop all threads > individually when presenting a stop. Let's suppose we have 1000 > threads and the user does ^C. GDB asks the target to stop one > thread. Behind the scenes, the library retrieves 1000 thread states > and restores the 999 others still running ones. GDB asks the target > to stop another one. The target retrieves 999 thread states and > restores the 998 remaining ones. That means that to stop 1000 > threads, we did 1000 back and forths with the GPU. It would have > been much better to just retrieve the states once and stop there. > > - Resuming with pending events: suppose the 1000 threads hit a > breakpoint at the same time. The breakpoint is conditional and > evaluates to true for the first thread, to false for all others. GDB > pulls one event (for the first thread) from the target, decides that > it should present a stop, so stops all threads using > stop_all_threads. All these other threads have a breakpoint event to > report, which is saved in `thread_info::suspend::waitstatus` for > later. When the user does "continue", GDB resumes that one thread > that did hit the breakpoint. It then processes the pending events > one by one as if they just arrived. It picks one, evaluates the > condition to false, and resumes the thread. It picks another one, > evaluates the condition to false, and resumes the thread. And so on. > In between each resumption, there is a full state retrieval and > re-creation. It would be much nicer if we could wait a little bit > before sending those threads on the GPU, until it processed all those > pending events. > > To address this kind of performance issue, ROCdbgapi has a concept > called "forward progress required", which is a boolean state that allows > its user (i.e. GDB) to say "I'm doing a bunch of operations, you can > hold off putting the threads on the GPU until I'm done" (the "forward > progress not required" state). Turning forward progress back on > indicates to the library that all threads that are supposed to be > running should now be really running on the GPU. > > It turns out that GDB has a similar concept, though not as general, > commit_resume. On difference is that commit_resume is not stateful: the > target can't look up "does the core need me to schedule resumed threads > for execution right now". It is also specifically linked to the resume > method, it is not used in other contexts. The target accumulates > resumption requests through target_ops::resume calls, and then commits > those resumptions when target_ops::commit_resume is called. The target > has no way to check if it's ok to leave resumed threads stopped in other > target methods. > > To bridge the gap, this patch generalizes the commit_resume concept in > GDB to match the forward progress concept of ROCdbgapi. The current > name (commit_resume) can be interpreted as "commit the previous resume > calls". I renamed the concept to "commit_resumed", as in "commit the > threads that are resumed". > > In the new version, we have two things in process_stratum_target: > > - the commit_resumed_state field: indicates whether GDB requires this > target to have resumed threads committed to the execution > target/device. If false, the target is allowed to leave resumed > threads un-committed at the end of whatever method it is executing. > > - the commit_resumed method: called when commit_resumed_state > transitions from false to true. While commit_resumed_state was > false, the target may have left some resumed threads un-committed. > This method being called tells it that it should commit them back to > the execution device. > > Let's take the "Stopping all threads" scenario from above and see how it > would work with the ROCm target with this change. Before stopping all > threads, GDB would set the target's commit_resumed_state field to false. > It would then ask the target to stop the first thread. The target would > retrieve all threads' state from the GPU and mark that one as stopped. > Since commit_resumed_state is false, it leaves all the other threads > (still resumed) stopped. GDB would then proceed to call target_stop for > all the other threads. Since resumed threads are not committed, this > doesn't do any back and forth with the GPU. > > To simplify the implementation of targets, I made it so that when > calling certain target methods, the contract between the core and the > targets guarantees that commit_resumed_state is false. This way, the > target doesn't need two paths, one commit_resumed_state == true and one > for commit_resumed_state == false. It can just assert that > commit_resumed_state is false and work with that assumption. This also > helps catch places where we forgot to disable commit_resumed_state > before calling the method, which represents a probable optimization > opportunity. > > To have some confidence that this contract between the core and the > targets is respected, I added assertions in the linux-nat target > methods, even though the linux-nat target doesn't actually use that > feature. Since linux-nat is tested much more than other targets, this > will help catch these issues quicker. > > To ensure that commit_resumed_state is always turned back on (only if > necessary, see below) and the commit_resumed method is called when doing > so, I introduced the scoped_disabled_commit_resumed RAII object, which > replaces make_scoped_defer_process_target_commit_resume. On > construction, it clears the commit_resumed_state flag of all process > targets. On destruction, it turns it back on (if necessary) and calls > the commit_resumed method. The nested case is handled by having a > "nesting" counter: only when the counter goes back to 0 is > commit_resumed_state turned back on. > > On destruction, commit-resumed is not re-enabled for a given target if: > > 1. this target has no threads resumed, or > 2. this target at least one thread with a pending status known to the > core (saved in thread_info::suspend::waitstatus). > > The first point is not technically necessary, because a proper > commit_resumed implementation would be a no-op if the target has no > resumed threads. But since we have a flag do to a quick check, I think > it doesn't hurt. > > The second point is more important: together with the > scoped_disable_commit_resumed instance added in fetch_inferior_event, it > makes it so the "Resuming with pending events" described above is > handled efficiently. Here's what happens in that case: > > 1. The user types "continue". > 2. Upon destruction, the scoped_disable_commit_resumed in the `proceed` > function does not enable commit-resumed, as it sees other threads > have pending statuses. > 3. fetch_inferior_event is called to handle another event, one thread > is resumed. Because there are still more threads with pending > statuses, the destructor of scoped_disable_commit_resumed in > fetch_inferior_event still doesn't enable commit-resumed. > 4. Rinse and repeat step 3, until the last pending status is handled by > fetch_inferior_event. In that case, scoped_disable_commit_resumed's > destructor sees there are no more threads with pending statues, so > it asks the target to commit resumed threads. > > This allows us to avoid all unnecessary back and forths, there is a > single commit_resumed call. > > This change required remote_target::remote_stop_ns to learn how to > handle stopping threads that were resumed but pending vCont. The > simplest example where that happens is when using the remote target in > all-stop, but with "maint set target-non-stop on", to force it to > operate in non-stop mode under the hood. If two threads hit a > breakpoint at the same time, GDB will receive two stop replies. It will > present the stop for one thread and save the other one in > thread_info::suspend::waitstatus. > > Before this patch, when doing "continue", GDB first resumes the thread > without a pending status: > > Sending packet: $vCont;c:p172651.172676#f3 > > It then consumes the pending status in the next fetch_inferior_event > call: > > [infrun] do_target_wait_1: Using pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP for Thread 1517137.1517137. > [infrun] target_wait (-1.0.0, status) = > [infrun] 1517137.1517137.0 [Thread 1517137.1517137], > [infrun] status->kind = stopped, signal = GDB_SIGNAL_TRAP > > It then realizes it needs to stop all threads to present the stop, so > stops the thread it just resumed: > > [infrun] stop_all_threads: Thread 1517137.1517137 not executing > [infrun] stop_all_threads: Thread 1517137.1517174 executing, need stop > remote_stop called > Sending packet: $vCont;t:p172651.172676#04 > > This is an unnecessary resume/stop. With this patch, we don't commit > resumed threads after proceeding, because of the pending status: > > [infrun] maybe_commit_resumed_all_process_targets: not requesting commit-resumed for target extended-remote, a thread has a pending waitstatus > > When GDB handles the pending status and stop_all_threads runs, we stop a > resumed but pending vCont thread: > > remote_stop_ns: Enqueueing phony stop reply for thread pending vCont-resume (1520940, 1520976, 0) > > That thread was never actually resumed on the remote stub / gdbserver. > This is why remote_stop_ns needed to learn this new trick of enqueueing > phony stop replies. > > Note that this patch only considers pending statuses known to the core > of GDB, that is the events that were pulled out of the target and stored > in `thread_info::suspend::waitstatus`. In some cases, we could also > avoid unnecessary back and forth when the target has events that it has > not yet reported the core. I plan to implement this as a subsequent > patch, once this series has settled. I think this patch introduces some regressions, when running $ while make check TESTS="gdb.threads/interrupt-while-step-over.exp" RUNTESTFLAGS="--target_board=native-extended-gdbserver"; do done I'll sometimes get: /home/smarchi/src/binutils-gdb/gdb/inline-frame.c:383: internal-error: void skip_inline_frames(thread_info*, bpstat): Assertion `find_inline_frame_state (thread) == NULL' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) Simon