From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id bvssGdas+F9ZNwAAWB0awg (envelope-from ) for ; Fri, 08 Jan 2021 14:04:54 -0500 Received: by simark.ca (Postfix, from userid 112) id 5965B1E99A; Fri, 8 Jan 2021 14:04:54 -0500 (EST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 404AF1E590 for ; Fri, 8 Jan 2021 14:04:53 -0500 (EST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E9B5D38438A3; Fri, 8 Jan 2021 19:04:52 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E9B5D38438A3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1610132693; bh=OG/eFqD/A/JFjmZTHRmEF4Yae7/08lmbBh1FFrmZkUQ=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=jQbXuWsv4IagYMvsjXe4qiw++33jLqrGGF5qdKDZC5FhRGroYQWdmBi39VtitGu1r lNO/gRbbA6cHpwglEM0iFZ+bmLgdq8mttU2X7l1PKsZ0HffJn8UGtvQlKXvnyXjn62 dnx30yTM0LwkWU8hvhJcX+ZJBZjYfNXohk606mCA= Received: from smtp.polymtl.ca (smtp.polymtl.ca [132.207.4.11]) by sourceware.org (Postfix) with ESMTPS id 5442E38438A3 for ; Fri, 8 Jan 2021 19:04:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5442E38438A3 Received: from simark.ca (simark.ca [158.69.221.121]) (authenticated bits=0) by smtp.polymtl.ca (8.14.7/8.14.7) with ESMTP id 108J4ftf001045 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 8 Jan 2021 14:04:46 -0500 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.polymtl.ca 108J4ftf001045 Received: from [10.0.0.213] (192-222-157-6.qc.cable.ebox.net [192.222.157.6]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 1CEBD1E590; Fri, 8 Jan 2021 14:04:41 -0500 (EST) Subject: Re: [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses To: Andrew Burgess References: <20210108041734.3873826-1-simon.marchi@polymtl.ca> <20210108041734.3873826-5-simon.marchi@polymtl.ca> <20210108183459.GD2945@embecosm.com> Message-ID: <998e1449-f74c-d9ac-242a-d956196fbbe7@polymtl.ca> Date: Fri, 8 Jan 2021 14:04:40 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <20210108183459.GD2945@embecosm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Poly-FromMTA: (simark.ca [158.69.221.121]) at Fri, 8 Jan 2021 19:04:41 +0000 X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Simon Marchi via Gdb-patches Reply-To: Simon Marchi Cc: Simon Marchi , gdb-patches@sourceware.org Errors-To: gdb-patches-bounces@sourceware.org Sender: "Gdb-patches" On 2021-01-08 1:34 p.m., Andrew Burgess wrote: > * Simon Marchi [2021-01-07 23:17:33 -0500]: > >> From: Simon Marchi >> >> The rationale for this patch comes from the ROCm port [1], the goal >> being to reduce the number of back and forths between GDB and the target >> when doing successive operations. I'll start with explaining the >> rationale and then go over the implementation. In the ROCm / GPU world, >> the term "wave" is somewhat equivalent to a "thread" in GDB. So if you >> read if from a GPU stand point, just s/thread/wave/. >> >> ROCdbgapi, the library used by GDB [2] to communicate with the GPU >> target, gives the illusion that it's possible for the debugger to >> control (start and stop) individual threads. But in reality, this is >> not how it works. Under the hood, all threads of a queue are controlled >> as a group. To stop one thread in a group of running ones, the state of >> all threads is retrieved from the GPU, all threads are destroyed, and all >> threads but the one we want to stop are re-created from the saved state. >> The net result, from the point of view of GDB, is that the library >> stopped one thread. The same thing goes if we want to resume one thread >> while others are running: the state of all running threads is retrieved >> from the GPU, they are all destroyed, and they are all re-created, >> including the thread we want to resume. >> >> This leads to some inefficiencies when combined with how GDB works, here >> are two examples: >> >> - Stopping all threads: because the target operates in non-stop mode, >> when the user interface mode is all-stop, GDB must stop all threads >> individually when presenting a stop. Let's suppose we have 1000 >> threads and the user does ^C. GDB asks the target to stop one >> thread. Behind the scenes, the library retrieves 1000 thread states >> and restores the 999 others still running ones. GDB asks the target >> to stop another one. The target retrieves 999 thread states and >> restores the 998 remaining ones. That means that to stop 1000 >> threads, we did 1000 back and forths with the GPU. It would have >> been much better to just retrieve the states once and stop there. >> >> - Resuming with pending events: suppose the 1000 threads hit a >> breakpoint at the same time. The breakpoint is conditional and >> evaluates to true for the first thread, to false for all others. GDB >> pulls one event (for the first thread) from the target, decides that >> it should present a stop, so stops all threads using >> stop_all_threads. All these other threads have a breakpoint event to >> report, which is saved in `thread_info::suspend::waitstatus` for >> later. When the user does "continue", GDB resumes that one thread >> that did hit the breakpoint. It then processes the pending events >> one by one as if they just arrived. It picks one, evaluates the >> condition to false, and resumes the thread. It picks another one, >> evaluates the condition to false, and resumes the thread. And so on. >> In between each resumption, there is a full state retrieval and >> re-creation. It would be much nicer if we could wait a little bit >> before sending those threads on the GPU, until it processed all those >> pending events. >> >> To address this kind of performance issue, ROCdbgapi has a concept >> called "forward progress required", which is a boolean state that allows >> its user (i.e. GDB) to say "I'm doing a bunch of operations, you can >> hold off putting the threads on the GPU until I'm done" (the "forward >> progress not required" state). Turning forward progress back on >> indicates to the library that all threads that are supposed to be >> running should now be really running on the GPU. >> >> It turns out that GDB has a similar concept, though not as general, >> commit_resume. On difference is that commit_resume is not stateful: the > > typo: 'On difference' ? Fixed to "One difference". > >> target can't look up "does the core need me to schedule resumed threads >> for execution right now". It is also specifically linked to the resume >> method, it is not used in other contexts. The target accumulates >> resumption requests through target_ops::resume calls, and then commits >> those resumptions when target_ops::commit_resume is called. The target >> has no way to check if it's ok to leave resumed threads stopped in other >> target methods. >> >> To bridge the gap, this patch generalizes the commit_resume concept in >> GDB to match the forward progress concept of ROCdbgapi. The current >> name (commit_resume) can be interpreted as "commit the previous resume >> calls". I renamed the concept to "commit_resumed", as in "commit the >> threads that are resumed". >> >> In the new version, we have two things in process_stratum_target: >> >> - the commit_resumed_state field: indicates whether GDB requires this >> target to have resumed threads committed to the execution >> target/device. If false, the target is allowed to leave resumed >> threads un-committed at the end of whatever method it is executing. >> >> - the commit_resumed method: called when commit_resumed_state >> transitions from false to true. While commit_resumed_state was >> false, the target may have left some resumed threads un-committed. >> This method being called tells it that it should commit them back to >> the execution device. >> >> Let's take the "Stopping all threads" scenario from above and see how it >> would work with the ROCm target with this change. Before stopping all >> threads, GDB would set the target's commit_resumed_state field to false. >> It would then ask the target to stop the first thread. The target would >> retrieve all threads' state from the GPU and mark that one as stopped. >> Since commit_resumed_state is false, it leaves all the other threads >> (still resumed) stopped. GDB would then proceed to call target_stop for >> all the other threads. Since resumed threads are not committed, this >> doesn't do any back and forth with the GPU. >> >> To simplify the implementation of targets, I made it so that when >> calling certain target methods, the contract between the core and the >> targets guarantees that commit_resumed_state is false. This way, the >> target doesn't need two paths, one commit_resumed_state == true and one >> for commit_resumed_state == false. It can just assert that >> commit_resumed_state is false and work with that assumption. This also >> helps catch places where we forgot to disable commit_resumed_state >> before calling the method, which represents a probable optimization >> opportunity. >> >> To have some confidence that this contract between the core and the >> targets is respected, I added assertions in the linux-nat target >> methods, even though the linux-nat target doesn't actually use that >> feature. Since linux-nat is tested much more than other targets, this >> will help catch these issues quicker. >> >> To ensure that commit_resumed_state is always turned back on (only if >> necessary, see below) and the commit_resumed method is called when doing >> so, I introduced the scoped_disabled_commit_resumed RAII object, which >> replaces make_scoped_defer_process_target_commit_resume. On >> construction, it clears the commit_resumed_state flag of all process >> targets. On destruction, it turns it back on (if necessary) and calls >> the commit_resumed method. The nested case is handled by having a >> "nesting" counter: only when the counter goes back to 0 is >> commit_resumed_state turned back on. >> >> On destruction, commit-resumed is not re-enabled for a given target if: >> >> 1. this target has no threads resumed, or >> 2. this target at least one thread with a pending status known to the > > Missing word: 'this target at least'? Fixed to "this target has at least". > I read through the commit message and convinced myself that it made > sense. I ran out of time to look at the actual code. That's already very nice, thanks! Simon