From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id gEqRJpZVxV+NMQAAWB0awg (envelope-from ) for ; Mon, 30 Nov 2020 15:27:02 -0500 Received: by simark.ca (Postfix, from userid 112) id 9B57F1F0AB; Mon, 30 Nov 2020 15:27:02 -0500 (EST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 6B11F1E552 for ; Mon, 30 Nov 2020 15:27:01 -0500 (EST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AFC143896C18; Mon, 30 Nov 2020 20:27:00 +0000 (GMT) Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id 545F13896C18 for ; Mon, 30 Nov 2020 20:26:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 545F13896C18 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark@simark.ca Received: from [10.0.0.11] (173-246-6-90.qc.cable.ebox.net [173.246.6.90]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id B740B1E552; Mon, 30 Nov 2020 15:26:56 -0500 (EST) Subject: Re: [PATCH 09/12] gdb: move displaced stepping logic to gdbarch, allow starting concurrent displaced steps To: Pedro Alves , Simon Marchi , gdb-patches@sourceware.org References: <20201110214614.2842615-1-simon.marchi@efficios.com> <20201110214614.2842615-10-simon.marchi@efficios.com> <72a75576-a54b-bcc8-e8d3-5a57571fe234@palves.net> <1e9654db-de61-a5d4-48c3-a493b8598f0d@efficios.com> <598ee38c-b1d7-63d8-b8d0-36093e06f51f@palves.net> From: Simon Marchi Message-ID: Date: Mon, 30 Nov 2020 15:26:55 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <598ee38c-b1d7-63d8-b8d0-36093e06f51f@palves.net> Content-Type: text/plain; charset=utf-8 Content-Language: fr Content-Transfer-Encoding: 7bit X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces@sourceware.org Sender: "Gdb-patches" On 2020-11-26 9:24 a.m., Pedro Alves wrote: > The function still throws errors anyway, e.g. a bit above: > > int status = target_read_memory (m_addr, m_saved_copy.data (), len); > if (status != 0) > throw_error (MEMORY_ERROR, > _("Error accessing memory address %s (%s) for " > "displaced-stepping scratch space."), > paddress (arch, m_addr), safe_strerror (status)); > > That's then caught by displaced_step_prepare, which results in > the "disabling displaced stepping" warning, and GDB disabling > displaced stepping for the inferior. > > Also, NOT_SUPPORTED_ERROR is handled the same way. See: > > commit 16b4184277c4ad5b4a20278060fd3f6259d1ed49 > Author: Pedro Alves > AuthorDate: Tue Mar 15 16:33:04 2016 +0000 > > Fix PR gdb/19676: Disable displaced stepping if /proc not mounted > > Any other error is just propagated out, likely all the way to > the prompt: > > static displaced_step_prepare_status > displaced_step_prepare (thread_info *thread) > { > ... > catch (const gdb_exception_error &ex) > { > struct displaced_step_inferior_state *displaced_state; > > if (ex.error != MEMORY_ERROR > && ex.error != NOT_SUPPORTED_ERROR) > throw; > >> Would it make sense to remove the _ERROR status code and report all >> errors using exceptions then? > > I don't think so, because there are 3 kinds of "errors": > > #1 - MEMORY and NOT_SUPPORTED errors result in disabling displaced stepping > for the inferior permanently. > > #2 - other errors/exceptions propagate outwards. > > #3 - "prepare" returning "Can't do displace step for this instruction" for > some reason. Like the backend not supporting displaced stepping > that particular instruction, or the breakpoint address falling > within the scratch pad. I.e., these cases: > > ... > if (breakpoint_in_range_p (aspace, m_addr, len)) > { > /* There's a breakpoint set in the scratch pad location range > (which is usually around the entry point). We'd either > install it before resuming, which would overwrite/corrupt the > scratch pad, or if it was already inserted, this displaced > step would overwrite it. The latter is OK in the sense that > we already assume that no thread is going to execute the code > in the scratch pad range (after initial startup) anyway, but > the former is unacceptable. Simply punt and fallback to > stepping over this breakpoint in-line. */ > displaced_debug_printf ("breakpoint set in scratch pad. " > "Stepping over breakpoint in-line instead."); > > return DISPLACED_STEP_PREPARE_STATUS_ERROR; > } > ... > m_copy_insn_closure = gdbarch_displaced_step_copy_insn (arch, > m_original_pc, > m_addr, > regcache); > if (m_copy_insn_closure == nullptr) > { > /* The architecture doesn't know how or want to displaced step > this instruction or instruction sequence. Fallback to > stepping over the breakpoint in-line. */ > return DISPLACED_STEP_PREPARE_STATUS_ERROR; > } > > The "failed to write pc" try/catch case seems debatable, but I'd > lean on it being case #2. > > I think that case #3 calls for renaming DISPLACED_STEP_PREPARE_STATUS_ERROR > to something else to avoid confusion. Note that in master, -1 wasn't > documented in terms of error: > > ... > Returns 1 if preparing was successful -- this thread is going to be > stepped now; 0 if displaced stepping this thread got queued; or -1 > if this instruction can't be displaced stepped. */ > > static int > displaced_step_prepare_throw (thread_info *tp) Ok, thanks for this explanation. I didn't understand well the different error modes (can't displaced-step this particular instruction vs some other unexpected error) and how they were handled. I'll rename DISPLACED_STEP_PREPARE_STATUS_ERROR to DISPLACED_STEP_PREPARE_STATUS_CANT. It fits well with the comments in the code where it is used, that all say variants of "can't displaced step this instruction, fall back on in-line stepping". >>>> - return false; >>>> + /* If there are threads left in the THREADS_TO_STEP list, but we have >>>> + detected that we can't start anything more, put back these threads >>>> + in the global list. */ >>> >>> Do we also need to do this if an exception happens to escape the function? >>> We might end up pretty bonkers anyhow if that happens, though... >> >> Yes, I realized that when playing with this code again yesterday. A >> scope_exit would help for this I think. >> >> But I'm also wondering if we should enqueue back the thread TP that was >> dequeued, that caused an error. This one has already been dequeued from >> THREADS_TO_STEP. For example, if thread_still_needs_step_over throws >> (it reads the PC, so that could happen), should we put back the thread >> in the global chain? If we do, and the same error happens over and >> over, the thread will never leave the step over chain and be the first >> in line, preventing any other displaced step to happen. >> >> If, on the other hand, the error was just an intermittent one and we >> don't put it back in the queue, the next time we'll resume the thread >> we'll realize it needs a step over and enqueue it again. So it sounds >> less risky to me to just not enqueue back the thread on error. I'm >> really not sure, I find it difficult to reason about all the possible >> cases. > > Yeah, not enqueuing sounds OK. And indeed, error handling in all of > infrun is very hard to reason about, and likely to end up with things > messed up, e.g., with a thread marked as running when it is really > stopped. Ok, did that. >> Also, re-reading that code, I noticed that we only return "true" when an >> inline step over is started, or if a displaced step on an all-stop >> target was started. That's the case even before the patch. I suppose >> that's on purpose, because the caller wants to know whether it can >> resume more stuff or not? > > Yes. If it returns true, then the caller shouldn't resume anything > and go straight to waiting. ISTR having some trouble coming up with > the right conditions at the callers. Maybe it could be simplified. > I no longer remember why start_step_over returns false here: > > /* Don't start a new step-over if we already have an in-line > step-over operation ongoing. */ > if (step_over_info_valid_p ()) > return false; > > Off hand, it seems like that should return true. But it may > be that the intention was to only return true if anything > that was stopped before was started. I'd need to experiment > to know better now. :-/ Ok, I'll leave it as-is for now and treat is as a separete issue. >>>> @@ -5293,25 +5226,23 @@ handle_inferior_event (struct execution_control_state *ecs) >>>> { >>>> struct regcache *regcache = get_thread_regcache (ecs->event_thread); >>>> struct gdbarch *gdbarch = regcache->arch (); >>>> + inferior *parent_inf = find_inferior_ptid (ecs->target, ecs->ptid); >>>> + >>>> + if (ecs->ws.kind == TARGET_WAITKIND_FORKED) >>>> + { >>>> + /* Restore in the child process any displaced stepping buffers that >>>> + were in use at the time of the fork. */ >>>> + gdbarch_displaced_step_restore_all_in_ptid >>>> + (gdbarch, parent_inf, ecs->ws.value.related_pid); >>>> + } >>> >>> Why was this moved out of the displaced_step_in_progress_thread check >>> below? >> >> If: >> >> 1. thread 1 is doing a displaced step >> 2. thread 2 does a fork, not letting time for thread 1 to complete >> >> event_thread is thread 2, and it is not doing a displaced step, so we >> don't enter the >> >> if (displaced_step_in_progress_thread (ecs->event_thread)) >> >> But we still want to restore the bytes in the child used as the >> displaced stepping buffer for thread 1. Does that make sense? Is it >> possible that the current code is not correct in that regard? > > Indeed, looks like it. Good catch. How about fixing that > in a separate patch, then, with its own commit log? Will do. I didn't do a separate patch at first because I initially thought: "now that we can do multiple displaced steps in parallel, it can be that the other displaced steps does a fork!". I didn't think right away of the case where it's a non-displaced-stepping thread who does a fork. Simon