From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-bounces@sourceware.org>
Received: from simark.ca
	by simark.ca with LMTP
	id gEqRJpZVxV+NMQAAWB0awg
	(envelope-from <gdb-patches-bounces@sourceware.org>)
	for <public-inbox@simark.ca>; Mon, 30 Nov 2020 15:27:02 -0500
Received: by simark.ca (Postfix, from userid 112)
	id 9B57F1F0AB; Mon, 30 Nov 2020 15:27:02 -0500 (EST)
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca
X-Spam-Level: 
X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI
	autolearn=ham autolearn_force=no version=3.4.2
Received: from sourceware.org (server2.sourceware.org [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by simark.ca (Postfix) with ESMTPS id 6B11F1E552
	for <public-inbox@simark.ca>; Mon, 30 Nov 2020 15:27:01 -0500 (EST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id AFC143896C18;
	Mon, 30 Nov 2020 20:27:00 +0000 (GMT)
Received: from simark.ca (simark.ca [158.69.221.121])
 by sourceware.org (Postfix) with ESMTPS id 545F13896C18
 for <gdb-patches@sourceware.org>; Mon, 30 Nov 2020 20:26:57 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 545F13896C18
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=simark.ca
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark@simark.ca
Received: from [10.0.0.11] (173-246-6-90.qc.cable.ebox.net [173.246.6.90])
 (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by simark.ca (Postfix) with ESMTPSA id B740B1E552;
 Mon, 30 Nov 2020 15:26:56 -0500 (EST)
Subject: Re: [PATCH 09/12] gdb: move displaced stepping logic to gdbarch,
 allow starting concurrent displaced steps
To: Pedro Alves <pedro@palves.net>, Simon Marchi <simon.marchi@efficios.com>, 
 gdb-patches@sourceware.org
References: <20201110214614.2842615-1-simon.marchi@efficios.com>
 <20201110214614.2842615-10-simon.marchi@efficios.com>
 <72a75576-a54b-bcc8-e8d3-5a57571fe234@palves.net>
 <1e9654db-de61-a5d4-48c3-a493b8598f0d@efficios.com>
 <598ee38c-b1d7-63d8-b8d0-36093e06f51f@palves.net>
From: Simon Marchi <simark@simark.ca>
Message-ID: <da72a806-696f-e526-53b5-81ef9cca4188@simark.ca>
Date: Mon, 30 Nov 2020 15:26:55 -0500
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.12.0
MIME-Version: 1.0
In-Reply-To: <598ee38c-b1d7-63d8-b8d0-36093e06f51f@palves.net>
Content-Type: text/plain; charset=utf-8
Content-Language: fr
Content-Transfer-Encoding: 7bit
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
Errors-To: gdb-patches-bounces@sourceware.org
Sender: "Gdb-patches" <gdb-patches-bounces@sourceware.org>

On 2020-11-26 9:24 a.m., Pedro Alves wrote:
> The function still throws errors anyway, e.g. a bit above:
>
>   int status = target_read_memory (m_addr, m_saved_copy.data (), len);
>   if (status != 0)
>     throw_error (MEMORY_ERROR,
> 		 _("Error accessing memory address %s (%s) for "
> 		   "displaced-stepping scratch space."),
> 		 paddress (arch, m_addr), safe_strerror (status));
>
> That's then caught by displaced_step_prepare, which results in
> the "disabling displaced stepping" warning, and GDB disabling
> displaced stepping for the inferior.
>
> Also, NOT_SUPPORTED_ERROR is handled the same way.  See:
>
>  commit 16b4184277c4ad5b4a20278060fd3f6259d1ed49
>  Author:     Pedro Alves <palves@redhat.com>
>  AuthorDate: Tue Mar 15 16:33:04 2016 +0000
>
>     Fix PR gdb/19676: Disable displaced stepping if /proc not mounted
>
> Any other error is just propagated out, likely all the way to
> the prompt:
>
>  static displaced_step_prepare_status
>  displaced_step_prepare (thread_info *thread)
>  {
>  ...
>    catch (const gdb_exception_error &ex)
>      {
>       struct displaced_step_inferior_state *displaced_state;
>
>       if (ex.error != MEMORY_ERROR
> 	  && ex.error != NOT_SUPPORTED_ERROR)
> 	throw;
>
>> Would it make sense to remove the _ERROR status code and report all
>> errors using exceptions then?
>
> I don't think so, because there are 3 kinds of "errors":
>
>  #1 - MEMORY and NOT_SUPPORTED errors result in disabling displaced stepping
>       for the inferior permanently.
>
>  #2 - other errors/exceptions propagate outwards.
>
>  #3 - "prepare" returning "Can't do displace step for this instruction" for
>       some reason.  Like the backend not supporting displaced stepping
>       that particular instruction, or the breakpoint address falling
>       within the scratch pad.  I.e., these cases:
>
> ...
>    if (breakpoint_in_range_p (aspace, m_addr, len))
>     {
>       /* There's a breakpoint set in the scratch pad location range
> 	 (which is usually around the entry point).  We'd either
> 	 install it before resuming, which would overwrite/corrupt the
> 	 scratch pad, or if it was already inserted, this displaced
> 	 step would overwrite it.  The latter is OK in the sense that
> 	 we already assume that no thread is going to execute the code
> 	 in the scratch pad range (after initial startup) anyway, but
> 	 the former is unacceptable.  Simply punt and fallback to
> 	 stepping over this breakpoint in-line.  */
>       displaced_debug_printf ("breakpoint set in scratch pad.  "
> 			      "Stepping over breakpoint in-line instead.");
>
>       return DISPLACED_STEP_PREPARE_STATUS_ERROR;
>     }
> ...
>    m_copy_insn_closure = gdbarch_displaced_step_copy_insn (arch,
>  							  m_original_pc,
>  							  m_addr,
>  							  regcache);
>    if (m_copy_insn_closure == nullptr)
>      {
>        /* The architecture doesn't know how or want to displaced step
>          this instruction or instruction sequence.  Fallback to
>          stepping over the breakpoint in-line.  */
>        return DISPLACED_STEP_PREPARE_STATUS_ERROR;
>      }
>
> The "failed to write pc" try/catch case seems debatable, but I'd
> lean on it being case #2.
>
> I think that case #3 calls for renaming DISPLACED_STEP_PREPARE_STATUS_ERROR
> to something else to avoid confusion.  Note that in master, -1 wasn't
> documented in terms of error:
>
> ...
>    Returns 1 if preparing was successful -- this thread is going to be
>    stepped now; 0 if displaced stepping this thread got queued; or -1
>    if this instruction can't be displaced stepped.  */
>
>  static int
>  displaced_step_prepare_throw (thread_info *tp)

Ok, thanks for this explanation.  I didn't understand well the different
error modes (can't displaced-step this particular instruction vs some
other unexpected error) and how they were handled.

I'll rename DISPLACED_STEP_PREPARE_STATUS_ERROR to
DISPLACED_STEP_PREPARE_STATUS_CANT.  It fits well with the comments in
the code where it is used, that all say variants of "can't displaced
step this instruction, fall back on in-line stepping".

>>>> -  return false;
>>>> +    /* If there are threads left in the THREADS_TO_STEP list, but we have
>>>> +       detected that we can't start anything more, put back these threads
>>>> +       in the global list.  */
>>>
>>> Do we also need to do this if an exception happens to escape the function?
>>> We might end up pretty bonkers anyhow if that happens, though...
>>
>> Yes, I realized that when playing with this code again yesterday.  A
>> scope_exit would help for this I think.
>>
>> But I'm also wondering if we should enqueue back the thread TP that was
>> dequeued, that caused an error.  This one has already been dequeued from
>> THREADS_TO_STEP.  For example, if thread_still_needs_step_over throws
>> (it reads the PC, so that could happen), should we put back the thread
>> in the global chain?  If we do, and the same error happens over and
>> over, the thread will never leave the step over chain and be the first
>> in line, preventing any other displaced step to happen.
>>
>> If, on the other hand, the error was just an intermittent one and we
>> don't put it back in the queue, the next time we'll resume the thread
>> we'll realize it needs a step over and enqueue it again.  So it sounds
>> less risky to me to just not enqueue back the thread on error.  I'm
>> really not sure, I find it difficult to reason about all the possible
>> cases.
>
> Yeah, not enqueuing sounds OK.  And indeed, error handling in all of
> infrun is very hard to reason about, and likely to end up with things
> messed up, e.g., with a thread marked as running when it is really
> stopped.

Ok, did that.

>> Also, re-reading that code, I noticed that we only return "true" when an
>> inline step over is started, or if a displaced step on an all-stop
>> target was started.  That's the case even before the patch.  I suppose
>> that's on purpose, because the caller wants to know whether it can
>> resume more stuff or not?
>
> Yes.  If it returns true, then the caller shouldn't resume anything
> and go straight to waiting.  ISTR having some trouble coming up with
> the right conditions at the callers.  Maybe it could be simplified.
> I no longer remember why start_step_over returns false here:
>
>   /* Don't start a new step-over if we already have an in-line
>      step-over operation ongoing.  */
>   if (step_over_info_valid_p ())
>     return false;
>
> Off hand, it seems like that should return true.  But it may
> be that the intention was to only return true if anything
> that was stopped before was started.  I'd need to experiment
> to know better now.  :-/

Ok, I'll leave it as-is for now and treat is as a separete issue.

>>>> @@ -5293,25 +5226,23 @@ handle_inferior_event (struct execution_control_state *ecs)
>>>>        {
>>>>  	struct regcache *regcache = get_thread_regcache (ecs->event_thread);
>>>>  	struct gdbarch *gdbarch = regcache->arch ();
>>>> +	inferior *parent_inf = find_inferior_ptid (ecs->target, ecs->ptid);
>>>> +
>>>> +	if (ecs->ws.kind == TARGET_WAITKIND_FORKED)
>>>> +	  {
>>>> +	    /* Restore in the child process any displaced stepping buffers that
>>>> +	       were in use at the time of the fork.  */
>>>> +	    gdbarch_displaced_step_restore_all_in_ptid
>>>> +	      (gdbarch, parent_inf, ecs->ws.value.related_pid);
>>>> +	  }
>>>
>>> Why was this moved out of the displaced_step_in_progress_thread check
>>> below?
>>
>> If:
>>
>> 1. thread 1 is doing a displaced step
>> 2. thread 2 does a fork, not letting time for thread 1 to complete
>>
>> event_thread is thread 2, and it is not doing a displaced step, so we
>> don't enter the
>>
>>   if (displaced_step_in_progress_thread (ecs->event_thread))
>>
>> But we still want to restore the bytes in the child used as the
>> displaced stepping buffer for thread 1.  Does that make sense?  Is it
>> possible that the current code is not correct in that regard?
>
> Indeed, looks like it.  Good catch.  How about fixing that
> in a separate patch, then, with its own commit log?

Will do.  I didn't do a separate patch at first because I initially
thought: "now that we can do multiple displaced steps in parallel, it
can be that the other displaced steps does a fork!".  I didn't think
right away of the case where it's a non-displaced-stepping thread who
does a fork.

Simon