On 2021-01-09 3:34 p.m., Pedro Alves wrote:
> On 08/01/21 04:17, Simon Marchi wrote:
>>  - Resuming with pending events: suppose the 1000 threads hit a
>>    breakpoint at the same time.  The breakpoint is conditional and
>>    evaluates to true for the first thread, to false for all others.  GDB
>>    pulls one event (for the first thread) from the target, decides that
>>    it should present a stop, so stops all threads using
>>    stop_all_threads.  All these other threads have a breakpoint event to
>>    report, which is saved in `thread_info::suspend::waitstatus` for
>>    later.  When the user does "continue", GDB resumes that one thread
>>    that did hit the breakpoint.  It then processes the pending events
>>    one by one as if they just arrived.  It picks one, evaluates the
>>    condition to false, and resumes the thread.  It picks another one,
>>    evaluates the condition to false, and resumes the thread.  And so on.
>>    In between each resumption, there is a full state retrieval and
>>    re-creation.  It would be much nicer if we could wait a little bit
>>    before sending those threads on the GPU, until it processed all those
>>    pending events.
> 
> A potential downside of holding on in this latter scenario, with regular
> host debugging, is that currently, threads are resumed immediately, thus potentially
> the inferior process's threads spend less time paused, at least with the native target
> if we implemented commit_resume there.  With remote, the trade off is
> probably more in favor of deferring, given the higher latency.
> 
> However, since we don't implement commit_resume for native target
> currently, it shouldn't have any effect there.

Indeed.  If there was a way to ptrace-resume multiple threads in one
ptrace call, we could think of implementing it for linux-nat, for
example.  But even then it would be a trade-off: not implementing it
get the first thread back and running faster, implementing it reduces
the number of syscalls done.

> To confirm this, I tried the testcase we used when debugging the
> displaced stepping buffers series, with 100 threads continuously
> stepping over a breakpoint, for 10 seconds.  Suprisingly, when
> native target, I see a consistent ~3% slowdown caused by this series.
> 
> I don't see any material difference with gdbserver.
> 
> (higher is better)
> 
> native, pristine
> 
>  avg              440.240000
>  avg              436.670000
>  avg              451.310000
>  avg              432.840000
>  avg              437.060000
>  ===========================
>  avg of avg       439.624000
> 
> native, patched
> 
>  avg              420.940000
>  avg              428.130000
>  avg              425.230000
>  avg              428.080000
>  avg              424.880000
>  ===========================
>  avg of avg       425.452000
> 
> 
> 
> gdbserver, pristine:
> 
>  avg              633.490000
>  avg              639.910000
>  avg              642.300000
>  avg              626.160000
>  avg              626.460000
>  ===========================
>  avg of avg       633.664000
> 
> 
> gdbserver, patched
> 
>  avg              630.970000
>  avg              628.960000
>  avg              638.340000
>  avg              627.030000
>  avg              638.390000
>  ===========================
>  avg of avg       632.738000
> 
> tests run like this:
> 
>   $ gcc disp-step-buffers-test.c -o disp-step-buffers-test -g3 -O2 -pthread
>   $ g="./gdb -data-directory=data-directory"
>   $ time $g -q --batch disp-step-buffers-test -ex "b 16 if 0" -ex "r"
>   $ time $g -q --batch disp-step-buffers-test -ex "set sysroot" -ex "target remote | ../gdbserver/gdbserver - disp-step-buffers-test" -ex "b 16 if 0" -ex "c" 
> 
> I've attached disp-step-buffers-test.c.
> 
> I'm surprised that native debugging is quite slower here, compared to
> gdbserver.  I don't recall observing that earlier.  Maybe I just missed
> it then.

Well, remember that GDBserver is doing the condition evaluation, so it all
happens GDBserver-side.  And given all the state machines and everything in
GDBserver is simpler than GDB, I could imagine it could explain why GDBserver
is faster.

I tried it on my side, I also see the GDBserver-based test doing about 1.33
more steps.

When adding "set breakpoint condition-evaluation host", then the GDBserver
test becomes slower, doing about 0.55 times the number of steps than the GDB
baseline.

> 
> I wouldn't have thought we would be doing that much work that it
> would be noticeable with the native target (pristive vs patched, the 3%
> slowdown).  I wonder whether that is caused by the constant std::set allocation
> in all_process_targets.  But then it's strange that we don't see that
> same slowdown when remote debugging.  I'm surprised.

I see more of ~1.33 % slowdown, but it's consistent too.

It could be the std::set allocation.  I changed all_process_targets to make it
return an array of 1 element, just to see what happens, it didn't seem to help.
See attached patch "0001-Test-returning-something-else-than-std-set-in-all_pr.patch"
if you want to try it.

It's maybe due to the fact that we now iterate on all threads at every handled
event? fetch_inferior_event calls ~scoped_disable_commit_resumed, which calls
maybe_commit_resumed_all_process_targets, which iterates on all threads.  The
loop actually breaks when it finds a thread with a pending status, but that
still makes this function O(number of threads).

>>
>> In the new version, we have two things in process_stratum_target:
>>
>>  - the commit_resumed_state field: indicates whether GDB requires this
>>    target to have resumed threads committed to the execution
>>    target/device.  If false, the target is allowed to leave resumed
>>    threads un-committed at the end of whatever method it is executing.
>>
>>  - the commit_resumed method: called when commit_resumed_state
>>    transitions from false to true.  While commit_resumed_state was
>>    false, the target may have left some resumed threads un-committed.
>>    This method being called tells it that it should commit them back to
>>    the execution device.
>>
>> Let's take the "Stopping all threads" scenario from above and see how it
>> would work with the ROCm target with this change.  Before stopping all
>> threads, GDB would set the target's commit_resumed_state field to false.
>> It would then ask the target to stop the first thread.  The target would
>> retrieve all threads' state from the GPU and mark that one as stopped.
>> Since commit_resumed_state is false, it leaves all the other threads
>> (still resumed) stopped.  GDB would then proceed to call target_stop for
>> all the other threads.  Since resumed threads are not committed, this
>> doesn't do any back and forth with the GPU.
>>
>> To simplify the implementation of targets, I made it so that when
>> calling certain target methods, the contract between the core and the
>> targets guarantees that commit_resumed_state is false.  This way, the
>> target doesn't need two paths, one commit_resumed_state == true and one
>> for commit_resumed_state == false.  It can just assert that
>> commit_resumed_state is false and work with that assumption.  This also
>> helps catch places where we forgot to disable commit_resumed_state
>> before calling the method, which represents a probable optimization
>> opportunity.
>>
>> To have some confidence that this contract between the core and the
>> targets is respected, I added assertions in the linux-nat target
>> methods, even though the linux-nat target doesn't actually use that
>> feature.  Since linux-nat is tested much more than other targets, this
>> will help catch these issues quicker.
> 
> Did you consider adding the assertions to target.c instead, in the
> target_resume/target_wait/target_stop wrapper methods?  That would
> cover all targets.

No, but it would be a good idea.  That wouldn't cover the cases where
target_ops::<method> is called directly, not through target_<method>.
From what I can see, this is only ever done by target methods to call
the corresponding method in the beneath target.  So presumably, up the
call somewhere is target_<method>, which will have already made the
assertion. The only exception to that is record_full_wait_1 which calls
beneath ()->resume.  But since both wait and resume are guaranteed to
be called with commit_resumed_state false, we're fine.

We'd be in trouble if, for example, a target's fetch_registers called
the beneath target's wait method (for some weird reason), as
fetch_registers could be called with commit_resumed_state true.  So,
just something to keep in mind in the future.

>> To ensure that commit_resumed_state is always turned back on (only if
>> necessary, see below) and the commit_resumed method is called when doing
>> so, I introduced the scoped_disabled_commit_resumed RAII object, which
>> replaces make_scoped_defer_process_target_commit_resume.  On
>> construction, it clears the commit_resumed_state flag of all process
>> targets.  On destruction, it turns it back on (if necessary) and calls
>> the commit_resumed method.  
> 
> This part makes me nervous and I think will cause us problems.  I'm
> really not sure it's a good idea.  The issue is that the commit_resumed method can
> throw, and we'll be in a dtor, which means that we will need to swallow the
> error, there's no way to propagate it out aborting the current function.
> That's why we currently have explicit commit calls, and the scoped object just
> tweaks the "defer commit" flag.  Would it work to build on the current
> design instead of moving the commit to the dtor?

Hmm, I'll give it a try.  The reason why I made it RAII is that I wanted
to be absolutely sure commit_resumed_state was turned back to true, even
in case of error.  Perhaps indeed the RAII can just flip back
commit_resumed_state to true (like the defer commit flag is today) and
the call to the commit_resumed can be made by hand after the scope.

>> The nested case is handled by having a
>> "nesting" counter: only when the counter goes back to 0 is
>> commit_resumed_state turned back on.
> 
> It wasn't obvious to me from the description why do we need both commit_resumed_state
> and a counter.  As in, wouldn't just the counter work?  Like, if the count is 0,
> the state is on, if >0, it is off.  

The counter just counts how many scoped_disable_commit_resumed
instances are active right now, up the stack, to make sure only
the outermost actually tries to re-enable commit-resumed.  This
is explained (perhaps poorly) in the comment in infrun.h:

   In addition, track creation of nested scoped_disable_commit_resumed objects,
   for cases like this:

     void
     inner_func ()
     {
       scoped_disable_commit_resumed disable;
       // do stuff
     }

     void
     outer_func ()
     {
       scoped_disable_commit_resumed disable;

       for (... each thread ...)
         inner_func ();
     }

   In this case, we don't want the `disable` in `inner_func` to require targets
   to commit resumed threads in its destructor.  */


When a scoped_disable_commit_resumed gets destroyed and
the counter goes down to 0, it means it's the outermost
instance, and that means it will try to re-enable
commit-resumed on all targets.  But that doesn't mean
that commit-resumed will be re-enabled on all targets:
if a target has a pending status, commit-resumed will
stay disabled.

The counter basically just replaces the scoped_restore on
defer_process_target_commit_resume.  Using a counter,
where each object decrements the counter on destruction,
is a slightly more general solution to this problem than
using scoped_restore, where each object restores the
value it saw when it got constructed.

The scoped_restore solution works when the lifetime of
all instances of the object are perfectly nested.  The
counter solution works when they are not.  But since our
scoped_disable_commit_resumed objects are all stack
allocated, their lifetimes should be perfectly nested,
so just using a scoped_restore should work.  I'll try to
replace the counter with just a boolean and a
scoped_restore, as we have now, that may simplify things
a bit.

But hopefully that clarifies why commit_resumed_state
and that counter thing are not the same thing.

> Can different targets ever have different commit resumed states?
> The only spot I see that tweaks the flag outside of the scoped object,
> is record-full.c, but I think that's only to avoid hitting the assertion?
> Do you plan on adding more spots that would override the state even if
> a scoped_disable_commit_resumed object is live?

No, I don't plan to add more such spots.  The record one is just an
annoying exception.

And yes, different targets can have different commit resumed states.

When a scoped_disable_commit_resumed object is created, it disables
commit-resumed for all targets.  When it is destructed, commit-resumed
is conditionally re-enabled for targets which have resumed threads
(from the point of view of infrun) and no pending status.

So if you have two targets with threads resumed, one of which has a
pending status, then that one will have commit-resumed off and the
other one will have commit-resumed on.  Or if you have two targets,
all threads stopped, and single step one thread.  Only the target
with the single-stepped thread will have its commit-resumed state
momentarily turned on.

It would be possible to implement different schemes, like a single
commit-resumed state for all targets.  If one target somewhere has
a pending status, we don't commit-resumed anybody.  But it seemed
to me like having things a bit more granular from the start would
help in the long run.

>> On destruction, commit-resumed is not re-enabled for a given target if:
>>
>>  1. this target has no threads resumed, or
>>  2. this target at least one thread with a pending status known to the
>>     core (saved in thread_info::suspend::waitstatus).
> 
> Should also check whether the thread with the pending status is resumed.
> /me reads patch, oh, did you that.  Good.  Please mention it here:
> ... one resumed thread ...

Will do.  I did not add this check from the start, I only realized it
was needed when debugging some testsuite regression.

>> The first point is not technically necessary, because a proper
>> commit_resumed implementation would be a no-op if the target has no
>> resumed threads.  But since we have a flag do to a quick check, I think
>> it doesn't hurt.
>>
>> The second point is more important: together with the
>> scoped_disable_commit_resumed instance added in fetch_inferior_event, it
>> makes it so the "Resuming with pending events" described above is
>> handled efficiently.  Here's what happens in that case:
>>
>>  1. The user types "continue".
>>  2. Upon destruction, the scoped_disable_commit_resumed in the `proceed`
>>     function does not enable commit-resumed, as it sees other threads
>>     have pending statuses.
>>  3. fetch_inferior_event is called to handle another event, one thread
>>     is resumed.  Because there are still more threads with pending
>>     statuses, the destructor of scoped_disable_commit_resumed in
>>     fetch_inferior_event still doesn't enable commit-resumed.
>>  4. Rinse and repeat step 3, until the last pending status is handled by
>>     fetch_inferior_event.  In that case, scoped_disable_commit_resumed's
>>     destructor sees there are no more threads with pending statues, so
>>     it asks the target to commit resumed threads.
>>
>> This allows us to avoid all unnecessary back and forths, there is a
>> single commit_resumed call.
>>
>> This change required remote_target::remote_stop_ns to learn how to
>> handle stopping threads that were resumed but pending vCont.  The
>> simplest example where that happens is when using the remote target in
>> all-stop, but with "maint set target-non-stop on", to force it to
>> operate in non-stop mode under the hood.  If two threads hit a
>> breakpoint at the same time, GDB will receive two stop replies.  It will
>> present the stop for one thread and save the other one in
>> thread_info::suspend::waitstatus.
>>
>> Before this patch, when doing "continue", GDB first resumes the thread
>> without a pending status:
>>
>>     Sending packet: $vCont;c:p172651.172676#f3
>>
>> It then consumes the pending status in the next fetch_inferior_event
>> call:
>>
>>     [infrun] do_target_wait_1: Using pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP for Thread 1517137.1517137.
>>     [infrun] target_wait (-1.0.0, status) =
>>     [infrun]   1517137.1517137.0 [Thread 1517137.1517137],
>>     [infrun]   status->kind = stopped, signal = GDB_SIGNAL_TRAP
>>
>> It then realizes it needs to stop all threads to present the stop, so
>> stops the thread it just resumed:
>>
>>     [infrun] stop_all_threads:   Thread 1517137.1517137 not executing
>>     [infrun] stop_all_threads:   Thread 1517137.1517174 executing, need stop
>>     remote_stop called
>>     Sending packet: $vCont;t:p172651.172676#04
>>
>> This is an unnecessary resume/stop.  With this patch, we don't commit
>> resumed threads after proceeding, because of the pending status:
>>
>>     [infrun] maybe_commit_resumed_all_process_targets: not requesting commit-resumed for target extended-remote, a thread has a pending waitstatus
>>
>> When GDB handles the pending status and stop_all_threads runs, we stop a
>> resumed but pending vCont thread:
>>
>>     remote_stop_ns: Enqueueing phony stop reply for thread pending vCont-resume (1520940, 1520976, 0)
>>
>> That thread was never actually resumed on the remote stub / gdbserver.
>> This is why remote_stop_ns needed to learn this new trick of enqueueing
>> phony stop replies.
>>
>> Note that this patch only considers pending statuses known to the core
>> of GDB, that is the events that were pulled out of the target and stored
>> in `thread_info::suspend::waitstatus`.  In some cases, we could also
>> avoid unnecessary back and forth when the target has events that it has
>> not yet reported the core.  I plan to implement this as a subsequent
>> patch, once this series has settled.
>>
>> gdb/ChangeLog:
>>
>> 	* infrun.h (struct scoped_disable_commit_resumed): New.
>> 	* infrun.c (do_target_resume): Remove
>> 	maybe_commit_resume_process_target call.
>> 	(maybe_commit_resume_all_process_targets): Rename to...
>> 	(maybe_commit_resumed_all_process_targets): ... this.  Skip
>> 	targets that have no executing threads or resumed threads with
>> 	a pending status.
>> 	(scoped_disable_commit_resumed_depth): New.
>> 	(scoped_disable_commit_resumed::scoped_disable_commit_resumed):
>> 	New.
>> 	(scoped_disable_commit_resumed::~scoped_disable_commit_resumed):
>> 	New.
>> 	(proceed): Use scoped_disable_commit_resumed.
>> 	(fetch_inferior_event): Use scoped_disable_commit_resumed.
>> 	* process-stratum-target.h (class process_stratum_target):
>> 	<commit_resume>: Rename to...
>> 	<commit_resumed>: ... this.
>> 	<commit_resumed_state>: New.
>> 	(all_process_targets): New.
>> 	(maybe_commit_resume_process_target): Remove.
>> 	(make_scoped_defer_process_target_commit_resume): Remove.
>> 	* process-stratum-target.c (all_process_targets): New.
>> 	(defer_process_target_commit_resume): Remove.
>> 	(maybe_commit_resume_process_target): Remove.
>> 	(make_scoped_defer_process_target_commit_resume): Remove.
>> 	* linux-nat.c (linux_nat_target::resume): Add gdb_assert.
>> 	(linux_nat_target::wait): Add gdb_assert.
>> 	(linux_nat_target::stop): Add gdb_assert.
>> 	* infcmd.c (run_command_1): Use scoped_disable_commit_resumed.
>> 	(attach_command): Use scoped_disable_commit_resumed.
>> 	(detach_command): Use scoped_disable_commit_resumed.
>> 	(interrupt_target_1): Use scoped_disable_commit_resumed.
>> 	* mi/mi-main.c (exec_continue): Use
>> 	scoped_disable_commit_resumed.
>> 	* record-full.c (record_full_wait_1): Change
>> 	commit_resumed_state around calling commit_resumed.
>> 	* remote.c (class remote_target) <commit_resume>: Rename to...
>> 	<commit_resumed>: ... this.
>> 	(remote_target::resume): Add gdb_assert.
>> 	(remote_target::commit_resume): Rename to...
>> 	(remote_target::commit_resumed): ... this.  Check if there is
>> 	any thread pending vCont resume.
>> 	(struct stop_reply): Move up.
>> 	(remote_target::remote_stop_ns): Generate stop replies for
>> 	resumed but pending vCont threads.
>> 	(remote_target::wait_ns): Add gdb_assert.
>>
>> [1] https://github.com/ROCm-Developer-Tools/ROCgdb/
>> [2] https://github.com/ROCm-Developer-Tools/ROCdbgapi
>>
>> Change-Id: I836135531a29214b21695736deb0a81acf8cf566
>> ---
>>  gdb/infcmd.c                 |   8 +++
>>  gdb/infrun.c                 | 116 +++++++++++++++++++++++++++++++----
>>  gdb/infrun.h                 |  41 +++++++++++++
>>  gdb/linux-nat.c              |   5 ++
>>  gdb/mi/mi-main.c             |   2 +
>>  gdb/process-stratum-target.c |  37 +++++------
>>  gdb/process-stratum-target.h |  63 +++++++++++--------
>>  gdb/record-full.c            |   4 +-
>>  gdb/remote.c                 | 111 +++++++++++++++++++++++----------
>>  9 files changed, 292 insertions(+), 95 deletions(-)
>>
>> diff --git a/gdb/infcmd.c b/gdb/infcmd.c
>> index 6f0ed952de67..b7595e42e265 100644
>> --- a/gdb/infcmd.c
>> +++ b/gdb/infcmd.c
>> @@ -488,6 +488,8 @@ run_command_1 (const char *args, int from_tty, enum run_how run_how)
>>        uiout->flush ();
>>      }
>>  
>> +  scoped_disable_commit_resumed disable_commit_resumed ("running");
>> +
>>    /* We call get_inferior_args() because we might need to compute
>>       the value now.  */
>>    run_target->create_inferior (exec_file,
>> @@ -2591,6 +2593,8 @@ attach_command (const char *args, int from_tty)
>>    if (non_stop && !attach_target->supports_non_stop ())
>>      error (_("Cannot attach to this target in non-stop mode"));
>>  
>> +  scoped_disable_commit_resumed disable_commit_resumed ("attaching");
>> +
>>    attach_target->attach (args, from_tty);
>>    /* to_attach should push the target, so after this point we
>>       shouldn't refer to attach_target again.  */
>> @@ -2746,6 +2750,8 @@ detach_command (const char *args, int from_tty)
>>    if (inferior_ptid == null_ptid)
>>      error (_("The program is not being run."));
>>  
>> +  scoped_disable_commit_resumed disable_commit_resumed ("detaching");
>> +
> 
> This one looks incorrect -- target_detach -> prepare_for_detach
> may need to finish off displaced steps, and resume the target
> in the process.  This here will inhibit it.  I have some WIP patches
> that will stop prepare_for_detach from doing that though, so it'll
> end up being correct after.

Ok, I will re-check that.

> 
>>    query_if_trace_running (from_tty);
>>  
>>    disconnect_tracing ();
>> @@ -2814,6 +2820,8 @@ stop_current_target_threads_ns (ptid_t ptid)
>>  void
>>  interrupt_target_1 (bool all_threads)
>>  {
>> +  scoped_disable_commit_resumed inhibit ("interrupting");
>> +
>>    if (non_stop)
>>      {
>>        if (all_threads)
>> diff --git a/gdb/infrun.c b/gdb/infrun.c
>> index 1a27af51b7e9..92a1102cb595 100644
>> --- a/gdb/infrun.c
>> +++ b/gdb/infrun.c
>> @@ -2172,8 +2172,6 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
>>  
>>    target_resume (resume_ptid, step, sig);
>>  
>> -  maybe_commit_resume_process_target (tp->inf->process_target ());
>> -
>>    if (target_can_async_p ())
>>      target_async (1);
>>  }
>> @@ -2760,17 +2758,109 @@ schedlock_applies (struct thread_info *tp)
>>  					    execution_direction)));
>>  }
>>  
>> -/* Calls maybe_commit_resume_process_target on all process targets.  */
>> +/* Maybe require all process stratum targets to commit their resumed threads.
>> +
>> +   A specific process stratum target is not required to do so if:
>> +
>> +   - it has no resumed threads
>> +   - it has a thread with a pending status  */
>>  
>>  static void
>> -maybe_commit_resume_all_process_targets ()
>> +maybe_commit_resumed_all_process_targets ()
>>  {
>> -  scoped_restore_current_thread restore_thread;
>> +  /* This is an optional to avoid unnecessary thread switches. */
> 
> Missing double space after period.
> 
> But, just scoped_restore_current_thread itself doesn't switch the
> thread.  Is this trying to save something else?  It seems pointless
> to me offhand.

IIRC that was to avoid some regressions in annotation tests, where
we would suddenly generate some additional "registers changed" events
or something like that after the prompt.

> 
>> +  gdb::optional<scoped_restore_current_thread> restore_thread;
>>  
>>    for (process_stratum_target *target : all_non_exited_process_targets ())
>>      {
>> +      gdb_assert (!target->commit_resumed_state);
> 
> Not sure I understand this assertion.  Isn't this another thing
> showing that the per-target state isn't really necessary, and we
> could just use the global state?

I don't think so.  maybe_commit_resumed_all_process_targets is called
when destructing the last / outermost scoped_disable_commit_resumed
object.  When constructing that scoped_disable_commit_resumed object,
we explicitly set commit_resumed_state for all targets to false, to
disable commit-resumed for all targets for the duration of the scope.
So this verifies that it's still the case.  We don't expect any other
inner code to change to change that value.  Or if some code does
(like the record-full target's wait method), then it must make sure
to turn it back to false.

After maybe_commit_resumed_all_process_targets has ran, the targets
that have threads resumed and no pending status will have their
commit-resumed state turned back to true, while other targets will
have theirs still false.

>> diff --git a/gdb/infrun.h b/gdb/infrun.h
>> index 7160b60f1368..5c32c0c97f6e 100644
>> --- a/gdb/infrun.h
>> +++ b/gdb/infrun.h
> 
>> +
>> +struct scoped_disable_commit_resumed
>> +{
>> +  scoped_disable_commit_resumed (const char *reason);
> 
> explicit

Fixed.

> 
>> index 1436a550ac04..9877f0d81931 100644
>> --- a/gdb/process-stratum-target.c
>> +++ b/gdb/process-stratum-target.c
>> @@ -99,6 +99,20 @@ all_non_exited_process_targets ()
>>  
>>  /* See process-stratum-target.h.  */
>>  
>> +std::set<process_stratum_target *>
>> +all_process_targets ()
>> +{
>> +  /* Inferiors may share targets.  To eliminate duplicates, use a set.  */
>> +  std::set<process_stratum_target *> targets;
>> +  for (inferior *inf : all_inferiors ())
>> +    if (inf->process_target () != nullptr)
>> +      targets.insert (inf->process_target ());
>> +
>> +  return targets;
>> +}
> 
> An alternative that would avoid creating this temporary std::set
> (along with its internal heap allocations) on every call would be to expose
> target-connection.c:process_targets.

That would make sense.

>> @@ -86,6 +77,35 @@ class process_stratum_target : public target_ops
>>  
>>    /* The connection number.  Visible in "info connections".  */
>>    int connection_number = 0;
>> +
>> +  /* Whether resumed threads must be committed to the target.
>> +
>> +     When true, resumed threads must be committed to the execution target.
>> +
>> +     When false, the process stratum target may leave resumed threads stopped
>> +     when it's convenient or efficient to do so.  When the core requires resumed
>> +     threads to be committed again, this is set back to true and calls the
>> +     `commit_resumed` method to allow the target to do so.
>> +
>> +     To simplify the implementation of process stratum targets, the following
>> +     methods are guaranteed to be called with COMMIT_RESUMED_STATE set to
>> +     false:
>> +
>> +       - resume
>> +       - stop
>> +       - wait
> 
> Should we mention this in the documentation of each of these methods?

Yeah that would be nice.  Would you mention it in both places
or just in those methods' documentation?

>> @@ -6656,6 +6660,9 @@ remote_target::commit_resume ()
>>  	  continue;
>>  	}
>>  
>> +      if (priv->resume_state () == resume_state::RESUMED_PENDING_VCONT)
>> +	any_pending_vcont_resume = true;
>> +
>>        /* If a thread is the parent of an unfollowed fork, then we
>>  	 can't do a global wildcard, as that would resume the fork
>>  	 child.  */
>> @@ -6663,6 +6670,11 @@ remote_target::commit_resume ()
>>  	may_global_wildcard_vcont = 0;
>>      }
>>  
>> +  /* We didn't have any resumed thread pending a vCont resume, so nothing to
>> +     do.  */
>> +  if (!any_pending_vcont_resume)
>> +    return;
> 
> Is this just an optimization you noticed, or something more related to
> this patch?

Damn, I knew you would ask :P.  I honestly can't remember.  I think
it's just an obvious-ish optimization.  With the scoped_disable_commit_resumed
in fetch_inferior_event, I am under the impression that we potentially call
commit_resumed more often when nothing actually requires commit-resuming.

Let's say you debug a remote program and a native program (so, one remote
target and the native target), both are running.  When the native target
generates an event, we will end up calling commit_resumed on the remote
target, although nothing needs to be done.  So an early exit sounds
beneficial to avoid the extra work.  But I think it's not necessary, the
remote target's commit_resumed function would otherwise do the right thing
(send nothing) if nothing needs to be done.

That means I could keep this change for later or make a preparatory patch
for it.

And to further optimize things (to avoid iterating on all of the target's
threads), we could maintain a flag in the target that indicates whether
any thread is in the RESUMED_PENDING_VCONT state.  If that flag is false,
we can early-return without doing any work.

Simon