[PATCH] gdb: better handling of 'S' packets

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* [PATCH] gdb: better handling of 'S' packets
@ 2020-11-11 15:35 Andrew Burgess
  2020-12-10 16:29 ` Andrew Burgess
  2021-01-08  0:51 ` [PATCH] " Pedro Alves
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Burgess @ 2020-11-11 15:35 UTC (permalink / raw)
  To: gdb-patches

This commit builds on work started in the following two commits:

  commit 24ed6739b699f329c2c45aedee5f8c7d2f54e493
  Date:   Thu Jan 30 14:35:40 2020 +0000

      gdb/remote: Restore support for 'S' stop reply packet

  commit cada5fc921e39a1945c422eea055c8b326d8d353
  Date:   Wed Mar 11 12:30:13 2020 +0000

      gdb: Handle W and X remote packets without giving a warning

This is related to how GDB handles remote targets that send back 'S'
packets.

In the first of the above commits we fixed GDB's ability to handle a
single process, single threaded target that sends back 'S' packets.
Although the 'T' packets would always be preferred to 'S' these days,
there's nothing really wrong with 'S' for this situation.

The second commit above fixed an oversight in the first commit, a
single-process, multi-threaded target can send back a process wide
event, for example the process exited event 'W' without including a
process-id, this also is fine as there is no ambiguity in this case.

In PR gdb/26819 however we start to move towards "better" handling of
more ambiguous cases.  In this bug openocd is used to drive the spike
RISC-V simulator.  In this particular case a multi-core system is
being simulated and presented to GDB as two threads.  GDB then is
seeing a single process, two thread system.  Unfortunately the
target (openocd) is still sending back 'S' packets, these are the
packets that _don't_ include a thread-id.

It is my opinion that this target, in this particular configuration,
is broken.  Even though it is possible, by being very careful with how
GDB is configured to ensure that GDB only ever tries to run one thread
at a time, I feel that any target that presents multiple threads to
GDB should be making use of the 'T' stop packet, combined with sending
a thread-id.

However, with that caveat out of the way, I think this bug report does
reveal a number of ways that GDB could be improved.

Firstly, the main issue reported in the bug was that GDB would exit
with this assertion:

  infrun.c:5690: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.

I think it's fair to say that having a target send back 'S' packets
when it should use 'T' is _not_ an excuse for GDB to throw an
assertion.

What's happening is that GDB connects to the 2 core system.  Core 2 is
selected, and a program loaded.  A breakpoint is placed in main and we
continue, this results in this packet exchange:

 Sending packet: $vCont;c#a8...Packet received: T05thread:2;

That's good, all cores ran, and the remote told GDB that we stopped in
thread 2.  Next the user does `stepi` and this results in this packet
exchange:

  Sending packet: $vCont;s:2#24...Packet received: S05

Here GDB is trying to step only thread 2, and the target replies with
an 'S' packet.  Though it feels like sending back 'T05thread:2;' would
be so much simpler here, there's nothing fundamentally wrong ambiguous
about the exchange.

Inside GDB the problem we're running into is within the function
remote_target::process_stop_reply.  When a stop reply doesn't include
a thread-id (or process-id) it is this function that is responsible
for looking one up.  Currently we always just select the first
non-exited thread, so in this case thread 1.

As the step was issued as part of the step over a breakpoint logic,
which was specifically run for thread-2 GDB is expecting the event to
be reported in thread-2, and hence when we try to handle thread-1 we
trigger the above assertion.

My proposal is to improve the logic used in process_stop_reply to make
the thread selection smarter.

My first thought was that each thread has an 'executing' flag, instead
of picking the first non-exited thread, we should pick a non-exited
thread that is currently executing.  The logic being that stop events
shouldn't arrive for threads that are no executing.

The problem with this is the very first initial connection.

When GDB first connects to the remote target it is told about all the
existing threads.  These are all created by GDB in the non-executing
state.  Another part of the connecting logic is to send the remote the
'?' packet, which asks why the target halted.  This sends back a stop
packet which is then processed.  At this point non of the threads are
marked executing so we would end up with no suitably matching threads.

This left me with two rules:

  1. Select the first non-exited thread that is marked executing, or
  2. If no threads match rule 1, select the first non-exited thread
  whether it is executing or not.

This seemed fine, and certainly resolved the issue seen in the
original bug report.  So then I tried to create a test for this using
a multi-threaded test program and `gdbserver --disable-packet=T`.

I wasn't able to get anything that exactly reproduced the original
bug, but I was able to hit similar issues where GDB would try to step
one thread but GDB would handle the step (from the step) in a
different thread.  In some of these cases there was genuine ambiguity
in the reply from the target, however, it still felt like GDB could do
a better job at guessing which thread to select for the stop event.

I wondered if we could make use of the 'continue_thread' and/or the
'general_thread' to help guide the choice of thread.

In the end I settled on these rules for thread selection:

  [ NOTE: For all the following rules, only non-exited threads are
    considered.  ]

  1. If the continue_thread is set to a single specific thread, and
  that thread is executing, then assume this is where the event
  occurred.

  2. If nothing matches rule 1, then if the general_thread is set to a
  single specific thread, and that thread is executing, assume this is
  where the event occurred.

  3. If nothing matches rule 2 then select the first thread that is
  marked as executing.

  4. If nothing matches rule 3 then select the first thread.

This works fine except for one small problem, when GDB is using the
vcont packets we don't need to send 'Hc' packets to the target and so
the 'continue_thread' is never set.

In this commit I add a new record_continue_thread function, this sets
the continue_thread without sending a 'Hc' packet.  This effectively
serves as a cache for which thread did we set running.

The only slight "wart" here is that when GDB steps a thread the
continue_thread is not set to a specific single thread-id, rather it
gets set to either minus_one_ptid or to a specific processes ptid.  In
this case (when a step is requested) I store the ptid of the stepping
thread.

gdb/ChangeLog:

	PR gdb/26819
	* remote.c (remote_target::guess_thread_for_ambiguous_stop_reply):
	New member function.
	(record_continue_thread): New function.
	(remote_target::remote_resume_with_vcont): Call record_continue_thread.
	(remote_target::process_stop_reply): Call
	guess_thread_for_ambiguous_stop_reply.

gdb/testsuite/ChangeLog:

	PR gdb/26819
	* gdb.server/stop-reply-no-thread-multi.c: New file.
	* gdb.server/stop-reply-no-thread-multi.exp: New file.
---
 gdb/ChangeLog                                 |  10 +
 gdb/remote.c                                  | 261 ++++++++++++++----
 gdb/testsuite/ChangeLog                       |   6 +
 .../gdb.server/stop-reply-no-thread-multi.c   |  77 ++++++
 .../gdb.server/stop-reply-no-thread-multi.exp | 139 ++++++++++
 5 files changed, 441 insertions(+), 52 deletions(-)
 create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
 create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp

diff --git a/gdb/remote.c b/gdb/remote.c
index 71f814efb36..0020a1ee3c5 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -747,6 +747,9 @@ class remote_target : public process_stratum_target
   ptid_t process_stop_reply (struct stop_reply *stop_reply,
 			     target_waitstatus *status);
 
+  ptid_t guess_thread_for_ambiguous_stop_reply
+    (const struct target_waitstatus *status);
+
   void remote_notice_new_inferior (ptid_t currthread, int executing);
 
   void process_initial_stop_replies (int from_tty);
@@ -2576,6 +2579,22 @@ record_currthread (struct remote_state *rs, ptid_t currthread)
   rs->general_thread = currthread;
 }
 
+/* Called from the vcont packet generation code.  Unlike the old thread
+   control packets, which rely on sending a Hc packet before sending the
+   continue/step packet, with vcont no Hc packet is sent.
+
+   As a result the remote state's continue_thread field is never updated.
+
+   Sometime though it can be useful if we do have some information about
+   which thread(s) the vcont tried to continue/step as this can be used to
+   guide the choice of thread in the case were a miss-behaving remote
+   doesn't include a thread-id in its stop packet.  */
+static void
+record_continue_thread (struct remote_state *rs, ptid_t thr)
+{
+  rs->continue_thread = thr;
+}
+
 /* If 'QPassSignals' is supported, tell the remote stub what signals
    it can simply pass through to the inferior without reporting.  */
 
@@ -6227,6 +6246,8 @@ remote_target::remote_resume_with_vcont (ptid_t ptid, int step,
   char *p;
   char *endp;
 
+  record_continue_thread (get_remote_state (), ptid);
+
   /* No reverse execution actions defined for vCont.  */
   if (::execution_direction == EXEC_REVERSE)
     return 0;
@@ -6264,6 +6285,7 @@ remote_target::remote_resume_with_vcont (ptid_t ptid, int step,
 	{
 	  /* Step inferior_ptid, with or without signal.  */
 	  p = append_resumption (p, endp, inferior_ptid, step, siggnal);
+	  record_continue_thread (get_remote_state (), inferior_ptid);
 	}
 
       /* Also pass down any pending signaled resumption for other
@@ -7671,6 +7693,191 @@ remote_notif_get_pending_events (remote_target *remote, notif_client *nc)
   remote->remote_notif_get_pending_events (nc);
 }
 
+/* Called from process_stop_reply when the stop packet we are responding
+   too didn't include a process-id or thread-id.  STATUS is the stop event
+   we are responding too.
+
+   It is the task of this function to find (guess) a suitable thread and
+   return its ptid, this is the thread we will assume the stop event came
+   from.
+
+   In some cases there really isn't any guessing going on, a basic remote
+   with a single process containing a single thread might choose not to
+   send any process-id or thread-id in its stop packets, this function will
+   select and return the one and only thread.
+
+   However, there are targets out there which are.... not great, and in
+   some cases will support multiple threads but still don't include a
+   thread-id.  In these cases we try to do the best we can when selecting a
+   thread, but in the general case we can never know for sure we have
+   picked the correct thread.  As a result this function can issue a
+   warning to the user if it detects that there is the possibility that we
+   really are guessing at which thread to report.  */
+
+ptid_t
+remote_target::guess_thread_for_ambiguous_stop_reply
+	(const struct target_waitstatus *status)
+{
+  /* Some stop events apply to all threads in an inferior, while others
+     only apply to a single thread.  */
+  bool is_stop_for_all_threads
+    = (status->kind == TARGET_WAITKIND_EXITED
+       || status->kind == TARGET_WAITKIND_SIGNALLED);
+
+  struct remote_state *rs = get_remote_state ();
+
+  /* Track the possible threads in this structure.  */
+  struct thread_choices
+  {
+    /* Constructor.  */
+    thread_choices (struct remote_state *rs, bool is_stop_for_all_threads)
+      : m_rs (rs),
+	m_is_stop_for_all_threads (is_stop_for_all_threads)
+    { /* Nothing.  */ }
+
+    /* Disable/delete these.  */
+    thread_choices () = delete;
+    DISABLE_COPY_AND_ASSIGN (thread_choices);
+
+    /* Consider thread THR setting the internal thread tracking variables
+       as appropriate.  */
+    void consider_thread (thread_info *thr)
+    {
+      /* Record this as the first thread, or mark that we have multiple
+	 possible threads.  We set the m_multiple flag even if there is
+	 only one thread executing.  This means we possibly issue warnings
+	 to the user when there is no ambiguity... but there's really no
+	 reason why the remote target couldn't include a thread-id so it
+	 doesn't seem to bad to point this out.  */
+      if (m_first_thread == nullptr)
+	m_first_thread = thr;
+      else if (!m_is_stop_for_all_threads
+	       || m_first_thread->ptid.pid () != thr->ptid.pid ())
+	m_multiple = true;
+
+      /* If this is an executing thread then it might be a more appropriate
+	 match than just picking the first non-exited thread.  */
+      if (thr->executing)
+	{
+	  /* These are checked and updated in the same order that
+	     best_thread will check them.  This allows us to minimise the
+	     number of ptid comparisons we do here.  */
+	  if (thr->ptid == m_rs->continue_thread)
+	    m_continue_thread = thr;
+	  else if (m_executing_thread == nullptr)
+	    m_executing_thread = thr;
+	  else if (thr->ptid == m_rs->general_thread)
+	    m_general_thread = thr;
+	}
+    }
+
+    /* Return a pointer to the best possible thread.  */
+    thread_info *best_thread () const
+    {
+      /* Best is a thread that was explicitly told to continue or step.
+	 This will only contain a match if the remote state's continue
+	 thread holds an exact thread-id (so not something like
+	 minus_one_ptid).  */
+      thread_info *thr = m_continue_thread;
+      /* If the continue thread didn't contain a match then check the
+	 general thread.  As with the continue thread we will only find a
+	 match here if the remote state's general thread is set to a
+	 specific thread-id.  This ensures GDB is more likely to report
+	 events as occurring in the currently selected thread.  */
+      if (thr == nullptr)
+	thr = m_general_thread;
+      /* If neither of the above helped then look for the first executing
+	 thread.  If through careful adjustment of GDB's options only a
+	 single thread was set running then this should give us the correct
+	 thread.  */
+      if (thr == nullptr)
+	thr = m_executing_thread;
+      /* This final case should only be needed during the initial attach to
+	 a remote target.  At this point all threads are in a non-executing
+	 state, but we still get a stop packet that we process.  In this
+	 case we just report the event against the very first thread.  */
+      if (thr == nullptr)
+	thr = m_first_thread;
+      return thr;
+    }
+
+    /* Return true if there were multiple possible thread/processes and we
+       had to just pick one.  This indicates that a warning probably should
+       be issued to the user.  */
+    bool multiple_possible_threads_p () const
+    { return m_multiple; }
+
+  private:
+
+    /* The remote state we are examining threads for.  */
+    struct remote_state *m_rs = nullptr;
+
+    /* Is this stop event one for all threads in a process (e.g.  process
+       exited), or an event for a single thread (e.g. thread stopped).  */
+    bool m_is_stop_for_all_threads;
+
+    /* A thread matching the continue_thread within M_RS.  */
+    thread_info *m_continue_thread = nullptr;
+
+    /* A thread matching the general_thread within M_RS.  */
+    thread_info *m_general_thread = nullptr;
+
+    /* The first thread whose executing flag is true.  */
+    thread_info *m_executing_thread = nullptr;
+
+    /* The first non-exited thread.  */
+    thread_info *m_first_thread = nullptr;
+
+    /* Is set true if we have multiple threads or processes that could
+       have matched and we should give a warning to the user to indicate
+       that their remote target is not being helpful.  */
+    bool m_multiple = false;
+  } choices (rs, is_stop_for_all_threads);
+
+  /* Consider all non-exited threads to see which is the best match.  */
+  for (thread_info *thr : all_non_exited_threads (this))
+    choices.consider_thread (thr);
+
+  /* Select the best possible thread.  */
+  thread_info *thr = choices.best_thread ();
+  gdb_assert (thr != nullptr);
+
+  /* Warn if the remote target is sending ambiguous stop replies.  */
+  if (choices.multiple_possible_threads_p ())
+    {
+      static bool warned = false;
+
+      if (!warned)
+	{
+	  /* If you are seeing this warning then the remote target has
+	     stopped without specifying a thread-id, but the target
+	     does have multiple threads (or inferiors), and so GDB is
+	     having to guess which thread stopped.
+
+	     Examples of what might cause this are the target sending
+	     and 'S' stop packet, or a 'T' stop packet and not
+	     including a thread-id.
+
+	     Additionally, the target might send a 'W' or 'X packet
+	     without including a process-id, when the target has
+	     multiple running inferiors.  */
+	  if (is_stop_for_all_threads)
+	    warning (_("multi-inferior target stopped without "
+		       "sending a process-id, using first "
+		       "non-exited inferior"));
+	  else
+	    warning (_("multi-threaded target stopped without "
+		       "sending a thread-id, using first "
+		       "non-exited thread"));
+	  warned = true;
+	}
+    }
+
+  /* If this is a stop for all threads then don't use a particular threads
+     ptid, instead create a new ptid where only the pid field is set.  */
+  return ((is_stop_for_all_threads) ? ptid_t (thr->ptid.pid ()) : thr->ptid);
+}
+
 /* Called when it is decided that STOP_REPLY holds the info of the
    event that is to be returned to the core.  This function always
    destroys STOP_REPLY.  */
@@ -7687,58 +7894,8 @@ remote_target::process_stop_reply (struct stop_reply *stop_reply,
   /* If no thread/process was reported by the stub then use the first
      non-exited thread in the current target.  */
   if (ptid == null_ptid)
-    {
-      /* Some stop events apply to all threads in an inferior, while others
-	 only apply to a single thread.  */
-      bool is_stop_for_all_threads
-	= (status->kind == TARGET_WAITKIND_EXITED
-	   || status->kind == TARGET_WAITKIND_SIGNALLED);
-
-      for (thread_info *thr : all_non_exited_threads (this))
-	{
-	  if (ptid != null_ptid
-	      && (!is_stop_for_all_threads
-		  || ptid.pid () != thr->ptid.pid ()))
-	    {
-	      static bool warned = false;
-
-	      if (!warned)
-		{
-		  /* If you are seeing this warning then the remote target
-		     has stopped without specifying a thread-id, but the
-		     target does have multiple threads (or inferiors), and
-		     so GDB is having to guess which thread stopped.
-
-		     Examples of what might cause this are the target
-		     sending and 'S' stop packet, or a 'T' stop packet and
-		     not including a thread-id.
-
-		     Additionally, the target might send a 'W' or 'X
-		     packet without including a process-id, when the target
-		     has multiple running inferiors.  */
-		  if (is_stop_for_all_threads)
-		    warning (_("multi-inferior target stopped without "
-			       "sending a process-id, using first "
-			       "non-exited inferior"));
-		  else
-		    warning (_("multi-threaded target stopped without "
-			       "sending a thread-id, using first "
-			       "non-exited thread"));
-		  warned = true;
-		}
-	      break;
-	    }
-
-	  /* If this is a stop for all threads then don't use a particular
-	     threads ptid, instead create a new ptid where only the pid
-	     field is set.  */
-	  if (is_stop_for_all_threads)
-	    ptid = ptid_t (thr->ptid.pid ());
-	  else
-	    ptid = thr->ptid;
-	}
-      gdb_assert (ptid != null_ptid);
-    }
+    ptid = guess_thread_for_ambiguous_stop_reply (status);
+  gdb_assert (ptid != null_ptid);
 
   if (status->kind != TARGET_WAITKIND_EXITED
       && status->kind != TARGET_WAITKIND_SIGNALLED
diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
new file mode 100644
index 00000000000..f0d86bd13c1
--- /dev/null
+++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
@@ -0,0 +1,77 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2020 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+#include <pthread.h>
+#include <unistd.h>
+
+volatile int worker_blocked = 1;
+volatile int main_blocked = 1;
+
+void
+unlock_worker ()
+{
+  worker_blocked = 0;
+}
+
+void
+unlock_main ()
+{
+  main_blocked = 0;
+}
+
+void
+breakpt ()
+{
+  /* Nothing.  */
+}
+
+static void*
+worker (void *data)
+{
+  unlock_main ();
+
+  while (worker_blocked)
+    ;
+
+  breakpt ();
+
+  return NULL;
+}
+
+int
+main ()
+{
+  pthread_t thr;
+  void *retval;
+
+  /* Ensure the test doesn't run forever.  */
+  alarm (99);
+
+  if (pthread_create (&thr, NULL, worker, NULL) != 0)
+    abort ();
+
+  while (main_blocked)
+    ;
+
+  unlock_worker ();
+
+  if (pthread_join (thr, &retval) != 0)
+    abort ();
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
new file mode 100644
index 00000000000..b4ab03471e8
--- /dev/null
+++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
@@ -0,0 +1,139 @@
+# This testcase is part of GDB, the GNU debugger.
+#
+# Copyright 2020 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test how GDB handles the case where a target either doesn't use 'T'
+# packets at all or doesn't include a thread-id in a 'T' packet, AND,
+# where the test program contains multiple threads.
+#
+# In general this is a broken situation and GDB can never do the
+# "right" thing is all cases.  If two threads are running and when a
+# stop occurs, the remote does not tell GDB which thread stopped, then
+# GDB can never be sure it has attributed the stop to the correct
+# thread.
+#
+# However, we can ensure some reasonably sane default behaviours which
+# can make some broken targets appear a little less broken.
+
+load_lib gdbserver-support.exp
+
+if { [skip_gdbserver_tests] } {
+    verbose "skipping gdbserver tests"
+    return -1
+}
+
+standard_testfile
+if [prepare_for_testing "failed to prepare" $testfile $srcfile {debug pthreads}] {
+    return -1
+}
+
+# Run the tests with different features of GDBserver disabled.
+proc run_test { disable_feature } {
+    global binfile gdb_prompt decimal hex
+
+    clean_restart ${binfile}
+
+    # Make sure we're disconnected, in case we're testing with an
+    # extended-remote board, therefore already connected.
+    gdb_test "disconnect" ".*"
+
+    set packet_arg ""
+    if { $disable_feature != "" } {
+	set packet_arg "--disable-packet=${disable_feature}"
+    }
+    set res [gdbserver_start $packet_arg $binfile]
+    set gdbserver_protocol [lindex $res 0]
+    set gdbserver_gdbport [lindex $res 1]
+
+    # Disable XML-based thread listing, and multi-process extensions.
+    gdb_test_no_output "set remote threads-packet off"
+    gdb_test_no_output "set remote multiprocess-feature-packet off"
+
+    set res [gdb_target_cmd $gdbserver_protocol $gdbserver_gdbport]
+    if ![gdb_assert {$res == 0} "connect"] {
+	return
+    }
+
+    # There should be only one thread listed at this point.
+    gdb_test_multiple "info threads" "" {
+	-re "2 Thread.*$gdb_prompt $" {
+	    fail $gdb_test_name
+	}
+	-re "has terminated.*$gdb_prompt $" {
+	    fail $gdb_test_name
+	}
+	-re "\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n$gdb_prompt $" {
+	    pass $gdb_test_name
+	}
+    }
+
+    gdb_breakpoint "unlock_worker"
+    gdb_continue_to_breakpoint "run to unlock_worker"
+
+    # There should be two threads at this point with thread 1 selected.
+    gdb_test "info threads" \
+	"\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n  2\[\t \]*Thread\[^\r\n\]*" \
+	"second thread should now exist"
+
+    # Switch threads.
+    gdb_test "thread 2" ".*" "switch to second thread"
+
+    # Single step.  This will set all threads running but as there's
+    # no reason for the first thread to report a stop we expect to
+    # finish the step with thread 2 still selected.
+    gdb_test_multiple "stepi" "" {
+	-re "Thread 1 received signal SIGTRAP" {
+	    fail $gdb_test_name
+	}
+	-re "$hex.*$decimal.*while \\(worker_blocked\\).*$gdb_prompt" {
+	    pass $gdb_test_name
+	}
+    }
+
+    # Double check that thread 2 is still selected.
+    gdb_test "info threads" \
+	"  1\[\t \]*Thread\[^\r\n\]*\r\n\\\* 2\[\t \]*Thread\[^\r\n\]*" \
+	"second thread should still be selected after stepi"
+
+    # Now "continue" thread 2.  Again there's no reason for thread 1
+    # to report a stop so we should finish with thread 2 still
+    # selected.
+    gdb_breakpoint "breakpt"
+    gdb_continue_to_breakpoint "run to breakpt"
+
+    # Again, double check that thread 2 is still selected.
+    gdb_test "info threads" \
+	"  1\[\t \]*Thread\[^\r\n\]*\r\n\\\* 2\[\t \]*Thread\[^\r\n\]*" \
+	"second thread should still be selected at breakpt"
+
+    # Continue until exit.  The server sends a 'W' with no PID.
+    # Bad GDB gave an error like below when target is nonstop:
+    #  (gdb) c
+    #  Continuing.
+    #  No process or thread specified in stop reply: W00
+    gdb_continue_to_end "" continue 1
+}
+
+# Disable different features within gdbserver:
+#
+# Tthread: Start GDBserver, with ";thread:NNN" in T stop replies disabled,
+#          emulating old gdbservers when debugging single-threaded programs.
+#
+# T: Start GDBserver with the entire 'T' stop reply packet disabled,
+#    GDBserver will instead send the 'S' stop reply.
+foreach_with_prefix to_disable { "" Tthread T } {
+    run_test $to_disable
+}
-- 
2.25.4


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gdb: better handling of 'S' packets
  2020-11-11 15:35 [PATCH] gdb: better handling of 'S' packets Andrew Burgess
@ 2020-12-10 16:29 ` Andrew Burgess
  2020-12-23 23:09   ` [PATCHv2] " Andrew Burgess
  2021-01-08  0:51 ` [PATCH] " Pedro Alves
  1 sibling, 1 reply; 9+ messages in thread
From: Andrew Burgess @ 2020-12-10 16:29 UTC (permalink / raw)
  To: gdb-patches

Ping!

I could potentially split this patch into two, one that does the
minimum required to address PR gdb/26819, and a follow up to add the
more (possibly controversial) ambiguous packet handling.

I'm still hopeful someone might take a look through and give an
opinion on this as it though...

Thanks,
Andrew



* Andrew Burgess <andrew.burgess@embecosm.com> [2020-11-11 15:35:48 +0000]:

> This commit builds on work started in the following two commits:
> 
>   commit 24ed6739b699f329c2c45aedee5f8c7d2f54e493
>   Date:   Thu Jan 30 14:35:40 2020 +0000
> 
>       gdb/remote: Restore support for 'S' stop reply packet
> 
>   commit cada5fc921e39a1945c422eea055c8b326d8d353
>   Date:   Wed Mar 11 12:30:13 2020 +0000
> 
>       gdb: Handle W and X remote packets without giving a warning
> 
> This is related to how GDB handles remote targets that send back 'S'
> packets.
> 
> In the first of the above commits we fixed GDB's ability to handle a
> single process, single threaded target that sends back 'S' packets.
> Although the 'T' packets would always be preferred to 'S' these days,
> there's nothing really wrong with 'S' for this situation.
> 
> The second commit above fixed an oversight in the first commit, a
> single-process, multi-threaded target can send back a process wide
> event, for example the process exited event 'W' without including a
> process-id, this also is fine as there is no ambiguity in this case.
> 
> In PR gdb/26819 however we start to move towards "better" handling of
> more ambiguous cases.  In this bug openocd is used to drive the spike
> RISC-V simulator.  In this particular case a multi-core system is
> being simulated and presented to GDB as two threads.  GDB then is
> seeing a single process, two thread system.  Unfortunately the
> target (openocd) is still sending back 'S' packets, these are the
> packets that _don't_ include a thread-id.
> 
> It is my opinion that this target, in this particular configuration,
> is broken.  Even though it is possible, by being very careful with how
> GDB is configured to ensure that GDB only ever tries to run one thread
> at a time, I feel that any target that presents multiple threads to
> GDB should be making use of the 'T' stop packet, combined with sending
> a thread-id.
> 
> However, with that caveat out of the way, I think this bug report does
> reveal a number of ways that GDB could be improved.
> 
> Firstly, the main issue reported in the bug was that GDB would exit
> with this assertion:
> 
>   infrun.c:5690: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
> 
> I think it's fair to say that having a target send back 'S' packets
> when it should use 'T' is _not_ an excuse for GDB to throw an
> assertion.
> 
> What's happening is that GDB connects to the 2 core system.  Core 2 is
> selected, and a program loaded.  A breakpoint is placed in main and we
> continue, this results in this packet exchange:
> 
>  Sending packet: $vCont;c#a8...Packet received: T05thread:2;
> 
> That's good, all cores ran, and the remote told GDB that we stopped in
> thread 2.  Next the user does `stepi` and this results in this packet
> exchange:
> 
>   Sending packet: $vCont;s:2#24...Packet received: S05
> 
> Here GDB is trying to step only thread 2, and the target replies with
> an 'S' packet.  Though it feels like sending back 'T05thread:2;' would
> be so much simpler here, there's nothing fundamentally wrong ambiguous
> about the exchange.
> 
> Inside GDB the problem we're running into is within the function
> remote_target::process_stop_reply.  When a stop reply doesn't include
> a thread-id (or process-id) it is this function that is responsible
> for looking one up.  Currently we always just select the first
> non-exited thread, so in this case thread 1.
> 
> As the step was issued as part of the step over a breakpoint logic,
> which was specifically run for thread-2 GDB is expecting the event to
> be reported in thread-2, and hence when we try to handle thread-1 we
> trigger the above assertion.
> 
> My proposal is to improve the logic used in process_stop_reply to make
> the thread selection smarter.
> 
> My first thought was that each thread has an 'executing' flag, instead
> of picking the first non-exited thread, we should pick a non-exited
> thread that is currently executing.  The logic being that stop events
> shouldn't arrive for threads that are no executing.
> 
> The problem with this is the very first initial connection.
> 
> When GDB first connects to the remote target it is told about all the
> existing threads.  These are all created by GDB in the non-executing
> state.  Another part of the connecting logic is to send the remote the
> '?' packet, which asks why the target halted.  This sends back a stop
> packet which is then processed.  At this point non of the threads are
> marked executing so we would end up with no suitably matching threads.
> 
> This left me with two rules:
> 
>   1. Select the first non-exited thread that is marked executing, or
>   2. If no threads match rule 1, select the first non-exited thread
>   whether it is executing or not.
> 
> This seemed fine, and certainly resolved the issue seen in the
> original bug report.  So then I tried to create a test for this using
> a multi-threaded test program and `gdbserver --disable-packet=T`.
> 
> I wasn't able to get anything that exactly reproduced the original
> bug, but I was able to hit similar issues where GDB would try to step
> one thread but GDB would handle the step (from the step) in a
> different thread.  In some of these cases there was genuine ambiguity
> in the reply from the target, however, it still felt like GDB could do
> a better job at guessing which thread to select for the stop event.
> 
> I wondered if we could make use of the 'continue_thread' and/or the
> 'general_thread' to help guide the choice of thread.
> 
> In the end I settled on these rules for thread selection:
> 
>   [ NOTE: For all the following rules, only non-exited threads are
>     considered.  ]
> 
>   1. If the continue_thread is set to a single specific thread, and
>   that thread is executing, then assume this is where the event
>   occurred.
> 
>   2. If nothing matches rule 1, then if the general_thread is set to a
>   single specific thread, and that thread is executing, assume this is
>   where the event occurred.
> 
>   3. If nothing matches rule 2 then select the first thread that is
>   marked as executing.
> 
>   4. If nothing matches rule 3 then select the first thread.
> 
> This works fine except for one small problem, when GDB is using the
> vcont packets we don't need to send 'Hc' packets to the target and so
> the 'continue_thread' is never set.
> 
> In this commit I add a new record_continue_thread function, this sets
> the continue_thread without sending a 'Hc' packet.  This effectively
> serves as a cache for which thread did we set running.
> 
> The only slight "wart" here is that when GDB steps a thread the
> continue_thread is not set to a specific single thread-id, rather it
> gets set to either minus_one_ptid or to a specific processes ptid.  In
> this case (when a step is requested) I store the ptid of the stepping
> thread.
> 
> gdb/ChangeLog:
> 
> 	PR gdb/26819
> 	* remote.c (remote_target::guess_thread_for_ambiguous_stop_reply):
> 	New member function.
> 	(record_continue_thread): New function.
> 	(remote_target::remote_resume_with_vcont): Call record_continue_thread.
> 	(remote_target::process_stop_reply): Call
> 	guess_thread_for_ambiguous_stop_reply.
> 
> gdb/testsuite/ChangeLog:
> 
> 	PR gdb/26819
> 	* gdb.server/stop-reply-no-thread-multi.c: New file.
> 	* gdb.server/stop-reply-no-thread-multi.exp: New file.
> ---
>  gdb/ChangeLog                                 |  10 +
>  gdb/remote.c                                  | 261 ++++++++++++++----
>  gdb/testsuite/ChangeLog                       |   6 +
>  .../gdb.server/stop-reply-no-thread-multi.c   |  77 ++++++
>  .../gdb.server/stop-reply-no-thread-multi.exp | 139 ++++++++++
>  5 files changed, 441 insertions(+), 52 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
>  create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
> 
> diff --git a/gdb/remote.c b/gdb/remote.c
> index 71f814efb36..0020a1ee3c5 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c
> @@ -747,6 +747,9 @@ class remote_target : public process_stratum_target
>    ptid_t process_stop_reply (struct stop_reply *stop_reply,
>  			     target_waitstatus *status);
>  
> +  ptid_t guess_thread_for_ambiguous_stop_reply
> +    (const struct target_waitstatus *status);
> +
>    void remote_notice_new_inferior (ptid_t currthread, int executing);
>  
>    void process_initial_stop_replies (int from_tty);
> @@ -2576,6 +2579,22 @@ record_currthread (struct remote_state *rs, ptid_t currthread)
>    rs->general_thread = currthread;
>  }
>  
> +/* Called from the vcont packet generation code.  Unlike the old thread
> +   control packets, which rely on sending a Hc packet before sending the
> +   continue/step packet, with vcont no Hc packet is sent.
> +
> +   As a result the remote state's continue_thread field is never updated.
> +
> +   Sometime though it can be useful if we do have some information about
> +   which thread(s) the vcont tried to continue/step as this can be used to
> +   guide the choice of thread in the case were a miss-behaving remote
> +   doesn't include a thread-id in its stop packet.  */
> +static void
> +record_continue_thread (struct remote_state *rs, ptid_t thr)
> +{
> +  rs->continue_thread = thr;
> +}
> +
>  /* If 'QPassSignals' is supported, tell the remote stub what signals
>     it can simply pass through to the inferior without reporting.  */
>  
> @@ -6227,6 +6246,8 @@ remote_target::remote_resume_with_vcont (ptid_t ptid, int step,
>    char *p;
>    char *endp;
>  
> +  record_continue_thread (get_remote_state (), ptid);
> +
>    /* No reverse execution actions defined for vCont.  */
>    if (::execution_direction == EXEC_REVERSE)
>      return 0;
> @@ -6264,6 +6285,7 @@ remote_target::remote_resume_with_vcont (ptid_t ptid, int step,
>  	{
>  	  /* Step inferior_ptid, with or without signal.  */
>  	  p = append_resumption (p, endp, inferior_ptid, step, siggnal);
> +	  record_continue_thread (get_remote_state (), inferior_ptid);
>  	}
>  
>        /* Also pass down any pending signaled resumption for other
> @@ -7671,6 +7693,191 @@ remote_notif_get_pending_events (remote_target *remote, notif_client *nc)
>    remote->remote_notif_get_pending_events (nc);
>  }
>  
> +/* Called from process_stop_reply when the stop packet we are responding
> +   too didn't include a process-id or thread-id.  STATUS is the stop event
> +   we are responding too.
> +
> +   It is the task of this function to find (guess) a suitable thread and
> +   return its ptid, this is the thread we will assume the stop event came
> +   from.
> +
> +   In some cases there really isn't any guessing going on, a basic remote
> +   with a single process containing a single thread might choose not to
> +   send any process-id or thread-id in its stop packets, this function will
> +   select and return the one and only thread.
> +
> +   However, there are targets out there which are.... not great, and in
> +   some cases will support multiple threads but still don't include a
> +   thread-id.  In these cases we try to do the best we can when selecting a
> +   thread, but in the general case we can never know for sure we have
> +   picked the correct thread.  As a result this function can issue a
> +   warning to the user if it detects that there is the possibility that we
> +   really are guessing at which thread to report.  */
> +
> +ptid_t
> +remote_target::guess_thread_for_ambiguous_stop_reply
> +	(const struct target_waitstatus *status)
> +{
> +  /* Some stop events apply to all threads in an inferior, while others
> +     only apply to a single thread.  */
> +  bool is_stop_for_all_threads
> +    = (status->kind == TARGET_WAITKIND_EXITED
> +       || status->kind == TARGET_WAITKIND_SIGNALLED);
> +
> +  struct remote_state *rs = get_remote_state ();
> +
> +  /* Track the possible threads in this structure.  */
> +  struct thread_choices
> +  {
> +    /* Constructor.  */
> +    thread_choices (struct remote_state *rs, bool is_stop_for_all_threads)
> +      : m_rs (rs),
> +	m_is_stop_for_all_threads (is_stop_for_all_threads)
> +    { /* Nothing.  */ }
> +
> +    /* Disable/delete these.  */
> +    thread_choices () = delete;
> +    DISABLE_COPY_AND_ASSIGN (thread_choices);
> +
> +    /* Consider thread THR setting the internal thread tracking variables
> +       as appropriate.  */
> +    void consider_thread (thread_info *thr)
> +    {
> +      /* Record this as the first thread, or mark that we have multiple
> +	 possible threads.  We set the m_multiple flag even if there is
> +	 only one thread executing.  This means we possibly issue warnings
> +	 to the user when there is no ambiguity... but there's really no
> +	 reason why the remote target couldn't include a thread-id so it
> +	 doesn't seem to bad to point this out.  */
> +      if (m_first_thread == nullptr)
> +	m_first_thread = thr;
> +      else if (!m_is_stop_for_all_threads
> +	       || m_first_thread->ptid.pid () != thr->ptid.pid ())
> +	m_multiple = true;
> +
> +      /* If this is an executing thread then it might be a more appropriate
> +	 match than just picking the first non-exited thread.  */
> +      if (thr->executing)
> +	{
> +	  /* These are checked and updated in the same order that
> +	     best_thread will check them.  This allows us to minimise the
> +	     number of ptid comparisons we do here.  */
> +	  if (thr->ptid == m_rs->continue_thread)
> +	    m_continue_thread = thr;
> +	  else if (m_executing_thread == nullptr)
> +	    m_executing_thread = thr;
> +	  else if (thr->ptid == m_rs->general_thread)
> +	    m_general_thread = thr;
> +	}
> +    }
> +
> +    /* Return a pointer to the best possible thread.  */
> +    thread_info *best_thread () const
> +    {
> +      /* Best is a thread that was explicitly told to continue or step.
> +	 This will only contain a match if the remote state's continue
> +	 thread holds an exact thread-id (so not something like
> +	 minus_one_ptid).  */
> +      thread_info *thr = m_continue_thread;
> +      /* If the continue thread didn't contain a match then check the
> +	 general thread.  As with the continue thread we will only find a
> +	 match here if the remote state's general thread is set to a
> +	 specific thread-id.  This ensures GDB is more likely to report
> +	 events as occurring in the currently selected thread.  */
> +      if (thr == nullptr)
> +	thr = m_general_thread;
> +      /* If neither of the above helped then look for the first executing
> +	 thread.  If through careful adjustment of GDB's options only a
> +	 single thread was set running then this should give us the correct
> +	 thread.  */
> +      if (thr == nullptr)
> +	thr = m_executing_thread;
> +      /* This final case should only be needed during the initial attach to
> +	 a remote target.  At this point all threads are in a non-executing
> +	 state, but we still get a stop packet that we process.  In this
> +	 case we just report the event against the very first thread.  */
> +      if (thr == nullptr)
> +	thr = m_first_thread;
> +      return thr;
> +    }
> +
> +    /* Return true if there were multiple possible thread/processes and we
> +       had to just pick one.  This indicates that a warning probably should
> +       be issued to the user.  */
> +    bool multiple_possible_threads_p () const
> +    { return m_multiple; }
> +
> +  private:
> +
> +    /* The remote state we are examining threads for.  */
> +    struct remote_state *m_rs = nullptr;
> +
> +    /* Is this stop event one for all threads in a process (e.g.  process
> +       exited), or an event for a single thread (e.g. thread stopped).  */
> +    bool m_is_stop_for_all_threads;
> +
> +    /* A thread matching the continue_thread within M_RS.  */
> +    thread_info *m_continue_thread = nullptr;
> +
> +    /* A thread matching the general_thread within M_RS.  */
> +    thread_info *m_general_thread = nullptr;
> +
> +    /* The first thread whose executing flag is true.  */
> +    thread_info *m_executing_thread = nullptr;
> +
> +    /* The first non-exited thread.  */
> +    thread_info *m_first_thread = nullptr;
> +
> +    /* Is set true if we have multiple threads or processes that could
> +       have matched and we should give a warning to the user to indicate
> +       that their remote target is not being helpful.  */
> +    bool m_multiple = false;
> +  } choices (rs, is_stop_for_all_threads);
> +
> +  /* Consider all non-exited threads to see which is the best match.  */
> +  for (thread_info *thr : all_non_exited_threads (this))
> +    choices.consider_thread (thr);
> +
> +  /* Select the best possible thread.  */
> +  thread_info *thr = choices.best_thread ();
> +  gdb_assert (thr != nullptr);
> +
> +  /* Warn if the remote target is sending ambiguous stop replies.  */
> +  if (choices.multiple_possible_threads_p ())
> +    {
> +      static bool warned = false;
> +
> +      if (!warned)
> +	{
> +	  /* If you are seeing this warning then the remote target has
> +	     stopped without specifying a thread-id, but the target
> +	     does have multiple threads (or inferiors), and so GDB is
> +	     having to guess which thread stopped.
> +
> +	     Examples of what might cause this are the target sending
> +	     and 'S' stop packet, or a 'T' stop packet and not
> +	     including a thread-id.
> +
> +	     Additionally, the target might send a 'W' or 'X packet
> +	     without including a process-id, when the target has
> +	     multiple running inferiors.  */
> +	  if (is_stop_for_all_threads)
> +	    warning (_("multi-inferior target stopped without "
> +		       "sending a process-id, using first "
> +		       "non-exited inferior"));
> +	  else
> +	    warning (_("multi-threaded target stopped without "
> +		       "sending a thread-id, using first "
> +		       "non-exited thread"));
> +	  warned = true;
> +	}
> +    }
> +
> +  /* If this is a stop for all threads then don't use a particular threads
> +     ptid, instead create a new ptid where only the pid field is set.  */
> +  return ((is_stop_for_all_threads) ? ptid_t (thr->ptid.pid ()) : thr->ptid);
> +}
> +
>  /* Called when it is decided that STOP_REPLY holds the info of the
>     event that is to be returned to the core.  This function always
>     destroys STOP_REPLY.  */
> @@ -7687,58 +7894,8 @@ remote_target::process_stop_reply (struct stop_reply *stop_reply,
>    /* If no thread/process was reported by the stub then use the first
>       non-exited thread in the current target.  */
>    if (ptid == null_ptid)
> -    {
> -      /* Some stop events apply to all threads in an inferior, while others
> -	 only apply to a single thread.  */
> -      bool is_stop_for_all_threads
> -	= (status->kind == TARGET_WAITKIND_EXITED
> -	   || status->kind == TARGET_WAITKIND_SIGNALLED);
> -
> -      for (thread_info *thr : all_non_exited_threads (this))
> -	{
> -	  if (ptid != null_ptid
> -	      && (!is_stop_for_all_threads
> -		  || ptid.pid () != thr->ptid.pid ()))
> -	    {
> -	      static bool warned = false;
> -
> -	      if (!warned)
> -		{
> -		  /* If you are seeing this warning then the remote target
> -		     has stopped without specifying a thread-id, but the
> -		     target does have multiple threads (or inferiors), and
> -		     so GDB is having to guess which thread stopped.
> -
> -		     Examples of what might cause this are the target
> -		     sending and 'S' stop packet, or a 'T' stop packet and
> -		     not including a thread-id.
> -
> -		     Additionally, the target might send a 'W' or 'X
> -		     packet without including a process-id, when the target
> -		     has multiple running inferiors.  */
> -		  if (is_stop_for_all_threads)
> -		    warning (_("multi-inferior target stopped without "
> -			       "sending a process-id, using first "
> -			       "non-exited inferior"));
> -		  else
> -		    warning (_("multi-threaded target stopped without "
> -			       "sending a thread-id, using first "
> -			       "non-exited thread"));
> -		  warned = true;
> -		}
> -	      break;
> -	    }
> -
> -	  /* If this is a stop for all threads then don't use a particular
> -	     threads ptid, instead create a new ptid where only the pid
> -	     field is set.  */
> -	  if (is_stop_for_all_threads)
> -	    ptid = ptid_t (thr->ptid.pid ());
> -	  else
> -	    ptid = thr->ptid;
> -	}
> -      gdb_assert (ptid != null_ptid);
> -    }
> +    ptid = guess_thread_for_ambiguous_stop_reply (status);
> +  gdb_assert (ptid != null_ptid);
>  
>    if (status->kind != TARGET_WAITKIND_EXITED
>        && status->kind != TARGET_WAITKIND_SIGNALLED
> diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
> new file mode 100644
> index 00000000000..f0d86bd13c1
> --- /dev/null
> +++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
> @@ -0,0 +1,77 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2020 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include <stdlib.h>
> +#include <pthread.h>
> +#include <unistd.h>
> +
> +volatile int worker_blocked = 1;
> +volatile int main_blocked = 1;
> +
> +void
> +unlock_worker ()
> +{
> +  worker_blocked = 0;
> +}
> +
> +void
> +unlock_main ()
> +{
> +  main_blocked = 0;
> +}
> +
> +void
> +breakpt ()
> +{
> +  /* Nothing.  */
> +}
> +
> +static void*
> +worker (void *data)
> +{
> +  unlock_main ();
> +
> +  while (worker_blocked)
> +    ;
> +
> +  breakpt ();
> +
> +  return NULL;
> +}
> +
> +int
> +main ()
> +{
> +  pthread_t thr;
> +  void *retval;
> +
> +  /* Ensure the test doesn't run forever.  */
> +  alarm (99);
> +
> +  if (pthread_create (&thr, NULL, worker, NULL) != 0)
> +    abort ();
> +
> +  while (main_blocked)
> +    ;
> +
> +  unlock_worker ();
> +
> +  if (pthread_join (thr, &retval) != 0)
> +    abort ();
> +
> +  return 0;
> +}
> diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
> new file mode 100644
> index 00000000000..b4ab03471e8
> --- /dev/null
> +++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
> @@ -0,0 +1,139 @@
> +# This testcase is part of GDB, the GNU debugger.
> +#
> +# Copyright 2020 Free Software Foundation, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test how GDB handles the case where a target either doesn't use 'T'
> +# packets at all or doesn't include a thread-id in a 'T' packet, AND,
> +# where the test program contains multiple threads.
> +#
> +# In general this is a broken situation and GDB can never do the
> +# "right" thing is all cases.  If two threads are running and when a
> +# stop occurs, the remote does not tell GDB which thread stopped, then
> +# GDB can never be sure it has attributed the stop to the correct
> +# thread.
> +#
> +# However, we can ensure some reasonably sane default behaviours which
> +# can make some broken targets appear a little less broken.
> +
> +load_lib gdbserver-support.exp
> +
> +if { [skip_gdbserver_tests] } {
> +    verbose "skipping gdbserver tests"
> +    return -1
> +}
> +
> +standard_testfile
> +if [prepare_for_testing "failed to prepare" $testfile $srcfile {debug pthreads}] {
> +    return -1
> +}
> +
> +# Run the tests with different features of GDBserver disabled.
> +proc run_test { disable_feature } {
> +    global binfile gdb_prompt decimal hex
> +
> +    clean_restart ${binfile}
> +
> +    # Make sure we're disconnected, in case we're testing with an
> +    # extended-remote board, therefore already connected.
> +    gdb_test "disconnect" ".*"
> +
> +    set packet_arg ""
> +    if { $disable_feature != "" } {
> +	set packet_arg "--disable-packet=${disable_feature}"
> +    }
> +    set res [gdbserver_start $packet_arg $binfile]
> +    set gdbserver_protocol [lindex $res 0]
> +    set gdbserver_gdbport [lindex $res 1]
> +
> +    # Disable XML-based thread listing, and multi-process extensions.
> +    gdb_test_no_output "set remote threads-packet off"
> +    gdb_test_no_output "set remote multiprocess-feature-packet off"
> +
> +    set res [gdb_target_cmd $gdbserver_protocol $gdbserver_gdbport]
> +    if ![gdb_assert {$res == 0} "connect"] {
> +	return
> +    }
> +
> +    # There should be only one thread listed at this point.
> +    gdb_test_multiple "info threads" "" {
> +	-re "2 Thread.*$gdb_prompt $" {
> +	    fail $gdb_test_name
> +	}
> +	-re "has terminated.*$gdb_prompt $" {
> +	    fail $gdb_test_name
> +	}
> +	-re "\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n$gdb_prompt $" {
> +	    pass $gdb_test_name
> +	}
> +    }
> +
> +    gdb_breakpoint "unlock_worker"
> +    gdb_continue_to_breakpoint "run to unlock_worker"
> +
> +    # There should be two threads at this point with thread 1 selected.
> +    gdb_test "info threads" \
> +	"\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n  2\[\t \]*Thread\[^\r\n\]*" \
> +	"second thread should now exist"
> +
> +    # Switch threads.
> +    gdb_test "thread 2" ".*" "switch to second thread"
> +
> +    # Single step.  This will set all threads running but as there's
> +    # no reason for the first thread to report a stop we expect to
> +    # finish the step with thread 2 still selected.
> +    gdb_test_multiple "stepi" "" {
> +	-re "Thread 1 received signal SIGTRAP" {
> +	    fail $gdb_test_name
> +	}
> +	-re "$hex.*$decimal.*while \\(worker_blocked\\).*$gdb_prompt" {
> +	    pass $gdb_test_name
> +	}
> +    }
> +
> +    # Double check that thread 2 is still selected.
> +    gdb_test "info threads" \
> +	"  1\[\t \]*Thread\[^\r\n\]*\r\n\\\* 2\[\t \]*Thread\[^\r\n\]*" \
> +	"second thread should still be selected after stepi"
> +
> +    # Now "continue" thread 2.  Again there's no reason for thread 1
> +    # to report a stop so we should finish with thread 2 still
> +    # selected.
> +    gdb_breakpoint "breakpt"
> +    gdb_continue_to_breakpoint "run to breakpt"
> +
> +    # Again, double check that thread 2 is still selected.
> +    gdb_test "info threads" \
> +	"  1\[\t \]*Thread\[^\r\n\]*\r\n\\\* 2\[\t \]*Thread\[^\r\n\]*" \
> +	"second thread should still be selected at breakpt"
> +
> +    # Continue until exit.  The server sends a 'W' with no PID.
> +    # Bad GDB gave an error like below when target is nonstop:
> +    #  (gdb) c
> +    #  Continuing.
> +    #  No process or thread specified in stop reply: W00
> +    gdb_continue_to_end "" continue 1
> +}
> +
> +# Disable different features within gdbserver:
> +#
> +# Tthread: Start GDBserver, with ";thread:NNN" in T stop replies disabled,
> +#          emulating old gdbservers when debugging single-threaded programs.
> +#
> +# T: Start GDBserver with the entire 'T' stop reply packet disabled,
> +#    GDBserver will instead send the 'S' stop reply.
> +foreach_with_prefix to_disable { "" Tthread T } {
> +    run_test $to_disable
> +}
> -- 
> 2.25.4
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCHv2] gdb: better handling of 'S' packets
  2020-12-10 16:29 ` Andrew Burgess
@ 2020-12-23 23:09   ` Andrew Burgess
  2021-01-06 21:19     ` Simon Marchi via Gdb-patches
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Burgess @ 2020-12-23 23:09 UTC (permalink / raw)
  To: gdb-patches

In V2 I've reduced the patch to focus only on the minimum changes
required to fix the original bug (PR gdb/26819).

I think that this new patch is much more obviously the right thing to
do.

Once this is merged I'll put together a new patch with the extra
functionality from V1, but that can come later.

All feedback welcome.

Thanks,
Andrew

---

commit 00190f2b3958e2e822b3d6b078148175f995486c
Author: Andrew Burgess <andrew.burgess@embecosm.com>
Date:   Tue Nov 10 17:54:11 2020 +0000

    gdb: better handling of 'S' packets
    
    This commit builds on work started in the following two commits:
    
      commit 24ed6739b699f329c2c45aedee5f8c7d2f54e493
      Date:   Thu Jan 30 14:35:40 2020 +0000
    
          gdb/remote: Restore support for 'S' stop reply packet
    
      commit cada5fc921e39a1945c422eea055c8b326d8d353
      Date:   Wed Mar 11 12:30:13 2020 +0000
    
          gdb: Handle W and X remote packets without giving a warning
    
    This is related to how GDB handles remote targets that send back 'S'
    packets.
    
    In the first of the above commits we fixed GDB's ability to handle a
    single process, single threaded target that sends back 'S' packets.
    Although the 'T' packet would always be preferred to 'S' these days,
    there's nothing really wrong with 'S' for this situation.
    
    The second commit above fixed an oversight in the first commit, a
    single-process, multi-threaded target can send back a process wide
    event, for example the process exited event 'W' without including a
    process-id, this also is fine as there is no ambiguity in this case.
    
    In PR gdb/26819 we run into yet another problem with the above
    commits.  In this case we have a single process with two threads, GDB
    hits a breakpoint in thread 2 and then performs a stepi:
    
      (gdb) b main
      Breakpoint 1 at 0x1212340830: file infinite_loop.S, line 10.
      (gdb) c
      Continuing.
    
      Thread 2 hit Breakpoint 1, main () at infinite_loop.S:10
      10    in infinite_loop.S
      (gdb) set debug remote 1
      (gdb) stepi
      Sending packet: $vCont;s:2#24...Packet received: S05
      ../binutils-gdb/gdb/infrun.c:5807: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
    
    What happens in this case is that on the RISC-V target displaced
    stepping is not supported, so when the stepi is issued GDB steps just
    thread 2.  As only a single thread was set running the target decides
    that is can get away with sending back an 'S' packet without a
    thread-id.  GDB then associates the stop with thread 1 (the first
    non-exited thread), but as thread 1 was not previously set executing
    the assertion seen above triggers.
    
    As an aside I am surprised that the target sends pack 'S' in this
    situation.  The target is happy to send back 'T' (including thread-id)
    when multiple threads are set running, so (to me) it would seem easier
    to just always use the 'T' packet when multiple threads are in use.
    However, the target only uses 'T' when multiple threads are actually
    executing, otherwise an 'S' packet it used.
    
    Still, when looking at the above situation we can see that GDB should
    be able to understand which thread the 'S' reply is referring too.
    
    The problem is that is that in commit 24ed6739b699 (above) when a stop
    reply comes in with no thread-id we look for the first non-exited
    thread and select that as the thread the stop applies too.
    
    What we should really do is check the threads executing flag too, and
    so find the first non-exited, executing thread, and associate the stop
    event with this thread.  In the above example both thread 1 and 2 are
    non-exited, but only thread 2 is executing, so this is what we should
    use.
    
    Initially I planned to always look for the first non-exited, executing
    thread, however, this doesn't always work.
    
    When GDB initially connects to a target it queries the target for a
    list of threads.  These threads are created within GDB in the
    non-executing state.  GDB then asks the target for the last stop
    reason with the '?'  packet.  If the reply to '?' doesn't include a
    thread-id then GDB needs to look through all the threads and find a
    suitable candidate.  At this point no threads will be marked
    executing, so all we can do is find the first non-exited thread (as we
    currently do).
    
    There's a test for this issue included which works with stock
    gdbserver by disabling use of the 'T' packet, and enabling
    'scheduler-locking' within GDB so only one thread is set running.
    
    gdb/ChangeLog:
    
            PR gdb/26819
            * remote.c
            (remote_target::select_thread_for_ambiguous_stop_reply): New
            member function.
            (remote_target::process_stop_reply): Call
            select_thread_for_ambiguous_stop_reply.
    
    gdb/testsuite/ChangeLog:
    
            PR gdb/26819
            * gdb.server/stop-reply-no-thread-multi.c: New file.
            * gdb.server/stop-reply-no-thread-multi.exp: New file.

diff --git a/gdb/remote.c b/gdb/remote.c
index 71f814efb36..2dc2dc59da2 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -747,6 +747,9 @@ class remote_target : public process_stratum_target
   ptid_t process_stop_reply (struct stop_reply *stop_reply,
 			     target_waitstatus *status);
 
+  ptid_t select_thread_for_ambiguous_stop_reply
+    (const struct target_waitstatus *status);
+
   void remote_notice_new_inferior (ptid_t currthread, int executing);
 
   void process_initial_stop_replies (int from_tty);
@@ -7671,75 +7674,201 @@ remote_notif_get_pending_events (remote_target *remote, notif_client *nc)
   remote->remote_notif_get_pending_events (nc);
 }
 
-/* Called when it is decided that STOP_REPLY holds the info of the
-   event that is to be returned to the core.  This function always
-   destroys STOP_REPLY.  */
+/* Called from process_stop_reply when the stop packet we are responding
+   too didn't include a process-id or thread-id.  STATUS is the stop event
+   we are responding too.
+
+   It is the task of this function to select a suitable thread (or process)
+   and return its ptid, this is the thread (or process) we will assume the
+   stop event came from.
+
+   In some cases there isn't really any choice about which thread (or
+   process) is selected, a basic remote with a single process containing a
+   single thread might choose not to send any process-id or thread-id in
+   its stop packets, this function will select and return the one and only
+   thread.
+
+   However, if a target supports multiple threads (or processes) and still
+   doesn't include a thread-id (or process-id) in its stop packet then
+   first, this is a badly behaving target, and second, we're going to have
+   to select a thread (or process) at random and use that.  This function
+   will print a warning to the user if it detects that there is the
+   possibility that GDB is guessing which thread (or process) to
+   report.  */
 
 ptid_t
-remote_target::process_stop_reply (struct stop_reply *stop_reply,
-				   struct target_waitstatus *status)
+remote_target::select_thread_for_ambiguous_stop_reply
+	(const struct target_waitstatus *status)
 {
-  ptid_t ptid;
+  /* Some stop events apply to all threads in an inferior, while others
+     only apply to a single thread.  */
+  bool is_stop_for_all_threads
+    = (status->kind == TARGET_WAITKIND_EXITED
+       || status->kind == TARGET_WAITKIND_SIGNALLED);
 
-  *status = stop_reply->ws;
-  ptid = stop_reply->ptid;
+  struct remote_state *rs = get_remote_state ();
 
-  /* If no thread/process was reported by the stub then use the first
-     non-exited thread in the current target.  */
-  if (ptid == null_ptid)
+  /* Track the possible threads in this structure.  */
+  struct thread_choices
+  {
+    /* Constructor.  */
+    thread_choices (struct remote_state *rs, bool is_stop_for_all_threads)
+      : m_rs (rs),
+	m_is_stop_for_all_threads (is_stop_for_all_threads)
+    { /* Nothing.  */ }
+
+    /* Disable/delete these.  */
+    thread_choices () = delete;
+    DISABLE_COPY_AND_ASSIGN (thread_choices);
+
+    /* Consider thread THR setting the internal thread tracking variables
+       as appropriate.  */
+    void consider_thread (thread_info *thr)
+    {
+      /* Record the first non-exited thread as a fall-back response.  This
+	 is only every used during the initial connection to the target.  */
+      if (m_non_exited_thread == nullptr)
+	m_non_exited_thread = thr;
+      else if (!m_is_stop_for_all_threads
+	       || m_non_exited_thread->ptid.pid () != thr->ptid.pid ())
+	m_non_exited_thread_multiple = true;
+
+      /* If this is an executing thread then it might be a more appropriate
+	 match than just picking the first non-exited thread.  */
+      if (thr->executing)
+	{
+	  if (m_executing_thread == nullptr)
+	    m_executing_thread = thr;
+	  else if (!m_is_stop_for_all_threads
+		   || m_executing_thread->ptid.pid () != thr->ptid.pid ())
+	    m_executing_thread_multiple = true;
+	}
+    }
+
+    /* Return a pointer to the best possible thread.  */
+    thread_info *best_thread () const
+    { return best_thread_inner ().first; }
+
+    /* Return true if there were multiple possible thread/processes and we
+       had to just pick one.  We only worry about seeing multiple
+       candidates in the category from which the BEST_THREAD was selected.  */
+    bool multiple_possible_threads_p () const
+    { return best_thread_inner ().second; }
+
+  private:
+
+    /* Return a pair.  The first item is the best thread that was found,
+       the second is a bool which is true if there were multiple suitable
+       threads seen.  If multiple suitable threads were seen then the best
+       thread that was returned was actually just the first, best, thread.
+       If the bool is false then only one suitable thread was seen.  The
+       best thread will never be nullptr.  */
+    std::pair<thread_info *,bool> best_thread_inner () const
+    {
+      /* The best thread is the first non-exited thread that was marked as
+	 executing.  */
+      std::pair <thread_info *, bool> result (m_executing_thread,
+					      m_executing_thread_multiple);
+
+      /* If there was no thread marked as executing then select a
+	 non-exited thread.  This should only happen during the initial
+	 connection to the target when all threads are marked
+	 non-executing but we still see the very first stop packet.  */
+      if (result.first == nullptr)
+	result = {m_non_exited_thread, m_non_exited_thread_multiple};
+
+      return result;
+    }
+
+    /* The remote state we are examining threads for.  */
+    struct remote_state *m_rs = nullptr;
+
+    /* Is this stop event one for all threads in a process (e.g.  process
+       exited), or an event for a single thread (e.g. thread stopped).  */
+    bool m_is_stop_for_all_threads;
+
+    /* The first thread whose executing flag is true.  */
+    thread_info *m_executing_thread = nullptr;
+
+    /* Set to true if we see multiple possible threads that could be
+       stored into M_EXECUTING_THREAD.  The variable M_EXECUTING_THREAD is
+       set to the first suitable thread seen, then on the second suitable
+       thread this flag is set to true.  */
+    unsigned int m_executing_thread_multiple = false;
+
+    /* The first non-exited thread.  */
+    thread_info *m_non_exited_thread = nullptr;
+
+    /* Set to true if we see multiple possible threads that could be
+       stored into M_NON_EXITED_THREAD.  The variable M_NON_EXITED_THREAD
+       is set to the first suitable thread seen, then on the second suitable
+       thread this flag is set to true.  */
+    unsigned int m_non_exited_thread_multiple = false;
+  } choices (rs, is_stop_for_all_threads);
+
+  /* Consider all non-exited threads to see which is the best match.  */
+  for (thread_info *thr : all_non_exited_threads (this))
+    choices.consider_thread (thr);
+
+  /* Select the best possible thread.  */
+  thread_info *thr = choices.best_thread ();
+  gdb_assert (thr != nullptr);
+
+  /* Warn if the remote target is sending ambiguous stop replies.  */
+  if (choices.multiple_possible_threads_p ())
     {
-      /* Some stop events apply to all threads in an inferior, while others
-	 only apply to a single thread.  */
-      bool is_stop_for_all_threads
-	= (status->kind == TARGET_WAITKIND_EXITED
-	   || status->kind == TARGET_WAITKIND_SIGNALLED);
+      static bool warned = false;
 
-      for (thread_info *thr : all_non_exited_threads (this))
+      if (!warned)
 	{
-	  if (ptid != null_ptid
-	      && (!is_stop_for_all_threads
-		  || ptid.pid () != thr->ptid.pid ()))
-	    {
-	      static bool warned = false;
+	  /* If you are seeing this warning then the remote target has
+	     stopped without specifying a thread-id, but the target
+	     does have multiple threads (or inferiors), and so GDB is
+	     having to guess which thread stopped.
 
-	      if (!warned)
-		{
-		  /* If you are seeing this warning then the remote target
-		     has stopped without specifying a thread-id, but the
-		     target does have multiple threads (or inferiors), and
-		     so GDB is having to guess which thread stopped.
-
-		     Examples of what might cause this are the target
-		     sending and 'S' stop packet, or a 'T' stop packet and
-		     not including a thread-id.
-
-		     Additionally, the target might send a 'W' or 'X
-		     packet without including a process-id, when the target
-		     has multiple running inferiors.  */
-		  if (is_stop_for_all_threads)
-		    warning (_("multi-inferior target stopped without "
-			       "sending a process-id, using first "
-			       "non-exited inferior"));
-		  else
-		    warning (_("multi-threaded target stopped without "
-			       "sending a thread-id, using first "
-			       "non-exited thread"));
-		  warned = true;
-		}
-	      break;
-	    }
+	     Examples of what might cause this are the target sending
+	     and 'S' stop packet, or a 'T' stop packet and not
+	     including a thread-id.
 
-	  /* If this is a stop for all threads then don't use a particular
-	     threads ptid, instead create a new ptid where only the pid
-	     field is set.  */
+	     Additionally, the target might send a 'W' or 'X packet
+	     without including a process-id, when the target has
+	     multiple running inferiors.  */
 	  if (is_stop_for_all_threads)
-	    ptid = ptid_t (thr->ptid.pid ());
+	    warning (_("multi-inferior target stopped without "
+		       "sending a process-id, using first "
+		       "non-exited inferior"));
 	  else
-	    ptid = thr->ptid;
+	    warning (_("multi-threaded target stopped without "
+		       "sending a thread-id, using first "
+		       "non-exited thread"));
+	  warned = true;
 	}
-      gdb_assert (ptid != null_ptid);
     }
 
+  /* If this is a stop for all threads then don't use a particular threads
+     ptid, instead create a new ptid where only the pid field is set.  */
+  return ((is_stop_for_all_threads) ? ptid_t (thr->ptid.pid ()) : thr->ptid);
+}
+
+/* Called when it is decided that STOP_REPLY holds the info of the
+   event that is to be returned to the core.  This function always
+   destroys STOP_REPLY.  */
+
+ptid_t
+remote_target::process_stop_reply (struct stop_reply *stop_reply,
+				   struct target_waitstatus *status)
+{
+  ptid_t ptid;
+
+  *status = stop_reply->ws;
+  ptid = stop_reply->ptid;
+
+  /* If no thread/process was reported by the stub then select a suitable
+     thread/process.  */
+  if (ptid == null_ptid)
+    ptid = select_thread_for_ambiguous_stop_reply (status);
+  gdb_assert (ptid != null_ptid);
+
   if (status->kind != TARGET_WAITKIND_EXITED
       && status->kind != TARGET_WAITKIND_SIGNALLED
       && status->kind != TARGET_WAITKIND_NO_RESUMED)
diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
new file mode 100644
index 00000000000..f0d86bd13c1
--- /dev/null
+++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
@@ -0,0 +1,77 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2020 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+#include <pthread.h>
+#include <unistd.h>
+
+volatile int worker_blocked = 1;
+volatile int main_blocked = 1;
+
+void
+unlock_worker ()
+{
+  worker_blocked = 0;
+}
+
+void
+unlock_main ()
+{
+  main_blocked = 0;
+}
+
+void
+breakpt ()
+{
+  /* Nothing.  */
+}
+
+static void*
+worker (void *data)
+{
+  unlock_main ();
+
+  while (worker_blocked)
+    ;
+
+  breakpt ();
+
+  return NULL;
+}
+
+int
+main ()
+{
+  pthread_t thr;
+  void *retval;
+
+  /* Ensure the test doesn't run forever.  */
+  alarm (99);
+
+  if (pthread_create (&thr, NULL, worker, NULL) != 0)
+    abort ();
+
+  while (main_blocked)
+    ;
+
+  unlock_worker ();
+
+  if (pthread_join (thr, &retval) != 0)
+    abort ();
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
new file mode 100644
index 00000000000..ff80f10243f
--- /dev/null
+++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
@@ -0,0 +1,136 @@
+# This testcase is part of GDB, the GNU debugger.
+#
+# Copyright 2020 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test how GDB handles the case where a target either doesn't use 'T'
+# packets at all or doesn't include a thread-id in a 'T' packet, AND,
+# where the test program contains multiple threads.
+#
+# In general if multiple threads are executing and the target doesn't
+# include a thread-id in its stop response then GDB will not be able
+# to correctly figure out which thread the stop applies to.
+#
+# However, this test covers a very specific case, there are multiple
+# threads but only a single thread is actually executing.  So, when
+# the stop comes from the target, without a thread-id, GDB should be
+# able to correctly figure out which thread has stopped.
+
+load_lib gdbserver-support.exp
+
+if { [skip_gdbserver_tests] } {
+    verbose "skipping gdbserver tests"
+    return -1
+}
+
+standard_testfile
+if [prepare_for_testing "failed to prepare" $testfile $srcfile {debug pthreads}] {
+    return -1
+}
+
+# Run the tests with different features of GDBserver disabled.
+proc run_test { disable_feature } {
+    global binfile gdb_prompt decimal hex
+
+    clean_restart ${binfile}
+
+    # Make sure we're disconnected, in case we're testing with an
+    # extended-remote board, therefore already connected.
+    gdb_test "disconnect" ".*"
+
+    set packet_arg ""
+    if { $disable_feature != "" } {
+	set packet_arg "--disable-packet=${disable_feature}"
+    }
+    set res [gdbserver_start $packet_arg $binfile]
+    set gdbserver_protocol [lindex $res 0]
+    set gdbserver_gdbport [lindex $res 1]
+
+    # Disable XML-based thread listing, and multi-process extensions.
+    gdb_test_no_output "set remote threads-packet off"
+    gdb_test_no_output "set remote multiprocess-feature-packet off"
+
+    set res [gdb_target_cmd $gdbserver_protocol $gdbserver_gdbport]
+    if ![gdb_assert {$res == 0} "connect"] {
+	return
+    }
+
+    # There should be only one thread listed at this point.
+    gdb_test_multiple "info threads" "" {
+	-re "2 Thread.*$gdb_prompt $" {
+	    fail $gdb_test_name
+	}
+	-re "has terminated.*$gdb_prompt $" {
+	    fail $gdb_test_name
+	}
+	-re "\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n$gdb_prompt $" {
+	    pass $gdb_test_name
+	}
+    }
+
+    gdb_breakpoint "unlock_worker"
+    gdb_continue_to_breakpoint "run to unlock_worker"
+
+    # There should be two threads at this point with thread 1 selected.
+    gdb_test "info threads" \
+	"\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n  2\[\t \]*Thread\[^\r\n\]*" \
+	"second thread should now exist"
+
+    # Switch threads.
+    gdb_test "thread 2" ".*" "switch to second thread"
+
+    # Now turn on scheduler-locking so that when we step thread 2 only
+    # that one thread will be set running.
+    gdb_test_no_output "set scheduler-locking on"
+
+    # Single step thread 2.  Only the one thread will step.  When the
+    # thread stops, if the stop packet doesn't include a thread-id
+    # then GDB should still understand which thread stopped.
+    gdb_test_multiple "stepi" "" {
+	-re "Thread 1 received signal SIGTRAP" {
+	    fail $gdb_test_name
+	}
+	-re "$hex.*$decimal.*while \\(worker_blocked\\).*$gdb_prompt" {
+	    pass $gdb_test_name
+	}
+    }
+
+    # Check that thread 2 is still selected.
+    gdb_test "info threads" \
+	"  1\[\t \]*Thread\[^\r\n\]*\r\n\\\* 2\[\t \]*Thread\[^\r\n\]*" \
+	"second thread should still be selected after stepi"
+
+    # Turn scheduler locking off again so that when we continue all
+    # threads will be set running.
+    gdb_test_no_output "set scheduler-locking off"
+
+    # Continue until exit.  The server sends a 'W' with no PID.
+    # Bad GDB gave an error like below when target is nonstop:
+    #  (gdb) c
+    #  Continuing.
+    #  No process or thread specified in stop reply: W00
+    gdb_continue_to_end "" continue 1
+}
+
+# Disable different features within gdbserver:
+#
+# Tthread: Start GDBserver, with ";thread:NNN" in T stop replies disabled,
+#          emulating old gdbservers when debugging single-threaded programs.
+#
+# T: Start GDBserver with the entire 'T' stop reply packet disabled,
+#    GDBserver will instead send the 'S' stop reply.
+foreach_with_prefix to_disable { "" Tthread T } {
+    run_test $to_disable
+}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv2] gdb: better handling of 'S' packets
  2020-12-23 23:09   ` [PATCHv2] " Andrew Burgess
@ 2021-01-06 21:19     ` Simon Marchi via Gdb-patches
  2021-01-07  9:57       ` Andrew Burgess
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Marchi via Gdb-patches @ 2021-01-06 21:19 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches

On 2020-12-23 6:09 p.m., Andrew Burgess wrote:
> In V2 I've reduced the patch to focus only on the minimum changes
> required to fix the original bug (PR gdb/26819).
> 
> I think that this new patch is much more obviously the right thing to
> do.
> 
> Once this is merged I'll put together a new patch with the extra
> functionality from V1, but that can come later.
> 
> All feedback welcome.
> 
> Thanks,
> Andrew
> 
> ---
> 
> commit 00190f2b3958e2e822b3d6b078148175f995486c
> Author: Andrew Burgess <andrew.burgess@embecosm.com>
> Date:   Tue Nov 10 17:54:11 2020 +0000
> 
>     gdb: better handling of 'S' packets
>     
>     This commit builds on work started in the following two commits:
>     
>       commit 24ed6739b699f329c2c45aedee5f8c7d2f54e493
>       Date:   Thu Jan 30 14:35:40 2020 +0000
>     
>           gdb/remote: Restore support for 'S' stop reply packet
>     
>       commit cada5fc921e39a1945c422eea055c8b326d8d353
>       Date:   Wed Mar 11 12:30:13 2020 +0000
>     
>           gdb: Handle W and X remote packets without giving a warning
>     
>     This is related to how GDB handles remote targets that send back 'S'
>     packets.
>     
>     In the first of the above commits we fixed GDB's ability to handle a
>     single process, single threaded target that sends back 'S' packets.
>     Although the 'T' packet would always be preferred to 'S' these days,
>     there's nothing really wrong with 'S' for this situation.
>     
>     The second commit above fixed an oversight in the first commit, a
>     single-process, multi-threaded target can send back a process wide
>     event, for example the process exited event 'W' without including a
>     process-id, this also is fine as there is no ambiguity in this case.
>     
>     In PR gdb/26819 we run into yet another problem with the above
>     commits.  In this case we have a single process with two threads, GDB
>     hits a breakpoint in thread 2 and then performs a stepi:
>     
>       (gdb) b main
>       Breakpoint 1 at 0x1212340830: file infinite_loop.S, line 10.
>       (gdb) c
>       Continuing.
>     
>       Thread 2 hit Breakpoint 1, main () at infinite_loop.S:10
>       10    in infinite_loop.S
>       (gdb) set debug remote 1
>       (gdb) stepi
>       Sending packet: $vCont;s:2#24...Packet received: S05
>       ../binutils-gdb/gdb/infrun.c:5807: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
>     
>     What happens in this case is that on the RISC-V target displaced
>     stepping is not supported, so when the stepi is issued GDB steps just
>     thread 2.  As only a single thread was set running the target decides
>     that is can get away with sending back an 'S' packet without a
>     thread-id.  GDB then associates the stop with thread 1 (the first
>     non-exited thread), but as thread 1 was not previously set executing
>     the assertion seen above triggers.
>     
>     As an aside I am surprised that the target sends pack 'S' in this
>     situation.  The target is happy to send back 'T' (including thread-id)
>     when multiple threads are set running, so (to me) it would seem easier
>     to just always use the 'T' packet when multiple threads are in use.
>     However, the target only uses 'T' when multiple threads are actually
>     executing, otherwise an 'S' packet it used.
>     
>     Still, when looking at the above situation we can see that GDB should
>     be able to understand which thread the 'S' reply is referring too.
>     
>     The problem is that is that in commit 24ed6739b699 (above) when a stop
>     reply comes in with no thread-id we look for the first non-exited
>     thread and select that as the thread the stop applies too.
>     
>     What we should really do is check the threads executing flag too, and
>     so find the first non-exited, executing thread, and associate the stop
>     event with this thread.  In the above example both thread 1 and 2 are
>     non-exited, but only thread 2 is executing, so this is what we should
>     use.
>     
>     Initially I planned to always look for the first non-exited, executing
>     thread, however, this doesn't always work.
>     
>     When GDB initially connects to a target it queries the target for a
>     list of threads.  These threads are created within GDB in the
>     non-executing state.  GDB then asks the target for the last stop
>     reason with the '?'  packet.  If the reply to '?' doesn't include a
>     thread-id then GDB needs to look through all the threads and find a
>     suitable candidate.  At this point no threads will be marked
>     executing, so all we can do is find the first non-exited thread (as we
>     currently do).

I'm thinking about how this could work with my patch that makes the remote
target track its own resume state:

https://sourceware.org/pipermail/gdb-patches/2020-December/174274.html

I'm trying to update that patch to make the remote target track the resume
state even in all-stop, as discussed in that thread.  In all-stop, when we
receive a stop-reply indicating that a thread stopped, all the threads of
the target are assumed to be stopped, so I marked all of them as NOT_RESUMED.

I'm thinking that the remote target could assume that all threads are initially
RESUMED.  In all-stop, we process the stop reply we got in response to '?', and
that will mark all the threads as NOT_RESUMED.

So I think your code could use that new state and implement what you initially
wanted: look for the first non-exited, resumed thread (resumed from the
point of view of the remote target).  And that would simplify things a bit.

>  ptid_t
> -remote_target::process_stop_reply (struct stop_reply *stop_reply,
> -				   struct target_waitstatus *status)
> +remote_target::select_thread_for_ambiguous_stop_reply
> +	(const struct target_waitstatus *status)
>  {
> -  ptid_t ptid;
> +  /* Some stop events apply to all threads in an inferior, while others
> +     only apply to a single thread.  */
> +  bool is_stop_for_all_threads
> +    = (status->kind == TARGET_WAITKIND_EXITED
> +       || status->kind == TARGET_WAITKIND_SIGNALLED);
>  
> -  *status = stop_reply->ws;
> -  ptid = stop_reply->ptid;
> +  struct remote_state *rs = get_remote_state ();
>  
> -  /* If no thread/process was reported by the stub then use the first
> -     non-exited thread in the current target.  */
> -  if (ptid == null_ptid)
> +  /* Track the possible threads in this structure.  */
> +  struct thread_choices
> +  {
> +    /* Constructor.  */
> +    thread_choices (struct remote_state *rs, bool is_stop_for_all_threads)
> +      : m_rs (rs),
> +	m_is_stop_for_all_threads (is_stop_for_all_threads)
> +    { /* Nothing.  */ }
> +
> +    /* Disable/delete these.  */
> +    thread_choices () = delete;

I don't think it's necessary to delete the default constructor if you have
defined a non-default constructor.

> +    DISABLE_COPY_AND_ASSIGN (thread_choices);
> +
> +    /* Consider thread THR setting the internal thread tracking variables
> +       as appropriate.  */
> +    void consider_thread (thread_info *thr)
> +    {
> +      /* Record the first non-exited thread as a fall-back response.  This
> +	 is only every used during the initial connection to the target.  */

every -> ever

Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv2] gdb: better handling of 'S' packets
  2021-01-06 21:19     ` Simon Marchi via Gdb-patches
@ 2021-01-07  9:57       ` Andrew Burgess
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Burgess @ 2021-01-07  9:57 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

* Simon Marchi <simon.marchi@polymtl.ca> [2021-01-06 16:19:54 -0500]:

> On 2020-12-23 6:09 p.m., Andrew Burgess wrote:
> > In V2 I've reduced the patch to focus only on the minimum changes
> > required to fix the original bug (PR gdb/26819).
> > 
> > I think that this new patch is much more obviously the right thing to
> > do.
> > 
> > Once this is merged I'll put together a new patch with the extra
> > functionality from V1, but that can come later.
> > 
> > All feedback welcome.
> > 
> > Thanks,
> > Andrew
> > 
> > ---
> > 
> > commit 00190f2b3958e2e822b3d6b078148175f995486c
> > Author: Andrew Burgess <andrew.burgess@embecosm.com>
> > Date:   Tue Nov 10 17:54:11 2020 +0000
> > 
> >     gdb: better handling of 'S' packets
> >     
> >     This commit builds on work started in the following two commits:
> >     
> >       commit 24ed6739b699f329c2c45aedee5f8c7d2f54e493
> >       Date:   Thu Jan 30 14:35:40 2020 +0000
> >     
> >           gdb/remote: Restore support for 'S' stop reply packet
> >     
> >       commit cada5fc921e39a1945c422eea055c8b326d8d353
> >       Date:   Wed Mar 11 12:30:13 2020 +0000
> >     
> >           gdb: Handle W and X remote packets without giving a warning
> >     
> >     This is related to how GDB handles remote targets that send back 'S'
> >     packets.
> >     
> >     In the first of the above commits we fixed GDB's ability to handle a
> >     single process, single threaded target that sends back 'S' packets.
> >     Although the 'T' packet would always be preferred to 'S' these days,
> >     there's nothing really wrong with 'S' for this situation.
> >     
> >     The second commit above fixed an oversight in the first commit, a
> >     single-process, multi-threaded target can send back a process wide
> >     event, for example the process exited event 'W' without including a
> >     process-id, this also is fine as there is no ambiguity in this case.
> >     
> >     In PR gdb/26819 we run into yet another problem with the above
> >     commits.  In this case we have a single process with two threads, GDB
> >     hits a breakpoint in thread 2 and then performs a stepi:
> >     
> >       (gdb) b main
> >       Breakpoint 1 at 0x1212340830: file infinite_loop.S, line 10.
> >       (gdb) c
> >       Continuing.
> >     
> >       Thread 2 hit Breakpoint 1, main () at infinite_loop.S:10
> >       10    in infinite_loop.S
> >       (gdb) set debug remote 1
> >       (gdb) stepi
> >       Sending packet: $vCont;s:2#24...Packet received: S05
> >       ../binutils-gdb/gdb/infrun.c:5807: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
> >     
> >     What happens in this case is that on the RISC-V target displaced
> >     stepping is not supported, so when the stepi is issued GDB steps just
> >     thread 2.  As only a single thread was set running the target decides
> >     that is can get away with sending back an 'S' packet without a
> >     thread-id.  GDB then associates the stop with thread 1 (the first
> >     non-exited thread), but as thread 1 was not previously set executing
> >     the assertion seen above triggers.
> >     
> >     As an aside I am surprised that the target sends pack 'S' in this
> >     situation.  The target is happy to send back 'T' (including thread-id)
> >     when multiple threads are set running, so (to me) it would seem easier
> >     to just always use the 'T' packet when multiple threads are in use.
> >     However, the target only uses 'T' when multiple threads are actually
> >     executing, otherwise an 'S' packet it used.
> >     
> >     Still, when looking at the above situation we can see that GDB should
> >     be able to understand which thread the 'S' reply is referring too.
> >     
> >     The problem is that is that in commit 24ed6739b699 (above) when a stop
> >     reply comes in with no thread-id we look for the first non-exited
> >     thread and select that as the thread the stop applies too.
> >     
> >     What we should really do is check the threads executing flag too, and
> >     so find the first non-exited, executing thread, and associate the stop
> >     event with this thread.  In the above example both thread 1 and 2 are
> >     non-exited, but only thread 2 is executing, so this is what we should
> >     use.
> >     
> >     Initially I planned to always look for the first non-exited, executing
> >     thread, however, this doesn't always work.
> >     
> >     When GDB initially connects to a target it queries the target for a
> >     list of threads.  These threads are created within GDB in the
> >     non-executing state.  GDB then asks the target for the last stop
> >     reason with the '?'  packet.  If the reply to '?' doesn't include a
> >     thread-id then GDB needs to look through all the threads and find a
> >     suitable candidate.  At this point no threads will be marked
> >     executing, so all we can do is find the first non-exited thread (as we
> >     currently do).
> 
> I'm thinking about how this could work with my patch that makes the remote
> target track its own resume state:
> 
> https://sourceware.org/pipermail/gdb-patches/2020-December/174274.html
> 
> I'm trying to update that patch to make the remote target track the resume
> state even in all-stop, as discussed in that thread.  In all-stop, when we
> receive a stop-reply indicating that a thread stopped, all the threads of
> the target are assumed to be stopped, so I marked all of them as NOT_RESUMED.
> 
> I'm thinking that the remote target could assume that all threads are initially
> RESUMED.  In all-stop, we process the stop reply we got in response to '?', and
> that will mark all the threads as NOT_RESUMED.
> 
> So I think your code could use that new state and implement what you initially
> wanted: look for the first non-exited, resumed thread (resumed from the
> point of view of the remote target).  And that would simplify things a bit.

OK, I'll wait for a while to see if you post a revised version of the
above patch, then update this to match.

Thanks,
Andrew


> 
> >  ptid_t
> > -remote_target::process_stop_reply (struct stop_reply *stop_reply,
> > -				   struct target_waitstatus *status)
> > +remote_target::select_thread_for_ambiguous_stop_reply
> > +	(const struct target_waitstatus *status)
> >  {
> > -  ptid_t ptid;
> > +  /* Some stop events apply to all threads in an inferior, while others
> > +     only apply to a single thread.  */
> > +  bool is_stop_for_all_threads
> > +    = (status->kind == TARGET_WAITKIND_EXITED
> > +       || status->kind == TARGET_WAITKIND_SIGNALLED);
> >  
> > -  *status = stop_reply->ws;
> > -  ptid = stop_reply->ptid;
> > +  struct remote_state *rs = get_remote_state ();
> >  
> > -  /* If no thread/process was reported by the stub then use the first
> > -     non-exited thread in the current target.  */
> > -  if (ptid == null_ptid)
> > +  /* Track the possible threads in this structure.  */
> > +  struct thread_choices
> > +  {
> > +    /* Constructor.  */
> > +    thread_choices (struct remote_state *rs, bool is_stop_for_all_threads)
> > +      : m_rs (rs),
> > +	m_is_stop_for_all_threads (is_stop_for_all_threads)
> > +    { /* Nothing.  */ }
> > +
> > +    /* Disable/delete these.  */
> > +    thread_choices () = delete;
> 
> I don't think it's necessary to delete the default constructor if you have
> defined a non-default constructor.
> 
> > +    DISABLE_COPY_AND_ASSIGN (thread_choices);
> > +
> > +    /* Consider thread THR setting the internal thread tracking variables
> > +       as appropriate.  */
> > +    void consider_thread (thread_info *thr)
> > +    {
> > +      /* Record the first non-exited thread as a fall-back response.  This
> > +	 is only every used during the initial connection to the target.  */
> 
> every -> ever
> 
> Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gdb: better handling of 'S' packets
  2020-11-11 15:35 [PATCH] gdb: better handling of 'S' packets Andrew Burgess
  2020-12-10 16:29 ` Andrew Burgess
@ 2021-01-08  0:51 ` Pedro Alves
  2021-01-08  3:00   ` Simon Marchi via Gdb-patches
  2021-01-08  3:58   ` Simon Marchi via Gdb-patches
  1 sibling, 2 replies; 9+ messages in thread
From: Pedro Alves @ 2021-01-08  0:51 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches

Hi Andrew,

I think I spotted one bug below.  Otherwise, just minor comments.

On 11/11/20 15:35, Andrew Burgess wrote:
> This commit builds on work started in the following two commits:
> 
>   commit 24ed6739b699f329c2c45aedee5f8c7d2f54e493
>   Date:   Thu Jan 30 14:35:40 2020 +0000
> 
>       gdb/remote: Restore support for 'S' stop reply packet
> 
>   commit cada5fc921e39a1945c422eea055c8b326d8d353
>   Date:   Wed Mar 11 12:30:13 2020 +0000
> 
>       gdb: Handle W and X remote packets without giving a warning
> 
> This is related to how GDB handles remote targets that send back 'S'
> packets.
> 
> In the first of the above commits we fixed GDB's ability to handle a
> single process, single threaded target that sends back 'S' packets.
> Although the 'T' packets would always be preferred to 'S' these days,
> there's nothing really wrong with 'S' for this situation.
> 
> The second commit above fixed an oversight in the first commit, a
> single-process, multi-threaded target can send back a process wide
> event, for example the process exited event 'W' without including a
> process-id, this also is fine as there is no ambiguity in this case.
> 
> In PR gdb/26819 however we start to move towards "better" handling of
> more ambiguous cases.  In this bug openocd is used to drive the spike
> RISC-V simulator.  In this particular case a multi-core system is
> being simulated and presented to GDB as two threads.  GDB then is
> seeing a single process, two thread system.  Unfortunately the
> target (openocd) is still sending back 'S' packets, these are the
> packets that _don't_ include a thread-id.
> 
> It is my opinion that this target, in this particular configuration,
> is broken.  Even though it is possible, by being very careful with how
> GDB is configured to ensure that GDB only ever tries to run one thread
> at a time, I feel that any target that presents multiple threads to
> GDB should be making use of the 'T' stop packet, combined with sending
> a thread-id.
> 
> However, with that caveat out of the way, I think this bug report does
> reveal a number of ways that GDB could be improved.
> 
> Firstly, the main issue reported in the bug was that GDB would exit
> with this assertion:
> 
>   infrun.c:5690: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
> 
> I think it's fair to say that having a target send back 'S' packets
> when it should use 'T' is _not_ an excuse for GDB to throw an
> assertion.
> 

Yeah, assertions aren't ideal.

> What's happening is that GDB connects to the 2 core system.  Core 2 is
> selected, and a program loaded.  A breakpoint is placed in main and we
> continue, this results in this packet exchange:
> 
>  Sending packet: $vCont;c#a8...Packet received: T05thread:2;
> 
> That's good, all cores ran, and the remote told GDB that we stopped in
> thread 2.  Next the user does `stepi` and this results in this packet
> exchange:
> 
>   Sending packet: $vCont;s:2#24...Packet received: S05
> 
> Here GDB is trying to step only thread 2, and the target replies with
> an 'S' packet.  Though it feels like sending back 'T05thread:2;' would
> be so much simpler here, there's nothing fundamentally wrong ambiguous
> about the exchange.
> 
> Inside GDB the problem we're running into is within the function
> remote_target::process_stop_reply.  When a stop reply doesn't include
> a thread-id (or process-id) it is this function that is responsible
> for looking one up.  Currently we always just select the first
> non-exited thread, so in this case thread 1.
> 
> As the step was issued as part of the step over a breakpoint logic,
> which was specifically run for thread-2 GDB is expecting the event to
> be reported in thread-2, and hence when we try to handle thread-1 we
> trigger the above assertion.
> 
> My proposal is to improve the logic used in process_stop_reply to make
> the thread selection smarter.

I don't object to this.  As guiding principle, I'd aim at preferring
simplicity over being super tolerant of "bad" stubs, especially if the
workaround complicates the code or some gets in the way in the future.
This seems to be still in tolerable levels.  And we do get the warning.

Is the stub getting fixed as well?

> 
> My first thought was that each thread has an 'executing' flag, instead
> of picking the first non-exited thread, we should pick a non-exited
> thread that is currently executing.  The logic being that stop events
> shouldn't arrive for threads that are no executing.
> 
> The problem with this is the very first initial connection.
> 
> When GDB first connects to the remote target it is told about all the
> existing threads.  These are all created by GDB in the non-executing
> state.  Another part of the connecting logic is to send the remote the
> '?' packet, which asks why the target halted.  This sends back a stop
> packet which is then processed.  At this point non of the threads are

"non" -> "none"

> marked executing so we would end up with no suitably matching threads.
> 
> This left me with two rules:
> 
>   1. Select the first non-exited thread that is marked executing, or
>   2. If no threads match rule 1, select the first non-exited thread
>   whether it is executing or not.
> 
> This seemed fine, and certainly resolved the issue seen in the
> original bug report.  So then I tried to create a test for this using
> a multi-threaded test program and `gdbserver --disable-packet=T`.
> 
> I wasn't able to get anything that exactly reproduced the original
> bug, but I was able to hit similar issues where GDB would try to step
> one thread but GDB would handle the step (from the step) in a
> different thread.  In some of these cases there was genuine ambiguity
> in the reply from the target, however, it still felt like GDB could do
> a better job at guessing which thread to select for the stop event.
> 
> I wondered if we could make use of the 'continue_thread' and/or the
> 'general_thread' to help guide the choice of thread.
> 
> In the end I settled on these rules for thread selection:
> 
>   [ NOTE: For all the following rules, only non-exited threads are
>     considered.  ]
> 
>   1. If the continue_thread is set to a single specific thread, and
>   that thread is executing, then assume this is where the event
>   occurred.
> 
>   2. If nothing matches rule 1, then if the general_thread is set to a
>   single specific thread, and that thread is executing, assume this is
>   where the event occurred.
> 
>   3. If nothing matches rule 2 then select the first thread that is
>   marked as executing.
> 
>   4. If nothing matches rule 3 then select the first thread.
> 
> This works fine except for one small problem, when GDB is using the
> vcont packets we don't need to send 'Hc' packets to the target and so

vcont -> vCont

> the 'continue_thread' is never set.
> 
> In this commit I add a new record_continue_thread function, this sets
> the continue_thread without sending a 'Hc' packet.  This effectively
> serves as a cache for which thread did we set running.
> 
> The only slight "wart" here is that when GDB steps a thread the
> continue_thread is not set to a specific single thread-id, rather it
> gets set to either minus_one_ptid or to a specific processes ptid.  In
> this case (when a step is requested) I store the ptid of the stepping
> thread.
> 



> 
> diff --git a/gdb/remote.c b/gdb/remote.c
> index 71f814efb36..0020a1ee3c5 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c
> @@ -747,6 +747,9 @@ class remote_target : public process_stratum_target
>    ptid_t process_stop_reply (struct stop_reply *stop_reply,
>  			     target_waitstatus *status);
>  
> +  ptid_t guess_thread_for_ambiguous_stop_reply
> +    (const struct target_waitstatus *status);
> +
>    void remote_notice_new_inferior (ptid_t currthread, int executing);
>  
>    void process_initial_stop_replies (int from_tty);
> @@ -2576,6 +2579,22 @@ record_currthread (struct remote_state *rs, ptid_t currthread)
>    rs->general_thread = currthread;
>  }
>  
> +/* Called from the vcont packet generation code.  Unlike the old thread

vcont -> vCont

> +   control packets, which rely on sending a Hc packet before sending the
> +   continue/step packet, with vcont no Hc packet is sent.

vcont -> vCont

> +
> +   As a result the remote state's continue_thread field is never updated.
> +
> +   Sometime though it can be useful if we do have some information about
> +   which thread(s) the vcont tried to continue/step as this can be used to

vcont -> vCont

> +   guide the choice of thread in the case were a miss-behaving remote
> +   doesn't include a thread-id in its stop packet.  */
> +static void

Missing empty line between comment and function.

Note that a single vCont packet can include multiple resumptions,
like e.g.:

  vCont s:1; s:2; c:3

though currently only in non-stop RSP.  Maybe just mention that 

> +record_continue_thread (struct remote_state *rs, ptid_t thr)
> +{
> +  rs->continue_thread = thr;
> +}
> +
>  /* If 'QPassSignals' is supported, tell the remote stub what signals
>     it can simply pass through to the inferior without reporting.  */
>  
> @@ -6227,6 +6246,8 @@ remote_target::remote_resume_with_vcont (ptid_t ptid, int step,
>    char *p;
>    char *endp;
>  
> +  record_continue_thread (get_remote_state (), ptid);

You're calling this before the check below that checks whether vCont is supported
at all.  That means that if vCont isn't supported, when we get to remote_resume_with_hc,
to the set_thread call via set_continue_thread, the continue_thread is already updated
in gdb, but it was never updated in the remote side:

void
remote_target::set_thread (ptid_t ptid, int gen)
{
  struct remote_state *rs = get_remote_state ();
  ptid_t state = gen ? rs->general_thread : rs->continue_thread;
  char *buf = rs->buf.data ();
  char *endbuf = buf + get_remote_packet_size ();

  if (state == ptid)
    return;
  ...

I.e., it seems to me that gdb and the remote get out of sync?


> +
>    /* No reverse execution actions defined for vCont.  */
>    if (::execution_direction == EXEC_REVERSE)
>      return 0;
> @@ -6264,6 +6285,7 @@ remote_target::remote_resume_with_vcont (ptid_t ptid, int step,
>  	{
>  	  /* Step inferior_ptid, with or without signal.  */
>  	  p = append_resumption (p, endp, inferior_ptid, step, siggnal);
> +	  record_continue_thread (get_remote_state (), inferior_ptid);
>  	}
>  
>        /* Also pass down any pending signaled resumption for other
> @@ -7671,6 +7693,191 @@ remote_notif_get_pending_events (remote_target *remote, notif_client *nc)
>    remote->remote_notif_get_pending_events (nc);
>  }
>  
> +/* Called from process_stop_reply when the stop packet we are responding
> +   too didn't include a process-id or thread-id.  STATUS is the stop event


"responding too" -> "responding to".


> +   we are responding too.

Ditto.

> +
> +   It is the task of this function to find (guess) a suitable thread and
> +   return its ptid, this is the thread we will assume the stop event came
> +   from.
> +
> +   In some cases there really isn't any guessing going on, a basic remote
> +   with a single process containing a single thread might choose not to
> +   send any process-id or thread-id in its stop packets, this function will
> +   select and return the one and only thread.
> +
> +   However, there are targets out there which are.... not great, and in
> +   some cases will support multiple threads but still don't include a
> +   thread-id.  In these cases we try to do the best we can when selecting a
> +   thread, but in the general case we can never know for sure we have
> +   picked the correct thread.  As a result this function can issue a
> +   warning to the user if it detects that there is the possibility that we
> +   really are guessing at which thread to report.  */
> +
> +ptid_t
> +remote_target::guess_thread_for_ambiguous_stop_reply
> +	(const struct target_waitstatus *status)

Should be indented with 2 spaces instead of a tab.

> +{
> +  /* Some stop events apply to all threads in an inferior, while others
> +     only apply to a single thread.  */
> +  bool is_stop_for_all_threads
> +    = (status->kind == TARGET_WAITKIND_EXITED
> +       || status->kind == TARGET_WAITKIND_SIGNALLED);
> +
> +  struct remote_state *rs = get_remote_state ();
> +
> +  /* Track the possible threads in this structure.  */
> +  struct thread_choices
> +  {
> +    /* Constructor.  */
> +    thread_choices (struct remote_state *rs, bool is_stop_for_all_threads)
> +      : m_rs (rs),
> +	m_is_stop_for_all_threads (is_stop_for_all_threads)
> +    { /* Nothing.  */ }
> +
> +    /* Disable/delete these.  */
> +    thread_choices () = delete;

This one is not necessary, since you provide a non-default ctor.

> +    DISABLE_COPY_AND_ASSIGN (thread_choices);
> +
> +    /* Consider thread THR setting the internal thread tracking variables
> +       as appropriate.  */
> +    void consider_thread (thread_info *thr)
> +    {
> +      /* Record this as the first thread, or mark that we have multiple
> +	 possible threads.  We set the m_multiple flag even if there is

Uppercase m_multiple.

> +	 only one thread executing.  This means we possibly issue warnings
> +	 to the user when there is no ambiguity... but there's really no
> +	 reason why the remote target couldn't include a thread-id so it
> +	 doesn't seem to bad to point this out.  */
> +      if (m_first_thread == nullptr)
> +	m_first_thread = thr;
> +      else if (!m_is_stop_for_all_threads
> +	       || m_first_thread->ptid.pid () != thr->ptid.pid ())
> +	m_multiple = true;
> +

...


> +
> +    /* Return true if there were multiple possible thread/processes and we
> +       had to just pick one.  This indicates that a warning probably should
> +       be issued to the user.  */
> +    bool multiple_possible_threads_p () const
> +    { return m_multiple; }
> +
> +  private:
> +
> +    /* The remote state we are examining threads for.  */
> +    struct remote_state *m_rs = nullptr;

This one's already initialized in the ctor.

> +
> +    /* Is this stop event one for all threads in a process (e.g.  process

Spurious double space after e.g.

> +       exited), or an event for a single thread (e.g. thread stopped).  */
> +    bool m_is_stop_for_all_threads;
> +

...

> +
> +  /* If this is a stop for all threads then don't use a particular threads
> +     ptid, instead create a new ptid where only the pid field is set.  */
> +  return ((is_stop_for_all_threads) ? ptid_t (thr->ptid.pid ()) : thr->ptid);

Parens around is_stop_for_all_threads redundant.

> +}
> +


> diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
> new file mode 100644
> index 00000000000..b4ab03471e8
> --- /dev/null
> +++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
> @@ -0,0 +1,139 @@
> +# This testcase is part of GDB, the GNU debugger.
> +#
> +# Copyright 2020 Free Software Foundation, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test how GDB handles the case where a target either doesn't use 'T'
> +# packets at all or doesn't include a thread-id in a 'T' packet, AND,
> +# where the test program contains multiple threads.
> +#
> +# In general this is a broken situation and GDB can never do the
> +# "right" thing is all cases.  If two threads are running and when a
> +# stop occurs, the remote does not tell GDB which thread stopped, then
> +# GDB can never be sure it has attributed the stop to the correct
> +# thread.
> +#
> +# However, we can ensure some reasonably sane default behaviours which
> +# can make some broken targets appear a little less broken.
> +
> +load_lib gdbserver-support.exp
> +
> +if { [skip_gdbserver_tests] } {
> +    verbose "skipping gdbserver tests"
> +    return -1
> +}
> +
> +standard_testfile
> +if [prepare_for_testing "failed to prepare" $testfile $srcfile {debug pthreads}] {

Below in run_test, you use clean_restart so here you can use build_executable
instead of prepare_for_testing.  Saves spawning gdb here (and creating unnecessary
gdb.cmd.1/gdb.in.1 files in the output dir) only to restart it again
immediately after.

> +    return -1
> +}
> +

...

> +
> +    gdb_breakpoint "unlock_worker"
> +    gdb_continue_to_breakpoint "run to unlock_worker"
> +
> +    # There should be two threads at this point with thread 1 selected.
> +    gdb_test "info threads" \
> +	"\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n  2\[\t \]*Thread\[^\r\n\]*" \
> +	"second thread should now exist"
> +
> +    # Switch threads.
> +    gdb_test "thread 2" ".*" "switch to second thread"
> +
> +    # Single step.  This will set all threads running but as there's
> +    # no reason for the first thread to report a stop we expect to
> +    # finish the step with thread 2 still selected.

I think GDB will first switch to thread 1 to step over the breakpoint thread 1
is stopped at and only after will it step thread 2 while letting thread 1 run free.
I think that with your patch GDB will do the "right" thing and figure out the
right thread for the first step stop of thread 1 correctly, since at that point
no other thread is executing.  It's just that the comment seems a bit off.

> +    gdb_test_multiple "stepi" "" {
> +	-re "Thread 1 received signal SIGTRAP" {

Shouldn't this consume the prompt?

> +	    fail $gdb_test_name
> +	}
> +	-re "$hex.*$decimal.*while \\(worker_blocked\\).*$gdb_prompt" {
> +	    pass $gdb_test_name
> +	}
> +    }
> +
> +    # Double check that thread 2 is still selected.
> +    gdb_test "info threads" \
> +	"  1\[\t \]*Thread\[^\r\n\]*\r\n\\\* 2\[\t \]*Thread\[^\r\n\]*" \
> +	"second thread should still be selected after stepi"
> +
> +    # Now "continue" thread 2.  Again there's no reason for thread 1
> +    # to report a stop so we should finish with thread 2 still
> +    # selected.

Ditto here.


I'll look at Simon's patch next, and your thread/frame patches after.
Hopefully tomorrow.

Pedro Alves

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gdb: better handling of 'S' packets
  2021-01-08  0:51 ` [PATCH] " Pedro Alves
@ 2021-01-08  3:00   ` Simon Marchi via Gdb-patches
  2021-01-08 10:15     ` Andrew Burgess
  2021-01-08  3:58   ` Simon Marchi via Gdb-patches
  1 sibling, 1 reply; 9+ messages in thread
From: Simon Marchi via Gdb-patches @ 2021-01-08  3:00 UTC (permalink / raw)
  To: Pedro Alves, Andrew Burgess, gdb-patches

Hi Pedro,

It looks like you reviewed v1, but Andrew had posted v2:

https://sourceware.org/pipermail/gdb-patches/2020-December/174277.html

I don't know if that was on purpose.

On 2021-01-07 7:51 p.m., Pedro Alves wrote:
> I'll look at Simon's patch next, and your thread/frame patches after.
> Hopefully tomorrow.

I adapted Andrew's v2 to work on top of my patch that makes the remote
target track the threads' resume state, and so far it looks good (Andrew's
test still passes).  I'll give it a test run and send a combined series.

Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gdb: better handling of 'S' packets
  2021-01-08  0:51 ` [PATCH] " Pedro Alves
  2021-01-08  3:00   ` Simon Marchi via Gdb-patches
@ 2021-01-08  3:58   ` Simon Marchi via Gdb-patches
  1 sibling, 0 replies; 9+ messages in thread
From: Simon Marchi via Gdb-patches @ 2021-01-08  3:58 UTC (permalink / raw)
  To: Pedro Alves, Andrew Burgess, gdb-patches

On 2021-01-07 7:51 p.m., Pedro Alves wrote:
>> +
>> +    gdb_breakpoint "unlock_worker"
>> +    gdb_continue_to_breakpoint "run to unlock_worker"
>> +
>> +    # There should be two threads at this point with thread 1 selected.
>> +    gdb_test "info threads" \
>> +	"\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n  2\[\t \]*Thread\[^\r\n\]*" \
>> +	"second thread should now exist"
>> +
>> +    # Switch threads.
>> +    gdb_test "thread 2" ".*" "switch to second thread"
>> +
>> +    # Single step.  This will set all threads running but as there's
>> +    # no reason for the first thread to report a stop we expect to
>> +    # finish the step with thread 2 still selected.
> 
> I think GDB will first switch to thread 1 to step over the breakpoint thread 1
> is stopped at and only after will it step thread 2 while letting thread 1 run free.
> I think that with your patch GDB will do the "right" thing and figure out the
> right thread for the first step stop of thread 1 correctly, since at that point
> no other thread is executing.  It's just that the comment seems a bit off.
> 
v2 uses scheduler-locking, I suppose to address that.


>> +    gdb_test_multiple "stepi" "" {
>> +	-re "Thread 1 received signal SIGTRAP" {
> 
> Shouldn't this consume the prompt?

Is it important?  We could probably just omit it and let gdb_test_multiple
time out if it doesn't see what we expect?

> 
>> +	    fail $gdb_test_name
>> +	}
>> +	-re "$hex.*$decimal.*while \\(worker_blocked\\).*$gdb_prompt" {
>> +	    pass $gdb_test_name
>> +	}
>> +    }
>> +
>> +    # Double check that thread 2 is still selected.
>> +    gdb_test "info threads" \
>> +	"  1\[\t \]*Thread\[^\r\n\]*\r\n\\\* 2\[\t \]*Thread\[^\r\n\]*" \
>> +	"second thread should still be selected after stepi"
>> +
>> +    # Now "continue" thread 2.  Again there's no reason for thread 1
>> +    # to report a stop so we should finish with thread 2 still
>> +    # selected.
> 
> Ditto here.

This also changed in v2, the test now lets both threads and the program
run until the end.

I tried to address all your comments that still apply in the version
I'll send.

Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gdb: better handling of 'S' packets
  2021-01-08  3:00   ` Simon Marchi via Gdb-patches
@ 2021-01-08 10:15     ` Andrew Burgess
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Burgess @ 2021-01-08 10:15 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

* Simon Marchi <simon.marchi@polymtl.ca> [2021-01-07 22:00:14 -0500]:

> Hi Pedro,
> 
> It looks like you reviewed v1, but Andrew had posted v2:
> 
> https://sourceware.org/pipermail/gdb-patches/2020-December/174277.html
> 
> I don't know if that was on purpose.
> 
> On 2021-01-07 7:51 p.m., Pedro Alves wrote:
> > I'll look at Simon's patch next, and your thread/frame patches after.
> > Hopefully tomorrow.
> 
> I adapted Andrew's v2 to work on top of my patch that makes the remote
> target track the threads' resume state, and so far it looks good (Andrew's
> test still passes).  I'll give it a test run and send a combined series.

Thanks Simon.

I think this is the best approach.  The V2 patch addresses the core
issue that was raised in the bug only, while the V1 patch goes beyond
this.

The reason the V1 patch came about is that originally I struggled to
find a way to reproduce exactly the issue that was reported in the
bug.  To reproduce I was testing gdbserver with 'T' and 'Tthread'
disabled and I kept running into other bugs in cases where I would
tell myself, yes the remote is wrong not to include a thread-id, but
GDB _should_ be able to figure out what I'm doing here.

As Pedro points out in his V1 patch review, we should probably go for
simplicity over complexity in the face of a badly behaving remote
target.  So long as GDB doesn't crash/assert, just throwing a warning
at the user is fine.

NOTE: The V2 patch does address a _real_ bug in GDB, so absolutely
should still go in.

Once V2 is merged I'll rebase the V1 patch and see how complex it
actually looks.  I might repost it if it doesn't add too much
complexity, but I might just drop it.

Thanks.

Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-01-08 10:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-11 15:35 [PATCH] gdb: better handling of 'S' packets Andrew Burgess
2020-12-10 16:29 ` Andrew Burgess
2020-12-23 23:09   ` [PATCHv2] " Andrew Burgess
2021-01-06 21:19     ` Simon Marchi via Gdb-patches
2021-01-07  9:57       ` Andrew Burgess
2021-01-08  0:51 ` [PATCH] " Pedro Alves
2021-01-08  3:00   ` Simon Marchi via Gdb-patches
2021-01-08 10:15     ` Andrew Burgess
2021-01-08  3:58   ` Simon Marchi via Gdb-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox