Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
To: gdb-patches@sourceware.org
Subject: [RFC PATCH 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads
Date: Thu, 21 Mar 2024 20:11:49 -0300	[thread overview]
Message-ID: <20240321231149.519549-4-thiago.bauermann@linaro.org> (raw)
In-Reply-To: <20240321231149.519549-1-thiago.bauermann@linaro.org>

When GDB attaches to a multi-threaded process, it calls
linux_proc_attach_tgid_threads () to go through all threads found in
/proc/PID/task/ and call attach_proc_task_lwp_callback () on each of
them.  If it does that twice without the callback reporting that a new
thread was found, then it considers that all inferior threads have been
found and returns.

The problem is that the callback considers any thread that it hasn't
attached to yet as new.  This causes problems if the process has one or
more zombie threads, because GDB can't attach to it and the loop will
always "find" a new thread (the zombie one), and get stuck in an
infinite loop.

This is easy to trigger (at least on aarch64-linux and powerpc64le-linux)
with the gdb.threads/attach-many-short-lived-threads.exp testcase, because
its test program constantly creates and finishes joinable threads so the
chance of having zombie threads is high.

This problem causes the following failures:

FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: attach (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: no new threads (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: set breakpoint always-inserted on (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break break_fn (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 1 (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 2 (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 3 (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: reset timer in the inferior (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: print seconds_left (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: detach (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: set breakpoint always-inserted off (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: delete all breakpoints, watchpoints, tracepoints, and catchpoints in delete_breakpoints (timeout)
ERROR: breakpoints not deleted

The iteration number is random, and all tests in the subsequent iterations
fail too, because GDB is stuck in the attach command at the beginning of
the iteration.

The solution is to make linux_proc_attach_tgid_threads () remember when it
has already processed a given LWP and skip it in the subsequent iterations.

PR testsuite/31312
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31312
---
 gdb/nat/linux-osdata.c | 22 ++++++++++++++++++++++
 gdb/nat/linux-osdata.h |  4 ++++
 gdb/nat/linux-procfs.c | 19 +++++++++++++++++++
 3 files changed, 45 insertions(+)

diff --git a/gdb/nat/linux-osdata.c b/gdb/nat/linux-osdata.c
index c254f2e4f05b..998279377433 100644
--- a/gdb/nat/linux-osdata.c
+++ b/gdb/nat/linux-osdata.c
@@ -112,6 +112,28 @@ linux_common_core_of_thread (ptid_t ptid)
   return core;
 }
 
+/* See linux-osdata.h.  */
+
+std::optional<ULONGEST>
+linux_get_starttime (ptid_t ptid)
+{
+  std::optional<std::string> field = linux_find_proc_stat_field (ptid, 22);
+
+  if (!field.has_value ())
+    return {};
+
+  errno = 0;
+  const char *trailer;
+  ULONGEST starttime = strtoulst (field->c_str (), &trailer, 10);
+  if (starttime == ULONGEST_MAX && errno == ERANGE)
+    return {};
+  else if (*trailer != '\0')
+    /* There were unexpected characters.  */
+    return {};
+
+  return starttime;
+}
+
 /* Finds the command-line of process PID and copies it into COMMAND.
    At most MAXLEN characters are copied.  If the command-line cannot
    be found, PID is copied into command in text-form.  */
diff --git a/gdb/nat/linux-osdata.h b/gdb/nat/linux-osdata.h
index a82fb08b998e..1cdc687aa9cf 100644
--- a/gdb/nat/linux-osdata.h
+++ b/gdb/nat/linux-osdata.h
@@ -27,4 +27,8 @@ extern int linux_common_core_of_thread (ptid_t ptid);
 extern LONGEST linux_common_xfer_osdata (const char *annex, gdb_byte *readbuf,
 					 ULONGEST offset, ULONGEST len);
 
+/* Get the start time of thread PTID.  */
+
+extern std::optional<ULONGEST> linux_get_starttime (ptid_t ptid);
+
 #endif /* NAT_LINUX_OSDATA_H */
diff --git a/gdb/nat/linux-procfs.c b/gdb/nat/linux-procfs.c
index b17e3120792e..b01bf36c0b53 100644
--- a/gdb/nat/linux-procfs.c
+++ b/gdb/nat/linux-procfs.c
@@ -17,10 +17,13 @@
    along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
 
 #include "gdbsupport/common-defs.h"
+#include "linux-osdata.h"
 #include "linux-procfs.h"
 #include "gdbsupport/filestuff.h"
 #include <dirent.h>
 #include <sys/stat.h>
+#include <set>
+#include <utility>
 
 /* Return the TGID of LWPID from /proc/pid/status.  Returns -1 if not
    found.  */
@@ -290,6 +293,10 @@ linux_proc_attach_tgid_threads (pid_t pid,
       return;
     }
 
+  /* Keeps track of the LWPs we have already visited in /proc,
+     identified by their PID and starttime to detect PID reuse.  */
+  std::set<std::pair<unsigned long,ULONGEST>> visited_lwps;
+
   /* Scan the task list for existing threads.  While we go through the
      threads, new threads may be spawned.  Cycle through the list of
      threads until we have done two iterations without finding new
@@ -308,6 +315,18 @@ linux_proc_attach_tgid_threads (pid_t pid,
 	  if (lwp != 0)
 	    {
 	      ptid_t ptid = ptid_t (pid, lwp);
+	      std::optional<ULONGEST> starttime = linux_get_starttime (ptid);
+
+	      if (starttime.has_value ())
+		{
+		  std::pair<unsigned long,ULONGEST> key (lwp, *starttime);
+
+		  /* If we already visited this LWP, skip it this time.  */
+		  if (visited_lwps.find (key) != visited_lwps.cend ())
+		    continue;
+
+		  visited_lwps.insert (key);
+		}
 
 	      if (attach_lwp (ptid))
 		new_threads_found = 1;

  parent reply	other threads:[~2024-03-21 23:12 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-21 23:11 [RFC PATCH 0/3] " Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread Thiago Jung Bauermann
2024-03-22 17:33   ` Luis Machado
2024-04-17 15:55     ` Pedro Alves
2024-04-20  5:15       ` Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 2/3] gdb/nat: Factor linux_find_proc_stat_field out of linux_common_core_of_thread Thiago Jung Bauermann
2024-03-22 16:12   ` Luis Machado
2024-04-17 16:06   ` Pedro Alves
2024-04-20  5:16     ` Thiago Jung Bauermann
2024-03-21 23:11 ` Thiago Jung Bauermann [this message]
2024-03-22 16:19   ` [RFC PATCH 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads Luis Machado
2024-03-22 16:52   ` Pedro Alves
2024-04-16  4:48     ` Thiago Jung Bauermann
2024-04-17 15:32       ` Pedro Alves
2024-04-20  5:00         ` Thiago Jung Bauermann
2024-04-26 15:35           ` Pedro Alves
2024-04-17 16:28   ` Pedro Alves
2024-04-20  5:28     ` Thiago Jung Bauermann
2024-03-22 10:17 ` [RFC PATCH 0/3] " Christophe Lyon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240321231149.519549-4-thiago.bauermann@linaro.org \
    --to=thiago.bauermann@linaro.org \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox