From: Pedro Alves <pedro@palves.net>
To: Thiago Jung Bauermann <thiago.bauermann@linaro.org>,
gdb-patches@sourceware.org
Subject: Re: [RFC PATCH 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads
Date: Wed, 17 Apr 2024 17:28:12 +0100 [thread overview]
Message-ID: <29215aa0-8387-4dee-8b8d-3cbf64e6abe3@palves.net> (raw)
In-Reply-To: <20240321231149.519549-4-thiago.bauermann@linaro.org>
On 2024-03-21 23:11, Thiago Jung Bauermann wrote:
> When GDB attaches to a multi-threaded process, it calls
> linux_proc_attach_tgid_threads () to go through all threads found in
> /proc/PID/task/ and call attach_proc_task_lwp_callback () on each of
> them. If it does that twice without the callback reporting that a new
> thread was found, then it considers that all inferior threads have been
> found and returns.
>
> The problem is that the callback considers any thread that it hasn't
> attached to yet as new. This causes problems if the process has one or
> more zombie threads, because GDB can't attach to it and the loop will
> always "find" a new thread (the zombie one), and get stuck in an
> infinite loop.
>
> This is easy to trigger (at least on aarch64-linux and powerpc64le-linux)
> with the gdb.threads/attach-many-short-lived-threads.exp testcase, because
> its test program constantly creates and finishes joinable threads so the
> chance of having zombie threads is high.
>
> This problem causes the following failures:
>
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: attach (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: no new threads (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: set breakpoint always-inserted on (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break break_fn (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 1 (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 2 (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 3 (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: reset timer in the inferior (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: print seconds_left (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: detach (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: set breakpoint always-inserted off (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: delete all breakpoints, watchpoints, tracepoints, and catchpoints in delete_breakpoints (timeout)
> ERROR: breakpoints not deleted
>
> The iteration number is random, and all tests in the subsequent iterations
> fail too, because GDB is stuck in the attach command at the beginning of
> the iteration.
>
> The solution is to make linux_proc_attach_tgid_threads () remember when it
> has already processed a given LWP and skip it in the subsequent iterations.
>
> PR testsuite/31312
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31312
> ---
> gdb/nat/linux-osdata.c | 22 ++++++++++++++++++++++
> gdb/nat/linux-osdata.h | 4 ++++
> gdb/nat/linux-procfs.c | 19 +++++++++++++++++++
> 3 files changed, 45 insertions(+)
>
> diff --git a/gdb/nat/linux-osdata.c b/gdb/nat/linux-osdata.c
> index c254f2e4f05b..998279377433 100644
> --- a/gdb/nat/linux-osdata.c
> +++ b/gdb/nat/linux-osdata.c
> @@ -112,6 +112,28 @@ linux_common_core_of_thread (ptid_t ptid)
> return core;
> }
>
> +/* See linux-osdata.h. */
> +
> +std::optional<ULONGEST>
> +linux_get_starttime (ptid_t ptid)
Ditto re. moving this to linux-procfs. This has nothing to do with "info osdata".
> index a82fb08b998e..1cdc687aa9cf 100644
> --- a/gdb/nat/linux-osdata.h
> +++ b/gdb/nat/linux-osdata.h
> @@ -27,4 +27,8 @@ extern int linux_common_core_of_thread (ptid_t ptid);
> extern LONGEST linux_common_xfer_osdata (const char *annex, gdb_byte *readbuf,
> ULONGEST offset, ULONGEST len);
>
> +/* Get the start time of thread PTID. */
> +
> +extern std::optional<ULONGEST> linux_get_starttime (ptid_t ptid);
> +
> #endif /* NAT_LINUX_OSDATA_H */
> diff --git a/gdb/nat/linux-procfs.c b/gdb/nat/linux-procfs.c
> index b17e3120792e..b01bf36c0b53 100644
> --- a/gdb/nat/linux-procfs.c
> +++ b/gdb/nat/linux-procfs.c
> @@ -17,10 +17,13 @@
> along with this program. If not, see <http://www.gnu.org/licenses/>. */
>
> #include "gdbsupport/common-defs.h"
> +#include "linux-osdata.h"
> #include "linux-procfs.h"
linux-procfs.h is the main header of this source file, so it should be first.
But then again, I think this shouldn't be consuming things from linux-osdata, it
should be the other way around.
> #include "gdbsupport/filestuff.h"
> #include <dirent.h>
> #include <sys/stat.h>
> +#include <set>
> +#include <utility>
>
> /* Return the TGID of LWPID from /proc/pid/status. Returns -1 if not
> found. */
> @@ -290,6 +293,10 @@ linux_proc_attach_tgid_threads (pid_t pid,
> return;
> }
>
> + /* Keeps track of the LWPs we have already visited in /proc,
> + identified by their PID and starttime to detect PID reuse. */
> + std::set<std::pair<unsigned long,ULONGEST>> visited_lwps;
Missing space before ULONGEST.
AFAICT, you don't rely on order, so this could be an unordered_set?
> +
> /* Scan the task list for existing threads. While we go through the
> threads, new threads may be spawned. Cycle through the list of
> threads until we have done two iterations without finding new
> @@ -308,6 +315,18 @@ linux_proc_attach_tgid_threads (pid_t pid,
> if (lwp != 0)
> {
> ptid_t ptid = ptid_t (pid, lwp);
> + std::optional<ULONGEST> starttime = linux_get_starttime (ptid);
> +
> + if (starttime.has_value ())
> + {
> + std::pair<unsigned long,ULONGEST> key (lwp, *starttime);
Space before ULONGEST.
> +
> + /* If we already visited this LWP, skip it this time. */
> + if (visited_lwps.find (key) != visited_lwps.cend ())
> + continue;
> +
> + visited_lwps.insert (key);
> + }
>
> if (attach_lwp (ptid))
> new_threads_found = 1;
next prev parent reply other threads:[~2024-04-17 16:28 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-21 23:11 [RFC PATCH 0/3] " Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread Thiago Jung Bauermann
2024-03-22 17:33 ` Luis Machado
2024-04-17 15:55 ` Pedro Alves
2024-04-20 5:15 ` Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 2/3] gdb/nat: Factor linux_find_proc_stat_field out of linux_common_core_of_thread Thiago Jung Bauermann
2024-03-22 16:12 ` Luis Machado
2024-04-17 16:06 ` Pedro Alves
2024-04-20 5:16 ` Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads Thiago Jung Bauermann
2024-03-22 16:19 ` Luis Machado
2024-03-22 16:52 ` Pedro Alves
2024-04-16 4:48 ` Thiago Jung Bauermann
2024-04-17 15:32 ` Pedro Alves
2024-04-20 5:00 ` Thiago Jung Bauermann
2024-04-26 15:35 ` Pedro Alves
2024-04-17 16:28 ` Pedro Alves [this message]
2024-04-20 5:28 ` Thiago Jung Bauermann
2024-03-22 10:17 ` [RFC PATCH 0/3] " Christophe Lyon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=29215aa0-8387-4dee-8b8d-3cbf64e6abe3@palves.net \
--to=pedro@palves.net \
--cc=gdb-patches@sourceware.org \
--cc=thiago.bauermann@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox