Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
To: Pedro Alves <pedro@palves.net>
Cc: gdb-patches@sourceware.org
Subject: Re: [RFC PATCH 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads
Date: Tue, 16 Apr 2024 01:48:53 -0300	[thread overview]
Message-ID: <87msptgbey.fsf@linaro.org> (raw)
In-Reply-To: <f7dac3e3-a9c5-43c6-8696-3ea683beddff@palves.net> (Pedro Alves's message of "Fri, 22 Mar 2024 16:52:09 +0000")


Hello Pedro,

[ I just now noticed that I replied to you in private last time. Sorry
  about that, I thought I had cc'd the list. ]

Pedro Alves <pedro@palves.net> writes:

> On 2024-03-21 23:11, Thiago Jung Bauermann wrote:
>> When GDB attaches to a multi-threaded process, it calls
>> linux_proc_attach_tgid_threads () to go through all threads found in
>> /proc/PID/task/ and call attach_proc_task_lwp_callback () on each of
>> them.  If it does that twice without the callback reporting that a new
>> thread was found, then it considers that all inferior threads have been
>> found and returns.
>>
>> The problem is that the callback considers any thread that it hasn't
>> attached to yet as new.  This causes problems if the process has one or
>> more zombie threads, because GDB can't attach to it and the loop will
>> always "find" a new thread (the zombie one), and get stuck in an
>> infinite loop.
>>
>> This is easy to trigger (at least on aarch64-linux and powerpc64le-linux)
>> with the gdb.threads/attach-many-short-lived-threads.exp testcase, because
>> its test program constantly creates and finishes joinable threads so the
>> chance of having zombie threads is high.
>
> Hmm.  How about simply not restarting the loop if attach_lwp tries to attach to
> a zombie lwp (and silently fails)?
>
> I.e., in gdb, make attach_proc_task_lwp_callback return false/0 here:
>
>       if (ptrace (PTRACE_ATTACH, lwpid, 0, 0) < 0)
> 	{
> 	  int err = errno;
>
> 	  /* Be quiet if we simply raced with the thread exiting.
> 	     EPERM is returned if the thread's task still exists, and
> 	     is marked as exited or zombie, as well as other
> 	     conditions, so in that case, confirm the status in
> 	     /proc/PID/status.  */
> 	  if (err == ESRCH
> 	      || (err == EPERM && linux_proc_pid_is_gone (lwpid)))
> 	    {
> 	      linux_nat_debug_printf
> 		("Cannot attach to lwp %d: thread is gone (%d: %s)",
> 		 lwpid, err, safe_strerror (err));
>
>               return 0;        <<<< NEW RETURN
> 	    }
>
> Similar thing for gdbserver, of course.

Thank you for the suggestion. I tried doing that, and in fact I attached
a patch with that change in comment #17 of PR 31312 when I was
investigating a fix¹. I called it a workaround because I also had to
increase the number of iterations in linux_proc_attach_tgid_threads from
2 to 100, otherwise GDB gives up on waiting for new inferior threads too
early and the inferior dies with a SIGTRAP because some new unnoticed
thread tripped into the breakpoint.

Because of the need to increase the number of iterations, I didn't
consider it a good solution and went with the approach in this patch
series instead. Now I finally understand why I had to increase the
number of iterations (though I still don't see a way around it other
than what this patch series does):

With the approach in this patch series, even if a new thread becomes
zombie by the time GDB tries to attach to it, it still causes
new_threads_found to be set the first time GDB notices it, and the loop
in linux_proc_attach_tgid_threads starts over.

With the approach above, a new thread that becomes zombie before GDB has
a chance to attach to it never causes the loop to start over, and so it
exits earlier.

I think it's a matter of opinion whether one approach or the other can
be considered the better one.

Even with this patch series, it's not guaranteed that two iterations
without finding new threads is enough to ensure that GDB has found all
threads in the inferior. I left the test running in a loop overnight
with the patch series applied and it failed after about 2500 runs.

--
Thiago

¹ https://sourceware.org/bugzilla/attachment.cgi?id=15405&action=diff

  reply	other threads:[~2024-04-16  4:49 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-21 23:11 [RFC PATCH 0/3] " Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread Thiago Jung Bauermann
2024-03-22 17:33   ` Luis Machado
2024-04-17 15:55     ` Pedro Alves
2024-04-20  5:15       ` Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 2/3] gdb/nat: Factor linux_find_proc_stat_field out of linux_common_core_of_thread Thiago Jung Bauermann
2024-03-22 16:12   ` Luis Machado
2024-04-17 16:06   ` Pedro Alves
2024-04-20  5:16     ` Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads Thiago Jung Bauermann
2024-03-22 16:19   ` Luis Machado
2024-03-22 16:52   ` Pedro Alves
2024-04-16  4:48     ` Thiago Jung Bauermann [this message]
2024-04-17 15:32       ` Pedro Alves
2024-04-20  5:00         ` Thiago Jung Bauermann
2024-04-26 15:35           ` Pedro Alves
2024-04-17 16:28   ` Pedro Alves
2024-04-20  5:28     ` Thiago Jung Bauermann
2024-03-22 10:17 ` [RFC PATCH 0/3] " Christophe Lyon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87msptgbey.fsf@linaro.org \
    --to=thiago.bauermann@linaro.org \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox