Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
To: Pedro Alves <pedro@palves.net>
Cc: gdb-patches@sourceware.org
Subject: Re: [RFC PATCH 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads
Date: Sat, 20 Apr 2024 02:00:39 -0300	[thread overview]
Message-ID: <87jzksk4qw.fsf@linaro.org> (raw)
In-Reply-To: <9680e3cf-b8ad-4329-a51c-2aafb98d9476@palves.net> (Pedro Alves's message of "Wed, 17 Apr 2024 16:32:56 +0100")


Hello Pedro,

Pedro Alves <pedro@palves.net> writes:

> On 2024-04-16 05:48, Thiago Jung Bauermann wrote:
>>
>> Pedro Alves <pedro@palves.net> writes:
>>
>>> Hmm.  How about simply not restarting the loop if attach_lwp tries to attach to
>>> a zombie lwp (and silently fails)?

<snip>

>>> Similar thing for gdbserver, of course.
>>
>> Thank you for the suggestion. I tried doing that, and in fact I attached
>> a patch with that change in comment #17 of PR 31312 when I was
>> investigating a fix¹. I called it a workaround because I also had to
>> increase the number of iterations in linux_proc_attach_tgid_threads from
>> 2 to 100, otherwise GDB gives up on waiting for new inferior threads too
>> early and the inferior dies with a SIGTRAP because some new unnoticed
>> thread tripped into the breakpoint.
>>
>> Because of the need to increase the number of iterations, I didn't
>> consider it a good solution and went with the approach in this patch
>> series instead. Now I finally understand why I had to increase the
>> number of iterations (though I still don't see a way around it other
>> than what this patch series does):
>>
>> With the approach in this patch series, even if a new thread becomes
>> zombie by the time GDB tries to attach to it, it still causes
>> new_threads_found to be set the first time GDB notices it, and the loop
>> in linux_proc_attach_tgid_threads starts over.
>>
>> With the approach above, a new thread that becomes zombie before GDB has
>> a chance to attach to it never causes the loop to start over, and so it
>> exits earlier.
>>
>> I think it's a matter of opinion whether one approach or the other can
>> be considered the better one.
>>
>> Even with this patch series, it's not guaranteed that two iterations
>> without finding new threads is enough to ensure that GDB has found all
>> threads in the inferior. I left the test running in a loop overnight
>> with the patch series applied and it failed after about 2500 runs.
>
> Thanks for all the investigations.  I hadn't seen the bugzilla before,
> I didn't notice there was one identified in the patch.
>
> Let me just start by saying "bleh"...

Agreed.

> This is all of course bad ptrace/kernel design...  The only way to plug this race
> completely is with kernel help, I believe.

Indeed. Having a ptrace request to tell the kernel to stop all threads in
the process, and having a "way to use ptrace without signals or wait"
(as mentioned in the LinuxKernelWishList wiki page) would be a big
improvement. In the past few years the kernel has been introducing more
and more syscalls that accept a pidfd¹. Perhaps the time is ripe for a
pidfd_ptrace syscall?

> The only way that I know of today that we can make the kernel pause all threads for
> us, is with a group-stop.  I.e., pass a SIGSTOP to the inferior, and the kernel stops
> _all_ threads with SIGSTOP.
>
> I prototyped it today, and while far from perfect (would need to handle a corner
> scenarios like attaching and getting back something != SIGSTOP, and I still see
> some spurious SIGSTOPs), it kind of works ...

This is amazing. Thank you for this exploration and the patch.

> ... except for what I think is a deal breaker that I don't know how to work around:
>
> If you attach to a process that is running on a shell, you will see that the group-stop
> makes it go to the background.  GDB resumes the process, but it won't go back to the
> foreground without an explicit "fg" in the shell...  Like:
>
>  $ ./test_program
>  [1]+  Stopped                 ./test_program
>  $

Couldn't this be a shell behaviour rather than a kernel one? I don't see
anything that I could associate with this effect mentioned in the ptrace
man page (though I could have easily missed it). If it is, then the fix
would be in the shell too.

> Thankfully this is a contrived test, no real program should be spawning threads
> like this.  One would hope.

Yes, it's a particularly egregious program. :-)

> I will take a new look at your series.

Thank you!

--
Thiago

¹ https://lwn.net/Articles/794707/

  reply	other threads:[~2024-04-20  5:01 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-21 23:11 [RFC PATCH 0/3] " Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread Thiago Jung Bauermann
2024-03-22 17:33   ` Luis Machado
2024-04-17 15:55     ` Pedro Alves
2024-04-20  5:15       ` Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 2/3] gdb/nat: Factor linux_find_proc_stat_field out of linux_common_core_of_thread Thiago Jung Bauermann
2024-03-22 16:12   ` Luis Machado
2024-04-17 16:06   ` Pedro Alves
2024-04-20  5:16     ` Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads Thiago Jung Bauermann
2024-03-22 16:19   ` Luis Machado
2024-03-22 16:52   ` Pedro Alves
2024-04-16  4:48     ` Thiago Jung Bauermann
2024-04-17 15:32       ` Pedro Alves
2024-04-20  5:00         ` Thiago Jung Bauermann [this message]
2024-04-26 15:35           ` Pedro Alves
2024-04-17 16:28   ` Pedro Alves
2024-04-20  5:28     ` Thiago Jung Bauermann
2024-03-22 10:17 ` [RFC PATCH 0/3] " Christophe Lyon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzksk4qw.fsf@linaro.org \
    --to=thiago.bauermann@linaro.org \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox