From: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
To: Pedro Alves <pedro@palves.net>
Cc: gdb-patches@sourceware.org
Subject: Re: [RFC PATCH 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads
Date: Sat, 20 Apr 2024 02:00:39 -0300 [thread overview]
Message-ID: <87jzksk4qw.fsf@linaro.org> (raw)
In-Reply-To: <9680e3cf-b8ad-4329-a51c-2aafb98d9476@palves.net> (Pedro Alves's message of "Wed, 17 Apr 2024 16:32:56 +0100")
Hello Pedro,
Pedro Alves <pedro@palves.net> writes:
> On 2024-04-16 05:48, Thiago Jung Bauermann wrote:
>>
>> Pedro Alves <pedro@palves.net> writes:
>>
>>> Hmm. How about simply not restarting the loop if attach_lwp tries to attach to
>>> a zombie lwp (and silently fails)?
<snip>
>>> Similar thing for gdbserver, of course.
>>
>> Thank you for the suggestion. I tried doing that, and in fact I attached
>> a patch with that change in comment #17 of PR 31312 when I was
>> investigating a fix¹. I called it a workaround because I also had to
>> increase the number of iterations in linux_proc_attach_tgid_threads from
>> 2 to 100, otherwise GDB gives up on waiting for new inferior threads too
>> early and the inferior dies with a SIGTRAP because some new unnoticed
>> thread tripped into the breakpoint.
>>
>> Because of the need to increase the number of iterations, I didn't
>> consider it a good solution and went with the approach in this patch
>> series instead. Now I finally understand why I had to increase the
>> number of iterations (though I still don't see a way around it other
>> than what this patch series does):
>>
>> With the approach in this patch series, even if a new thread becomes
>> zombie by the time GDB tries to attach to it, it still causes
>> new_threads_found to be set the first time GDB notices it, and the loop
>> in linux_proc_attach_tgid_threads starts over.
>>
>> With the approach above, a new thread that becomes zombie before GDB has
>> a chance to attach to it never causes the loop to start over, and so it
>> exits earlier.
>>
>> I think it's a matter of opinion whether one approach or the other can
>> be considered the better one.
>>
>> Even with this patch series, it's not guaranteed that two iterations
>> without finding new threads is enough to ensure that GDB has found all
>> threads in the inferior. I left the test running in a loop overnight
>> with the patch series applied and it failed after about 2500 runs.
>
> Thanks for all the investigations. I hadn't seen the bugzilla before,
> I didn't notice there was one identified in the patch.
>
> Let me just start by saying "bleh"...
Agreed.
> This is all of course bad ptrace/kernel design... The only way to plug this race
> completely is with kernel help, I believe.
Indeed. Having a ptrace request to tell the kernel to stop all threads in
the process, and having a "way to use ptrace without signals or wait"
(as mentioned in the LinuxKernelWishList wiki page) would be a big
improvement. In the past few years the kernel has been introducing more
and more syscalls that accept a pidfd¹. Perhaps the time is ripe for a
pidfd_ptrace syscall?
> The only way that I know of today that we can make the kernel pause all threads for
> us, is with a group-stop. I.e., pass a SIGSTOP to the inferior, and the kernel stops
> _all_ threads with SIGSTOP.
>
> I prototyped it today, and while far from perfect (would need to handle a corner
> scenarios like attaching and getting back something != SIGSTOP, and I still see
> some spurious SIGSTOPs), it kind of works ...
This is amazing. Thank you for this exploration and the patch.
> ... except for what I think is a deal breaker that I don't know how to work around:
>
> If you attach to a process that is running on a shell, you will see that the group-stop
> makes it go to the background. GDB resumes the process, but it won't go back to the
> foreground without an explicit "fg" in the shell... Like:
>
> $ ./test_program
> [1]+ Stopped ./test_program
> $
Couldn't this be a shell behaviour rather than a kernel one? I don't see
anything that I could associate with this effect mentioned in the ptrace
man page (though I could have easily missed it). If it is, then the fix
would be in the shell too.
> Thankfully this is a contrived test, no real program should be spawning threads
> like this. One would hope.
Yes, it's a particularly egregious program. :-)
> I will take a new look at your series.
Thank you!
--
Thiago
¹ https://lwn.net/Articles/794707/
next prev parent reply other threads:[~2024-04-20 5:01 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-21 23:11 [RFC PATCH 0/3] " Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread Thiago Jung Bauermann
2024-03-22 17:33 ` Luis Machado
2024-04-17 15:55 ` Pedro Alves
2024-04-20 5:15 ` Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 2/3] gdb/nat: Factor linux_find_proc_stat_field out of linux_common_core_of_thread Thiago Jung Bauermann
2024-03-22 16:12 ` Luis Machado
2024-04-17 16:06 ` Pedro Alves
2024-04-20 5:16 ` Thiago Jung Bauermann
2024-03-21 23:11 ` [RFC PATCH 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads Thiago Jung Bauermann
2024-03-22 16:19 ` Luis Machado
2024-03-22 16:52 ` Pedro Alves
2024-04-16 4:48 ` Thiago Jung Bauermann
2024-04-17 15:32 ` Pedro Alves
2024-04-20 5:00 ` Thiago Jung Bauermann [this message]
2024-04-26 15:35 ` Pedro Alves
2024-04-17 16:28 ` Pedro Alves
2024-04-20 5:28 ` Thiago Jung Bauermann
2024-03-22 10:17 ` [RFC PATCH 0/3] " Christophe Lyon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87jzksk4qw.fsf@linaro.org \
--to=thiago.bauermann@linaro.org \
--cc=gdb-patches@sourceware.org \
--cc=pedro@palves.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox