From: Pedro Alves <palves@redhat.com>
To: Don Breazeal <donb@codesourcery.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH 11/16 v3] Extended-remote Linux exit events
Date: Thu, 13 Nov 2014 19:18:00 -0000 [thread overview]
Message-ID: <54650415.9020705@redhat.com> (raw)
In-Reply-To: <1414798134-11536-9-git-send-email-donb@codesourcery.com>
On 10/31/2014 11:28 PM, Don Breazeal wrote:
>
> Now for the long answer: in my testing I ran into a race condition in
> check_zombie_leaders, which detects when a thread group leader has exited
> and other threads still exist. On the Linux kernel, ptrace/waitpid don't
> allow reaping the leader thread until all other threads in the group are
> reaped. When the leader exits, it goes zombie, but waitpid will not return
> an exit status until the other threads are gone. When a non-leader thread
> calls exec, all other non-leader threads are destroyed, the leader becomes
> a zombie, and once the "other" threads have been reaped, the execing thread
> takes over the leader's pid (tgid) and appears to vanish. In order to
> handle this situation in the current implementation, check_zombie_leaders
> polls the process state in /proc and deletes thread group leaders that are
> in a zombie state. The replacement is added to the lwp list when the exec
> event is reported.
>
> See https://sourceware.org/ml/gdb-patches/2011-10/msg00704.html for a more
> detailed explanation of how this works.
>
> Here is the relevant part of check_zombie_leaders:
>
> if (leader_lp != NULL
> /* Check if there are other threads in the group, as we may
> have raced with the inferior simply exiting. */
> && !last_thread_of_process_p (leader_pid)
> && linux_proc_pid_is_zombie (leader_pid))
> {
> /* ...large informative comment block... */
> delete_lwp (leader_lp);
>
> The race occurred when there were two threads in the program, and the
> non-leader thread called exec. In this case the leader thread passed
> through a very brief zombie state before being replaced by the exec'ing
> thread as the thread group leader. This state transition was asynchronous,
> with no dependency on anything gdbserver did. Because there were no other
> threads, there were no thread exit events, and thus there was no
> synchronization with the leader passing through the zombie state and the
> exec completing. If there had been more threads, the leader would remain
> in the zombie state until they were waited for. In the two-thread case,
> sometimes the leader exit was detected and sometimes it wasn't. (Recall
> that check_zombie_leaders is polling the state, via
> linux_proc_pid_is_zombie. The race is between the leader thread passing
> through the zombie state and check_zombie_leaders testing for zombie
> state.) If leader exit wasn't detected, gdbserver would end up with a
> dangling lwp entry that didn't correspond to any real lwp, and would hang
> waiting for that lwp to stop. Using PTRACE_EVENT_EXIT guarantees that the
> leader exit will be detected.
>
> Note that check_zombie_leaders works just fine for the scenarios where the
> leader thread exits and the other threads continue to run, with no exec
> calls. It is required for systems that don't support the extended ptrace
> events.
>
> The sequence of events resulting in the race condition was this:
>
> 1) In the program, a CLONE event for a new thread occurs.
>
> 2) In the program, both threads are resumed once gdbserver has
> completed the new thread processing.
>
> 3) In gdbserver, the function linux_wait_for_event_filtered loops until
> waitpid returns "no more events" for the SIGCHLD generated by the
> CLONE event. Then linux_wait_for_event_filtered calls
> check_zombie_leaders.
>
> 4) In the program, the new thread is doing the exec. During the exec
> the leader thread will pass through a transitory zombie state. If
> there were more than two threads, the leader thread would remain a
> zombie until all the non-leader, non-exec'ing threads were reaped by
> gdbserver. Since there are no such threads to reap, the leader just
> becomes a zombie and is replaced by the exec'ing thread on-the-fly.
> (Note that it appears that the leader thread is a zombie just for a
> very brief instant.)
>
> 5) In gdbserver, check_zombie_leaders checks whether an lwp entry
> corresponds to a zombie leader thread, and if so, deletes it. Here
> is the race: in (4) above, the leader may or may not be in the
> transitory zombie state. In the case where a zombie isn't detected,
> delete_lwp is not called.
>
> 6) In gdbserver, an EXEC event is detected and processed. When it gets
> ready to report the event to GDB, it calls stop_all_lwps, which sends
> a SIGSTOP to each lwp in the list and the waits until all the lwps in
> the list have reported a stop event. If the zombie leader wasn't
> detected and processed in step (5), gdbserver blocks forever in
> linux_wait_for_event_filtered, waiting for the undeleted lwp to be
> stopped, which will never happen.
This is easy to see with thread-execl.exp, if we revert the PTRACE_EVENT_EXIT
parts:
linux_low_filter_event: pc is 0x3b37209237
pc is 0x3b37209237
stop pc is 0x3b37209237
HEW: Got exec event from LWP 2721
Hit a non-gdbserver trap event.
>>>> entering stop_all_lwps
stop_all_lwps (stop, except=none)
Sending sigstop to lwp 23192
wait_for_sigstop: pulling events
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x80000001): status(0), -1
LWFE: waitpid(-1, ...) returned -1, No child processes
leader_pid=2721, leader_lp!=NULL=1, num_lwps=2, zombie=0
sigsuspend'ing
*** hang ***
But I'm having trouble convincing myself that we do need
the PTRACE_EVENT_EXIT handling.
This seems to work for me just as well. The fix is in
making check_zombie_leaders ignore stopped threads.
@@ -1629,7 +1597,7 @@ check_zombie_leaders (void)
leader_pid, leader_lp!= NULL, num_lwps (leader_pid),
linux_proc_pid_is_zombie (leader_pid));
- if (leader_lp != NULL
+ if (leader_lp != NULL && !leader_lp->stopped
But I'm in a hurry to leave now. Maybe I'm missing something else.
From ab07ee8afc3c2358cd6b23474655fde087e2b36e Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Thu, 13 Nov 2014 17:53:43 +0000
Subject: [PATCH] revert PTRACE_EVENT_EXIT bits
---
gdb/gdbserver/linux-low.c | 56 +++--------------------------------------------
1 file changed, 3 insertions(+), 53 deletions(-)
diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index f308333..316b302 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -538,38 +538,6 @@ handle_extended_wait (struct lwp_info *event_child, int *wstatp)
event_child->waitstatus.kind = TARGET_WAITKIND_VFORK_DONE;
return 0;
}
- else if (event == PTRACE_EVENT_EXIT)
- {
- unsigned long exit_status;
- unsigned long lwpid = lwpid_of (event_thr);
- int ret;
-
- if (debug_threads)
- debug_printf ("HEW: Got exit event from LWP %ld\n", lwpid );
-
- ptrace (PTRACE_GETEVENTMSG, lwpid, (PTRACE_TYPE_ARG3) 0, &exit_status);
-
- if (num_lwps (pid_of (event_thr)) > 1)
- {
- /* If there is at least one more LWP, then the program still
- exists and the exit should not be reported to GDB. */
- delete_lwp (event_child);
- ret = 1;
- }
- else
- {
- /* Set the exit status to the actual exit status, so normal
- WIFEXITED/WIFSIGNALLED processing and reporting for the
- last lwp in the process can proceed from here. */
- *wstatp = exit_status;
- ret = 0;
- }
-
- /* Resume the thread so that it actually exits. Subsequent exit
- events for LWPs that were deleted above will be ignored. */
- ptrace (PTRACE_CONT, lwpid, (PTRACE_TYPE_ARG3) 0, (PTRACE_TYPE_ARG4) 0);
- return ret;
- }
else if (event == PTRACE_EVENT_EXEC)
{
if (debug_threads)
@@ -1629,7 +1597,7 @@ check_zombie_leaders (void)
leader_pid, leader_lp!= NULL, num_lwps (leader_pid),
linux_proc_pid_is_zombie (leader_pid));
- if (leader_lp != NULL
+ if (leader_lp != NULL && !leader_lp->stopped
/* Check if there are other threads in the group, as we may
have raced with the inferior simply exiting. */
&& !last_thread_of_process_p (leader_pid)
@@ -2107,17 +2075,6 @@ linux_low_filter_event (ptid_t filter_ptid, int lwpid, int *wstatp)
child = add_lwp (child_ptid);
child->stopped = 1;
current_thread = child->thread;
-
- if (non_stop && stopping_threads == NOT_STOPPING_THREADS)
- {
- /* Make sure we delete the lwp entry for the exec'ing thread,
- which will have vanished. We do this by sending a signal
- to all the other threads in the lwp list, deleting any
- that are not found. Note that in all-stop mode this will
- happen before reporting the event. */
- stop_all_lwps (0, child);
- unstop_all_lwps (0, child);
- }
}
/* If we didn't find a process, one of two things presumably happened:
@@ -2166,8 +2123,7 @@ linux_low_filter_event (ptid_t filter_ptid, int lwpid, int *wstatp)
architecture being defined already (so that CHILD has a valid
regcache), and on LAST_STATUS being set (to check for SIGTRAP or
not). */
- if (WIFSTOPPED (wstat)
- && (linux_ptrace_get_extended_event (wstat) != PTRACE_EVENT_EXIT))
+ if (WIFSTOPPED (wstat))
{
if (debug_threads
&& the_low_target.get_pc != NULL)
@@ -3486,8 +3442,6 @@ send_sigstop (struct lwp_info *lwp)
calling exec. See comments in linux_low_filter_event regarding
PTRACE_EVENT_EXEC. */
delete_lwp (lwp);
- set_desired_thread (0);
-
if (debug_threads)
debug_printf ("send_sigstop: lwp %d has vanished\n", pid);
}
@@ -5565,10 +5519,7 @@ linux_supports_vfork_events (void)
static int
linux_supports_exec_events (void)
{
- /* Check for PTRACE_O_TRACEEXIT, since our implementation of follow
- exec depends on this option, which was implemented in a later
- kernel version than PTRACE_O_TRACEFORK et al. */
- return linux_supports_traceexit ();
+ return linux_supports_tracefork ();
}
static int
@@ -6588,7 +6539,6 @@ initialize_low (void)
linux_ptrace_set_requested_options (PTRACE_O_TRACEFORK
| PTRACE_O_TRACEVFORK
| PTRACE_O_TRACEVFORKDONE
- | PTRACE_O_TRACEEXIT
| PTRACE_O_TRACEEXEC);
linux_ptrace_check_options ();
}
--
1.9.3
next prev parent reply other threads:[~2014-11-13 19:18 UTC|newest]
Thread overview: 109+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-07 18:00 [PATCH 00/10] Linux extended-remote fork events Don Breazeal
2014-08-07 18:00 ` [PATCH 06/10] Extended-remote follow fork Don Breazeal
2014-08-07 18:00 ` [PATCH 04/10] Enhance extended ptrace event setup Don Breazeal
2014-08-13 17:50 ` Breazeal, Don
2014-08-07 18:00 ` [PATCH 05/10] GDBserver clone breakpoint list Don Breazeal
2014-08-07 18:00 ` [PATCH 01/10] Refactor native follow-fork Don Breazeal
2014-08-07 18:00 ` [PATCH 07/10] Extended-remote arch-specific follow fork Don Breazeal
2014-08-07 18:00 ` [PATCH 02/10] Refactor follow-fork message printing Don Breazeal
2014-08-07 18:00 ` [PATCH 03/10] Refactor extended ptrace event status Don Breazeal
2014-08-07 18:01 ` [PATCH 08/10] Extended-remote follow vfork Don Breazeal
2014-08-07 18:01 ` [PATCH 09/10] Extended-remote fork catchpoints Don Breazeal
2014-08-07 18:01 ` [PATCH 10/10] Extended-remote fork event documentation Don Breazeal
[not found] ` <83tx5om6zs.fsf@gnu.org>
2014-08-08 15:35 ` Breazeal, Don
2014-08-21 0:29 ` [PATCH 01/16 v2] Refactor native follow-fork Don Breazeal
2014-09-05 14:20 ` Pedro Alves
2014-09-05 18:56 ` Breazeal, Don
2014-09-05 20:20 ` Breazeal, Don
2014-09-09 10:57 ` Pedro Alves
2014-09-08 23:54 ` Breazeal, Don
2014-09-09 11:09 ` Pedro Alves
2014-09-12 16:50 ` Breazeal, Don
2014-09-22 15:53 ` Breazeal, Don
2014-09-26 18:13 ` Pedro Alves
2014-09-29 18:08 ` Breazeal, Don
2014-09-30 10:56 ` Pedro Alves
2014-09-30 18:43 ` Breazeal, Don
2014-08-21 0:29 ` [Patch 00/16 v2] Linux extended-remote fork and exec events Don Breazeal
2014-09-04 20:57 ` Breazeal, Don
2014-10-31 23:29 ` [PATCH 06/16 v3] Extended-remote Linux follow fork Don Breazeal
2014-11-13 13:00 ` Pedro Alves
2014-11-13 18:53 ` Breazeal, Don
2014-11-13 18:59 ` Pedro Alves
2014-11-13 19:06 ` Breazeal, Don
2014-12-06 0:31 ` Breazeal, Don
2015-01-23 12:53 ` Pedro Alves
2015-01-23 17:18 ` Breazeal, Don
[not found] ` <1422222420-25421-1-git-send-email-donb@codesourcery.com>
2015-01-25 21:49 ` [PATCH v4 5/7] Arch-specific remote " Don Breazeal
2015-02-10 16:37 ` Pedro Alves
2015-01-25 21:49 ` [PATCH v4 6/7] Remote follow vfork Don Breazeal
2015-02-10 16:39 ` Pedro Alves
2015-01-25 21:50 ` [PATCH v4 2/7] Clone remote breakpoints Don Breazeal
2015-01-25 21:50 ` [PATCH v4 1/7] Identify remote fork event support Don Breazeal
2015-02-10 16:34 ` Pedro Alves
2015-01-25 21:58 ` [PATCH v4 7/7] Remote fork catch Don Breazeal
2015-01-26 0:07 ` [PATCH v4 3/7 v3] Extended-remote Linux follow fork Don Breazeal
2015-02-10 16:36 ` Pedro Alves
2015-01-26 0:20 ` [PATCH v4 4/7] Target remote " Don Breazeal
2015-01-12 22:39 ` [PATCH 06/16 v3] Extended-remote Linux " Don Breazeal
2015-01-12 22:49 ` Breazeal, Don
2014-10-31 23:29 ` [PATCH 05/16 v3] GDBserver clone breakpoint list Don Breazeal
2014-10-31 23:29 ` [PATCH 08/16 v3] Extended-remote follow vfork Don Breazeal
2014-10-31 23:29 ` [PATCH 00/16 v3] Linux extended-remote fork and exec events Don Breazeal
2014-11-12 15:54 ` Pedro Alves
2014-11-13 13:41 ` Pedro Alves
2014-11-13 13:51 ` Pedro Alves
2014-11-13 14:58 ` Pedro Alves
2014-11-13 19:14 ` Pedro Alves
2014-10-31 23:29 ` [PATCH 04/16 v3] Determine supported extended-remote features Don Breazeal
2014-11-13 12:59 ` Pedro Alves
2014-11-13 18:28 ` Breazeal, Don
2014-11-13 18:33 ` Pedro Alves
2014-11-13 19:08 ` Pedro Alves
2014-11-13 18:37 ` Breazeal, Don
2014-11-13 18:48 ` Pedro Alves
2014-12-06 0:30 ` Breazeal, Don
2015-01-12 22:36 ` Don Breazeal
2015-01-21 21:02 ` Breazeal, Don
2014-10-31 23:29 ` [PATCH 07/16 v3] Extended-remote arch-specific follow fork Don Breazeal
2014-10-31 23:30 ` [PATCH 12/16 v3] Extended-remote follow exec Don Breazeal
2014-10-31 23:30 ` [PATCH 09/16 v3] Extended-remote fork catchpoints Don Breazeal
2014-10-31 23:30 ` [PATCH 13/16 v3] Extended-remote exec catchpoints Don Breazeal
2014-10-31 23:30 ` [PATCH 10/16 v3] Extended-remote fork event documentation Don Breazeal
2014-10-31 23:30 ` [PATCH 11/16 v3] Extended-remote Linux exit events Don Breazeal
2014-11-13 19:18 ` Pedro Alves [this message]
2014-10-31 23:31 ` [PATCH 16/16 v3] Non-stop follow exec tests Don Breazeal
2014-10-31 23:31 ` [PATCH 14/16 v3] Suppress spurious warnings with extended-remote follow exec Don Breazeal
2014-10-31 23:31 ` [PATCH 15/16 v3] Extended-remote exec event documentation Don Breazeal
2014-08-21 0:30 ` [PATCH 02/16 v2] Refactor follow-fork message printing Don Breazeal
2014-09-26 19:52 ` Pedro Alves
2014-09-26 20:14 ` Breazeal, Don
2014-10-03 23:51 ` Breazeal, Don
2014-10-15 16:08 ` Pedro Alves
2014-10-22 23:47 ` Breazeal, Don
2014-10-24 12:35 ` Pedro Alves
2014-10-24 18:45 ` Breazeal, Don
2014-08-21 0:30 ` [PATCH 03/16 v2] Refactor ptrace extended event status Don Breazeal
2014-09-09 11:31 ` Pedro Alves
2014-09-19 18:14 ` [pushed] " Breazeal, Don
2014-08-21 0:30 ` [PATCH 04/16 v2] Determine supported extended-remote features Don Breazeal
2014-10-15 16:17 ` Pedro Alves
2014-10-21 23:23 ` Breazeal, Don
2014-10-22 21:48 ` Pedro Alves
2014-10-31 23:38 ` Breazeal, Don
2014-08-21 0:31 ` [PATCH 07/16 v2] Extended-remote arch-specific follow fork Don Breazeal
2014-09-19 21:26 ` Breazeal, Don
2014-08-21 0:31 ` [PATCH 06/16 v2] Extended-remote Linux " Don Breazeal
2014-09-19 20:57 ` Breazeal, Don
2014-08-21 0:31 ` [PATCH 05/16 v2] GDBserver clone breakpoint list Don Breazeal
2014-10-15 17:40 ` Pedro Alves
2014-10-31 23:44 ` Breazeal, Don
2014-08-21 0:32 ` [PATCH 08/16 v2] Extended-remote follow vfork Don Breazeal
2014-08-21 0:33 ` [PATCH 10/16 v2] Extended-remote fork event documentation Don Breazeal
2014-08-21 0:33 ` [PATCH 11/16 v2] Extended-remote Linux exit events Don Breazeal
2014-08-21 0:33 ` [PATCH 09/16 v2] Extended-remote fork catchpoints Don Breazeal
2014-08-21 0:34 ` [PATCH 12/16 v2] Extended-remote follow exec Don Breazeal
2014-08-21 0:34 ` [PATCH 13/16 v2] Extended-remote exec catchpoints Don Breazeal
2014-08-21 0:35 ` [PATCH 14/16 v2] Suppress spurious warnings with extended-remote follow exec Don Breazeal
2014-08-21 0:36 ` [PATCH 15/16 v2] Extended-remote exec event documentation Don Breazeal
2014-08-21 0:36 ` [PATCH 16/16 v2] Non-stop follow exec tests Don Breazeal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54650415.9020705@redhat.com \
--to=palves@redhat.com \
--cc=donb@codesourcery.com \
--cc=gdb-patches@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox