Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Pedro Alves <palves@redhat.com>
To: Don Breazeal <donb@codesourcery.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH 11/16 v3] Extended-remote Linux exit events
Date: Thu, 13 Nov 2014 19:18:00 -0000	[thread overview]
Message-ID: <54650415.9020705@redhat.com> (raw)
In-Reply-To: <1414798134-11536-9-git-send-email-donb@codesourcery.com>

On 10/31/2014 11:28 PM, Don Breazeal wrote:

> 
> Now for the long answer: in my testing I ran into a race condition in
> check_zombie_leaders, which detects when a thread group leader has exited
> and other threads still exist.  On the Linux kernel, ptrace/waitpid don't
> allow reaping the leader thread until all other threads in the group are
> reaped.  When the leader exits, it goes zombie, but waitpid will not return
> an exit status until the other threads are gone.  When a non-leader thread
> calls exec, all other non-leader threads are destroyed, the leader becomes
> a zombie, and once the "other" threads have been reaped, the execing thread
> takes over the leader's pid (tgid) and appears to vanish.  In order to
> handle this situation in the current implementation, check_zombie_leaders
> polls the process state in /proc and deletes thread group leaders that are
> in a zombie state.  The replacement is added to the lwp list when the exec
> event is reported.
> 
> See https://sourceware.org/ml/gdb-patches/2011-10/msg00704.html for a more
> detailed explanation of how this works.
> 
> Here is the relevant part of check_zombie_leaders:
> 
> if (leader_lp != NULL
>           /* Check if there are other threads in the group, as we may
>              have raced with the inferior simply exiting.  */
>           && !last_thread_of_process_p (leader_pid)
>           && linux_proc_pid_is_zombie (leader_pid))
>         {
>           /* ...large informative comment block... */
>           delete_lwp (leader_lp);
> 
> The race occurred when there were two threads in the program, and the
> non-leader thread called exec.  In this case the leader thread passed
> through a very brief zombie state before being replaced by the exec'ing
> thread as the thread group leader.  This state transition was asynchronous,
> with no dependency on anything gdbserver did.  Because there were no other
> threads, there were no thread exit events, and thus there was no
> synchronization with the leader passing through the zombie state and the
> exec completing.  If there had been more threads, the leader would remain
> in the zombie state until they were waited for.  In the two-thread case,
> sometimes the leader exit was detected and sometimes it wasn't. (Recall
> that check_zombie_leaders is polling the state, via
> linux_proc_pid_is_zombie.  The race is between the leader thread passing
> through the zombie state and check_zombie_leaders testing for zombie
> state.)  If leader exit wasn't detected, gdbserver would end up with a
> dangling lwp entry that didn't correspond to any real lwp, and would hang
> waiting for that lwp to stop.  Using PTRACE_EVENT_EXIT guarantees that the
> leader exit will be detected.
> 
> Note that check_zombie_leaders works just fine for the scenarios where the
> leader thread exits and the other threads continue to run, with no exec
> calls.  It is required for systems that don't support the extended ptrace
> events.
> 
> The sequence of events resulting in the race condition was this:
> 
>     1) In the program, a CLONE event for a new thread occurs.
> 
>     2) In the program, both threads are resumed once gdbserver has
>        completed the new thread processing.
> 
>     3) In gdbserver, the function linux_wait_for_event_filtered loops until
>        waitpid returns "no more events" for the SIGCHLD generated by the
>        CLONE event.  Then linux_wait_for_event_filtered calls
>        check_zombie_leaders.
> 
>     4) In the program, the new thread is doing the exec.  During the exec
>        the leader thread will pass through a transitory zombie state.  If
>        there were more than two threads, the leader thread would remain a
>        zombie until all the non-leader, non-exec'ing threads were reaped by
>        gdbserver.  Since there are no such threads to reap, the leader just
>        becomes a zombie and is replaced by the exec'ing thread on-the-fly.
>        (Note that it appears that the leader thread is a zombie just for a
>        very brief instant.)
> 
>     5) In gdbserver, check_zombie_leaders checks whether an lwp entry
>        corresponds to a zombie leader thread, and if so, deletes it.  Here
>        is the race: in (4) above, the leader may or may not be in the
>        transitory zombie state.  In the case where a zombie isn't detected,
>        delete_lwp is not called.
> 
>     6) In gdbserver, an EXEC event is detected and processed.  When it gets
>        ready to report the event to GDB, it calls stop_all_lwps, which sends
>        a SIGSTOP to each lwp in the list and the waits until all the lwps in
>        the list have reported a stop event.  If the zombie leader wasn't
>        detected and processed in step (5), gdbserver blocks forever in
>        linux_wait_for_event_filtered, waiting for the undeleted lwp to be
>        stopped, which will never happen.

This is easy to see with thread-execl.exp, if we revert the PTRACE_EVENT_EXIT
parts:

linux_low_filter_event: pc is 0x3b37209237
pc is 0x3b37209237
stop pc is 0x3b37209237
HEW: Got exec event from LWP 2721
Hit a non-gdbserver trap event.
>>>> entering stop_all_lwps
stop_all_lwps (stop, except=none)
Sending sigstop to lwp 23192
wait_for_sigstop: pulling events
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x80000001): status(0), -1
LWFE: waitpid(-1, ...) returned -1, No child processes
leader_pid=2721, leader_lp!=NULL=1, num_lwps=2, zombie=0
sigsuspend'ing
*** hang ***

But I'm having trouble convincing myself that we do need
the PTRACE_EVENT_EXIT handling.

This seems to work for me just as well.  The fix is in
making check_zombie_leaders ignore stopped threads.

  @@ -1629,7 +1597,7 @@ check_zombie_leaders (void)
 		      leader_pid, leader_lp!= NULL, num_lwps (leader_pid),
 		      linux_proc_pid_is_zombie (leader_pid));

  -      if (leader_lp != NULL
  +      if (leader_lp != NULL && !leader_lp->stopped

But I'm in a hurry to leave now.  Maybe I'm missing something else.

From ab07ee8afc3c2358cd6b23474655fde087e2b36e Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Thu, 13 Nov 2014 17:53:43 +0000
Subject: [PATCH] revert PTRACE_EVENT_EXIT bits

---
 gdb/gdbserver/linux-low.c | 56 +++--------------------------------------------
 1 file changed, 3 insertions(+), 53 deletions(-)

diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index f308333..316b302 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -538,38 +538,6 @@ handle_extended_wait (struct lwp_info *event_child, int *wstatp)
       event_child->waitstatus.kind = TARGET_WAITKIND_VFORK_DONE;
       return 0;
     }
-  else if (event == PTRACE_EVENT_EXIT)
-    {
-      unsigned long exit_status;
-      unsigned long lwpid = lwpid_of (event_thr);
-      int ret;
-
-      if (debug_threads)
-	debug_printf ("HEW: Got exit event from LWP %ld\n", lwpid );
-
-      ptrace (PTRACE_GETEVENTMSG, lwpid, (PTRACE_TYPE_ARG3) 0, &exit_status);
-
-      if (num_lwps (pid_of (event_thr)) > 1)
-	{
-	  /* If there is at least one more LWP, then the program still
-	     exists and the exit should not be reported to GDB.  */
-	  delete_lwp (event_child);
-	  ret = 1;
-	}
-      else
-	{
-	  /* Set the exit status to the actual exit status, so normal
-	     WIFEXITED/WIFSIGNALLED processing and reporting for the
-	     last lwp in the process can proceed from here.  */
-	  *wstatp = exit_status;
-	  ret = 0;
-	}
-
-      /* Resume the thread so that it actually exits.  Subsequent exit
-	 events for LWPs that were deleted above will be ignored.  */
-      ptrace (PTRACE_CONT, lwpid, (PTRACE_TYPE_ARG3) 0, (PTRACE_TYPE_ARG4) 0);
-      return ret;
-    }
   else if (event == PTRACE_EVENT_EXEC)
     {
       if (debug_threads)
@@ -1629,7 +1597,7 @@ check_zombie_leaders (void)
 		      leader_pid, leader_lp!= NULL, num_lwps (leader_pid),
 		      linux_proc_pid_is_zombie (leader_pid));

-      if (leader_lp != NULL
+      if (leader_lp != NULL && !leader_lp->stopped
 	  /* Check if there are other threads in the group, as we may
 	     have raced with the inferior simply exiting.  */
 	  && !last_thread_of_process_p (leader_pid)
@@ -2107,17 +2075,6 @@ linux_low_filter_event (ptid_t filter_ptid, int lwpid, int *wstatp)
       child = add_lwp (child_ptid);
       child->stopped = 1;
       current_thread = child->thread;
-
-      if (non_stop && stopping_threads == NOT_STOPPING_THREADS)
-	{
-	  /* Make sure we delete the lwp entry for the exec'ing thread,
-	     which will have vanished.  We do this by sending a signal
-	     to all the other threads in the lwp list, deleting any
-	     that are not found.  Note that in all-stop mode this will
-	     happen before reporting the event.  */
-	  stop_all_lwps (0, child);
-	  unstop_all_lwps (0, child);
-	}
     }

   /* If we didn't find a process, one of two things presumably happened:
@@ -2166,8 +2123,7 @@ linux_low_filter_event (ptid_t filter_ptid, int lwpid, int *wstatp)
      architecture being defined already (so that CHILD has a valid
      regcache), and on LAST_STATUS being set (to check for SIGTRAP or
      not).  */
-  if (WIFSTOPPED (wstat)
-      && (linux_ptrace_get_extended_event (wstat) != PTRACE_EVENT_EXIT))
+  if (WIFSTOPPED (wstat))
     {
       if (debug_threads
 	  && the_low_target.get_pc != NULL)
@@ -3486,8 +3442,6 @@ send_sigstop (struct lwp_info *lwp)
 	 calling exec.  See comments in linux_low_filter_event regarding
 	 PTRACE_EVENT_EXEC.  */
       delete_lwp (lwp);
-      set_desired_thread (0);
-
       if (debug_threads)
 	debug_printf ("send_sigstop: lwp %d has vanished\n", pid);
     }
@@ -5565,10 +5519,7 @@ linux_supports_vfork_events (void)
 static int
 linux_supports_exec_events (void)
 {
-  /* Check for PTRACE_O_TRACEEXIT, since our implementation of follow
-     exec depends on this option, which was implemented in a later
-     kernel version than PTRACE_O_TRACEFORK et al.  */
-  return linux_supports_traceexit ();
+  return linux_supports_tracefork ();
 }

 static int
@@ -6588,7 +6539,6 @@ initialize_low (void)
   linux_ptrace_set_requested_options (PTRACE_O_TRACEFORK
 				      | PTRACE_O_TRACEVFORK
 				      | PTRACE_O_TRACEVFORKDONE
-				      | PTRACE_O_TRACEEXIT
 				      | PTRACE_O_TRACEEXEC);
   linux_ptrace_check_options ();
 }
-- 
1.9.3



  reply	other threads:[~2014-11-13 19:18 UTC|newest]

Thread overview: 109+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-07 18:00 [PATCH 00/10] Linux extended-remote fork events Don Breazeal
2014-08-07 18:00 ` [PATCH 06/10] Extended-remote follow fork Don Breazeal
2014-08-07 18:00 ` [PATCH 04/10] Enhance extended ptrace event setup Don Breazeal
2014-08-13 17:50   ` Breazeal, Don
2014-08-07 18:00 ` [PATCH 05/10] GDBserver clone breakpoint list Don Breazeal
2014-08-07 18:00 ` [PATCH 01/10] Refactor native follow-fork Don Breazeal
2014-08-07 18:00 ` [PATCH 07/10] Extended-remote arch-specific follow fork Don Breazeal
2014-08-07 18:00 ` [PATCH 02/10] Refactor follow-fork message printing Don Breazeal
2014-08-07 18:00 ` [PATCH 03/10] Refactor extended ptrace event status Don Breazeal
2014-08-07 18:01 ` [PATCH 08/10] Extended-remote follow vfork Don Breazeal
2014-08-07 18:01 ` [PATCH 09/10] Extended-remote fork catchpoints Don Breazeal
2014-08-07 18:01 ` [PATCH 10/10] Extended-remote fork event documentation Don Breazeal
     [not found]   ` <83tx5om6zs.fsf@gnu.org>
2014-08-08 15:35     ` Breazeal, Don
2014-08-21  0:29 ` [PATCH 01/16 v2] Refactor native follow-fork Don Breazeal
2014-09-05 14:20   ` Pedro Alves
2014-09-05 18:56     ` Breazeal, Don
2014-09-05 20:20       ` Breazeal, Don
2014-09-09 10:57       ` Pedro Alves
2014-09-08 23:54     ` Breazeal, Don
2014-09-09 11:09       ` Pedro Alves
2014-09-12 16:50         ` Breazeal, Don
2014-09-22 15:53           ` Breazeal, Don
2014-09-26 18:13           ` Pedro Alves
2014-09-29 18:08             ` Breazeal, Don
2014-09-30 10:56               ` Pedro Alves
2014-09-30 18:43                 ` Breazeal, Don
2014-08-21  0:29 ` [Patch 00/16 v2] Linux extended-remote fork and exec events Don Breazeal
2014-09-04 20:57   ` Breazeal, Don
2014-10-31 23:29   ` [PATCH 06/16 v3] Extended-remote Linux follow fork Don Breazeal
2014-11-13 13:00     ` Pedro Alves
2014-11-13 18:53       ` Breazeal, Don
2014-11-13 18:59         ` Pedro Alves
2014-11-13 19:06           ` Breazeal, Don
2014-12-06  0:31             ` Breazeal, Don
2015-01-23 12:53               ` Pedro Alves
2015-01-23 17:18                 ` Breazeal, Don
     [not found]                 ` <1422222420-25421-1-git-send-email-donb@codesourcery.com>
2015-01-25 21:49                   ` [PATCH v4 5/7] Arch-specific remote " Don Breazeal
2015-02-10 16:37                     ` Pedro Alves
2015-01-25 21:49                   ` [PATCH v4 6/7] Remote follow vfork Don Breazeal
2015-02-10 16:39                     ` Pedro Alves
2015-01-25 21:50                   ` [PATCH v4 2/7] Clone remote breakpoints Don Breazeal
2015-01-25 21:50                   ` [PATCH v4 1/7] Identify remote fork event support Don Breazeal
2015-02-10 16:34                     ` Pedro Alves
2015-01-25 21:58                   ` [PATCH v4 7/7] Remote fork catch Don Breazeal
2015-01-26  0:07                   ` [PATCH v4 3/7 v3] Extended-remote Linux follow fork Don Breazeal
2015-02-10 16:36                     ` Pedro Alves
2015-01-26  0:20                   ` [PATCH v4 4/7] Target remote " Don Breazeal
2015-01-12 22:39             ` [PATCH 06/16 v3] Extended-remote Linux " Don Breazeal
2015-01-12 22:49               ` Breazeal, Don
2014-10-31 23:29   ` [PATCH 05/16 v3] GDBserver clone breakpoint list Don Breazeal
2014-10-31 23:29   ` [PATCH 08/16 v3] Extended-remote follow vfork Don Breazeal
2014-10-31 23:29   ` [PATCH 00/16 v3] Linux extended-remote fork and exec events Don Breazeal
2014-11-12 15:54     ` Pedro Alves
2014-11-13 13:41     ` Pedro Alves
2014-11-13 13:51       ` Pedro Alves
2014-11-13 14:58         ` Pedro Alves
2014-11-13 19:14     ` Pedro Alves
2014-10-31 23:29   ` [PATCH 04/16 v3] Determine supported extended-remote features Don Breazeal
2014-11-13 12:59     ` Pedro Alves
2014-11-13 18:28       ` Breazeal, Don
2014-11-13 18:33         ` Pedro Alves
2014-11-13 19:08           ` Pedro Alves
2014-11-13 18:37         ` Breazeal, Don
2014-11-13 18:48           ` Pedro Alves
2014-12-06  0:30             ` Breazeal, Don
2015-01-12 22:36           ` Don Breazeal
2015-01-21 21:02             ` Breazeal, Don
2014-10-31 23:29   ` [PATCH 07/16 v3] Extended-remote arch-specific follow fork Don Breazeal
2014-10-31 23:30   ` [PATCH 12/16 v3] Extended-remote follow exec Don Breazeal
2014-10-31 23:30   ` [PATCH 09/16 v3] Extended-remote fork catchpoints Don Breazeal
2014-10-31 23:30   ` [PATCH 13/16 v3] Extended-remote exec catchpoints Don Breazeal
2014-10-31 23:30   ` [PATCH 10/16 v3] Extended-remote fork event documentation Don Breazeal
2014-10-31 23:30   ` [PATCH 11/16 v3] Extended-remote Linux exit events Don Breazeal
2014-11-13 19:18     ` Pedro Alves [this message]
2014-10-31 23:31   ` [PATCH 16/16 v3] Non-stop follow exec tests Don Breazeal
2014-10-31 23:31   ` [PATCH 14/16 v3] Suppress spurious warnings with extended-remote follow exec Don Breazeal
2014-10-31 23:31   ` [PATCH 15/16 v3] Extended-remote exec event documentation Don Breazeal
2014-08-21  0:30 ` [PATCH 02/16 v2] Refactor follow-fork message printing Don Breazeal
2014-09-26 19:52   ` Pedro Alves
2014-09-26 20:14     ` Breazeal, Don
2014-10-03 23:51       ` Breazeal, Don
2014-10-15 16:08       ` Pedro Alves
2014-10-22 23:47         ` Breazeal, Don
2014-10-24 12:35           ` Pedro Alves
2014-10-24 18:45             ` Breazeal, Don
2014-08-21  0:30 ` [PATCH 03/16 v2] Refactor ptrace extended event status Don Breazeal
2014-09-09 11:31   ` Pedro Alves
2014-09-19 18:14     ` [pushed] " Breazeal, Don
2014-08-21  0:30 ` [PATCH 04/16 v2] Determine supported extended-remote features Don Breazeal
2014-10-15 16:17   ` Pedro Alves
2014-10-21 23:23     ` Breazeal, Don
2014-10-22 21:48       ` Pedro Alves
2014-10-31 23:38         ` Breazeal, Don
2014-08-21  0:31 ` [PATCH 07/16 v2] Extended-remote arch-specific follow fork Don Breazeal
2014-09-19 21:26   ` Breazeal, Don
2014-08-21  0:31 ` [PATCH 06/16 v2] Extended-remote Linux " Don Breazeal
2014-09-19 20:57   ` Breazeal, Don
2014-08-21  0:31 ` [PATCH 05/16 v2] GDBserver clone breakpoint list Don Breazeal
2014-10-15 17:40   ` Pedro Alves
2014-10-31 23:44     ` Breazeal, Don
2014-08-21  0:32 ` [PATCH 08/16 v2] Extended-remote follow vfork Don Breazeal
2014-08-21  0:33 ` [PATCH 10/16 v2] Extended-remote fork event documentation Don Breazeal
2014-08-21  0:33 ` [PATCH 11/16 v2] Extended-remote Linux exit events Don Breazeal
2014-08-21  0:33 ` [PATCH 09/16 v2] Extended-remote fork catchpoints Don Breazeal
2014-08-21  0:34 ` [PATCH 12/16 v2] Extended-remote follow exec Don Breazeal
2014-08-21  0:34 ` [PATCH 13/16 v2] Extended-remote exec catchpoints Don Breazeal
2014-08-21  0:35 ` [PATCH 14/16 v2] Suppress spurious warnings with extended-remote follow exec Don Breazeal
2014-08-21  0:36 ` [PATCH 15/16 v2] Extended-remote exec event documentation Don Breazeal
2014-08-21  0:36 ` [PATCH 16/16 v2] Non-stop follow exec tests Don Breazeal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54650415.9020705@redhat.com \
    --to=palves@redhat.com \
    --cc=donb@codesourcery.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox