Re: [PATCH] linux-nat.c: no need to block child signals so aggressively.

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Luis Machado <lgustavo@codesourcery.com>
To: Pedro Alves <palves@redhat.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH] linux-nat.c: no need to block child signals so aggressively.
Date: Wed, 14 Aug 2013 18:50:00 -0000	[thread overview]
Message-ID: <520BD162.1020105@codesourcery.com> (raw)
In-Reply-To: <20130814182653.7136.72780.stgit@brno.lan>

On 08/14/2013 03:26 PM, Pedro Alves wrote:
> In http://sourceware.org/ml/gdb-patches/2013-08/msg00174.html , the
> issue of child signal handling around ptrace option support discovery
> being different between GDB and GDBserver came up.
>
> I recalled adding these block_child_signals calls, and the "We don't
> want those ptrace calls to be interrupted" comment, but not exactly
> why.  So I looked into it.  My first guess is that I got confused.
> The patch that added this
> <http://sourceware.org/ml/gdb-patches/2009-04/msg00125.html> rewrote
> the linux native async support completely, and the old async support
> code had the SIGCHLD handler itself do waitpid, so in places that we'd
> want a blocking waitpid, we'd have to have the signal handler blocked.
> That was probably the mindset I had at the time.  Anyway, whatever the
> case, looks like I was wrong on the need for this blocking.
> Given GDBserver doesn't block like this, I investigated why this is
> currently needed on GDB but not on GDBserver.
>
> I removed the block_child_signals (and restore) calls, and hacked
> linux-nat.c to call linux_test_for_tracefork in a loop.  Running the
> resulting GDB, I then saw bad things happening.  Specifically, I'd end
> up with a bunch of zombies, and eventually, the machine would grind to
> a half with insufficient resources.
>
> The issue is that linux_test_for_tracefork test forks, and has the
> child fork again.  If we don't block SIGCHLD on entry to the function,
> the children will inherit SIGCHLD's action/disposition (meaning,
> SIGCHLD will be unblocked in the child).  When the first child forks
> again a second child, and that child exits, the first child gets a
> SIGCHLD.  Now, when we try to wrap up for the whole options test, we
> kill the first child, and collect the waitstatus.  Here, when SIGCHLD
> isn't blocked, GDB will first see the child reporting a stop with
> SIGCHLD.  gdbserver's option test does a PTRACE_KILL loop at the end,
> which catches the SIGCHLD, and retries the kill.  The GDB version did
> not do that.  So the GDB version would proceed, leaving the child
> zombie, as nothing collected its final waitstatus.
>
> So this patch makes the GDB version of linux_test_for_tracefork do the
> exact same as the GDBserver version, removes all this unnecessary
> blocking throughout, and adds a couple comments at places that do need
> it -- namely: places where we'll use sleep with sigsuspend; and
> linux_async_pipe, as that destroys the pipe the signal handler
> touches.
>
> Tested on x86_64 Fedora 17, sync and async.
>
> gdb/
> 2013-08-14  Pedro Alves  <palves@redhat.com>
>
> 	* linux-nat.c (linux_test_for_tracefork)
> 	(linux_test_for_tracesysgood, linux_child_follow_fork)
> 	(lin_lwp_attach_lwp, linux_nat_resume): Don't block child signals.
> 	(linux_nat_wait_1): Extend comment.
> 	(linux_async_pipe): Add comment.
> ---
>   gdb/linux-nat.c |   59 +++++++++++++------------------------------------------
>   1 file changed, 14 insertions(+), 45 deletions(-)
>
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index 45a6e5f..db23433 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -395,20 +395,13 @@ linux_test_for_tracefork (int original_pid)
>   {
>     int child_pid, ret, status;
>     long second_pid;
> -  sigset_t prev_mask;
> -
> -  /* We don't want those ptrace calls to be interrupted.  */
> -  block_child_signals (&prev_mask);
>
>     linux_supports_tracefork_flag = 0;
>     linux_supports_tracevforkdone_flag = 0;
>
>     ret = ptrace (PTRACE_SETOPTIONS, original_pid, 0, PTRACE_O_TRACEFORK);
>     if (ret != 0)
> -    {
> -      restore_child_signals_mask (&prev_mask);
> -      return;
> -    }
> +    return;
>
>     child_pid = fork ();
>     if (child_pid == -1)
> @@ -433,7 +426,6 @@ linux_test_for_tracefork (int original_pid)
>         if (ret != 0)
>   	{
>   	  warning (_("linux_test_for_tracefork: failed to kill child"));
> -	  restore_child_signals_mask (&prev_mask);
>   	  return;
>   	}
>
> @@ -445,7 +437,6 @@ linux_test_for_tracefork (int original_pid)
>   	warning (_("linux_test_for_tracefork: unexpected "
>   		   "wait status 0x%x from killed child"), status);
>
> -      restore_child_signals_mask (&prev_mask);
>         return;
>       }
>
> @@ -482,12 +473,14 @@ linux_test_for_tracefork (int original_pid)
>       warning (_("linux_test_for_tracefork: unexpected result from waitpid "
>   	     "(%d, status 0x%x)"), ret, status);
>
> -  ret = ptrace (PTRACE_KILL, child_pid, 0, 0);
> -  if (ret != 0)
> -    warning (_("linux_test_for_tracefork: failed to kill child"));
> -  my_waitpid (child_pid, &status, 0);
> -
> -  restore_child_signals_mask (&prev_mask);
> +  do
> +    {
> +      ret = ptrace (PTRACE_KILL, child_pid, 0, 0);
> +      if (ret != 0)
> +	warning ("linux_test_for_tracefork: failed to kill child");
> +      my_waitpid (child_pid, &status, 0);
> +    }
> +  while (WIFSTOPPED (status));
>   }
>
>   /* Determine if PTRACE_O_TRACESYSGOOD can be used to follow syscalls.
> @@ -500,20 +493,14 @@ static void
>   linux_test_for_tracesysgood (int original_pid)
>   {
>     int ret;
> -  sigset_t prev_mask;
> -
> -  /* We don't want those ptrace calls to be interrupted.  */
> -  block_child_signals (&prev_mask);
>
>     linux_supports_tracesysgood_flag = 0;
>
>     ret = ptrace (PTRACE_SETOPTIONS, original_pid, 0, PTRACE_O_TRACESYSGOOD);
>     if (ret != 0)
> -    goto out;
> +    return;
>
>     linux_supports_tracesysgood_flag = 1;
> -out:
> -  restore_child_signals_mask (&prev_mask);
>   }
>
>   /* Determine wether we support PTRACE_O_TRACESYSGOOD option available.
> @@ -630,12 +617,9 @@ delete_lwp_cleanup (void *lp_voidp)
>   static int
>   linux_child_follow_fork (struct target_ops *ops, int follow_child)
>   {
> -  sigset_t prev_mask;
>     int has_vforked;
>     int parent_pid, child_pid;
>
> -  block_child_signals (&prev_mask);
> -
>     has_vforked = (inferior_thread ()->pending_follow.kind
>   		 == TARGET_WAITKIND_VFORKED);
>     parent_pid = ptid_get_lwp (inferior_ptid);
> @@ -954,7 +938,6 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
>         check_for_thread_db ();
>       }
>
> -  restore_child_signals_mask (&prev_mask);
>     return 0;
>   }
>
> @@ -1427,13 +1410,10 @@ int
>   lin_lwp_attach_lwp (ptid_t ptid)
>   {
>     struct lwp_info *lp;
> -  sigset_t prev_mask;
>     int lwpid;
>
>     gdb_assert (is_lwp (ptid));
>
> -  block_child_signals (&prev_mask);
> -
>     lp = find_lwp_pid (ptid);
>     lwpid = GET_LWP (ptid);
>
> @@ -1461,7 +1441,6 @@ lin_lwp_attach_lwp (ptid_t ptid)
>   		  /* We've already seen this thread stop, but we
>   		     haven't seen the PTRACE_EVENT_CLONE extended
>   		     event yet.  */
> -		  restore_child_signals_mask (&prev_mask);
>   		  return 0;
>   		}
>   	      else
> @@ -1478,8 +1457,6 @@ lin_lwp_attach_lwp (ptid_t ptid)
>   		    {
>   		      if (WIFSTOPPED (status))
>   			add_to_pid_list (&stopped_pids, lwpid, status);
> -
> -		      restore_child_signals_mask (&prev_mask);
>   		      return 1;
>   		    }
>   		}
> @@ -1492,7 +1469,6 @@ lin_lwp_attach_lwp (ptid_t ptid)
>   	     to create them.  */
>   	  warning (_("Can't attach %s: %s"), target_pid_to_str (ptid),
>   		   safe_strerror (errno));
> -	  restore_child_signals_mask (&prev_mask);
>   	  return -1;
>   	}
>
> @@ -1503,10 +1479,7 @@ lin_lwp_attach_lwp (ptid_t ptid)
>
>         status = linux_nat_post_attach_wait (ptid, 0, &cloned, &signalled);
>         if (!WIFSTOPPED (status))
> -	{
> -	  restore_child_signals_mask (&prev_mask);
> -	  return 1;
> -	}
> +	return 1;
>
>         lp = add_lwp (ptid);
>         lp->stopped = 1;
> @@ -1542,7 +1515,6 @@ lin_lwp_attach_lwp (ptid_t ptid)
>       }
>
>     lp->last_resume_kind = resume_stop;
> -  restore_child_signals_mask (&prev_mask);
>     return 0;
>   }
>
> @@ -1975,7 +1947,6 @@ static void
>   linux_nat_resume (struct target_ops *ops,
>   		  ptid_t ptid, int step, enum gdb_signal signo)
>   {
> -  sigset_t prev_mask;
>     struct lwp_info *lp;
>     int resume_many;
>
> @@ -1988,8 +1959,6 @@ linux_nat_resume (struct target_ops *ops,
>   			 ? strsignal (gdb_signal_to_host (signo)) : "0"),
>   			target_pid_to_str (inferior_ptid));
>
> -  block_child_signals (&prev_mask);
> -
>     /* A specific PTID means `step only this process id'.  */
>     resume_many = (ptid_equal (minus_one_ptid, ptid)
>   		 || ptid_is_pid (ptid));
> @@ -2047,7 +2016,6 @@ linux_nat_resume (struct target_ops *ops,
>   			    "LLR: Short circuiting for status 0x%x\n",
>   			    lp->status);
>
> -      restore_child_signals_mask (&prev_mask);
>         if (target_can_async_p ())
>   	{
>   	  target_async (inferior_event_handler, 0);
> @@ -2080,7 +2048,6 @@ linux_nat_resume (struct target_ops *ops,
>   			(signo != GDB_SIGNAL_0
>   			 ? strsignal (gdb_signal_to_host (signo)) : "0"));
>
> -  restore_child_signals_mask (&prev_mask);
>     if (target_can_async_p ())
>       target_async (inferior_event_handler, 0);
>   }
> @@ -3477,7 +3444,7 @@ linux_nat_wait_1 (struct target_ops *ops,
>         lp->resumed = 1;
>       }
>
> -  /* Make sure SIGCHLD is blocked.  */
> +  /* Make sure SIGCHLD is blocked until the sigsuspend below.  */
>     block_child_signals (&prev_mask);
>
>   retry:
> @@ -4907,6 +4874,8 @@ linux_async_pipe (int enable)
>       {
>         sigset_t prev_mask;
>
> +      /* Block child signals while we create/destroy the pipe, as
> +	 their handler writes to it.  */
>         block_child_signals (&prev_mask);
>
>         if (enable)
>
>
>

Based on the explanation, it makes sense to me.

Thanks,
Luis

next prev parent reply	other threads:[~2013-08-14 18:50 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-14 18:26 Pedro Alves
2013-08-14 18:50 ` Luis Machado [this message]
2013-08-19 13:51   ` Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=520BD162.1020105@codesourcery.com \
    --to=lgustavo@codesourcery.com \
    --cc=gdb-patches@sourceware.org \
    --cc=palves@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox