From: Luis Machado <lgustavo@codesourcery.com>
To: Pedro Alves <palves@redhat.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH] linux-nat.c: no need to block child signals so aggressively.
Date: Wed, 14 Aug 2013 18:50:00 -0000 [thread overview]
Message-ID: <520BD162.1020105@codesourcery.com> (raw)
In-Reply-To: <20130814182653.7136.72780.stgit@brno.lan>
On 08/14/2013 03:26 PM, Pedro Alves wrote:
> In http://sourceware.org/ml/gdb-patches/2013-08/msg00174.html , the
> issue of child signal handling around ptrace option support discovery
> being different between GDB and GDBserver came up.
>
> I recalled adding these block_child_signals calls, and the "We don't
> want those ptrace calls to be interrupted" comment, but not exactly
> why. So I looked into it. My first guess is that I got confused.
> The patch that added this
> <http://sourceware.org/ml/gdb-patches/2009-04/msg00125.html> rewrote
> the linux native async support completely, and the old async support
> code had the SIGCHLD handler itself do waitpid, so in places that we'd
> want a blocking waitpid, we'd have to have the signal handler blocked.
> That was probably the mindset I had at the time. Anyway, whatever the
> case, looks like I was wrong on the need for this blocking.
> Given GDBserver doesn't block like this, I investigated why this is
> currently needed on GDB but not on GDBserver.
>
> I removed the block_child_signals (and restore) calls, and hacked
> linux-nat.c to call linux_test_for_tracefork in a loop. Running the
> resulting GDB, I then saw bad things happening. Specifically, I'd end
> up with a bunch of zombies, and eventually, the machine would grind to
> a half with insufficient resources.
>
> The issue is that linux_test_for_tracefork test forks, and has the
> child fork again. If we don't block SIGCHLD on entry to the function,
> the children will inherit SIGCHLD's action/disposition (meaning,
> SIGCHLD will be unblocked in the child). When the first child forks
> again a second child, and that child exits, the first child gets a
> SIGCHLD. Now, when we try to wrap up for the whole options test, we
> kill the first child, and collect the waitstatus. Here, when SIGCHLD
> isn't blocked, GDB will first see the child reporting a stop with
> SIGCHLD. gdbserver's option test does a PTRACE_KILL loop at the end,
> which catches the SIGCHLD, and retries the kill. The GDB version did
> not do that. So the GDB version would proceed, leaving the child
> zombie, as nothing collected its final waitstatus.
>
> So this patch makes the GDB version of linux_test_for_tracefork do the
> exact same as the GDBserver version, removes all this unnecessary
> blocking throughout, and adds a couple comments at places that do need
> it -- namely: places where we'll use sleep with sigsuspend; and
> linux_async_pipe, as that destroys the pipe the signal handler
> touches.
>
> Tested on x86_64 Fedora 17, sync and async.
>
> gdb/
> 2013-08-14 Pedro Alves <palves@redhat.com>
>
> * linux-nat.c (linux_test_for_tracefork)
> (linux_test_for_tracesysgood, linux_child_follow_fork)
> (lin_lwp_attach_lwp, linux_nat_resume): Don't block child signals.
> (linux_nat_wait_1): Extend comment.
> (linux_async_pipe): Add comment.
> ---
> gdb/linux-nat.c | 59 +++++++++++++------------------------------------------
> 1 file changed, 14 insertions(+), 45 deletions(-)
>
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index 45a6e5f..db23433 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -395,20 +395,13 @@ linux_test_for_tracefork (int original_pid)
> {
> int child_pid, ret, status;
> long second_pid;
> - sigset_t prev_mask;
> -
> - /* We don't want those ptrace calls to be interrupted. */
> - block_child_signals (&prev_mask);
>
> linux_supports_tracefork_flag = 0;
> linux_supports_tracevforkdone_flag = 0;
>
> ret = ptrace (PTRACE_SETOPTIONS, original_pid, 0, PTRACE_O_TRACEFORK);
> if (ret != 0)
> - {
> - restore_child_signals_mask (&prev_mask);
> - return;
> - }
> + return;
>
> child_pid = fork ();
> if (child_pid == -1)
> @@ -433,7 +426,6 @@ linux_test_for_tracefork (int original_pid)
> if (ret != 0)
> {
> warning (_("linux_test_for_tracefork: failed to kill child"));
> - restore_child_signals_mask (&prev_mask);
> return;
> }
>
> @@ -445,7 +437,6 @@ linux_test_for_tracefork (int original_pid)
> warning (_("linux_test_for_tracefork: unexpected "
> "wait status 0x%x from killed child"), status);
>
> - restore_child_signals_mask (&prev_mask);
> return;
> }
>
> @@ -482,12 +473,14 @@ linux_test_for_tracefork (int original_pid)
> warning (_("linux_test_for_tracefork: unexpected result from waitpid "
> "(%d, status 0x%x)"), ret, status);
>
> - ret = ptrace (PTRACE_KILL, child_pid, 0, 0);
> - if (ret != 0)
> - warning (_("linux_test_for_tracefork: failed to kill child"));
> - my_waitpid (child_pid, &status, 0);
> -
> - restore_child_signals_mask (&prev_mask);
> + do
> + {
> + ret = ptrace (PTRACE_KILL, child_pid, 0, 0);
> + if (ret != 0)
> + warning ("linux_test_for_tracefork: failed to kill child");
> + my_waitpid (child_pid, &status, 0);
> + }
> + while (WIFSTOPPED (status));
> }
>
> /* Determine if PTRACE_O_TRACESYSGOOD can be used to follow syscalls.
> @@ -500,20 +493,14 @@ static void
> linux_test_for_tracesysgood (int original_pid)
> {
> int ret;
> - sigset_t prev_mask;
> -
> - /* We don't want those ptrace calls to be interrupted. */
> - block_child_signals (&prev_mask);
>
> linux_supports_tracesysgood_flag = 0;
>
> ret = ptrace (PTRACE_SETOPTIONS, original_pid, 0, PTRACE_O_TRACESYSGOOD);
> if (ret != 0)
> - goto out;
> + return;
>
> linux_supports_tracesysgood_flag = 1;
> -out:
> - restore_child_signals_mask (&prev_mask);
> }
>
> /* Determine wether we support PTRACE_O_TRACESYSGOOD option available.
> @@ -630,12 +617,9 @@ delete_lwp_cleanup (void *lp_voidp)
> static int
> linux_child_follow_fork (struct target_ops *ops, int follow_child)
> {
> - sigset_t prev_mask;
> int has_vforked;
> int parent_pid, child_pid;
>
> - block_child_signals (&prev_mask);
> -
> has_vforked = (inferior_thread ()->pending_follow.kind
> == TARGET_WAITKIND_VFORKED);
> parent_pid = ptid_get_lwp (inferior_ptid);
> @@ -954,7 +938,6 @@ holding the child stopped. Try \"set detach-on-fork\" or \
> check_for_thread_db ();
> }
>
> - restore_child_signals_mask (&prev_mask);
> return 0;
> }
>
> @@ -1427,13 +1410,10 @@ int
> lin_lwp_attach_lwp (ptid_t ptid)
> {
> struct lwp_info *lp;
> - sigset_t prev_mask;
> int lwpid;
>
> gdb_assert (is_lwp (ptid));
>
> - block_child_signals (&prev_mask);
> -
> lp = find_lwp_pid (ptid);
> lwpid = GET_LWP (ptid);
>
> @@ -1461,7 +1441,6 @@ lin_lwp_attach_lwp (ptid_t ptid)
> /* We've already seen this thread stop, but we
> haven't seen the PTRACE_EVENT_CLONE extended
> event yet. */
> - restore_child_signals_mask (&prev_mask);
> return 0;
> }
> else
> @@ -1478,8 +1457,6 @@ lin_lwp_attach_lwp (ptid_t ptid)
> {
> if (WIFSTOPPED (status))
> add_to_pid_list (&stopped_pids, lwpid, status);
> -
> - restore_child_signals_mask (&prev_mask);
> return 1;
> }
> }
> @@ -1492,7 +1469,6 @@ lin_lwp_attach_lwp (ptid_t ptid)
> to create them. */
> warning (_("Can't attach %s: %s"), target_pid_to_str (ptid),
> safe_strerror (errno));
> - restore_child_signals_mask (&prev_mask);
> return -1;
> }
>
> @@ -1503,10 +1479,7 @@ lin_lwp_attach_lwp (ptid_t ptid)
>
> status = linux_nat_post_attach_wait (ptid, 0, &cloned, &signalled);
> if (!WIFSTOPPED (status))
> - {
> - restore_child_signals_mask (&prev_mask);
> - return 1;
> - }
> + return 1;
>
> lp = add_lwp (ptid);
> lp->stopped = 1;
> @@ -1542,7 +1515,6 @@ lin_lwp_attach_lwp (ptid_t ptid)
> }
>
> lp->last_resume_kind = resume_stop;
> - restore_child_signals_mask (&prev_mask);
> return 0;
> }
>
> @@ -1975,7 +1947,6 @@ static void
> linux_nat_resume (struct target_ops *ops,
> ptid_t ptid, int step, enum gdb_signal signo)
> {
> - sigset_t prev_mask;
> struct lwp_info *lp;
> int resume_many;
>
> @@ -1988,8 +1959,6 @@ linux_nat_resume (struct target_ops *ops,
> ? strsignal (gdb_signal_to_host (signo)) : "0"),
> target_pid_to_str (inferior_ptid));
>
> - block_child_signals (&prev_mask);
> -
> /* A specific PTID means `step only this process id'. */
> resume_many = (ptid_equal (minus_one_ptid, ptid)
> || ptid_is_pid (ptid));
> @@ -2047,7 +2016,6 @@ linux_nat_resume (struct target_ops *ops,
> "LLR: Short circuiting for status 0x%x\n",
> lp->status);
>
> - restore_child_signals_mask (&prev_mask);
> if (target_can_async_p ())
> {
> target_async (inferior_event_handler, 0);
> @@ -2080,7 +2048,6 @@ linux_nat_resume (struct target_ops *ops,
> (signo != GDB_SIGNAL_0
> ? strsignal (gdb_signal_to_host (signo)) : "0"));
>
> - restore_child_signals_mask (&prev_mask);
> if (target_can_async_p ())
> target_async (inferior_event_handler, 0);
> }
> @@ -3477,7 +3444,7 @@ linux_nat_wait_1 (struct target_ops *ops,
> lp->resumed = 1;
> }
>
> - /* Make sure SIGCHLD is blocked. */
> + /* Make sure SIGCHLD is blocked until the sigsuspend below. */
> block_child_signals (&prev_mask);
>
> retry:
> @@ -4907,6 +4874,8 @@ linux_async_pipe (int enable)
> {
> sigset_t prev_mask;
>
> + /* Block child signals while we create/destroy the pipe, as
> + their handler writes to it. */
> block_child_signals (&prev_mask);
>
> if (enable)
>
>
>
Based on the explanation, it makes sense to me.
Thanks,
Luis
next prev parent reply other threads:[~2013-08-14 18:50 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-14 18:26 Pedro Alves
2013-08-14 18:50 ` Luis Machado [this message]
2013-08-19 13:51 ` Pedro Alves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=520BD162.1020105@codesourcery.com \
--to=lgustavo@codesourcery.com \
--cc=gdb-patches@sourceware.org \
--cc=palves@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox