From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17317 invoked by alias); 5 Aug 2015 11:41:43 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 14992 invoked by uid 89); 5 Aug 2015 11:41:43 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pa0-f49.google.com Received: from mail-pa0-f49.google.com (HELO mail-pa0-f49.google.com) (209.85.220.49) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 05 Aug 2015 11:41:40 +0000 Received: by pacgq8 with SMTP id gq8so33920194pac.3 for ; Wed, 05 Aug 2015 04:41:39 -0700 (PDT) X-Received: by 10.66.65.234 with SMTP id a10mr18807134pat.2.1438774898912; Wed, 05 Aug 2015 04:41:38 -0700 (PDT) Received: from E107787-LIN (gcc1-power7.osuosl.org. [140.211.15.137]) by smtp.gmail.com with ESMTPSA id q1sm2644300pap.20.2015.08.05.04.41.35 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 05 Aug 2015 04:41:38 -0700 (PDT) From: Yao Qi To: Pedro Alves Cc: Yao Qi , gdb-patches@sourceware.org Subject: Re: [PATCH/7.10 2/2] gdbserver: Fix non-stop / fork / step-over issues References: <1438362229-27653-1-git-send-email-palves@redhat.com> <1438362229-27653-3-git-send-email-palves@redhat.com> <86a8u8e754.fsf@gmail.com> <55BF94BC.6090904@redhat.com> <55C0EB12.9050209@redhat.com> Date: Wed, 05 Aug 2015 11:41:00 -0000 In-Reply-To: <55C0EB12.9050209@redhat.com> (Pedro Alves's message of "Tue, 04 Aug 2015 17:40:50 +0100") Message-ID: <86k2tac68i.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2015-08/txt/msg00106.txt.bz2 Pedro Alves writes: > I've now extended the test to run in a few different modes, along > a couple different axis. One axis is with/without conditional > breakpoints on the target enabled. That exposes the same fails you > saw on arm, on x86 gdbserver as well. And then the other axis is > with/without joining _all_ threads before exiting. If we gracefully > terminate the breakpoint thread (new mode), then the test should be > passing everywhere, because what fails is gdb's handling of the inferior > disappearing while a thread is stopped (and being inspected). > Therefore that new mode is not kfailed. > > For testing convenience, I've pushed this along with the previous patch > to the users/palves/gdbserver-fork-issues branch on sourceware.org. > Let me know if this works for you. Thanks, Pedro. There are no fails on arm-linux with GDBserver. Some comments on your patch below, > @@ -3089,8 +3142,25 @@ linux_wait_1 (ptid_t ptid, > info_p =3D &info; > else > info_p =3D NULL; > - linux_resume_one_lwp (event_child, event_child->stepping, > - WSTOPSIG (w), info_p); > + > + if (step_over_finished) > + { > + /* We cancelled this thread's step-over above. We still > + need to unsuspend all other LWPs, and set then back s/set then/set them/? > + running again while the signal handler runs. */ > + unsuspend_all_lwps (event_child); > + > + /* Enqueue the pending signal info so that proceed_all_lwps > + doesn't lose it. */ > + enqueue_pending_signal (event_child, WSTOPSIG (w), info_p); > + > + proceed_all_lwps (); > + } > + else > + { > + linux_resume_one_lwp (event_child, event_child->stepping, > + WSTOPSIG (w), info_p); > + } > return ignore_event (ourstatus); > } >=20=20 > @@ -3111,8 +3181,15 @@ linux_wait_1 (ptid_t ptid, > || (current_thread->last_resume_kind =3D=3D resume_step > && !in_step_range) > || event_child->stop_reason =3D=3D TARGET_STOPPED_BY_WATCHPOINT > - || (!step_over_finished && !in_step_range > - && !bp_explains_trap && !trace_event) > + || (!in_step_range > + && !bp_explains_trap > + && !trace_event > + /* A step-over was finished just now? */ > + && !step_over_finished > + /* A step-over had been finished previously, > + and the single-step was left pending? */ > + && !(current_thread->last_resume_kind =3D=3D resume_continue > + && event_child->stop_reason =3D=3D TARGET_STOPPED_BY_SINGLE_STEP)) > || (gdb_breakpoint_here (event_child->stop_pc) I don't fully understand this, what is a case that "step-over had been finished previously, but the single-step was left pending"? > (linux_wait_1): If passing a signal to the inferior after > finishing a step-over, unsuspend and re-resume all lwps. If we > see a single-step event but the thread should be continuing, don't > pass the trap to gdb. however, the explanations in ChangeLog look more reasonable. > && gdb_condition_true_at_breakpoint (event_child->stop_pc) > && gdb_no_commands_at_breakpoint (event_child->stop_pc)) > @@ -4189,16 +4298,36 @@ static int > start_step_over (struct lwp_info *lwp) > { > struct thread_info *thread =3D get_lwp_thread (lwp); > + ptid_t thread_ptid; > struct thread_info *saved_thread; > CORE_ADDR pc; > int step; >=20=20 > + thread_ptid =3D ptid_of (thread); > + > if (debug_threads) > debug_printf ("Starting step-over on LWP %ld. Stopping all threads\= n", > lwpid_of (thread)); >=20=20 > stop_all_lwps (1, lwp); > - gdb_assert (lwp->suspended =3D=3D 0); > + > + /* Re-find the LWP as it may have exited. */ > + lwp =3D find_lwp_pid (thread_ptid); > + if (lwp =3D=3D NULL || lwp_is_marked_dead (lwp)) > + { > + if (debug_threads) > + debug_printf ("Step-over thread died " > + "(another thread exited the process?).\n"); > + unstop_all_lwps (1, lwp); > + return 0; > + } > + > + if (lwp->suspended !=3D 0) > + { > + internal_error (__FILE__, __LINE__, > + "LWP %ld suspended=3D%d\n", lwpid_of (thread), > + lwp->suspended); > + } >=20=20 > if (debug_threads) > debug_printf ("Done stopping all threads for step-over.\n"); > @@ -4229,7 +4358,19 @@ start_step_over (struct lwp_info *lwp) >=20=20 > current_thread =3D saved_thread; >=20=20 > - linux_resume_one_lwp (lwp, step, 0, NULL); > + TRY > + { > + linux_resume_one_lwp_throw (lwp, step, 0, NULL); > + } IIUC, we do TRY/CATCH here because thread may have exited, but we've done that in this function (start_step_over), do we still need to worry about these exited threads? > + CATCH (ex, RETURN_MASK_ERROR) > + { > + unstop_all_lwps (1, lwp); > + > + if (!check_ptrace_stopped_lwp_gone (lwp)) I am thinking about the order of these two function calls. Does the order matter here? Probably we need to check_ptrace_stopped_lwp_gone first before unstop_all_lwps. > + throw_exception (ex); > + return 0; > + } > + END_CATCH >=20=20 > /* Require next event from this LWP. */ > step_over_bkpt =3D thread->entry.id; > @@ -4270,6 +4411,39 @@ finish_step_over (struct lwp_info *lwp) > return 0; > } >=20=20 > +/* If there's a step over in progress, wait until all threads stop > + (that is, until the stepping thread finishes its step), and > + unsuspend all lwps. The stepping thread ends with its status > + pending, which is processed later when we get back to processing > + events. */ > + > +static void > +complete_ongoing_step_over (void) > +{ > + if (!ptid_equal (step_over_bkpt, null_ptid)) > + { > + struct lwp_info *lwp; > + int wstat; > + int ret; > + > + if (debug_threads) > + debug_printf ("detach: step over in progress, finish it first\n"); "detach:" in the debug output looks obsolete. --=20 Yao (=E9=BD=90=E5=B0=A7)