From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-124991-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 17317 invoked by alias); 5 Aug 2015 11:41:43 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 14992 invoked by uid 89); 5 Aug 2015 11:41:43 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2
X-HELO: mail-pa0-f49.google.com
Received: from mail-pa0-f49.google.com (HELO mail-pa0-f49.google.com) (209.85.220.49) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 05 Aug 2015 11:41:40 +0000
Received: by pacgq8 with SMTP id gq8so33920194pac.3        for <gdb-patches@sourceware.org>; Wed, 05 Aug 2015 04:41:39 -0700 (PDT)
X-Received: by 10.66.65.234 with SMTP id a10mr18807134pat.2.1438774898912;        Wed, 05 Aug 2015 04:41:38 -0700 (PDT)
Received: from E107787-LIN (gcc1-power7.osuosl.org. [140.211.15.137])        by smtp.gmail.com with ESMTPSA id q1sm2644300pap.20.2015.08.05.04.41.35        (version=TLSv1.2 cipher=RC4-SHA bits=128/128);        Wed, 05 Aug 2015 04:41:38 -0700 (PDT)
From: Yao Qi <qiyaoltc@gmail.com>
To: Pedro Alves <palves@redhat.com>
Cc: Yao Qi <qiyaoltc@gmail.com>,  gdb-patches@sourceware.org
Subject: Re: [PATCH/7.10 2/2] gdbserver: Fix non-stop / fork / step-over issues
References: <1438362229-27653-1-git-send-email-palves@redhat.com>	<1438362229-27653-3-git-send-email-palves@redhat.com>	<86a8u8e754.fsf@gmail.com> <55BF94BC.6090904@redhat.com>	<55C0EB12.9050209@redhat.com>
Date: Wed, 05 Aug 2015 11:41:00 -0000
In-Reply-To: <55C0EB12.9050209@redhat.com> (Pedro Alves's message of "Tue, 04	Aug 2015 17:40:50 +0100")
Message-ID: <86k2tac68i.fsf@gmail.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-IsSubscribed: yes
X-SW-Source: 2015-08/txt/msg00106.txt.bz2

Pedro Alves <palves@redhat.com> writes:

> I've now extended the test to run in a few different modes, along
> a couple different axis.  One axis is with/without conditional
> breakpoints on the target enabled.  That exposes the same fails you
> saw on arm, on x86 gdbserver as well.  And then the other axis is
> with/without joining _all_ threads before exiting.  If we gracefully
> terminate the breakpoint thread (new mode), then the test should be
> passing everywhere, because what fails is gdb's handling of the inferior
> disappearing while a thread is stopped (and being inspected).
> Therefore that new mode is not kfailed.
>
> For testing convenience, I've pushed this along with the previous patch
> to the users/palves/gdbserver-fork-issues branch on sourceware.org.
> Let me know if this works for you.

Thanks, Pedro.  There are no fails on arm-linux with GDBserver.  Some
comments on your patch below,

> @@ -3089,8 +3142,25 @@ linux_wait_1 (ptid_t ptid,
>  	info_p =3D &info;
>        else
>  	info_p =3D NULL;
> -      linux_resume_one_lwp (event_child, event_child->stepping,
> -			    WSTOPSIG (w), info_p);
> +
> +      if (step_over_finished)
> +	{
> +	  /* We cancelled this thread's step-over above.  We still
> +	     need to unsuspend all other LWPs, and set then back

s/set then/set them/?

> +	     running again while the signal handler runs.  */
> +	  unsuspend_all_lwps (event_child);
> +
> +	  /* Enqueue the pending signal info so that proceed_all_lwps
> +	     doesn't lose it.  */
> +	  enqueue_pending_signal (event_child, WSTOPSIG (w), info_p);
> +
> +	  proceed_all_lwps ();
> +	}
> +      else
> +	{
> +	  linux_resume_one_lwp (event_child, event_child->stepping,
> +				WSTOPSIG (w), info_p);
> +	}
>        return ignore_event (ourstatus);
>      }
>=20=20
> @@ -3111,8 +3181,15 @@ linux_wait_1 (ptid_t ptid,
>  		   || (current_thread->last_resume_kind =3D=3D resume_step
>  		       && !in_step_range)
>  		   || event_child->stop_reason =3D=3D TARGET_STOPPED_BY_WATCHPOINT
> -		   || (!step_over_finished && !in_step_range
> -		       && !bp_explains_trap && !trace_event)
> +		   || (!in_step_range
> +		       && !bp_explains_trap
> +		       && !trace_event
> +		       /* A step-over was finished just now?  */
> +		       && !step_over_finished
> +		       /* A step-over had been finished previously,
> +			  and the single-step was left pending?  */
> +		       && !(current_thread->last_resume_kind =3D=3D resume_continue
> +			    && event_child->stop_reason =3D=3D TARGET_STOPPED_BY_SINGLE_STEP))
>  		   || (gdb_breakpoint_here (event_child->stop_pc)

I don't fully understand this, what is a case that "step-over had been
finished previously, but the single-step was left pending"?

> 	(linux_wait_1): If passing a signal to the inferior after
> 	finishing a step-over, unsuspend and re-resume all lwps.  If we
> 	see a single-step event but the thread should be continuing, don't
> 	pass the trap to gdb.

however, the explanations in ChangeLog look more reasonable.

>  		       && gdb_condition_true_at_breakpoint (event_child->stop_pc)
>  		       && gdb_no_commands_at_breakpoint (event_child->stop_pc))


> @@ -4189,16 +4298,36 @@ static int
>  start_step_over (struct lwp_info *lwp)
>  {
>    struct thread_info *thread =3D get_lwp_thread (lwp);
> +  ptid_t thread_ptid;
>    struct thread_info *saved_thread;
>    CORE_ADDR pc;
>    int step;
>=20=20
> +  thread_ptid =3D ptid_of (thread);
> +
>    if (debug_threads)
>      debug_printf ("Starting step-over on LWP %ld.  Stopping all threads\=
n",
>  		  lwpid_of (thread));
>=20=20
>    stop_all_lwps (1, lwp);
> -  gdb_assert (lwp->suspended =3D=3D 0);
> +
> +  /* Re-find the LWP as it may have exited.  */
> +  lwp =3D find_lwp_pid (thread_ptid);
> +  if (lwp =3D=3D NULL || lwp_is_marked_dead (lwp))
> +    {
> +      if (debug_threads)
> +	debug_printf ("Step-over thread died "
> +		      "(another thread exited the process?).\n");
> +      unstop_all_lwps (1, lwp);
> +      return 0;
> +    }
> +
> +  if (lwp->suspended !=3D 0)
> +    {
> +      internal_error (__FILE__, __LINE__,
> +		      "LWP %ld suspended=3D%d\n", lwpid_of (thread),
> +		      lwp->suspended);
> +    }
>=20=20
>    if (debug_threads)
>      debug_printf ("Done stopping all threads for step-over.\n");
> @@ -4229,7 +4358,19 @@ start_step_over (struct lwp_info *lwp)
>=20=20
>    current_thread =3D saved_thread;
>=20=20
> -  linux_resume_one_lwp (lwp, step, 0, NULL);
> +  TRY
> +    {
> +      linux_resume_one_lwp_throw (lwp, step, 0, NULL);
> +    }

IIUC, we do TRY/CATCH here because thread may have exited, but we've
done that in this function (start_step_over), do we still need to worry
about these exited threads?

> +  CATCH (ex, RETURN_MASK_ERROR)
> +    {
> +      unstop_all_lwps (1, lwp);
> +
> +      if (!check_ptrace_stopped_lwp_gone (lwp))

I am thinking about the order of these two function calls.  Does the
order matter here?  Probably we need to check_ptrace_stopped_lwp_gone
first before unstop_all_lwps.


> +	throw_exception (ex);
> +      return 0;
> +    }
> +  END_CATCH
>=20=20
>    /* Require next event from this LWP.  */
>    step_over_bkpt =3D thread->entry.id;
> @@ -4270,6 +4411,39 @@ finish_step_over (struct lwp_info *lwp)
>      return 0;
>  }
>=20=20
> +/* If there's a step over in progress, wait until all threads stop
> +   (that is, until the stepping thread finishes its step), and
> +   unsuspend all lwps.  The stepping thread ends with its status
> +   pending, which is processed later when we get back to processing
> +   events.  */
> +
> +static void
> +complete_ongoing_step_over (void)
> +{
> +  if (!ptid_equal (step_over_bkpt, null_ptid))
> +    {
> +      struct lwp_info *lwp;
> +      int wstat;
> +      int ret;
> +
> +      if (debug_threads)
> +	debug_printf ("detach: step over in progress, finish it first\n");

"detach:" in the debug output looks obsolete.

--=20
Yao (=E9=BD=90=E5=B0=A7)