From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-44403-listarch-gdb=sources.redhat.com@sourceware.org>
Received: (qmail 106077 invoked by alias); 11 Jun 2015 13:42:48 -0000
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb.sourceware.org>
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
Received: (qmail 106065 invoked by uid 89); 11 Jun 2015 13:42:47 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_50,RCVD_IN_DNSWL_NONE,SPF_SOFTFAIL autolearn=no version=3.3.2
X-HELO: mtaout22.012.net.il
Received: from mtaout22.012.net.il (HELO mtaout22.012.net.il) (80.179.55.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 11 Jun 2015 13:42:46 +0000
Received: from conversion-daemon.a-mtaout22.012.net.il by a-mtaout22.012.net.il (HyperSendmail v2007.08) id <0NPS00L008JV7F00@a-mtaout22.012.net.il> for gdb@sourceware.org; Thu, 11 Jun 2015 16:41:25 +0300 (IDT)
Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout22.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NPS00KGN8P0Q770@a-mtaout22.012.net.il>; Thu, 11 Jun 2015 16:41:25 +0300 (IDT)
Date: Thu, 11 Jun 2015 13:42:00 -0000
From: Eli Zaretskii <eliz@gnu.org>
Subject: Re: Inadvertently run inferior threads
In-reply-to: <83y4jrsgui.fsf@gnu.org>
To: Pedro Alves <palves@redhat.com>
Cc: gdb@sourceware.org
Reply-to: Eli Zaretskii <eliz@gnu.org>
Message-id: <83ioaus6pt.fsf@gnu.org>
References: <83h9tq3zu3.fsf@gnu.org> <55043A63.6020103@redhat.com> <8361a339xd.fsf@gnu.org> <5504555C.804@redhat.com> <550458E0.10206@redhat.com> <83y4jrsgui.fsf@gnu.org>
X-IsSubscribed: yes
X-SW-Source: 2015-06/txt/msg00008.txt.bz2

> Date: Wed, 10 Jun 2015 18:50:13 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: gdb@sourceware.org
> 
> > I see what's going on here:
> > 
> >  #1 - we suppress the *stopped -> *running transitions/notification when
> >    doing an inferior function call (the in_infcall checks in infrun.c).
> > 
> >  #2 - new threads are spawned and given *running state, because well,
> >    they're running.
> > 
> >  #3 - we suppress the running -> *stopped transition when doing
> >    an infcall, like in #1.  (The in_infcall check in normal_stop).
> > 
> >  #4 - result: _new_ threads end up in "running" state, even though they
> >     are stopped.
> > 
> > I don't know off hand what the best fix is.
> > 
> > I think this bug must be in the tree for a while.  Curious that
> > we don't have a test that exercises this...
> > 
> > I can't explain why you see _all_ threads as running instead of
> > only the new ones, though.
> 
> I think I can explain that.
> 
> First, in MinGW native debugging the function set_running, as well as
> most other thread-related functions that change state, are always
> called with minus_one_ptid as their ptid argument, and therefore they
> change the state of all the threads.
> 
> The second part of the puzzle is that when these threads are started,
> we are inside the 'proceed' call made by 'run_inferior_call'.  When a
> thread like this is started during this time, we get
> TARGET_WAITKIND_SPURIOUS event inside 'handle_inferior_event', and
> call 'resume'.  But when 'resume' is called like that, inferior_ptid
> is set to the thread ID of the new thread that was started, and which
> triggered TARGET_WAITKIND_SPURIOUS.  So when 'resume' wants to
> suppress the stopped -> running transition, here:
> 
> 	  if (!tp->control.in_infcall)
> 	    set_running (user_visible_resume_ptid (user_step), 1);
> 
> it winds up calling 'set_running', because the in_infcall flag is set
> on the thread that called the inferior function, not on the thread
> which was started and triggered TARGET_WAITKIND_SPURIOUS.
> 
> So 'set_running' is called, and it is called with minus_one_ptid,
> which then has the effect of marking all the threads as running.
> 
> What I don't understand is why doesn't the breakpoint we set at exit
> from the inferior function countermand that.  I do see the effect of
> that breakpoint if I turn on infrun debugging:
> 
>   infrun: target_wait (-1, status) =
>   infrun:   4608 [Thread 4608.0x4900],
>   infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
>   infrun: TARGET_WAITKIND_STOPPED
>   infrun: stop_pc = 0x88ba9f
>   infrun: BPSTAT_WHAT_STOP_SILENT
>   infrun: stop_waiting
> 
> Why don't we mark all threads as stopped when we hit the breakpoint?
> is that because of #3 above?
> 
> Any ideas how to solve this annoying problem?

I suggested a patch on gdb-patches which solved the problem for me, at
least in the MinGW-compiled GDB on MS-Windows.

And I have a question about your description of what happens on
GNU/Linux.  You say:

>  #4 - result: _new_ threads end up in "running" state, even though they
>     are stopped.

My question is this: who or what stops the new threads that were
started by the function we infcall'ed?  I know who stops them on
MS-Windows: the OS.  This is described in the MSDN documentation:

  When the system notifies the debugger of a debugging event, it also
  suspends all threads in the affected process. The threads do not
  resume execution until the debugger continues the debugging event by
  using ContinueDebugEvent.

Does the same happen on GNU/Linux (and other systems that support
asynchronous execution)?  If so, I don't understand why we suppress
the stopped <-> running transitions when in infcall.  Or at least the
running -> stopped transition.  The comment in normal_stop tries to
explain this:

  /* Let the user/frontend see the threads as stopped, but do nothing
     if the thread was running an infcall.  We may be e.g., evaluating
     a breakpoint condition.  In that case, the thread had state
     THREAD_RUNNING before the infcall, and shall remain set to
     running, all without informing the user/frontend about state
     transition changes.  If this is actually a call command, then the
     thread was originally already stopped, so there's no state to
     finish either.  */
  if (target_has_execution && inferior_thread ()->control.in_infcall)
    discard_cleanups (old_chain);
  else
    do_cleanups (old_chain);

I don't understand this explanation, in particular why it says the
thread was in THREAD_RUNNING state before the infcall.  If we are
evaluating a breakpoint condition, the thread should be stopped, since
we hit a breakpoint, no?

Or maybe this belongs to my being confused regarding THREAD_RUNNING
state, as I wrote earlier here:

  https://sourceware.org/ml/gdb-patches/2015-06/msg00191.html

In general, we should have the the thread state reflect its actual
state, as far as the OS is concerned.  If the above comment is
accurate about the THREAD_RUNNING state, it sounds like we need a
reference count to go with that state, so that we could increment and
decrement it as needed, without the fear of overwriting someone else's
"running" state.

Does this make sense?

Thanks.