Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Pedro Alves <palves@redhat.com>
To: Tankut Baris Aktemur <tankut.baris.aktemur@intel.com>,
	gdb-patches@sourceware.org
Subject: Re: [PATCH v7 5/5] gdb/infrun: handle already-exited threads when attempting to stop
Date: Wed, 13 May 2020 22:15:02 +0100	[thread overview]
Message-ID: <f71767c2-81f6-542f-4123-0c1cbfdacdf8@redhat.com> (raw)
In-Reply-To: <e540fa513347a44c1ae4fc4aa41b506098ad97d8.1587563226.git.tankut.baris.aktemur@intel.com>

Hi Tankut,

I've sent my updated version of this patch, along with the
series, here:
  https://sourceware.org/pipermail/gdb-patches/2020-May/168478.html

See some comments below.

On 4/22/20 4:00 PM, Tankut Baris Aktemur via Gdb-patches wrote:
> In stop_all_threads, GDB sends signals to other threads in an attempt
> to stop them.  While in a typical scenario the expected wait status is
> TARGET_WAITKIND_STOPPED, it is possible that the thread GDB attempted
> to stop has already terminated.  If so, a waitstatus other than
> TARGET_WAITKIND_STOPPED would be received.  Handle this case
> appropriately.
> 
> If a wait status that denotes thread termination is ignored, GDB goes
> into an infinite loop in stop_all_threads.
> E.g.:
> 
>   $ gdb ./a.out
>   (gdb) start
>   ...
>   (gdb) add-inferior -exec ./a.out
>   ...
>   (gdb) inferior 2
>   ...
>   (gdb) start
>   ...
>   (gdb) set schedule-multiple on
>   (gdb) set debug infrun 2
>   (gdb) continue
>   Continuing.
>   infrun: clear_proceed_status_thread (process 10449)
>   infrun: clear_proceed_status_thread (process 10453)
>   infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
>   infrun: proceed: resuming process 10449
>   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 10449] at 0x55555555514e
>   infrun: infrun_async(1)
>   infrun: prepare_to_wait
>   infrun: proceed: resuming process 10453
>   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 10453] at 0x55555555514e
>   infrun: prepare_to_wait
>   infrun: Found 2 inferiors, starting at #0
>   infrun: target_wait (-1.0.0, status) =
>   infrun:   10449.10449.0 [process 10449],
>   infrun:   status->kind = exited, status = 0
>   infrun: handle_inferior_event status->kind = exited, status = 0
>   [Inferior 1 (process 10449) exited normally]
>   infrun: stop_waiting
>   infrun: stop_all_threads
>   infrun: stop_all_threads, pass=0, iterations=0
>   infrun:   process 10453 executing, need stop
>   infrun: target_wait (-1.0.0, status) =
>   infrun:   10453.10453.0 [process 10453],
>   infrun:   status->kind = exited, status = 0
>   infrun: stop_all_threads status->kind = exited, status = 0 process 10453
>   infrun:   process 10453 executing, already stopping
>   infrun: target_wait (-1.0.0, status) =
>   infrun:   -1.0.0 [process -1],
>   infrun:   status->kind = no-resumed
>   infrun: infrun_async(0)
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   ...
> 
> And this polling goes on forever.  This patch prevents the infinite
> looping behavior.  For the same scenario above, we obtain the
> following behavior:
> 
>   ...
>   (gdb) continue
>   Continuing.
>   infrun: clear_proceed_status_thread (process 31229)
>   infrun: clear_proceed_status_thread (process 31233)
>   infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
>   infrun: proceed: resuming process 31229
>   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 31229] at 0x55555555514e
>   infrun: infrun_async(1)
>   infrun: prepare_to_wait
>   infrun: proceed: resuming process 31233
>   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 31233] at 0x55555555514e
>   infrun: prepare_to_wait
>   infrun: Found 2 inferiors, starting at #0
>   infrun: target_wait (-1.0.0, status) =
>   infrun:   31229.31229.0 [process 31229],
>   infrun:   status->kind = exited, status = 0
>   infrun: handle_inferior_event status->kind = exited, status = 0
>   [Inferior 1 (process 31229) exited normally]
>   infrun: stop_waiting
>   infrun: stop_all_threads
>   infrun: stop_all_threads, pass=0, iterations=0
>   infrun:   process 31233 executing, need stop
>   infrun: target_wait (-1.0.0, status) =
>   infrun:   31233.31233.0 [process 31233],
>   infrun:   status->kind = exited, status = 0
>   infrun: stop_all_threads status->kind = exited, status = 0 process 31233
>   infrun: saving status status->kind = exited, status = 0 for 31233.31233.0
>   infrun:   process 31233 not executing
>   infrun: stop_all_threads, pass=1, iterations=1
>   infrun:   process 31233 not executing
>   infrun: stop_all_threads done
>   (gdb)
> 
> The exit event from Inferior 1 is received and shown to the user.
> The exit event from Inferior 2 is not displayed, but kept pending.
> 
>   (gdb) info inferiors
>     Num  Description       Connection           Executable
>   * 1    <null>                                 a.out
>     2    process 31233     1 (native)           a.out
>   (gdb) inferior 2
>   [Switching to inferior 2 [process 31233] (a.out)]
>   [Switching to thread 2.1 (process 31233)]
>   Couldn't get registers: No such process.
>   (gdb) continue
>   Continuing.
>   infrun: clear_proceed_status_thread (process 31233)
>   infrun: clear_proceed_status_thread: thread process 31233 has pending wait status status->kind = exited, status = 0 (currently_stepping=0).
>   infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
>   infrun: proceed: resuming process 31233
>   infrun: resume: thread process 31233 has pending wait status status->kind = exited, status = 0 (currently_stepping=0).
>   infrun: prepare_to_wait
>   infrun: Using pending wait status status->kind = exited, status = 0 for process 31233.
>   infrun: target_wait (-1.0.0, status) =
>   infrun:   31233.31233.0 [process 31233],
>   infrun:   status->kind = exited, status = 0
>   infrun: handle_inferior_event status->kind = exited, status = 0
>   [Inferior 2 (process 31233) exited normally]
>   infrun: stop_waiting
>   (gdb) info inferiors
>     Num  Description       Connection           Executable
>     1    <null>                                 a.out
>   * 2    <null>                                 a.out
>   (gdb)
> 
> Regression-tested on X86_64 Linux.
> 
> gdb/ChangeLog:
> 2020-02-05  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> 	    Tom de Vries  <tdevries@suse.de>
> 
> 	PR threads/25478
> 	* infrun.c (stop_all_threads): Do NOT ignore
> 	TARGET_WAITKIND_NO_RESUMED, TARGET_WAITKIND_THREAD_EXITED,
> 	TARGET_WAITKIND_EXITED, TARGET_WAITKIND_SIGNALLED wait statuses
> 	received from threads we attempt to stop.
> 
> gdb/testsuite/ChangeLog:
> 2019-11-04  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> 
> 	* gdb.multi/multi-exit.c: New file.
> 	* gdb.multi/multi-exit.exp: New file.
> 	* gdb.multi/multi-kill.c: New file.
> 	* gdb.multi/multi-kill.exp: New file.
> 
> Change-Id: I7cec98f40283773b79255d998511da434e9cd408
> ---
>  gdb/infrun.c                           | 101 ++++++++++++++++++++--
>  gdb/testsuite/gdb.multi/multi-exit.c   |  22 +++++
>  gdb/testsuite/gdb.multi/multi-exit.exp | 111 +++++++++++++++++++++++++
>  gdb/testsuite/gdb.multi/multi-kill.c   |  42 ++++++++++
>  gdb/testsuite/gdb.multi/multi-kill.exp | 110 ++++++++++++++++++++++++
>  5 files changed, 379 insertions(+), 7 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.multi/multi-exit.c
>  create mode 100644 gdb/testsuite/gdb.multi/multi-exit.exp
>  create mode 100644 gdb/testsuite/gdb.multi/multi-kill.c
>  create mode 100644 gdb/testsuite/gdb.multi/multi-kill.exp
> 
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 167d50ff3ab..93169269553 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -4781,7 +4781,47 @@ stop_all_threads (void)
>  	{
>  	  int need_wait = 0;
>  
> -	  update_thread_list ();
> +	  for (inferior *inf : all_non_exited_inferiors ())
> +	    {
> +	      update_thread_list ();
> +
> +	      /* After updating the thread list, it's possible to end
> +		 up with pid != 0 but no threads, if the inf's process
> +		 has exited but we have not processed that event yet.
> +		 The exit event must be waiting somewhere in the queue
> +		 to be processed.  Silently add a thread so that we do
> +		 a wait_one() below to pick the pending event.  */
> +
> +	      bool has_threads = false;
> +	      for (thread_info *tp ATTRIBUTE_UNUSED
> +		     : inf->non_exited_threads ())
> +		{
> +		  has_threads = true;
> +		  break;
> +		}
> +
> +	      if (has_threads)
> +		continue;
> +
> +	      ptid_t ptid (inf->pid, inf->pid, 0);
> +

This here is what make me go think through all this.  I don't really
like it to have common code make up the ptid to use for the new thread.
That should be the responsibility of the target backend.  It may
be that the target backend uses the tid field to store important
information, for example.

I also think that having to re-add a thread is not ideal.
As we move GDB towards more multi-target support, and potentially other
kinds of execution objects, I think that it's likely that we will always
make it the responsibility of the target backend to create a thread, since
it's going to the the target that knows the actual (C++) type of the thread
Imagine target-specific classes that inherit from a common thread_info
class, with virtual methods.

After giving it some thought and experimentation, I think we should
go back to your idea of not deleting the last thread of the process.
But let's keep it simple without current_inferior() checks.
When that solution was brought up before, I pointed at the code 
handle_no_resumed that handles the case of inferiors with no threads.
I managed to reproduce that scenario, and confirm that we still
handle it OK.

I ended up squashing the remote and infrun changes in a single patch,
as they're part of the same logical change.

Finally, while playing with the testcases, I thought I'd make them
spawn more inferiors, so that we're more sure we hit the race window.
The way I've adjusted the testcases, it's simple to make them spawn
any number of inferiors -- you just have to change one global.

Please take a look and let me know what you think.

Thanks,
Pedro Alves



  parent reply	other threads:[~2020-05-13 21:15 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1587563226.git.tankut.baris.aktemur@intel.com>
2020-04-22 15:00 ` [PATCH v7 3/5] gdb/remote: do not delete a thread if it has a pending exit event Tankut Baris Aktemur
2020-05-04 14:43   ` Pedro Alves
2020-05-04 15:33     ` Aktemur, Tankut Baris
2020-05-13 14:20       ` Pedro Alves
2020-04-22 15:00 ` [PATCH v7 5/5] gdb/infrun: handle already-exited threads when attempting to stop Tankut Baris Aktemur
2020-04-22 16:07   ` Aktemur, Tankut Baris
2020-05-03 15:38   ` Aktemur, Tankut Baris
2020-05-13 21:15   ` Pedro Alves [this message]
2020-05-14  8:50     ` Aktemur, Tankut Baris
2020-05-14 11:32       ` Pedro Alves
2020-05-14 11:42         ` Aktemur, Tankut Baris
2020-05-14 11:25     ` Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f71767c2-81f6-542f-4123-0c1cbfdacdf8@redhat.com \
    --to=palves@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=tankut.baris.aktemur@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox