Re: RFC: skip_inline_frames failed assertion resuming from breakpoint on LynxOS

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Joel Brobecker <brobecker@adacore.com>
To: Pedro Alves <palves@redhat.com>
Cc: gdb-patches@sourceware.org
Subject: Re: RFC: skip_inline_frames failed assertion resuming from breakpoint on LynxOS
Date: Sat, 13 Dec 2014 15:46:00 -0000	[thread overview]
Message-ID: <20141213154638.GK5457@adacore.com> (raw)
In-Reply-To: <546F173A.3070204@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 1970 bytes --]

Hi Pedro,

> The main issue is that we're trying to move the thread past a
> breakpoint.  Barring displaced stepping support, to move the
> thread past the breakpoint, we have to remove the breakpoint from
> the target temporarily.  But then we _cannot_ resume other threads
> but the one that is stopped at the breakpoint, because then those
> other threads could fly by the removed breakpoint and miss it.

Attached is a patch that does just that, tested on ppc-lynx5 and
ppc-lynx178.  I waited a while before posting it here, because
I wanted to put it in observation for a while first...

gdb/gdbserver/ChangeLog:

        * lynx-low.c (lynx_resume): Use PTRACE_SINGLESTEP_ONE if N == 1.
        Remove FIXME comment about assumption about N.

OK to commit?

Note that parallel to that, I came across another issue, which I am
going to call a limitation for now: consider the case where we have
2 threads, A and B, and we are tring to next/step some code in thread
A. While doing so, thread B receives a signal, and therefore reports
it to GDB. GDB sees that this signal is configured as
nostop/noprint/pass, so presumably, you would think that we'd resume
the inferior passing that signal to thread B. However, how do you do
that while at the same time stepping thread A?

IIRC, what happens currently in this case is that GDB keeps trying
to resume/step thread A, and the kernel keeps telling GDB "no,
thread B just received a signal", and so GDB and the kernel go
into that infinite loop where nothing advances. I'm not quite sure
why we keep getting the signal for thread B, if it's a new signal
each time, or if it's about the signal not being passed back (the
program I saw this in is fairly large and complicated).

In any case, I don't see how we could improve this situation
without settting sss-like breakpoints... Something I'm not really
eager to do, at least for now, since "set scheduler-locking step"
seems to work around the issue.

Thanks!
-- 
Joel

[-- Attachment #2: 0001-gdbserver-lynxos-Use-PTRACE_SINGLESTEP_ONE-when-sing.patch --]
[-- Type: text/x-diff, Size: 3831 bytes --]

From ea7e173463120d24417a7706f98fff850f9aaa1a Mon Sep 17 00:00:00 2001
From: Joel Brobecker <brobecker@adacore.com>
Date: Tue, 25 Nov 2014 11:12:10 -0500
Subject: [PATCH] [gdbserver/lynxos] Use PTRACE_SINGLESTEP_ONE when
 single-stepping one thread.

Currently, when we receive a request to single-step one single thread
(Eg, when single-stepping out of a breakpoint), we use the
PTRACE_SINGLESTEP pthread request, which does single-step
the corresponding thread, but also resumes execution of all
other threads in the inferior.

This causes problems when debugging programs where another thread
receives multiple debug events while trying to single-step a specific
thread out of a breakpoint (with infrun traces turned on):

    (gdb) continue
    Continuing.
    infrun: clear_proceed_status_thread (Thread 126)
    [...]
    infrun: clear_proceed_status_thread (Thread 142)
    [...]
    infrun: clear_proceed_status_thread (Thread 146)
    infrun: clear_proceed_status_thread (Thread 125)
    infrun: proceed (addr=0xffffffff, signal=GDB_SIGNAL_DEFAULT, step=0)
    infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 142] at 0x10684838
    infrun: wait_for_inferior ()
    infrun: target_wait (-1, status) =
    infrun:   42000 [Thread 146],
    infrun:   status->kind = stopped, signal = GDB_SIGNAL_REALTIME_34
    infrun: infwait_normal_state
    infrun: TARGET_WAITKIND_STOPPED
    infrun: stop_pc = 0x10a187f4
    infrun: context switch
    infrun: Switching context from Thread 142 to Thread 146
    infrun: random signal (GDB_SIGNAL_REALTIME_34)
    infrun: switching back to stepped thread
    infrun: Switching context from Thread 146 to Thread 142
    infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 142] at 0x10684838
    infrun: prepare_to_wait
    [...handling of similar events for threads 145, 144 and 143 snipped...]
    infrun: prepare_to_wait
    infrun: target_wait (-1, status) =
    infrun:   42000 [Thread 146],
    infrun:   status->kind = stopped, signal = GDB_SIGNAL_REALTIME_34
    infrun: infwait_normal_state
    infrun: TARGET_WAITKIND_STOPPED
    infrun: stop_pc = 0x10a187f4
    infrun: context switch
    infrun: Switching context from Thread 142 to Thread 146
    ../../src/gdb/inline-frame.c:339: internal-error: skip_inline_frames: Assertion `find_inline_frame_state (ptid) == NULL' failed.

What happens is that GDB keeps sending requests to resume one specific
thread, and keeps receiving debugging events for other threads.
Things break down when the one of the other threads receives a debug
event for the second time (thread 146 in the example above).

This patch fixes the problem by making sure that only one thread
gets resumed, thus preventing the other threads from generating
an unexpected event.

gdb/gdbserver/ChangeLog:

        * lynx-low.c (lynx_resume): Use PTRACE_SINGLESTEP_ONE if N == 1.
        Remove FIXME comment about assumption about N.
---
 gdb/gdbserver/lynx-low.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gdb/gdbserver/lynx-low.c b/gdb/gdbserver/lynx-low.c
index 6178e03..3b83669 100644
--- a/gdb/gdbserver/lynx-low.c
+++ b/gdb/gdbserver/lynx-low.c
@@ -320,10 +320,11 @@ lynx_attach (unsigned long pid)
 static void
 lynx_resume (struct thread_resume *resume_info, size_t n)
 {
-  /* FIXME: Assume for now that n == 1.  */
   ptid_t ptid = resume_info[0].thread;
-  const int request = (resume_info[0].kind == resume_step
-                       ? PTRACE_SINGLESTEP : PTRACE_CONT);
+  const int request
+    = (resume_info[0].kind == resume_step
+       ? (n == 1 ? PTRACE_SINGLESTEP_ONE : PTRACE_SINGLESTEP)
+       : PTRACE_CONT);
   const int signal = resume_info[0].sig;

   /* If given a minus_one_ptid, then try using the current_process'
-- 
1.9.1

next prev parent reply	other threads:[~2014-12-13 15:46 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-20  5:11 Joel Brobecker
2014-11-20  5:12 ` Joel Brobecker
2014-11-20  9:55   ` Pedro Alves
2014-11-20 17:11     ` Joel Brobecker
2014-11-21 10:43       ` Pedro Alves
2014-12-13 15:46         ` Joel Brobecker [this message]
2014-12-15 13:11           ` Pedro Alves
2014-12-15 14:58             ` Joel Brobecker
2014-12-15 16:01               ` Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141213154638.GK5457@adacore.com \
    --to=brobecker@adacore.com \
    --cc=gdb-patches@sourceware.org \
    --cc=palves@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox