Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Andrew Burgess <aburgess@redhat.com>
To: Tom de Vries <tdevries@suse.de>, gdb-patches@sourceware.org
Subject: Re: [PATCH] [gdb/tdep] Fix unrelocated pc in i386_displaced_step_fixup
Date: Thu, 19 Mar 2026 12:34:03 +0000	[thread overview]
Message-ID: <87h5qc577o.fsf@redhat.com> (raw)
In-Reply-To: <20260317071925.3543-1-tdevries@suse.de>


Hi Tom,

Thanks for fixing this.  Just a couple of minor notes:

Tom de Vries <tdevries@suse.de> writes:

> With test-case gdb.threads/next-fork-other-thread.exp and target board
> unix/-m32 I run into:
> ...
> (gdb) next^M
> [Switching to Thread 0xf643ab40 (LWP 3267939)]^M
> ^M
> Thread 5 "next-fork-other" hit Breakpoint 1, main () at \
>   next-fork-other-thread.c:73^M
> 73        alarm (60);^M
> (gdb) FAIL: $exp: fork_func=fork: target-non-stop=off: non-stop=off: \
>   displaced-stepping=on: i=$n: next to break here
> ...

Please can you clean up the log output to remove ^M line endings when
quoting it in commit messages.  Commit messages are written once, but
read possibly multiple times, so it is right that we invest more effort
when writing them in order to make them easier to read.  Leaving the ^M
endings in place just adds unnecessary noise.

>
> Before we go into how this happens, let's first look at the inferior.
>
> In main, 4 threads are started with the same thread function, leaving all 5
> threads in a loop:
> - the main thread is stuck in a loop calling sleep, and gdb steps through this
>   loop using next
> - the other, non-main threads are stuck in a loop where each thread:
>   - forks off a child process that exits immediately
>   - waits for the child process to exit
>   - calls sleep
>
> The FAIL happens as follows (following snippets from this gdb.log [1]).
>
> One of the non-main threads enters __syscall_cancel_arch to do a sycall (
> either to sleep, or to wait).
>
> Then the non-main thread stops:
> ...
>   [infrun] handle_signal_stop: [2937316.2937324.0] hit another thread's \
>     single-step breakpoint^M
>   [infrun] handle_signal_stop: delayed software breakpoint trap, ignoring^M
>   [infrun] switch_back_to_stepped_thread: need to step [2937316.2937324.0] \
>     over single-step breakpoint^M
> ...
> because we "hit another thread's single-step breakpoint".
>
> AFAIU, this is because of the main thread stepping through
> __syscall_cancel_arch.
>
> To handle this, we're going to try displaced stepping.
>
> The syscall instruction is copied:
> ...
>   [displaced] displaced_step_prepare_throw: original insn 0xf7cdce69: \
>     cd 80      int    $0x80^M
>   [displaced] prepare: selected buffer at 0x80490d2^M
> ...
> to a buffer at _start+2:
> ...
> 080490d0 <_start>:
>  80490d0:       31 ed                   xor    %ebp,%ebp
>  80490d2:       5e                      pop    %esi
> ...
> and we're going to resume execution there.
>
> However, right after resuming we get a GDB_SIGNAL_CHLD for the same thread.
>
> Part of handling that is finalizing the displaced stepping:
> ...
>   [displaced] finish: restored 2937316.2937324.0 0x80490d2^M
>   [displaced] i386_displaced_step_fixup: fixup (0xf7cdce69, 0x80490d2), \
>     insn = 0xcd 0x80 ...^M
>   [displaced] i386_displaced_step_fixup: syscall changed %eip; not relocating^M
>   [infrun] handle_signal_stop: stop_pc=0x80490d2^M
> ...
>
> The stop pc is 0x80490d2, the address of the copied instruction.  In other
> words, we've stopped without making progress.
>
> The problem is that the address is in the displaced stepping buffer, and needs
> relocating, but instead we have "syscall changed %eip; not relocating".
>
> The code in i386_displaced_step_fixup doesn't recognize this situation:
> ...
>       if (i386_syscall_p (insn, &insn_len)
> 	  && pc != to + (insn - insn_start) + insn_len
> 	  /* GDB can get control back after the insn after the syscall.
> 	     Presumably this is a kernel bug.
> 	     i386_displaced_step_copy_insn ensures it's a nop,
> 	     we add one to the length for it.  */
> 	  && pc != to + (insn - insn_start) + insn_len + 1)
>  	displaced_debug_printf ("syscall changed %%eip; not relocating");
> ...
>
> It only handles the cases where the stop pc is:
> - the address after the syscall insn, or
> - the address after the nop after the syscall insn
>
> So, instead of relocating the stop pc back to the original 0xf7cdce69, it
> stays 0x80490d2.  After resuming at that address, the thread:
> - executes the syscall,
> - executes the rest of _start,
> - enters main, and
> - runs into the breakpoint at the start of main.
>
> Since commit cf141dd8ccd ("gdb: fix reg corruption from displaced stepping on
> amd64"), we do handle the "pc == to" case in amd64_displaced_step_fixup:
> ...
>        if (amd64_syscall_p (insn_details, &insn_len)
>  	  /* GDB can get control back after the insn after the syscall.
> 	     Presumably this is a kernel bug.  Fixup ensures its a nop, we
> 	     add one to the length for it.  */
> 	  && (pc < to || pc > (to + insn_len + 1)))
>  	displaced_debug_printf ("syscall changed %%rip; not relocating");
> ...
>
> Fix this in the same way.
>
> Tested on x86_64-linux, with target board unix/-m32.
>
> On openSUSE Tumbleweed (kernel version 6.19.7), this patch fixes the test-case.
>
> On openSUSE Leap 16.0 (kernel version 6.12.0), we still run into PR29040.

I still see these failures:

  FAIL: gdb.threads/next-fork-other-thread.exp: fork_func=fork: target-non-stop=off: non-stop=off: displaced-stepping=auto: i=4: next to other line
  FAIL: gdb.threads/next-fork-other-thread.exp: fork_func=fork: target-non-stop=off: non-stop=off: displaced-stepping=on: i=2: next to other line
  FAIL: gdb.threads/next-fork-other-thread.exp: fork_func=fork: target-non-stop=off: non-stop=off: displaced-stepping=off: i=4: next to for loop

The first two are described by PR gdb/29040, but the third is a
different error message, but I don't know if the third failure is a
consequence of the first two failures or not.  Still I do think the fix
you are proposing here is correct, so with the commit message cleaned
up:

Approved-By: Andrew Burgess <aburgess@redhat.com>

Thanks,
Andrew

>
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33997
>
> [1] https://sourceware.org/bugzilla/attachment.cgi?id=16660
> ---
>  gdb/i386-tdep.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
> index b9013e183c2..93357b41b10 100644
> --- a/gdb/i386-tdep.c
> +++ b/gdb/i386-tdep.c
> @@ -813,12 +813,11 @@ i386_displaced_step_fixup (struct gdbarch *gdbarch,
>  	 it unrelocated.  Goodness help us if there are PC-relative
>  	 system calls.  */
>        if (i386_syscall_p (insn, &insn_len)
> -	  && pc != to + (insn - insn_start) + insn_len
>  	  /* GDB can get control back after the insn after the syscall.
>  	     Presumably this is a kernel bug.
>  	     i386_displaced_step_copy_insn ensures it's a nop,
>  	     we add one to the length for it.  */
> -	  && pc != to + (insn - insn_start) + insn_len + 1)
> +	  && (pc < to || pc > to + (insn - insn_start) + insn_len + 1))
>  	displaced_debug_printf ("syscall changed %%eip; not relocating");
>        else
>  	{
>
> base-commit: 9cc83ec0ce9b4b75e8cd2b0c46f23d4cbf4b2f2b
> -- 
> 2.51.0


  reply	other threads:[~2026-03-19 12:34 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-17  7:19 Tom de Vries
2026-03-19 12:34 ` Andrew Burgess [this message]
2026-03-19 17:40   ` Schimpe, Christina
2026-03-19 22:34     ` Tom de Vries
2026-03-20 16:33   ` Tom de Vries

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h5qc577o.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=tdevries@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox