Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Tom de Vries <tdevries@suse.de>
To: "Schimpe, Christina" <christina.schimpe@intel.com>,
	Andrew Burgess <aburgess@redhat.com>,
	"gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Subject: Re: [PATCH] [gdb/tdep] Fix unrelocated pc in i386_displaced_step_fixup
Date: Thu, 19 Mar 2026 23:34:08 +0100	[thread overview]
Message-ID: <22f19822-3ae3-4757-968d-18281486c3c1@suse.de> (raw)
In-Reply-To: <SN7PR11MB7638D4D8E345900CB244389BF94FA@SN7PR11MB7638.namprd11.prod.outlook.com>

On 3/19/26 6:40 PM, Schimpe, Christina wrote:
>> -----Original Message-----
>> From: Andrew Burgess <aburgess@redhat.com>
>> Sent: Donnerstag, 19. März 2026 13:34
>> To: Tom de Vries <tdevries@suse.de>; gdb-patches@sourceware.org
>> Subject: Re: [PATCH] [gdb/tdep] Fix unrelocated pc in
>> i386_displaced_step_fixup
>>
>>
>> Hi Tom,
>>
>> Thanks for fixing this.  Just a couple of minor notes:
>>
>> Tom de Vries <tdevries@suse.de> writes:
>>
>>> With test-case gdb.threads/next-fork-other-thread.exp and target board
>>> unix/-m32 I run into:
>>> ...
>>> (gdb) next^M
>>> [Switching to Thread 0xf643ab40 (LWP 3267939)]^M ^M Thread 5
>>> "next-fork-other" hit Breakpoint 1, main () at \
>>>    next-fork-other-thread.c:73^M
>>> 73        alarm (60);^M
>>> (gdb) FAIL: $exp: fork_func=fork: target-non-stop=off: non-stop=off: \
>>>    displaced-stepping=on: i=$n: next to break here ...
>>
>> Please can you clean up the log output to remove ^M line endings when
>> quoting it in commit messages.  Commit messages are written once, but read
>> possibly multiple times, so it is right that we invest more effort when writing
>> them in order to make them easier to read.  Leaving the ^M endings in place
>> just adds unnecessary noise.
>>
>>>
>>> Before we go into how this happens, let's first look at the inferior.
>>>
>>> In main, 4 threads are started with the same thread function, leaving
>>> all 5 threads in a loop:
>>> - the main thread is stuck in a loop calling sleep, and gdb steps through this
>>>    loop using next
>>> - the other, non-main threads are stuck in a loop where each thread:
>>>    - forks off a child process that exits immediately
>>>    - waits for the child process to exit
>>>    - calls sleep
>>>
>>> The FAIL happens as follows (following snippets from this gdb.log [1]).
>>>
>>> One of the non-main threads enters __syscall_cancel_arch to do a
>>> sycall ( either to sleep, or to wait).
>>>
>>> Then the non-main thread stops:
>>> ...
>>>    [infrun] handle_signal_stop: [2937316.2937324.0] hit another thread's \
>>>      single-step breakpoint^M
>>>    [infrun] handle_signal_stop: delayed software breakpoint trap, ignoring^M
>>>    [infrun] switch_back_to_stepped_thread: need to step
>> [2937316.2937324.0] \
>>>      over single-step breakpoint^M
>>> ...
>>> because we "hit another thread's single-step breakpoint".
>>>
>>> AFAIU, this is because of the main thread stepping through
>>> __syscall_cancel_arch.
>>>
>>> To handle this, we're going to try displaced stepping.
>>>
>>> The syscall instruction is copied:
>>> ...
>>>    [displaced] displaced_step_prepare_throw: original insn 0xf7cdce69: \
>>>      cd 80      int    $0x80^M
>>>    [displaced] prepare: selected buffer at 0x80490d2^M ...
>>> to a buffer at _start+2:
>>> ...
>>> 080490d0 <_start>:
>>>   80490d0:       31 ed                   xor    %ebp,%ebp
>>>   80490d2:       5e                      pop    %esi
>>> ...
>>> and we're going to resume execution there.
>>>
>>> However, right after resuming we get a GDB_SIGNAL_CHLD for the same
>> thread.
>>>
>>> Part of handling that is finalizing the displaced stepping:
>>> ...
>>>    [displaced] finish: restored 2937316.2937324.0 0x80490d2^M
>>>    [displaced] i386_displaced_step_fixup: fixup (0xf7cdce69, 0x80490d2), \
>>>      insn = 0xcd 0x80 ...^M
>>>    [displaced] i386_displaced_step_fixup: syscall changed %eip; not
>> relocating^M
>>>    [infrun] handle_signal_stop: stop_pc=0x80490d2^M ...
>>>
>>> The stop pc is 0x80490d2, the address of the copied instruction.  In
>>> other words, we've stopped without making progress.
>>>
>>> The problem is that the address is in the displaced stepping buffer,
>>> and needs relocating, but instead we have "syscall changed %eip; not
>> relocating".
>>>
>>> The code in i386_displaced_step_fixup doesn't recognize this situation:
>>> ...
>>>        if (i386_syscall_p (insn, &insn_len)
>>> 	  && pc != to + (insn - insn_start) + insn_len
>>> 	  /* GDB can get control back after the insn after the syscall.
>>> 	     Presumably this is a kernel bug.
>>> 	     i386_displaced_step_copy_insn ensures it's a nop,
>>> 	     we add one to the length for it.  */
>>> 	  && pc != to + (insn - insn_start) + insn_len + 1)
>>>   	displaced_debug_printf ("syscall changed %%eip; not relocating");
>>> ...
>>>
>>> It only handles the cases where the stop pc is:
>>> - the address after the syscall insn, or
>>> - the address after the nop after the syscall insn
>>>
>>> So, instead of relocating the stop pc back to the original 0xf7cdce69,
>>> it stays 0x80490d2.  After resuming at that address, the thread:
>>> - executes the syscall,
>>> - executes the rest of _start,
>>> - enters main, and
>>> - runs into the breakpoint at the start of main.
>>>
>>> Since commit cf141dd8ccd ("gdb: fix reg corruption from displaced
>>> stepping on amd64"), we do handle the "pc == to" case in
>> amd64_displaced_step_fixup:
>>> ...
>>>         if (amd64_syscall_p (insn_details, &insn_len)
>>>   	  /* GDB can get control back after the insn after the syscall.
>>> 	     Presumably this is a kernel bug.  Fixup ensures its a nop, we
>>> 	     add one to the length for it.  */
>>> 	  && (pc < to || pc > (to + insn_len + 1)))
>>>   	displaced_debug_printf ("syscall changed %%rip; not relocating");
>>> ...
>>>
>>> Fix this in the same way.
>>>
>>> Tested on x86_64-linux, with target board unix/-m32.
>>>
>>> On openSUSE Tumbleweed (kernel version 6.19.7), this patch fixes the test-
>> case.
>>>
>>> On openSUSE Leap 16.0 (kernel version 6.12.0), we still run into PR29040.
>>
>> I still see these failures:
>>
>>    FAIL: gdb.threads/next-fork-other-thread.exp: fork_func=fork: target-non-
>> stop=off: non-stop=off: displaced-stepping=auto: i=4: next to other line
>>    FAIL: gdb.threads/next-fork-other-thread.exp: fork_func=fork: target-non-
>> stop=off: non-stop=off: displaced-stepping=on: i=2: next to other line
>>    FAIL: gdb.threads/next-fork-other-thread.exp: fork_func=fork: target-non-
>> stop=off: non-stop=off: displaced-stepping=off: i=4: next to for loop
>>
>> The first two are described by PR gdb/29040, but the third is a different error
>> message, but I don't know if the third failure is a consequence of the first two
>> failures or not.  Still I do think the fix you are proposing here is correct, so
>> with the commit message cleaned
>> up:
>>
>> Approved-By: Andrew Burgess <aburgess@redhat.com>
>>
>> Thanks,
>> Andrew
>>
>>>
>>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33997
>>>
>>> [1] https://sourceware.org/bugzilla/attachment.cgi?id=16660
>>> ---
>>>   gdb/i386-tdep.c | 3 +--
>>>   1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c index
>>> b9013e183c2..93357b41b10 100644
>>> --- a/gdb/i386-tdep.c
>>> +++ b/gdb/i386-tdep.c
>>> @@ -813,12 +813,11 @@ i386_displaced_step_fixup (struct gdbarch
>> *gdbarch,
>>>   	 it unrelocated.  Goodness help us if there are PC-relative
>>>   	 system calls.  */
>>>         if (i386_syscall_p (insn, &insn_len)
>>> -	  && pc != to + (insn - insn_start) + insn_len
>>>   	  /* GDB can get control back after the insn after the syscall.
>>>   	     Presumably this is a kernel bug.
>>>   	     i386_displaced_step_copy_insn ensures it's a nop,
>>>   	     we add one to the length for it.  */
>>> -	  && pc != to + (insn - insn_start) + insn_len + 1)
>>> +	  && (pc < to || pc > to + (insn - insn_start) + insn_len + 1))
>>>   	displaced_debug_printf ("syscall changed %%eip; not relocating");
>>>         else
>>>   	{
>>>
>>> base-commit: 9cc83ec0ce9b4b75e8cd2b0c46f23d4cbf4b2f2b
>>> --
>>> 2.51.0
>>
> 
> Thank you for working on this.
> 
> I cannot reproduce the issue(s), as I do not have that specific system available.
> In any case, the fix makes sense to me. However, I am only adding my Reviewed-by
> tag here as this is an area I do not yet feel completely confident about.
> Andrew approved this anyways already (thanks!). So with the commit message fixed:
> 
> Reviewed-By: Christina Schimpe <christina.schimpe@intel.com>

Thanks both for the reviews, pushed.

- Tom

> Christina
> Intel Deutschland GmbH
> Registered Address: Dornacher Straße 1, 85622 Feldkirchen, Germany
> Tel: +49 89 991 430, www.intel.de
> Managing Directors: Harry Demas, Jeffrey Schneiderman, Yin Chong Sorrell
> Chairperson of the Supervisory Board: Nicole Lau
> Registered Seat: Munich
> Commercial Register: Amtsgericht München HRB 186928


  reply	other threads:[~2026-03-19 22:34 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-17  7:19 Tom de Vries
2026-03-19 12:34 ` Andrew Burgess
2026-03-19 17:40   ` Schimpe, Christina
2026-03-19 22:34     ` Tom de Vries [this message]
2026-03-20 16:33   ` Tom de Vries

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22f19822-3ae3-4757-968d-18281486c3c1@suse.de \
    --to=tdevries@suse.de \
    --cc=aburgess@redhat.com \
    --cc=christina.schimpe@intel.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox