From: Guinevere Larsen <guinevere@redhat.com>
To: Pawel Kupczak <pawel.kupczak@intel.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH 1/3] gdb, amd64: extend the amd64 prologue analyzer to skip register pushes
Date: Fri, 18 Jul 2025 10:43:39 -0300 [thread overview]
Message-ID: <91a1e38e-2978-4e81-a168-762ed55d3811@redhat.com> (raw)
In-Reply-To: <20250701104759.52595-2-pawel.kupczak@intel.com>
Hello! Thanks for working on this!
I am not an authority on i386, but I did notice a few things that gave
me pause so I'd like to ask them to make sure I understand the changes.
Skipping the commit message:
On 7/1/25 7:47 AM, Pawel Kupczak wrote:
> ---
> gdb/amd64-tdep.c | 53 +++++++++++-
> .../amd64-extended-prologue-analysis.c | 49 +++++++++++
> .../amd64-extended-prologue-analysis.exp | 86 +++++++++++++++++++
> 3 files changed, 187 insertions(+), 1 deletion(-)
> mode change 100644 => 100755 gdb/amd64-tdep.c
> create mode 100644 gdb/testsuite/gdb.arch/amd64-extended-prologue-analysis.c
> create mode 100644 gdb/testsuite/gdb.arch/amd64-extended-prologue-analysis.exp
>
> diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
> old mode 100644
> new mode 100755
> index 82dd1e07cf3..863b29a8c27
> --- a/gdb/amd64-tdep.c
> +++ b/gdb/amd64-tdep.c
> @@ -2553,6 +2553,56 @@ amd64_analyze_frame_setup (gdbarch *gdbarch, CORE_ADDR pc,
> return pc;
> }
>
> +/* Check whether PC points at code pushing registers onto the stack. If so,
> + update CACHE and return pc after those pushes or CURRENT_PC, whichever is
> + smaller. Otherwise, return PC passed to this function. */
> +
> +static CORE_ADDR
> +amd64_analyze_register_saves (CORE_ADDR pc, CORE_ADDR current_pc,
> + amd64_frame_cache *cache)
> +{
> + gdb_byte op;
> + int offset = 0;
> +
> + /* There are at most 16 registers that would be pushed in the prologue. */
> + for (int i = 0; i < 16 && pc < current_pc; i++)
> + {
> + int reg = 0;
> + int pc_offset = 0;
> +
> + if (target_read_code (pc, &op, 1) == -1)
> + return pc;
> +
> + /* %r8 - %r15 prefix. */
> + if (op == 0x41)
Looking over on the disassembler for record-full, I see that all 0x4-
are considered prefixes, and (in 64 bit targets) they all have this effect.
Is this something that could affect this prologue analyzer? ie, that
some prefix like 0x40 is used to mean "push a register larger than 7",
which would cause us to not skip that instruction?
> + {
> + reg += 8;
> + pc_offset = 1;
> +
> + if (target_read_code (pc + 1, &op, 1) == -1)
> + return pc;
> + }
> +
> + /* push %rax|%rcx|%rdx|%rbx|%rsp|%rbp|%rsi|%rdi
> +
> + or with 0x41 prefix:
> + push %r8|%r9|%r10|%r11|%r12|%r13|%r14|%r15. */
> + if (op < 0x50 || op > 0x57)
> + break;
This code also ignores the "push lz" and "push lb" instructions. Could
they be used as prologue instructions? (again, this comes from the
recording disassembler)
If so, I would think this is worth explicity saying it in the comment at
the top of the function, otherwise it could confuse us in the future.
> +
> + reg += op - 0x50;
> + offset -= 8;
> +
> + int regnum = amd64_arch_reg_to_regnum (reg);
> + cache->saved_regs[regnum] = offset;
> + cache->sp_offset += 8;
> +
> + pc += 1 + pc_offset;
> + }
> +
> + return pc;
> +}
> +
> /* Do a limited analysis of the prologue at PC and update CACHE
> accordingly. Bail out early if CURRENT_PC is reached. Return the
> address where the analysis stopped.
> @@ -2594,7 +2644,8 @@ amd64_analyze_prologue (gdbarch *gdbarch, CORE_ADDR pc, CORE_ADDR current_pc,
> if (current_pc <= pc)
> return current_pc;
>
> - return amd64_analyze_frame_setup (gdbarch, pc, current_pc, cache);
> + pc = amd64_analyze_frame_setup (gdbarch, pc, current_pc, cache);
> + return amd64_analyze_register_saves (pc, current_pc, cache);
> }
>
> /* Work around false termination of prologue - GCC PR debug/48827.
> diff --git a/gdb/testsuite/gdb.arch/amd64-extended-prologue-analysis.c b/gdb/testsuite/gdb.arch/amd64-extended-prologue-analysis.c
> new file mode 100644
> index 00000000000..7d777d23236
> --- /dev/null
> +++ b/gdb/testsuite/gdb.arch/amd64-extended-prologue-analysis.c
> @@ -0,0 +1,49 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> + Copyright 2025 Free Software Foundation, Inc.
> +
> + This program is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3 of the License, or
> + (at your option) any later version.
> +
> + This program is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this program. If not, see <http://www.gnu.org/licenses/>. */
> +
> +int __attribute__ ((noinline))
> +bar (int x)
> +{
> + return x + x;
> +}
> +
> +/* This function should generate a prologue in shape of:
> + push %rbp
> + .cfi_def_cfa_offset 16
> + .cfi_offset %rbp, -16
> + mov %rsp, %rbp
> + .cfi_def_cfa_register %rbp
> + push %reg1
> + push %reg2
> + .cfi_offset %reg2, 32
> + .cfi_offset %reg1, 24
> +
> + So to be able to unwind a register, GDB needs to skip prologue past
> + register pushes (to access .cfi directives). */
In general, if possible, I think it's preferred that we actually use
assembly instead of relying on compiler behavior. You should be able to
use asm statements to create a fake function, and either the dwarf
assembler to actually label it as such... Maybe even using global labels
to do that more easily (not sure if GDB would use the prologue skip in
that case, though).
Checking the exp file, that is especially important because r12 is
hardcoded but compilers don't promise to be at all consistent about
which general purpose registers they'll use; and because you're relying
on optimizations which can change drastically between compilers and
versions. Case in point, this test will fail with clang. More below
> +int __attribute__ ((noinline))
> +foo (int a, int b, int c, int d)
> +{
> + a += bar (a) + bar (b) + bar (c);
> + return a;
> +}
> +
> +int
> +main (int argc, char **argv)
> +{
> + int a = foo (argc, argc + 1, argc + 2, argc * 2);
> + return a;
Clang seems to perform some tailcall optimization here, since GDB's
backtrace looks like this:
I also noticed that the test fails when using clang and getting a
backtrace, with the following output:
(gdb) bt
#0 0x000055555555442b in foo ()
#1 0x00007ffff7cc45f5 in __libc_start_call_main () from /lib64/libc.so.6
#2 0x00007ffff7cc46a8 in __libc_start_main_impl () from /lib64/libc.so.6
#3 0x0000555555554345 in _start ()
And the disasembly of "main" looks like this:
0000000000000460 <main>:
460: 55 push %rbp
461: 48 89 e5 mov %rsp,%rbp
464: 8d 77 01 lea 0x1(%rdi),%esi
467: 8d 57 02 lea 0x2(%rdi),%edx
46a: 5d pop %rbp
46b: e9 b0 ff ff ff jmp 420 <foo>
So clang actually noticed that the 'a' variable isn't really necessary,
and that the return of 'foo' is the return of main, so does a straight
jump to foo. Maybe this can be solved by having 'a' be volatile, but I
would still say that generating a .s file or a series of hand-written
assembly instructions to test with is much more reliable.
> +}
(Skipping exp file, since I have no comments about it specifically)
--
Cheers,
Guinevere Larsen
She/Her/Hers
next prev parent reply other threads:[~2025-07-18 13:44 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-01 10:47 [PATCH 0/3] extending the amd64 prologue analyzer Pawel Kupczak
2025-07-01 10:47 ` [PATCH 1/3] gdb, amd64: extend the amd64 prologue analyzer to skip register pushes Pawel Kupczak
2025-07-18 13:43 ` Guinevere Larsen [this message]
2025-07-18 15:15 ` Kupczak, Pawel
2025-07-23 10:34 ` Kupczak, Pawel
2025-07-23 16:07 ` Guinevere Larsen
2025-07-18 15:23 ` Andrew Burgess
2025-07-18 16:04 ` Schimpe, Christina
2025-07-01 10:47 ` [PATCH 2/3] gdb, amd64: return after amd64_analyze_frame_setup if current_pc reached Pawel Kupczak
2025-07-18 13:46 ` Guinevere Larsen
2025-07-18 15:19 ` Kupczak, Pawel
2025-07-18 14:46 ` Andrew Burgess
2025-07-18 15:21 ` Kupczak, Pawel
2025-07-01 10:47 ` [PATCH 3/3] gdb, amd64: extend the amd64 prologue analyzer to skip stack alloc Pawel Kupczak
2025-07-15 7:37 ` [PING] [PATCH 0/3] extending the amd64 prologue analyzer Kupczak, Pawel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=91a1e38e-2978-4e81-a168-762ed55d3811@redhat.com \
--to=guinevere@redhat.com \
--cc=gdb-patches@sourceware.org \
--cc=pawel.kupczak@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox