From: Simon Marchi <simon.marchi@polymtl.ca>
To: Simon Marchi <simon.marchi@efficios.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH 2/2] gdb/dwarf: fix internal error when FDEs do not describe the CFA
Date: Tue, 17 Mar 2026 15:33:19 -0400 [thread overview]
Message-ID: <f20665fd-090d-4a1c-9fba-25d3f0487dcb@polymtl.ca> (raw)
In-Reply-To: <20260316172239.349677-2-simon.marchi@efficios.com>
On 2026-03-16 13:22, Simon Marchi wrote:
> From: Simon Marchi <simon.marchi@polymtl.ca>
>
> This patch fixes an internal error problem that happens when a frame
> description entry does not defined the Canonical Frame Address (CFA).
> This problem was initially reported downstream as a ROCgdb issue (see
> Bug trailer below), but I wrote a reproducer that uses the .debug_frame
> functionality added to the DWARF assembler in the previous patch.
>
> The error is:
>
> /home/smarchi/src/binutils-gdb/gdb/dwarf2/frame.c:1046: internal-error: Unknown CFA rule.
>
> The original bug was encountered while debugging a GPU kernel written
> with Triton [1]. From what I understand, the generated kernel does not
> really use a stack, so the .debug_frame contents generated is quite
> bare:
>
> $ readelf --debug-dump=frames k
> Contents of the .debug_frame section:
>
> 00000000 000000000000000c ffffffff CIE
> Version: 4
> Augmentation: ""
> Pointer Size: 8
> Segment Size: 0
> Code alignment factor: 4
> Data alignment factor: 4
> Return address column: 16
>
> DW_CFA_nop
>
> 00000010 0000000000000014 00000000 FDE cie=00000000 pc=0000000000001600..0000000000001704
>
> For those who don't speak fluent .debug_frame, what we see here is a
> Frame Description Entry (FDE) that doesn't define any register rule,
> referring to a Common Information Entry (CIE) that also doesn't define
> any initial register rule. This is equivalent to having no unwind
> information at all. One question is: why generate these at all? I
> suppose that this is an edge case, that the compiler is written in a way
> that that presumes there will always be some unwind info. That there is
> not "if unwind info is empty, skip emitting the FDE" check. Anyway, the
> important thing for us is that these can be found in the wild, so GDB
> shouldn't crash.
>
> The first part of the fix is to handle CFA_UNSET in dwarf2_frame_cache
> (and do nothing). CFA_UNSET is the initial state when we start
> interpreting a CFA program, meaning that we don't know yet how the CFA
> is defined. In our case, it remains unset after interpreting the CFA
> program.
>
> Then, we would ideally want to get into this `if` below that sets
> undefined_retaddr:
>
> if (fs.retaddr_column < fs.regs.reg.size ()
> && fs.regs.reg[fs.retaddr_column].how == DWARF2_FRAME_REG_UNDEFINED)
> cache->undefined_retaddr = true;
>
> Setting undefined_retaddr has two effects:
>
> - dwarf2_frame_this_id won't try to build a frame id from the CFA
> - dwarf2_frame_unwind_stop_reason will return UNWIND_OUTERMOST, which
> is the most accurate thing we can return here (there is not outer
> frame)
>
> However, the way it is written currently, we don't get info the if.
> `fs.regs.reg.size ()` is 0, so the condition always evaluates to false.
> The `fs.regs.reg` is a vector that is expanded as needed: if an
> operation sets a rule for register N, then we'll resize the vector so it
> holds at least `N + 1` elements. But conceptually, all register columns
> initially contain "undefined". If we arrive to this condition and the
> vector hasn't been expanded to include a given column, then it means
> that the rule for this column is "undefined". Therefore, rewrite the
> condition to consider the return address as undefined if the vector is
> too small to include retaddr_column.
This is not really true, at least in the real-world. In GDB, registers
start as "unspecified", not "undefined". It seems like "undefined"
means: the CFI positively told us that this register's previous value is
unknown. Unspecified means that the CFI hasn't said anything about it,
and this can be interpreted in various ways. See below.
>
> Here are some relevant references to DWARF 5:
>
> - Section 6.4.1. ("Structure of Call Frame Information")
>
> The default rule for all columns before interpretation of the
> initial instructions is the undefined rule.
>
> - Section 6.4.4 ("Call Frame Calling Address")
>
> If a Return Address register is defined in the virtual unwind
> table, and its rule is undefined (for example, by
> DW_CFA_undefined), then there is no return address and no call
> address, and the virtual unwind of stack activations is complete.
>
> Add a test case written using the DWARF assembler that reproduces the
> issue. The user experience in this case is that the frame appears as
> the outer most frame:
>
> (gdb) bt #0 0x000055555555511d in main () (gdb) up ❌️ Initial
> frame selected; you cannot go up. (gdb) frame 1 ❌️ No frame at
> level 1.
>
> [1] https://triton-lang.org/
>
> Change-Id: I67c717ff03a41c0630a73ce9549d88ff363e8cea Bug:
> https://github.com/ROCm/ROCgdb/issues/47 --- gdb/dwarf2/frame.c
> | 7 ++- .../gdb.dwarf2/debug-frame-no-cfa.exp | 51
> +++++++++++++++++++ 2 files changed, 56 insertions(+), 2 deletions(-)
> create mode 100644 gdb/testsuite/gdb.dwarf2/debug-frame-no-cfa.exp
>
> diff --git a/gdb/dwarf2/frame.c b/gdb/dwarf2/frame.c index
> 152bebef0e30..fe74a4f65223 100644 --- a/gdb/dwarf2/frame.c +++
> b/gdb/dwarf2/frame.c @@ -962,6 +962,9 @@ dwarf2_frame_cache (const
> frame_info_ptr &this_frame, void **this_cache) /* Calculate the CFA.
> */ switch (fs.regs.cfa_how) { + case CFA_UNSET: + break;
> + case CFA_REG_OFFSET: cache->cfa = read_addr_from_reg (this_frame,
> fs.regs.cfa_reg); if (fs.armcc_cfa_offsets_reversed) @@ -1074,8
> +1077,8 @@ incomplete CFI data; unspecified registers (e.g., %s) at
> %s"), } }
>
> - if (fs.retaddr_column < fs.regs.reg.size () - &&
> fs.regs.reg[fs.retaddr_column].how == DWARF2_FRAME_REG_UNDEFINED) +
> if (fs.retaddr_column >= fs.regs.reg.size () + ||
> fs.regs.reg[fs.retaddr_column].how == DWARF2_FRAME_REG_UNDEFINED)
> cache->undefined_retaddr = true;
The Linaro CI informed me that this change causes regressions on
AArch64. The unwind info for AArch64 doesn't appear to (always) include
information about the return address. With my change, we get into this
if, and erroneously think that the return address is undefined.
And then we never unwind past frame 0.
More thinking needed, as well as learning about the CFI corner cases...
Simon
next prev parent reply other threads:[~2026-03-17 19:34 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-16 17:22 [PATCH 1/2] gdb/testsuite: add .debug_frame support in DWARF assembler Simon Marchi
2026-03-16 17:22 ` [PATCH 2/2] gdb/dwarf: fix internal error when FDEs do not describe the CFA Simon Marchi
2026-03-17 19:33 ` Simon Marchi [this message]
2026-03-18 20:27 ` [PATCH v2 1/2] gdb/testsuite: add .debug_frame support in DWARF assembler simon.marchi
2026-03-18 20:27 ` [PATCH v2 2/2] gdb/dwarf: fix internal error when FDEs do not describe the CFA simon.marchi
2026-04-06 18:10 ` Tom Tromey
2026-04-11 2:47 ` Simon Marchi
2026-04-04 1:25 ` [PATCH v2 1/2] gdb/testsuite: add .debug_frame support in DWARF assembler Simon Marchi
2026-04-06 17:45 ` Tom Tromey
2026-04-11 2:46 ` Simon Marchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f20665fd-090d-4a1c-9fba-25d3f0487dcb@polymtl.ca \
--to=simon.marchi@polymtl.ca \
--cc=gdb-patches@sourceware.org \
--cc=simon.marchi@efficios.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox