From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id qwDdCqqsuWkL2S0AWB0awg (envelope-from ) for ; Tue, 17 Mar 2026 15:34:02 -0400 Authentication-Results: simark.ca; dkim=pass (2048-bit key; unprotected) header.d=polymtl.ca header.i=@polymtl.ca header.a=rsa-sha256 header.s=oct2025 header.b=VtmBGPb+; dkim-atps=neutral Received: by simark.ca (Postfix, from userid 112) id 1896D1E0BC; Tue, 17 Mar 2026 15:34:02 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_VALIDITY_CERTIFIED_BLOCKED, RCVD_IN_VALIDITY_RPBL_BLOCKED,RCVD_IN_VALIDITY_SAFE_BLOCKED autolearn=ham autolearn_force=no version=4.0.1 Received: from vm01.sourceware.org (vm01.sourceware.org [38.145.34.32]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature ECDSA (prime256v1) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 774FC1E08C for ; Tue, 17 Mar 2026 15:34:00 -0400 (EDT) Received: from vm01.sourceware.org (localhost [127.0.0.1]) by sourceware.org (Postfix) with ESMTP id EB5004BBC0EB for ; Tue, 17 Mar 2026 19:33:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EB5004BBC0EB Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=polymtl.ca header.i=@polymtl.ca header.a=rsa-sha256 header.s=oct2025 header.b=VtmBGPb+ Received: from smtp.polymtl.ca (smtp.polymtl.ca [132.207.4.11]) by sourceware.org (Postfix) with ESMTPS id B7DEF4BB58DD for ; Tue, 17 Mar 2026 19:33:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B7DEF4BB58DD Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=polymtl.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=polymtl.ca ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B7DEF4BB58DD Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=132.207.4.11 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1773776007; cv=none; b=DDoxVPSV4Li6DpbOq6jWyVuKHxvtlZnYffC4NycVvDOK2jMLvGcsmV6gqHqJMnd8JKv3kSWWEaad2c0g0WPYU7cOQ1WURN5XTlI80opXQo2u9kWyE7wcYceEUa43Eq80RfAP933fNZjLzCHijC4YxqBCDUe/hzGNlPuQxC3Vzog= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1773776007; c=relaxed/simple; bh=QrgJsET24P2xIY6U87kqvhhqKqjTccKjlRZwEjpAt34=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=DKMWaLKTd536DPsQ/Rcig9PBVP3ADfMyfxv6nUcctQwVbiXiN7kUzwG4dTIYukwTUxfnJ6ra5KXRJTT4VGa36ziS0p68jhE4jJMnFWMmcHewLleBa4IlFk0Tz+ogSiX+R2iMnsa7owArXLuZ/0CuzFVPVWbUF+ymHXi3+zPRfr4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B7DEF4BB58DD Received: from simark.ca (simark.ca [158.69.221.121]) (authenticated bits=0) by smtp.polymtl.ca (8.14.7/8.14.7) with ESMTP id 62HJXLIE215024 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 17 Mar 2026 15:33:26 -0400 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.polymtl.ca 62HJXLIE215024 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=polymtl.ca; s=oct2025; t=1773776006; bh=LexRvPEKgICGs011dTAMHNqXTRIW2enFGFulNCI0yXI=; h=Date:Subject:To:From:In-Reply-To:From; b=VtmBGPb+/q4GqnX80oCyH79Ktn6SFxxjqVzzXePHHpOqazSJKXjtCNqS4r8dbZW/V YYZfSbVVMVcHy8uiOrTb5/PRqKjm6NTBA2oIIR1ntCbKk0Kcl+O4zmUGhcidO4xcCM Jheeb8jw4KyqLHgHvEPpLdyAZD3lNmrYyG0Qf6vdyGy2qKSLeWujkYUR9tIxZKucoW lJiDrGpEYAio3Vxx2Vdjio9ExCCemX8KUFoVzbrswu8HM+4nJmgLXKPbc43nR0W1XM 8t/pSPbiLuXdBvSM+XEyd5Vi/+fR/dL7YaEOZ9yAbiiNy/rGxRbzt/5sBEvwne+582 Nv1Kcd4af6USA== Received: by simark.ca (Postfix) id 6FA061E08C; Tue, 17 Mar 2026 15:33:20 -0400 (EDT) Message-ID: Date: Tue, 17 Mar 2026 15:33:19 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/2] gdb/dwarf: fix internal error when FDEs do not describe the CFA To: Simon Marchi , gdb-patches@sourceware.org References: <20260316172239.349677-1-simon.marchi@efficios.com> <20260316172239.349677-2-simon.marchi@efficios.com> Content-Language: en-US From: Simon Marchi In-Reply-To: <20260316172239.349677-2-simon.marchi@efficios.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Poly-FromMTA: (simark.ca [158.69.221.121]) at Tue, 17 Mar 2026 19:33:21 +0000 X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces~public-inbox=simark.ca@sourceware.org On 2026-03-16 13:22, Simon Marchi wrote: > From: Simon Marchi > > This patch fixes an internal error problem that happens when a frame > description entry does not defined the Canonical Frame Address (CFA). > This problem was initially reported downstream as a ROCgdb issue (see > Bug trailer below), but I wrote a reproducer that uses the .debug_frame > functionality added to the DWARF assembler in the previous patch. > > The error is: > > /home/smarchi/src/binutils-gdb/gdb/dwarf2/frame.c:1046: internal-error: Unknown CFA rule. > > The original bug was encountered while debugging a GPU kernel written > with Triton [1]. From what I understand, the generated kernel does not > really use a stack, so the .debug_frame contents generated is quite > bare: > > $ readelf --debug-dump=frames k > Contents of the .debug_frame section: > > 00000000 000000000000000c ffffffff CIE > Version: 4 > Augmentation: "" > Pointer Size: 8 > Segment Size: 0 > Code alignment factor: 4 > Data alignment factor: 4 > Return address column: 16 > > DW_CFA_nop > > 00000010 0000000000000014 00000000 FDE cie=00000000 pc=0000000000001600..0000000000001704 > > For those who don't speak fluent .debug_frame, what we see here is a > Frame Description Entry (FDE) that doesn't define any register rule, > referring to a Common Information Entry (CIE) that also doesn't define > any initial register rule. This is equivalent to having no unwind > information at all. One question is: why generate these at all? I > suppose that this is an edge case, that the compiler is written in a way > that that presumes there will always be some unwind info. That there is > not "if unwind info is empty, skip emitting the FDE" check. Anyway, the > important thing for us is that these can be found in the wild, so GDB > shouldn't crash. > > The first part of the fix is to handle CFA_UNSET in dwarf2_frame_cache > (and do nothing). CFA_UNSET is the initial state when we start > interpreting a CFA program, meaning that we don't know yet how the CFA > is defined. In our case, it remains unset after interpreting the CFA > program. > > Then, we would ideally want to get into this `if` below that sets > undefined_retaddr: > > if (fs.retaddr_column < fs.regs.reg.size () > && fs.regs.reg[fs.retaddr_column].how == DWARF2_FRAME_REG_UNDEFINED) > cache->undefined_retaddr = true; > > Setting undefined_retaddr has two effects: > > - dwarf2_frame_this_id won't try to build a frame id from the CFA > - dwarf2_frame_unwind_stop_reason will return UNWIND_OUTERMOST, which > is the most accurate thing we can return here (there is not outer > frame) > > However, the way it is written currently, we don't get info the if. > `fs.regs.reg.size ()` is 0, so the condition always evaluates to false. > The `fs.regs.reg` is a vector that is expanded as needed: if an > operation sets a rule for register N, then we'll resize the vector so it > holds at least `N + 1` elements. But conceptually, all register columns > initially contain "undefined". If we arrive to this condition and the > vector hasn't been expanded to include a given column, then it means > that the rule for this column is "undefined". Therefore, rewrite the > condition to consider the return address as undefined if the vector is > too small to include retaddr_column. This is not really true, at least in the real-world. In GDB, registers start as "unspecified", not "undefined". It seems like "undefined" means: the CFI positively told us that this register's previous value is unknown. Unspecified means that the CFI hasn't said anything about it, and this can be interpreted in various ways. See below. > > Here are some relevant references to DWARF 5: > > - Section 6.4.1. ("Structure of Call Frame Information") > > The default rule for all columns before interpretation of the > initial instructions is the undefined rule. > > - Section 6.4.4 ("Call Frame Calling Address") > > If a Return Address register is defined in the virtual unwind > table, and its rule is undefined (for example, by > DW_CFA_undefined), then there is no return address and no call > address, and the virtual unwind of stack activations is complete. > > Add a test case written using the DWARF assembler that reproduces the > issue. The user experience in this case is that the frame appears as > the outer most frame: > > (gdb) bt #0 0x000055555555511d in main () (gdb) up ❌️ Initial > frame selected; you cannot go up. (gdb) frame 1 ❌️ No frame at > level 1. > > [1] https://triton-lang.org/ > > Change-Id: I67c717ff03a41c0630a73ce9549d88ff363e8cea Bug: > https://github.com/ROCm/ROCgdb/issues/47 --- gdb/dwarf2/frame.c > | 7 ++- .../gdb.dwarf2/debug-frame-no-cfa.exp | 51 > +++++++++++++++++++ 2 files changed, 56 insertions(+), 2 deletions(-) > create mode 100644 gdb/testsuite/gdb.dwarf2/debug-frame-no-cfa.exp > > diff --git a/gdb/dwarf2/frame.c b/gdb/dwarf2/frame.c index > 152bebef0e30..fe74a4f65223 100644 --- a/gdb/dwarf2/frame.c +++ > b/gdb/dwarf2/frame.c @@ -962,6 +962,9 @@ dwarf2_frame_cache (const > frame_info_ptr &this_frame, void **this_cache) /* Calculate the CFA. > */ switch (fs.regs.cfa_how) { + case CFA_UNSET: + break; > + case CFA_REG_OFFSET: cache->cfa = read_addr_from_reg (this_frame, > fs.regs.cfa_reg); if (fs.armcc_cfa_offsets_reversed) @@ -1074,8 > +1077,8 @@ incomplete CFI data; unspecified registers (e.g., %s) at > %s"), } } > > - if (fs.retaddr_column < fs.regs.reg.size () - && > fs.regs.reg[fs.retaddr_column].how == DWARF2_FRAME_REG_UNDEFINED) + > if (fs.retaddr_column >= fs.regs.reg.size () + || > fs.regs.reg[fs.retaddr_column].how == DWARF2_FRAME_REG_UNDEFINED) > cache->undefined_retaddr = true; The Linaro CI informed me that this change causes regressions on AArch64. The unwind info for AArch64 doesn't appear to (always) include information about the return address. With my change, we get into this if, and erroneously think that the return address is undefined. And then we never unwind past frame 0. More thinking needed, as well as learning about the CFI corner cases... Simon