From: Andrew Burgess <aburgess@redhat.com>
To: gdb-patches@sourceware.org
Cc: Keith Seitz <keiths@redhat.com>, Kevin Buettner <kevinb@redhat.com>
Subject: Re: [PATCHv2] gdb/unwinders: better support for $pc not saved
Date: Tue, 05 Mar 2024 16:41:27 +0000 [thread overview]
Message-ID: <87a5ncbphk.fsf@redhat.com> (raw)
In-Reply-To: <2137e3edccdc997be04f3235718d772b5be97727.1706885900.git.aburgess@redhat.com>
Ping!
If there's no further feedback then I plan to push this patch in a week
or so.
Thanks,
Andrew
Andrew Burgess <aburgess@redhat.com> writes:
> In V2 I've addressed the test failure that Keith Pointed out. The
> only changes between V1 and V2 are in the testsuite, the GDB side of
> things is unchanged.
>
> The problem is, I suspect caused by an S390 compiler bug, though I'm
> not 100% sure, maybe I'm just missing something. If there is a bug
> then it's a weird little issue related to debug information
> generation, not a functionality bug.
>
> The test as originally written created a Python unwinder which claimed
> any frame for the function break_bt_here. The frame-id the unwinder
> creates for the break_bt_here frame is just $pc and $sp value unwound
> from the next frame.
>
> This of course is a bad frame-id. A frame-id should remain the same
> for every address within a function, which is why GDB usually uses the
> $pc for the start of a function.
>
> Still, for the sake of this test it didn't really matter. Or so I
> thought.
>
> The failure Keith pointed out only happened when we had a stack like
> this:
>
> main -> break_bt_here -> other_func
>
> My Python unwinder would claim the break_bt_here frame and create a
> frame-id by unwinding the $pc and $sp values from `other_func`.
>
> As other_func is a trivial (empty) leaf function, not every
> architecture is going to need to allocate a stack frame for this
> function. For example, on S390 the return address is placed into r14
> and not onto the stack.
>
> As such, my expectation, for a stack grows down architecture, is that
> the $sp value in other_func would be equal to, or less than the $sp
> value in break_bt_here.
>
> In my experience the DWARF Call Frame Address (CFA) is usually the
> stack pointer value on entry to a function, so my expectation was that
> the CFA of other_func would be equal to, or less than the unwound $sp
> value in break_bt_here, so (I thought) I'd be fine to use the unwound
> $sp value as the frame-id stack address in break_bt_here.
>
> However, on S390, the DWARF CIA that covers other_func looks like this:
>
> 00000000 0000000000000014 00000000 CIE
> Version: 1
> Augmentation: "zR"
> Code alignment factor: 1
> Data alignment factor: -8
> Return address column: 14
> Augmentation data: 1b
> DW_CFA_def_cfa: r15 ofs 160
> DW_CFA_undefined: r14
> DW_CFA_nop
>
> While the FDE for other_func is:
>
> 00000070 0000000000000018 00000048 FDE cie=0000002c pc=00000000010011a0..00000000010011b0
> DW_CFA_advance_loc: 4 to 00000000010011a4
> DW_CFA_register: r11 in r16 (f0)
> DW_CFA_advance_loc: 4 to 00000000010011a8
> DW_CFA_def_cfa_register: r11
> DW_CFA_advance_loc: 6 to 00000000010011ae
> DW_CFA_restore: r11
> DW_CFA_def_cfa_register: r15
>
> Notice the 'DW_CFA_def_cfa: r15 ofs 160' in the CIE. This sets up a
> default offset of 160 bytes for the CFA.
>
> I don't believe there's anything really "wrong" with this, but it does
> seem weird. It means that the stack-address within the frame-id for
> other_func will be 160 bytes higher than the $sp value on entry to the
> function other_func.
>
> Now this isn't normally a problem as break_bt_here, when using the
> DWARF unwinder rather than my custom Python unwinder will also have a
> 160 byte offset, so the frame-id for break_bt_here will still appear
> to be for an earlier stack frame.
>
> However, my custom Python unwinder doesn't understand this "magic" 160
> byte offset, and instead just uses the unwound stack value.
>
> Which means that when I have a stack like this:
>
> main -> break_bt_here -> other_func
>
> The stack address in the frame-id for break_bt_here is actually less
> than the stack address in the frame-id for other_func. This then
> triggers frame_id_inner check within get_prev_frame_always_1 and GDB
> things something has gone wrong with the stack unwind, leading to the
> failure Keith reported.
>
> My solution to this problem is to run up to break_bt_here without the
> Python unwinder loaded. Use 'maint print frame-id' to capture the
> frame-id generated from the DWARF unwinder, and then restart GDB and
> load the Python unwinder.
>
> I can then tell the Python unwinder the contents of the DWARF generate
> frame-id, and the Python unwinder will report this as its frame-id.
> For x86-64 very little changes. But for S390 I now use a stack
> address that includes the "magic" 160 byte offset, and everything is
> fine.
>
> ---
>
> This started with a Red Hat bug report which can be seen here:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1850710
>
> The problem reported here was using GDB on GNU/Linux for S390, the
> user stepped into JIT generated code. As they enter the JIT code GDB
> would report 'PC not saved', and this same message would be reported
> after each step/stepi.
>
> Additionally, the user had 'set disassemble-next-line on', and once
> they entered the JIT code this output was not displayed, nor were any
> 'display' directives displayed.
>
> The user is not making use of the JIT plugin API to provide debug
> information. But that's OK, they aren't expecting any source level
> debug here, they are happy to use 'stepi', but the missing 'display'
> directives are a problem, as is the constant 'PC not saved' (error)
> message.
>
> What is happening here is that as GDB is failing to find any debug
> information for the JIT generated code, it is falling back on to the
> S390 prologue unwinder to try and unwind frame #0. Unfortunately,
> without being able to identify the function boundaries, the S390
> prologue scanner can't help much, in fact, it doesn't even suggest an
> arbitrary previous $pc value (some targets that use a link-register
> will, by default, assume the link-register contains the previous $pc),
> instead the S390 will just say, "sorry, I have no previous $pc value".
>
> The result of this is that when GDB tries to find frame #1 we end
> throwing an error from frame_unwind_pc (the 'PC not saved' error).
> This error is not caught anywhere except at the top-level interpreter
> loop, and so we end up skipping all the 'display' directive handling.
>
> While thinking about this, I wondered, could I trigger the same error
> using the Python Unwinder API? What happens if a Python unwinder
> claims a frame, but then fails to provide a previous $pc value?
>
> Turns out that exactly the same thing happens, which is great, as that
> means we now have a way to reproduce this bug on any target. And so
> the test included with this patch does just this. I have a Python
> unwinder that claims a frame, but doesn't provide any previous
> register values.
>
> I then do two tests, first I stop in the claimed frame (i.e. frame #0
> is the frame that can't be unwound), I perform a few steps, and check
> the backtrace. And second, I stop in a child of the problem
> frame (i.e. frame #1 is the frame that can't be unwound), and from
> here I check the backtrace.
>
> While all this is going on I have a 'display' directive in place, and
> each time GDB stops I check that the display directive triggers.
>
> Additionally, when checking the backtrace, I am checking that the
> backtrace finishes with the message 'Backtrace stopped: frame did not
> save the PC'.
>
> As for the fix I chose to add a call to frame_unwind_pc directly to
> get_prev_frame_always_1. Calling frame_unwind_pc will cache the
> unwound $pc value, so this doesn't add much additional work as
> immediately after the new frame_unwind_pc call, we call
> get_prev_frame_maybe_check_cycle, which actually generates the
> previous frame, which will always (I think) require a call to
> frame_unwind_pc anyway.
>
> The reason for adding the frame_unwind_pc call into
> get_prev_frame_always_1, is that if the frame_unwind_pc call fails we
> want to set the frames 'stop_reason', and get_prev_frame_always_1
> seems to be the place where this is done, so I wanted to keep the new
> stop_reason setting code next to all the existing stop_reason setting
> code.
>
> Additionally, once we enter get_prev_frame_maybe_check_cycle we
> actually create the previous frame, then, if it turns out that the
> previous frame can't be created we need to remove the frame .. this
> seemed more complex than just making the check in
> get_prev_frame_always_1.
>
> With this fix in place the original S390 bug is fixed, and also the
> test added in this commit, that uses the Python API, is also fixed.
>
> Reviewed-By: Kevin Buettner <kevinb@redhat.com>
> ---
> gdb/frame.c | 32 +++++++
> gdb/testsuite/gdb.base/pc-not-saved.c | 48 ++++++++++
> gdb/testsuite/gdb.base/pc-not-saved.exp | 113 ++++++++++++++++++++++++
> gdb/testsuite/gdb.base/pc-not-saved.py | 71 +++++++++++++++
> 4 files changed, 264 insertions(+)
> create mode 100644 gdb/testsuite/gdb.base/pc-not-saved.c
> create mode 100644 gdb/testsuite/gdb.base/pc-not-saved.exp
> create mode 100644 gdb/testsuite/gdb.base/pc-not-saved.py
>
> diff --git a/gdb/frame.c b/gdb/frame.c
> index fae89cbbc0a..9ed4515d4d5 100644
> --- a/gdb/frame.c
> +++ b/gdb/frame.c
> @@ -2419,6 +2419,38 @@ get_prev_frame_always_1 (frame_info_ptr this_frame)
> }
> }
>
> + /* Ensure we can unwind the program counter of THIS_FRAME. */
> + try
> + {
> + /* Calling frame_unwind_pc for the sentinel frame relies on the
> + current_frame being set, which at this point it might not be if we
> + are in the process of setting the current_frame after a stop (see
> + get_current_frame).
> +
> + The point of this check is to ensure that the unwinder for
> + THIS_FRAME can actually unwind the $pc, which we assume the
> + sentinel frame unwinder can always do (it's just a read from the
> + machine state), so we only call frame_unwind_pc for frames other
> + than the sentinel (level -1) frame.
> +
> + Additionally, we don't actually care about the value of the
> + unwound $pc, just that the call completed successfully. */
> + if (this_frame->level >= 0)
> + frame_unwind_pc (this_frame);
> + }
> + catch (const gdb_exception_error &ex)
> + {
> + if (ex.error == NOT_AVAILABLE_ERROR || ex.error == OPTIMIZED_OUT_ERROR)
> + {
> + frame_debug_printf (" -> nullptr // no saved PC");
> + this_frame->stop_reason = UNWIND_NO_SAVED_PC;
> + this_frame->prev = nullptr;
> + return nullptr;
> + }
> +
> + throw;
> + }
> +
> return get_prev_frame_maybe_check_cycle (this_frame);
> }
>
> diff --git a/gdb/testsuite/gdb.base/pc-not-saved.c b/gdb/testsuite/gdb.base/pc-not-saved.c
> new file mode 100644
> index 00000000000..bc6632a97d7
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/pc-not-saved.c
> @@ -0,0 +1,48 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> + Copyright 2024 Free Software Foundation, Inc.
> +
> + This program is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3 of the License, or
> + (at your option) any later version.
> +
> + This program is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this program. If not, see <http://www.gnu.org/licenses/>. */
> +
> +volatile int global_var = 0;
> +
> +void
> +other_func (void)
> +{
> + /* Nothing. */
> +}
> +
> +void
> +break_bt_here (void)
> +{
> + /* This is all nonsense; just filler so this function has a body. */
> + if (global_var != 99)
> + global_var++;
> + if (global_var != 98)
> + global_var++;
> + if (global_var != 97)
> + global_var++;
> + if (global_var != 96)
> + global_var++;
> + other_func ();
> + if (global_var != 95)
> + global_var++;
> +}
> +
> +int
> +main (void)
> +{
> + break_bt_here ();
> + return 0;
> +}
> diff --git a/gdb/testsuite/gdb.base/pc-not-saved.exp b/gdb/testsuite/gdb.base/pc-not-saved.exp
> new file mode 100644
> index 00000000000..f267a269f1a
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/pc-not-saved.exp
> @@ -0,0 +1,113 @@
> +# Copyright 2024 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test how GDB handles a frame in which the previous-pc value is not
> +# available. Specifically, check that the backtrace correctly reports
> +# why the backtrace is truncated, and ensure that 'display' directives
> +# still work when 'stepi'-ing through the frame.
> +#
> +# We do this by registering a Python unwinder which doesn't provide
> +# any previous register values.
> +
> +require allow_python_tests
> +
> +standard_testfile
> +
> +if { [prepare_for_testing "failed to prepare" ${testfile} ${srcfile}] } {
> + return
> +}
> +
> +set remote_python_file \
> + [gdb_remote_download host "${srcdir}/${subdir}/${testfile}.py"]
> +
> +if { ![runto "break_bt_here"] } {
> + return
> +}
> +
> +# Figuring out the correct frame-id from a Python unwinder is hard.
> +# We need to know the function's start address (not too hard), and the
> +# stack address on entry to the function, which is much harder to
> +# figure out in a cross-target way.
> +#
> +# So instead we run without any Python unwinder in place and use
> +# 'maint print frame-id' to record the frame-id. We then restart GDB,
> +# load the Python unwinder, and tell it to use the frame-id we
> +# recorded here.
> +set pc unknown
> +set cfa unknown
> +gdb_test_multiple "maintenance print frame-id" "store break_bt_here frame-id" {
> + -re -wrap "frame-id for frame #0: \\{stack=($hex),code=($hex),\[^\}\]+\\}" {
> + set cfa $expect_out(1,string)
> + set pc $expect_out(1,string)
> + }
> +}
> +gdb_assert { ![string equal $cfa unknown] } \
> + "check we read the frame's CFA"
> +
> +gdb_assert { ![string equal $pc unknown] } \
> + "check we read the frame's PC"
> +
> +# Restart and load the Python unwinder script.
> +clean_restart $binfile
> +gdb_test_no_output "source ${remote_python_file}" "load python file"
> +
> +# Tell the Python unwinder to use the frame-id we cached above.
> +gdb_test_no_output "python set_break_bt_here_frame_id($pc, $cfa)"
> +
> +# Run up to the function which the unwinder will claim.
> +if { ![runto "break_bt_here"] } {
> + return
> +}
> +
> +# Print the backtrace. Check that the reason for stopping the
> +# backtrace is that the previous $pc is not available.
> +gdb_test "bt" \
> + [multi_line \
> + "^#0 break_bt_here \\(\\) at \[^\r\n\]+" \
> + "Backtrace stopped: frame did not save the PC"] \
> + "backtrace from break_bt_here function"
> +
> +# Ensure we can stepi.
> +gdb_test "stepi" \
> + "(:?$hex\\s+)?$decimal\\s+\[^\r\n\]+" \
> + "stepi without a display in place"
> +
> +# Setup a 'display' directive.
> +gdb_test "display/i \$pc" \
> + [multi_line \
> + "^1: x/i \\\$pc" \
> + "=> $hex <break_bt_here(:?\\+$decimal)?>:\\s+\[^\r\n\]+"]
> +
> +# Step again, check the 'display' directive is shown.
> +gdb_test "stepi" \
> + [multi_line \
> + "(:?$hex\\s+)?$decimal\\s+\[^\r\n\]+" \
> + "1: x/i \\\$pc" \
> + "=> $hex <break_bt_here(:?\\+$decimal)?>:\\s+\[^\r\n\]+"] \
> + "stepi with a display in place"
> +
> +# Continue to a function that is called from within break_bt_here.
> +# The Python unwinder will then be claiming frame #1.
> +gdb_breakpoint other_func
> +gdb_continue_to_breakpoint "continue to other_func"
> +
> +# Print the backtrace and check that the reason for stopping the
> +# backtrace is that the previous $pc is not available.
> +gdb_test "bt" \
> + [multi_line \
> + "#0 other_func \\(\\) at \[^\r\n\]+" \
> + "#1 (:?$hex in )?break_bt_here \\(\\) at \[^\r\n\]+" \
> + "Backtrace stopped: frame did not save the PC"] \
> + "backtrace from other_func function"
> diff --git a/gdb/testsuite/gdb.base/pc-not-saved.py b/gdb/testsuite/gdb.base/pc-not-saved.py
> new file mode 100644
> index 00000000000..65a8e764885
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/pc-not-saved.py
> @@ -0,0 +1,71 @@
> +# Copyright (C) 2024 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +
> +import gdb
> +from gdb.unwinder import Unwinder, FrameId
> +
> +# Cached FrameId. See set_break_bt_here_frame_id for details.
> +break_bt_here_frame_id = None
> +
> +
> +def set_break_bt_here_frame_id(pc, cfa):
> + """Call this to pre-calculate the FrameId for the frame our unwinder
> + is going to claim, this avoids us having to actually figure out a
> + frame-id within the unwinder, something which is going to be hard
> + to do in a cross-target way.
> +
> + Instead we first run the test without the Python unwinder in
> + place, use 'maint print frame-id' to record the frame-id, then,
> + after loading this Python script, we all this function to record
> + the frame-id that the unwinder should use."""
> + global break_bt_here_frame_id
> + break_bt_here_frame_id = FrameId(cfa, pc)
> +
> +
> +class break_unwinding(Unwinder):
> +
> + """An unwinder for the function 'break_bt_here'. This unwinder will
> + claim any frame for the function in question, but doesn't provide
> + any unwound register values. Importantly, we don't provide a
> + previous $pc value, this means that if we are stopped in
> + 'break_bt_here' then we should fail to unwind beyond frame #0."""
> +
> + def __init__(self):
> + Unwinder.__init__(self, "break unwinding")
> +
> + def __call__(self, pending_frame):
> + pc_desc = pending_frame.architecture().registers().find("pc")
> + pc = pending_frame.read_register(pc_desc)
> +
> + if pc.is_optimized_out:
> + return None
> +
> + block = gdb.block_for_pc(pc)
> + if block == None:
> + return None
> + func = block.function
> + if func == None:
> + return None
> + if str(func) != "break_bt_here":
> + return None
> +
> + global break_bt_here_frame_id
> + if break_bt_here_frame_id is None:
> + return None
> +
> + return pending_frame.create_unwind_info(break_bt_here_frame_id)
> +
> +
> +gdb.unwinder.register_unwinder(None, break_unwinding(), True)
>
> base-commit: 05d1b4b4ad7d74a64cc71c53d621241fc393fcb6
> --
> 2.25.4
next prev parent reply other threads:[~2024-03-05 16:42 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-27 15:26 [PATCH] " Andrew Burgess
2024-01-28 20:56 ` Kevin Buettner
2024-01-29 16:49 ` Keith Seitz
2024-02-02 15:19 ` [PATCHv2] " Andrew Burgess
2024-03-05 16:41 ` Andrew Burgess [this message]
2024-03-07 14:43 ` Keith Seitz
2024-03-11 10:07 ` Andrew Burgess
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a5ncbphk.fsf@redhat.com \
--to=aburgess@redhat.com \
--cc=gdb-patches@sourceware.org \
--cc=keiths@redhat.com \
--cc=kevinb@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox