Re: [RFA] use frame IDs to detect function calls while stepping

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Joel Brobecker <brobecker@gnat.com>
To: Andrew Cagney <cagney@gnu.org>, kettenis@gnu.org
Cc: Elena Zannoni <ezannoni@redhat.com>, gdb-patches@sources.redhat.com
Subject: Re: [RFA] use frame IDs to detect function calls while stepping
Date: Tue, 02 Mar 2004 06:16:00 -0000	[thread overview]
Message-ID: <20040302061642.GW1051@gnat.com> (raw)
In-Reply-To: <20040301235239.GP1051@gnat.com>

> Here is a description of the regression which appears if we don't
> include the following test:
> 
> +      if (IN_SOLIB_CALL_TRAMPOLINE (stop_pc, ecs->stop_func_name))
> +        {
> +          /* We landed in a shared library call trampoline, so it
> +             is a subroutine call.  */
> +          handle_step_into_function (ecs);
> +          return;
> +        }
> 
> The problem occurs with ending-run:
> 
>         % gdb ending-run
>         (gdb) b ending-run.c:33
>         Breakpoint 1 at 0x10828: file gdb.base/ending-run.c, line 33.
>         (gdb) run
>         Starting program: /[...]/gdb.base/ending-run 
>         -1 2 7 14 23 34 47 62 79  Goodbye!
>         
>         Breakpoint 1, main () at gdb.base/ending-run.c:33
>         33      }
>         (gdb) n
>         0x000105ec in _start ()
>         (gdb) n
>         Single stepping until exit from function _start, 
>         which has no line number information.
>         0x00020950 in _PROCEDURE_LINKAGE_TABLE_ ()
> 
> The expected behavior for the last "next" command is for GDB
> to run until the inferior exits:
> 
>         (gdb) n
>         Single stepping until exit from function _start, 
>         which has no line number information.
>         
>         Program exited normally.
> 
> Unfortunately, here is what happens. At 0x000105ec, before we do
> our second "next" command, we are about to execute the following
> code:
> 
>         0x000105ec <_start+100>:        call  0x20950 <exit>
>         0x000105f0 <_start+104>:        nop 
> 
> After two iterations (one for the call insn, and one for the delay
> slot), GDB lands at the begining of function "exit" at 0x00020950,
> which is:
> 
>         0x00020950 <exit+0>:    sethi  %hi(0xf000), %g1
>         0x00020954 <exit+4>:    b,a   0x20914 <_PROCEDURE_LINKAGE_TABLE_>
>         0x00020958 <exit+8>:    nop 
> 
> So at this point, the registers window has not been rotated.
> I don't know if this is the cause for this problem, but at this
> point GDB is unable to unwind the call stack:
> 
>         (gdb) bt
>         #0  0x00020950 in _PROCEDURE_LINKAGE_TABLE_ ()
> 
> (And gets the wrong procedure name as well, but that's a separate
> issue - although "x /i" does report what I believe is the correct
> name, strange!).
> 
> I am looking into the sparc unwinder code right now, to try to
> understand a bit better the source of the problem.

I think I found the source of the glitch. I may have the solution
to fix it, but my little finger is telling that it might be a bit
too extreme... Maybe MarkK has some comments about this?

What happens is that, at the point when we reach function "exit",
the FP register is null:

        (gdb) p /x $fp
        $2 = 0x0

The sparc unwinder in sparc_frame_cache() detects this, thinks there is
something wrong, and aborts early. So, we never unwind the "_start"
frame, and hence the following frame ID check doesn't notice the
function call, as it should have in this case:

+      if (frame_id_eq (get_frame_id (get_prev_frame (get_current_frame ())),
+                       step_frame_id))
+        {
+          /* It's a subroutine call.  */
+          handle_step_into_function (ecs);
+          return;
+        }

With this example in mind, it seemed to me that the assertion that %fp
register is not null is unfortunately incorrect. Given that the rest of
the code in sparc_frame_cache() wasn't using the value of that register,
I commented out the assertion, and retried.

<<
@@ -620,8 +620,8 @@ sparc_frame_cache (struct frame_info *ne
      frame.  */
 
   cache->base = frame_unwind_register_unsigned (next_frame, SPARC_FP_REGNUM);
-  if (cache->base == 0)
-    return cache;
+  // if (cache->base == 0)
+  //   return cache;
 
   cache->pc = frame_func_unwind (next_frame);
   if (cache->pc != 0)
>>

That causes the problem above to disappear. I even went as far as
to run the testsuite: No change (apart from fixing the regression
I observed).

sparc_frame_cache() seems well designed to handle null %fp registers.
It doesn't use its value when scanning the prologue, and then knows
to use %sp in its place as the frame base. So the frame should be
correctly unwound.

Comments?
-- 
Joel

WARNING: multiple messages have this Message-ID

From: Joel Brobecker <brobecker@gnat.com>
To: Andrew Cagney <cagney@gnu.org>, kettenis@gnu.org
Cc: Elena Zannoni <ezannoni@redhat.com>, gdb-patches@sources.redhat.com
Subject: Re: [RFA] use frame IDs to detect function calls while stepping
Date: Fri, 19 Mar 2004 00:09:00 -0000	[thread overview]
Message-ID: <20040302061642.GW1051@gnat.com> (raw)
Message-ID: <20040319000900.4r4Mycv89WDs2wrKoLzshuqM5z0iTgVDDK65cNcxSfQ@z> (raw)
In-Reply-To: <20040301235239.GP1051@gnat.com>

> Here is a description of the regression which appears if we don't
> include the following test:
> 
> +      if (IN_SOLIB_CALL_TRAMPOLINE (stop_pc, ecs->stop_func_name))
> +        {
> +          /* We landed in a shared library call trampoline, so it
> +             is a subroutine call.  */
> +          handle_step_into_function (ecs);
> +          return;
> +        }
> 
> The problem occurs with ending-run:
> 
>         % gdb ending-run
>         (gdb) b ending-run.c:33
>         Breakpoint 1 at 0x10828: file gdb.base/ending-run.c, line 33.
>         (gdb) run
>         Starting program: /[...]/gdb.base/ending-run 
>         -1 2 7 14 23 34 47 62 79  Goodbye!
>         
>         Breakpoint 1, main () at gdb.base/ending-run.c:33
>         33      }
>         (gdb) n
>         0x000105ec in _start ()
>         (gdb) n
>         Single stepping until exit from function _start, 
>         which has no line number information.
>         0x00020950 in _PROCEDURE_LINKAGE_TABLE_ ()
> 
> The expected behavior for the last "next" command is for GDB
> to run until the inferior exits:
> 
>         (gdb) n
>         Single stepping until exit from function _start, 
>         which has no line number information.
>         
>         Program exited normally.
> 
> Unfortunately, here is what happens. At 0x000105ec, before we do
> our second "next" command, we are about to execute the following
> code:
> 
>         0x000105ec <_start+100>:        call  0x20950 <exit>
>         0x000105f0 <_start+104>:        nop 
> 
> After two iterations (one for the call insn, and one for the delay
> slot), GDB lands at the begining of function "exit" at 0x00020950,
> which is:
> 
>         0x00020950 <exit+0>:    sethi  %hi(0xf000), %g1
>         0x00020954 <exit+4>:    b,a   0x20914 <_PROCEDURE_LINKAGE_TABLE_>
>         0x00020958 <exit+8>:    nop 
> 
> So at this point, the registers window has not been rotated.
> I don't know if this is the cause for this problem, but at this
> point GDB is unable to unwind the call stack:
> 
>         (gdb) bt
>         #0  0x00020950 in _PROCEDURE_LINKAGE_TABLE_ ()
> 
> (And gets the wrong procedure name as well, but that's a separate
> issue - although "x /i" does report what I believe is the correct
> name, strange!).
> 
> I am looking into the sparc unwinder code right now, to try to
> understand a bit better the source of the problem.

I think I found the source of the glitch. I may have the solution
to fix it, but my little finger is telling that it might be a bit
too extreme... Maybe MarkK has some comments about this?

What happens is that, at the point when we reach function "exit",
the FP register is null:

        (gdb) p /x $fp
        $2 = 0x0

The sparc unwinder in sparc_frame_cache() detects this, thinks there is
something wrong, and aborts early. So, we never unwind the "_start"
frame, and hence the following frame ID check doesn't notice the
function call, as it should have in this case:

+      if (frame_id_eq (get_frame_id (get_prev_frame (get_current_frame ())),
+                       step_frame_id))
+        {
+          /* It's a subroutine call.  */
+          handle_step_into_function (ecs);
+          return;
+        }

With this example in mind, it seemed to me that the assertion that %fp
register is not null is unfortunately incorrect. Given that the rest of
the code in sparc_frame_cache() wasn't using the value of that register,
I commented out the assertion, and retried.

<<
@@ -620,8 +620,8 @@ sparc_frame_cache (struct frame_info *ne
      frame.  */
 
   cache->base = frame_unwind_register_unsigned (next_frame, SPARC_FP_REGNUM);
-  if (cache->base == 0)
-    return cache;
+  // if (cache->base == 0)
+  //   return cache;
 
   cache->pc = frame_func_unwind (next_frame);
   if (cache->pc != 0)
>>

That causes the problem above to disappear. I even went as far as
to run the testsuite: No change (apart from fixing the regression
I observed).

sparc_frame_cache() seems well designed to handle null %fp registers.
It doesn't use its value when scanning the prologue, and then knows
to use %sp in its place as the frame base. So the frame should be
correctly unwound.

Comments?
-- 
Joel

next prev parent reply	other threads:[~2004-03-02  6:16 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-02-05  4:41 Joel Brobecker
2004-02-05 17:13 ` Joel Brobecker
2004-02-05 18:54   ` Elena Zannoni
2004-02-07  4:01     ` Joel Brobecker
2004-02-27 15:23       ` Andrew Cagney
2004-03-19  0:09         ` Joel Brobecker
2004-03-01 19:48           ` Joel Brobecker
2004-03-19  0:09           ` Joel Brobecker
2004-03-01 23:52             ` Joel Brobecker
2004-03-02  6:16             ` Joel Brobecker [this message]
2004-03-03 21:12               ` Mark Kettenis
2004-03-19  0:09                 ` Mark Kettenis
2004-03-19  0:09               ` Joel Brobecker
2004-03-19  0:09               ` Andrew Cagney
2004-03-02 15:48                 ` Andrew Cagney
2004-03-19  0:09                 ` Joel Brobecker
2004-03-02 22:07                   ` Joel Brobecker
2004-03-06  0:15                   ` Andrew Cagney
2004-03-19  0:09                     ` Andrew Cagney
2004-02-05 19:01   ` Daniel Jacobowitz
2004-02-05 19:23     ` Elena Zannoni
2004-02-05 19:49       ` Daniel Jacobowitz
2004-02-09 19:21         ` Andrew Cagney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040302061642.GW1051@gnat.com \
    --to=brobecker@gnat.com \
    --cc=cagney@gnu.org \
    --cc=ezannoni@redhat.com \
    --cc=gdb-patches@sources.redhat.com \
    --cc=kettenis@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox