[rfc, frame] Move the corrupt frame checks earlier

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* [rfc, frame] Move the corrupt frame checks earlier
@ 2006-08-20 13:10 Daniel Jacobowitz
  2006-08-20 16:48 ` Mark Kettenis
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Jacobowitz @ 2006-08-20 13:10 UTC (permalink / raw)
  To: gdb-patches

This patch is a follow-on to the unwind_stop_reason patch, although not
actually dependent.  It moves the frame-went-inner and frame-same-ID
checks earlier in the backtrace process.  Although they move down
in the function, they also check PREV rather than THIS.

The benefit of doing this is that we display the stop reason before,
rather than after, the frame we've decided is corrupt.  For instance,
suppose we have two frames with the same ID.  That means they have the
same CFA and the same function.  In other words, the same stack frame.
So the second backtrace entry will be a duplicate of the first. There's
no added value in displaying it, and logically it doesn't need to exist
on the frame chain.

The risk of doing this is performance; we call get_frame_id in
get_prev_frame_1 now.  That defeats some of the advantages of the lazy
frame structure.  But, it turns out, that's OK.  The lazy frame
structure is still good and useful for the first frame, where it allows
us to maintain our principle of "there is always a frame" without
having to figure out much about that frame.  But any time that someone
wants the second frame on the stack, they usually want its ID too. So
we fetch it slightly earlier and there's no net change in work done.

I verified this by staring at "set debug frame 1" output for a couple
of different test cases.  Step within a function and step into a
function cause the same number of ID fetches (zero and two
respectively).  Backtrace also fetches roughly the same number of IDs;
it may do one extra at the bottom of the stack, which is fine.  I
also did some benchmarking; it was in the noise.

Any thoughts on this patch?  I think it's an improvement, and a
template for where more similar checks should go (like the zero PC
check, whether it's an error or not).

-- 
Daniel Jacobowitz
CodeSourcery

2006-08-19  Daniel Jacobowitz  <dan@codesourcery.com>

	* frame.c (get_prev_frame_1): Move some unwinding failure
	checks to after the frame is created, and one frame earlier.

---
 gdb/frame.c |   80 ++++++++++++++++++++++++++++++++++--------------------------
 1 file changed, 46 insertions(+), 34 deletions(-)

Index: src/gdb/frame.c
===================================================================
--- src.orig/gdb/frame.c	2006-08-19 00:29:06.000000000 -0400
+++ src/gdb/frame.c	2006-08-19 00:30:13.000000000 -0400
@@ -1077,40 +1077,6 @@ get_prev_frame_1 (struct frame_info *thi
       return NULL;
     }
 
-  /* Check that this frame's ID isn't inner to (younger, below, next)
-     the next frame.  This happens when a frame unwind goes backwards.
-     Exclude signal trampolines (due to sigaltstack the frame ID can
-     go backwards) and sentinel frames (the test is meaningless).  */
-  if (this_frame->next->level >= 0
-      && this_frame->next->unwind->type != SIGTRAMP_FRAME
-      && frame_id_inner (this_id, get_frame_id (this_frame->next)))
-    {
-      if (frame_debug)
-	{
-	  fprintf_unfiltered (gdb_stdlog, "-> ");
-	  fprint_frame (gdb_stdlog, NULL);
-	  fprintf_unfiltered (gdb_stdlog, " // this frame ID is inner }\n");
-	}
-      this_frame->stop_reason = UNWIND_INNER_ID;
-      return NULL;
-    }
-
-  /* Check that this and the next frame are not identical.  If they
-     are, there is most likely a stack cycle.  As with the inner-than
-     test above, avoid comparing the inner-most and sentinel frames.  */
-  if (this_frame->level > 0
-      && frame_id_eq (this_id, get_frame_id (this_frame->next)))
-    {
-      if (frame_debug)
-	{
-	  fprintf_unfiltered (gdb_stdlog, "-> ");
-	  fprint_frame (gdb_stdlog, NULL);
-	  fprintf_unfiltered (gdb_stdlog, " // this frame has same ID }\n");
-	}
-      this_frame->stop_reason = UNWIND_SAME_ID;
-      return NULL;
-    }
-
   /* Allocate the new frame but do not wire it in to the frame chain.
      Some (bad) code in INIT_FRAME_EXTRA_INFO tries to look along
      frame->next to pull some fancy tricks (of course such code is, by
@@ -1147,6 +1113,52 @@ get_prev_frame_1 (struct frame_info *thi
   this_frame->prev = prev_frame;
   prev_frame->next = this_frame;
 
+  /* Now that the frame chain is in a consistant state, check whether
+     this frame is useful.  If it is not, unlink it.  Its storage will
+     be reclaimed the next time the frame cache is flushed, and we
+     will not try to unwind THIS_FRAME again.  We can't do these tests
+     earlier; we call things like get_frame_type, which need the frame
+     to be hooked up.  None of these checks apply to the current
+     frame; we deliberately avoid doing them for the current frame, to
+     improve single step performance.  */
+
+  /* Check that the new frame's ID isn't inner to (younger, below,
+     next) the current frame.  This happens when a frame unwind goes
+     backwards.  Exclude signal trampolines (due to sigaltstack the
+     frame ID can go backwards) and sentinel frames (the test is
+     meaningless).  */
+  if (prev_frame->level > 0
+      && prev_frame->next->unwind->type != SIGTRAMP_FRAME
+      && frame_id_inner (get_frame_id (prev_frame), this_id))
+    {
+      if (frame_debug)
+	{
+	  fprintf_unfiltered (gdb_stdlog, "-> ");
+	  fprint_frame (gdb_stdlog, NULL);
+	  fprintf_unfiltered (gdb_stdlog, " // this frame ID is inner }\n");
+	}
+      this_frame->stop_reason = UNWIND_INNER_ID;
+      this_frame->prev = NULL;
+      return NULL;
+    }
+
+  /* Check that this and the next frame are not identical.  If they
+     are, there is most likely a stack cycle.  As with the inner-than
+     test above, avoid comparing the inner-most and sentinel frames.  */
+  if (prev_frame->level > 0
+      && frame_id_eq (get_frame_id (prev_frame), this_id))
+    {
+      if (frame_debug)
+	{
+	  fprintf_unfiltered (gdb_stdlog, "-> ");
+	  fprint_frame (gdb_stdlog, NULL);
+	  fprintf_unfiltered (gdb_stdlog, " // this frame has same ID }\n");
+	}
+      this_frame->stop_reason = UNWIND_SAME_ID;
+      this_frame->prev = NULL;
+      return NULL;
+    }
+
   if (frame_debug)
     {
       fprintf_unfiltered (gdb_stdlog, "-> ");


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [rfc, frame] Move the corrupt frame checks earlier
  2006-08-20 13:10 [rfc, frame] Move the corrupt frame checks earlier Daniel Jacobowitz
@ 2006-08-20 16:48 ` Mark Kettenis
  2006-08-21 11:07   ` Daniel Jacobowitz
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Kettenis @ 2006-08-20 16:48 UTC (permalink / raw)
  To: drow; +Cc: gdb-patches

> Date: Sat, 19 Aug 2006 11:56:54 -0400
> From: Daniel Jacobowitz <drow@false.org>
> 
> This patch is a follow-on to the unwind_stop_reason patch, although not
> actually dependent.  It moves the frame-went-inner and frame-same-ID
> checks earlier in the backtrace process.  Although they move down
> in the function, they also check PREV rather than THIS.
> 
> The benefit of doing this is that we display the stop reason before,
> rather than after, the frame we've decided is corrupt.  For instance,
> suppose we have two frames with the same ID.  That means they have the
> same CFA and the same function.  In other words, the same stack frame.
> So the second backtrace entry will be a duplicate of the first. There's
> no added value in displaying it, and logically it doesn't need to exist
> on the frame chain.

Sorry it's not clear to me what exactly the effect is.  Does it make
frames disappear that are there now?  I wouldn't like that, since it
the past I clearly remember getting useful information out of the last
frame printed in a bad backtrace.

Perhaps you can give an example of the old & new behaviour?

Mark


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [rfc, frame] Move the corrupt frame checks earlier
  2006-08-20 16:48 ` Mark Kettenis
@ 2006-08-21 11:07   ` Daniel Jacobowitz
  2006-08-22 20:32     ` Mark Kettenis
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Jacobowitz @ 2006-08-21 11:07 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Sun, Aug 20, 2006 at 04:32:40PM +0200, Mark Kettenis wrote:
> > Date: Sat, 19 Aug 2006 11:56:54 -0400
> > From: Daniel Jacobowitz <drow@false.org>
> > 
> > This patch is a follow-on to the unwind_stop_reason patch, although not
> > actually dependent.  It moves the frame-went-inner and frame-same-ID
> > checks earlier in the backtrace process.  Although they move down
> > in the function, they also check PREV rather than THIS.
> > 
> > The benefit of doing this is that we display the stop reason before,
> > rather than after, the frame we've decided is corrupt.  For instance,
> > suppose we have two frames with the same ID.  That means they have the
> > same CFA and the same function.  In other words, the same stack frame.
> > So the second backtrace entry will be a duplicate of the first. There's
> > no added value in displaying it, and logically it doesn't need to exist
> > on the frame chain.
> 
> Sorry it's not clear to me what exactly the effect is.  Does it make
> frames disappear that are there now?  I wouldn't like that, since it
> the past I clearly remember getting useful information out of the last
> frame printed in a bad backtrace.
> 
> Perhaps you can give an example of the old & new behaviour?

It does eliminate the last frame from the backtrace when displaying one
of these errors, yes.  There is one bit of potentially useful
information there, but it isn't lost.

Suppose you've got a backtrace that looks like this.

#0  0xffffe410 in __kernel_vsyscall ()
#1  0xb7ea61cb in poll () from /lib/tls/i686/cmov/libc.so.6
#2  0x0810c713 in gdb_do_one_event ()
#3  0x08109a43 in catch_errors ()
#4  0x080b8c13 in _initialize_tui_hooks ()
#5  0x08109d1d in current_interp_command_loop ()
#6  0x08077a3b in main ()

Then I manually stitch up the stack frame, to get this:

#0  0xffffe410 in __kernel_vsyscall ()
#1  0xb7ea61cb in poll () from /lib/tls/i686/cmov/libc.so.6
#2  0x0810c713 in gdb_do_one_event ()
#3  0x0810c713 in gdb_do_one_event ()
Previous frame identical to this frame (corrupt stack?)

(By editing the saved %eip and %ebp for frame #2).

The question is, does frame #3 in that second backtrace display any
information other than what the error message does?  It's got the same
CFA and PC, so it's literally a duplicate of the previous frame.

The one small difference is that you can go to frame 2 and 3, and type
info reg ebx, and get different results; that's because we have the
prev_register.  But you can also go to frame 2 and type info frame, and
that will show you where the saved registers for this frame are; which
I think is just as useful.

For the "frame id inner" check it's more of a tossup.  Now there's
extra information in the frame output not in the error message:
we'll print out the unwound value of PC.  But both the address of
the saved eip and the value of the saved eip are available in info
frame.

My feeling is that this corrupt frame is not part of the backtrace,
and if you want to know more about it, having to type "info frame NUM"
is just fine.

Maybe I should add a bit to the manual about dealing with failed
backtraces?  Whether they're program corruption or GDB failure,
the useful things to do are the same.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [rfc, frame] Move the corrupt frame checks earlier
  2006-08-21 11:07   ` Daniel Jacobowitz
@ 2006-08-22 20:32     ` Mark Kettenis
  2006-08-22 21:04       ` Daniel Jacobowitz
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Kettenis @ 2006-08-22 20:32 UTC (permalink / raw)
  To: drow; +Cc: gdb-patches

> Date: Sun, 20 Aug 2006 12:48:48 -0400
> From: Daniel Jacobowitz <drow@false.org>
> 
> On Sun, Aug 20, 2006 at 04:32:40PM +0200, Mark Kettenis wrote:
> > > Date: Sat, 19 Aug 2006 11:56:54 -0400
> > > From: Daniel Jacobowitz <drow@false.org>
> > > 
> > > This patch is a follow-on to the unwind_stop_reason patch, although not
> > > actually dependent.  It moves the frame-went-inner and frame-same-ID
> > > checks earlier in the backtrace process.  Although they move down
> > > in the function, they also check PREV rather than THIS.
> > > 
> > > The benefit of doing this is that we display the stop reason before,
> > > rather than after, the frame we've decided is corrupt.  For instance,
> > > suppose we have two frames with the same ID.  That means they have the
> > > same CFA and the same function.  In other words, the same stack frame.
> > > So the second backtrace entry will be a duplicate of the first. There's
> > > no added value in displaying it, and logically it doesn't need to exist
> > > on the frame chain.
> > 
> > Sorry it's not clear to me what exactly the effect is.  Does it make
> > frames disappear that are there now?  I wouldn't like that, since it
> > the past I clearly remember getting useful information out of the last
> > frame printed in a bad backtrace.
> > 
> > Perhaps you can give an example of the old & new behaviour?
> 
> It does eliminate the last frame from the backtrace when displaying one
> of these errors, yes.  There is one bit of potentially useful
> information there, but it isn't lost.
> 
> Suppose you've got a backtrace that looks like this.
> 
> #0  0xffffe410 in __kernel_vsyscall ()
> #1  0xb7ea61cb in poll () from /lib/tls/i686/cmov/libc.so.6
> #2  0x0810c713 in gdb_do_one_event ()
> #3  0x08109a43 in catch_errors ()
> #4  0x080b8c13 in _initialize_tui_hooks ()
> #5  0x08109d1d in current_interp_command_loop ()
> #6  0x08077a3b in main ()
> 
> Then I manually stitch up the stack frame, to get this:
> 
> #0  0xffffe410 in __kernel_vsyscall ()
> #1  0xb7ea61cb in poll () from /lib/tls/i686/cmov/libc.so.6
> #2  0x0810c713 in gdb_do_one_event ()
> #3  0x0810c713 in gdb_do_one_event ()
> Previous frame identical to this frame (corrupt stack?)
> 
> (By editing the saved %eip and %ebp for frame #2).
> 
> The question is, does frame #3 in that second backtrace display any
> information other than what the error message does?  It's got the same
> CFA and PC, so it's literally a duplicate of the previous frame.
> 
> The one small difference is that you can go to frame 2 and 3, and type
> info reg ebx, and get different results; that's because we have the
> prev_register.  But you can also go to frame 2 and type info frame, and
> that will show you where the saved registers for this frame are; which
> I think is just as useful.

I don't think this is a small difference.  I've certainly made use of
it in the past, for both debugging stack corruption in a program being
and tracking down problems with gdb's unwinders.  Even if you know
where the saved registers are, you have to remember how big they are
before you can print them.

> For the "frame id inner" check it's more of a tossup.  Now there's
> extra information in the frame output not in the error message:
> we'll print out the unwound value of PC.  But both the address of
> the saved eip and the value of the saved eip are available in info
> frame.
> 
> My feeling is that this corrupt frame is not part of the backtrace,
> and if you want to know more about it, having to type "info frame NUM"
> is just fine.

Well, my feeling is that they are part of the backtrace and that it is
a good thing to show that a backtrace it corrupt.

Mark


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [rfc, frame] Move the corrupt frame checks earlier
  2006-08-22 20:32     ` Mark Kettenis
@ 2006-08-22 21:04       ` Daniel Jacobowitz
  0 siblings, 0 replies; 5+ messages in thread
From: Daniel Jacobowitz @ 2006-08-22 21:04 UTC (permalink / raw)
  To: gdb-patches

On Tue, Aug 22, 2006 at 10:22:24PM +0200, Mark Kettenis wrote:
> I don't think this is a small difference.  I've certainly made use of
> it in the past, for both debugging stack corruption in a program being
> and tracking down problems with gdb's unwinders.  Even if you know
> where the saved registers are, you have to remember how big they are
> before you can print them.

I'm not sure what my opinion is on this one any more :-(

> Well, my feeling is that they are part of the backtrace and that it is
> a good thing to show that a backtrace it corrupt.

Well sure.  That's what motivated the previous (stop reason) patch:
notify the user that the backtrace is corrupt, and hint them where
to look for more information.  I could add a manual section near
"backtrace" which referenced the prefix for backtrace errors.
I think this is actually more useful than the extra frame, for a
bunch of reasons - like the "don't show too much information" we
just discussed.

For instance, here's another option that I like.  We could show saved
register values in info frame in addition to addresses.  I think that
would actually be convenient - I'd certainly use it, the first thing I
do after info frame is usually a bunch of x/ commands.  But I'm mildly
worried about what it will do to your SPARC64 example, at the same time.
We do paginate, though, and most people use the GDB console in
something with a scrollback buffer...

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-08-22 20:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-08-20 13:10 [rfc, frame] Move the corrupt frame checks earlier Daniel Jacobowitz
2006-08-20 16:48 ` Mark Kettenis
2006-08-21 11:07   ` Daniel Jacobowitz
2006-08-22 20:32     ` Mark Kettenis
2006-08-22 21:04       ` Daniel Jacobowitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox