Re: RFA: prologue value modules

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Jim Blandy <jimb@codesourcery.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: gdb-patches@sources.redhat.com
Subject: Re: RFA: prologue value modules
Date: Tue, 28 Mar 2006 02:24:00 -0000	[thread overview]
Message-ID: <vt21wwntp5m.fsf@theseus.home.> (raw)
In-Reply-To: <vt2ek0r1rqq.fsf@theseus.home.> (Jim Blandy's message of "Fri, 24 Mar 2006 13:02:05 -0800")


Jim Blandy <jimb@codesourcery.com> writes:
> Eli Zaretskii <eliz@gnu.org> writes:
>>> From: Jim Blandy <jimb@codesourcery.com>
>>> Date: Wed, 22 Mar 2006 20:37:20 -0800
>>> 
>>> To recap: this patch adds prologue-value.[ch], support libraries for
>>> the new prologue analysis strategy used in the Renesas M32C port and
>>> some internal Red Hat ports.  I think they make prologue analysis
>>> simpler to write and more robust.
>>
>> Are we going to have this documented in gdbint.texinfo?
>
> Yes --- I promised that before.  Thanks for reminding me.

How's this doc patch?

src/gdb/doc/ChangeLog:
2006-03-27  Jim Blandy  <jimb@redhat.com>

	* gdbint.texinfo (Prologue Analysis): New section.

Index: src/gdb/doc/gdbint.texinfo
===================================================================
--- src.orig/gdb/doc/gdbint.texinfo
+++ src/gdb/doc/gdbint.texinfo
@@ -287,6 +287,175 @@ used to create a new @value{GDBN} frame 
 @code{DEPRECATED_INIT_EXTRA_FRAME_INFO} and
 @code{DEPRECATED_INIT_FRAME_PC} will be called for the new frame.
 
+@section Prologue Analysis
+
+@cindex prologue analysis
+@cindex call frame information
+To produce a backtrace and allow the user to manipulate older frames'
+variables and arguments, @value{GDBN} needs to find the base addresses
+of older frames, and discover where those frames' registers have been
+saved.  Since a frame's ``callee-saves'' registers get saved by
+younger frames if and when they're reused, a frame's registers may be
+scattered unpredictably across younger frames.  This means that
+changing the value of a register-allocated variable in an older frame
+may actually entail writing to a save slot in some younger frame.
+
+Modern versions of GCC emit Dwarf call frame information (``CFI'')
+that gives @value{GDBN} all the information it needs to do this.  But
+CFI is not always available, so as a fallback @value{GDBN} uses a
+technique called @dfn{prologue analysis} to find frame sizes and saved
+registers.  A prologue analyzer disassembles the function's machine
+code starting from its entry point, and looks for instructions that
+allocate frame space, save the stack pointer in a frame pointer
+register, save registers, and so on.  Obviously, this can't be done
+accurately in general, but it's tractible to do well enough to be very
+helpful.  Prologue analysis predates the GNU toolchain's support for
+CFI; at one time, prologue analysis was the only mechanism
+@value{GDBN} used for stack unwinding at all, when the function
+calling conventions didn't specify a fixed frame layout.
+
+In the olden days, function prologues were generated by hand-written,
+target-specific code in GCC, and treated as opaque and untouchable by
+optimizers.  Looking at this code, it was usually straightforward to
+write a prologue analyzer for @value{GDBN} that would accurately
+understand all the prologues GCC would generate.  However, over time
+GCC became more aggressive about instruction scheduling, and began to
+understand more about the semantics of the prologue instructions
+themselves; in response, @value{GDBN}'s analyzers became more complex
+and fragile.  Keeping the prologue analyzers working as GCC (and the
+ISA's themselves) evolved became a substantial task.
+
+@cindex @file{prologue-value.c}
+@cindex abstract interpretation of function prologues
+@cindex pseudo-evaluation of function prologues
+To try to address this problem, the code in
+@file{gdb/prologue-value.h} and @file{gdb/prologue-value.c} provide a
+general framework for writing prologue analyzers that are simpler and
+more robust than ad-hoc analyzers.  When we analyze a prologue using
+the prologue-value framework, we're really doing ``abstract
+interpretation'' or ``pseudo-evaluation'': running the function's code
+in simulation, but using conservative approximations of the values
+registers and memory would hold when the code actually runs.  For
+example, if our function starts with the instruction:
+
+@example
+addi r1, 42     # add 42 to r1
+@end example
+@noindent
+we don't know exactly what value will be in @code{r1} after executing
+this instruction, but we do know it'll be 42 greater than its original
+value.
+
+If we then see an instruction like:
+
+@example
+addi r1, 22     # add 22 to r1
+@end example
+@noindent
+we still don't know what @code{r1's} value is, but again, we can say
+it is now 64 greater than its original value.
+
+If the next instruction were:
+
+@example
+   mov r2, r1      # set r2 to r1's value
+@end example
+@noindent
+then we can say that @code{r2's} value is now the original value of
+@code{r1} plus 64.
+
+It's common for prologues to save registers on the stack, so we'll
+need to track the values of stack frame slots, as well as the
+registers.  So after an instruction like this:
+
+@example
+mov (fp+4), r2
+@end example
+@noindent
+Then we'd know that the stack slot four bytes above the frame pointer
+holds the original value of @code{r1} plus 64.
+
+And so on.
+
+Of course, this can only go so far before it gets unreasonable.  If we
+wanted to be able to say anything about the value of @code{r1} after
+the instruction:
+
+@example
+xor r1, r3      # exclusive-or r1 and r3, place result in r1
+@end example
+@noindent
+then things would get pretty complex.  But remember, we're just doing
+a conservative approximation; if exclusive-or instructions aren't
+relevant to prologues, we can just say @code{r1}'s value is now
+``unknown''.  We can ignore things that are too complex, if that loss of
+information is acceptable for our application.
+
+So when we say ``conservative approximation'' here, what we mean is an
+approximation that is either accurate, or marked ``unknown'', but
+never inaccurate.
+
+Using this framework, a prologue analyzer is simply an interpreter for
+machine code, but one that uses conservative approximations for the
+contents of registers and memory instead of actual values.  Starting
+from the function's entry point, you simulate instructions up to the
+current PC, or an instruction that you don't know how to simulate.
+Now you can examine the state of the registers and stack slots you've
+kept track of.
+
+@itemize @bullet
+
+@item
+To see how large your stack frame is, just check the value of the
+stack pointer register; if it's the original value of the SP
+minus a constant, then that constant is the stack frame's size.
+If the SP's value has been marked as ``unknown'', then that means
+the prologue has done something too complex for us to track, and
+we don't know the frame size.
+
+@item
+To see where we've saved the previous frame's registers, we just
+search the values we've tracked --- stack slots, usually, but
+registers, too, if you want --- for something equal to the
+register's original value.  If the ABI suggests a standard place
+to save a given register, then we can check there first, but
+really, anything that will get us back the original value will
+probably work.
+@end itemize
+
+This does take some work.  But prologue analyzers aren't
+quick-and-simple pattern patching to recognize a few fixed prologue
+forms any more; they're big, hairy functions.  Along with inferior
+function calls, prologue analysis accounts for a substantial portion
+of the time needed to stabilize a @value{GDBN} port.  So I think it's
+worthwhile to look for an approach that will be easier to understand
+and maintain.  In the approach used here:
+
+@itemize @bullet
+
+@item
+It's easier to see that the analyzer is correct: you just see
+whether the analyzer properly (albiet conservatively) simulates
+the effect of each instruction.
+
+@item
+It's easier to extend the analyzer: you can add support for new
+instructions, and know that you haven't broken anything that
+wasn't already broken before.
+
+@item
+It's orthogonal: to gather new information, you don't need to
+complicate the code for each instruction.  As long as your domain
+of conservative values is already detailed enough to tell you
+what you need, then all the existing instruction simulations are
+already gathering the right data for you.
+
+@end itemize
+
+The file @file{prologue-value.h} contains detailed comments explaining
+the framework and how to use it.
+
+
 @section Breakpoint Handling
 
 @cindex breakpoints

next prev parent reply	other threads:[~2006-03-27 23:59 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-23  5:20 Jim Blandy
2006-03-23  5:24 ` Daniel Jacobowitz
2006-03-23  5:32   ` Randolph Chung
2006-03-23  5:39   ` Jim Blandy
2006-03-23 14:06     ` Jim Blandy
2006-03-24  1:03     ` Jim Blandy
2006-03-24  2:40       ` Jim Blandy
2006-03-28  7:51         ` Jim Blandy
2006-03-24 14:48 ` Eli Zaretskii
2006-03-24 22:43   ` Jim Blandy
2006-03-28  2:24     ` Jim Blandy [this message]
2006-03-28 18:58       ` Eli Zaretskii
2006-03-28 19:15         ` Jim Blandy
2006-03-28 21:17           ` Eli Zaretskii
2006-03-28 21:59             ` Jim Blandy
2006-04-07 19:19               ` Ulrich Weigand
2006-04-08 23:40                 ` Jim Blandy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=vt21wwntp5m.fsf@theseus.home. \
    --to=jimb@codesourcery.com \
    --cc=eliz@gnu.org \
    --cc=gdb-patches@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox