From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-43390-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 28181 invoked by alias); 27 Mar 2006 23:59:38 -0000
Received: (qmail 28170 invoked by uid 22791); 27 Mar 2006 23:59:36 -0000
X-Spam-Check-By: sourceware.org
Received: from intranet.codesourcery.com (HELO mail.codesourcery.com) (65.74.133.6)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 27 Mar 2006 23:59:36 +0000
Received: (qmail 24773 invoked from network); 27 Mar 2006 23:59:33 -0000
Received: from unknown (HELO localhost) (jimb@127.0.0.2)   by mail.codesourcery.com with ESMTPA; 27 Mar 2006 23:59:33 -0000
To: Eli Zaretskii <eliz@gnu.org>
Cc: gdb-patches@sources.redhat.com
Subject: Re: RFA: prologue value modules
References: <vt2ek0trd33.fsf@theseus.home.> <uhd5o5hua.fsf@gnu.org> 	<vt2ek0r1rqq.fsf@theseus.home.>
From: Jim Blandy <jimb@codesourcery.com>
Date: Tue, 28 Mar 2006 02:24:00 -0000
In-Reply-To: <vt2ek0r1rqq.fsf@theseus.home.> (Jim Blandy's message of "Fri, 24 Mar 2006 13:02:05 -0800")
Message-ID: <vt21wwntp5m.fsf@theseus.home.>
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2006-03/txt/msg00313.txt.bz2


Jim Blandy <jimb@codesourcery.com> writes:
> Eli Zaretskii <eliz@gnu.org> writes:
>>> From: Jim Blandy <jimb@codesourcery.com>
>>> Date: Wed, 22 Mar 2006 20:37:20 -0800
>>> 
>>> To recap: this patch adds prologue-value.[ch], support libraries for
>>> the new prologue analysis strategy used in the Renesas M32C port and
>>> some internal Red Hat ports.  I think they make prologue analysis
>>> simpler to write and more robust.
>>
>> Are we going to have this documented in gdbint.texinfo?
>
> Yes --- I promised that before.  Thanks for reminding me.

How's this doc patch?

src/gdb/doc/ChangeLog:
2006-03-27  Jim Blandy  <jimb@redhat.com>

	* gdbint.texinfo (Prologue Analysis): New section.

Index: src/gdb/doc/gdbint.texinfo
===================================================================
--- src.orig/gdb/doc/gdbint.texinfo
+++ src/gdb/doc/gdbint.texinfo
@@ -287,6 +287,175 @@ used to create a new @value{GDBN} frame 
 @code{DEPRECATED_INIT_EXTRA_FRAME_INFO} and
 @code{DEPRECATED_INIT_FRAME_PC} will be called for the new frame.
 
+@section Prologue Analysis
+
+@cindex prologue analysis
+@cindex call frame information
+To produce a backtrace and allow the user to manipulate older frames'
+variables and arguments, @value{GDBN} needs to find the base addresses
+of older frames, and discover where those frames' registers have been
+saved.  Since a frame's ``callee-saves'' registers get saved by
+younger frames if and when they're reused, a frame's registers may be
+scattered unpredictably across younger frames.  This means that
+changing the value of a register-allocated variable in an older frame
+may actually entail writing to a save slot in some younger frame.
+
+Modern versions of GCC emit Dwarf call frame information (``CFI'')
+that gives @value{GDBN} all the information it needs to do this.  But
+CFI is not always available, so as a fallback @value{GDBN} uses a
+technique called @dfn{prologue analysis} to find frame sizes and saved
+registers.  A prologue analyzer disassembles the function's machine
+code starting from its entry point, and looks for instructions that
+allocate frame space, save the stack pointer in a frame pointer
+register, save registers, and so on.  Obviously, this can't be done
+accurately in general, but it's tractible to do well enough to be very
+helpful.  Prologue analysis predates the GNU toolchain's support for
+CFI; at one time, prologue analysis was the only mechanism
+@value{GDBN} used for stack unwinding at all, when the function
+calling conventions didn't specify a fixed frame layout.
+
+In the olden days, function prologues were generated by hand-written,
+target-specific code in GCC, and treated as opaque and untouchable by
+optimizers.  Looking at this code, it was usually straightforward to
+write a prologue analyzer for @value{GDBN} that would accurately
+understand all the prologues GCC would generate.  However, over time
+GCC became more aggressive about instruction scheduling, and began to
+understand more about the semantics of the prologue instructions
+themselves; in response, @value{GDBN}'s analyzers became more complex
+and fragile.  Keeping the prologue analyzers working as GCC (and the
+ISA's themselves) evolved became a substantial task.
+
+@cindex @file{prologue-value.c}
+@cindex abstract interpretation of function prologues
+@cindex pseudo-evaluation of function prologues
+To try to address this problem, the code in
+@file{gdb/prologue-value.h} and @file{gdb/prologue-value.c} provide a
+general framework for writing prologue analyzers that are simpler and
+more robust than ad-hoc analyzers.  When we analyze a prologue using
+the prologue-value framework, we're really doing ``abstract
+interpretation'' or ``pseudo-evaluation'': running the function's code
+in simulation, but using conservative approximations of the values
+registers and memory would hold when the code actually runs.  For
+example, if our function starts with the instruction:
+
+@example
+addi r1, 42     # add 42 to r1
+@end example
+@noindent
+we don't know exactly what value will be in @code{r1} after executing
+this instruction, but we do know it'll be 42 greater than its original
+value.
+
+If we then see an instruction like:
+
+@example
+addi r1, 22     # add 22 to r1
+@end example
+@noindent
+we still don't know what @code{r1's} value is, but again, we can say
+it is now 64 greater than its original value.
+
+If the next instruction were:
+
+@example
+   mov r2, r1      # set r2 to r1's value
+@end example
+@noindent
+then we can say that @code{r2's} value is now the original value of
+@code{r1} plus 64.
+
+It's common for prologues to save registers on the stack, so we'll
+need to track the values of stack frame slots, as well as the
+registers.  So after an instruction like this:
+
+@example
+mov (fp+4), r2
+@end example
+@noindent
+Then we'd know that the stack slot four bytes above the frame pointer
+holds the original value of @code{r1} plus 64.
+
+And so on.
+
+Of course, this can only go so far before it gets unreasonable.  If we
+wanted to be able to say anything about the value of @code{r1} after
+the instruction:
+
+@example
+xor r1, r3      # exclusive-or r1 and r3, place result in r1
+@end example
+@noindent
+then things would get pretty complex.  But remember, we're just doing
+a conservative approximation; if exclusive-or instructions aren't
+relevant to prologues, we can just say @code{r1}'s value is now
+``unknown''.  We can ignore things that are too complex, if that loss of
+information is acceptable for our application.
+
+So when we say ``conservative approximation'' here, what we mean is an
+approximation that is either accurate, or marked ``unknown'', but
+never inaccurate.
+
+Using this framework, a prologue analyzer is simply an interpreter for
+machine code, but one that uses conservative approximations for the
+contents of registers and memory instead of actual values.  Starting
+from the function's entry point, you simulate instructions up to the
+current PC, or an instruction that you don't know how to simulate.
+Now you can examine the state of the registers and stack slots you've
+kept track of.
+
+@itemize @bullet
+
+@item
+To see how large your stack frame is, just check the value of the
+stack pointer register; if it's the original value of the SP
+minus a constant, then that constant is the stack frame's size.
+If the SP's value has been marked as ``unknown'', then that means
+the prologue has done something too complex for us to track, and
+we don't know the frame size.
+
+@item
+To see where we've saved the previous frame's registers, we just
+search the values we've tracked --- stack slots, usually, but
+registers, too, if you want --- for something equal to the
+register's original value.  If the ABI suggests a standard place
+to save a given register, then we can check there first, but
+really, anything that will get us back the original value will
+probably work.
+@end itemize
+
+This does take some work.  But prologue analyzers aren't
+quick-and-simple pattern patching to recognize a few fixed prologue
+forms any more; they're big, hairy functions.  Along with inferior
+function calls, prologue analysis accounts for a substantial portion
+of the time needed to stabilize a @value{GDBN} port.  So I think it's
+worthwhile to look for an approach that will be easier to understand
+and maintain.  In the approach used here:
+
+@itemize @bullet
+
+@item
+It's easier to see that the analyzer is correct: you just see
+whether the analyzer properly (albiet conservatively) simulates
+the effect of each instruction.
+
+@item
+It's easier to extend the analyzer: you can add support for new
+instructions, and know that you haven't broken anything that
+wasn't already broken before.
+
+@item
+It's orthogonal: to gather new information, you don't need to
+complicate the code for each instruction.  As long as your domain
+of conservative values is already detailed enough to tell you
+what you need, then all the existing instruction simulations are
+already gathering the right data for you.
+
+@end itemize
+
+The file @file{prologue-value.h} contains detailed comments explaining
+the framework and how to use it.
+
+
 @section Breakpoint Handling
 
 @cindex breakpoints