From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28181 invoked by alias); 27 Mar 2006 23:59:38 -0000 Received: (qmail 28170 invoked by uid 22791); 27 Mar 2006 23:59:36 -0000 X-Spam-Check-By: sourceware.org Received: from intranet.codesourcery.com (HELO mail.codesourcery.com) (65.74.133.6) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 27 Mar 2006 23:59:36 +0000 Received: (qmail 24773 invoked from network); 27 Mar 2006 23:59:33 -0000 Received: from unknown (HELO localhost) (jimb@127.0.0.2) by mail.codesourcery.com with ESMTPA; 27 Mar 2006 23:59:33 -0000 To: Eli Zaretskii Cc: gdb-patches@sources.redhat.com Subject: Re: RFA: prologue value modules References: From: Jim Blandy Date: Tue, 28 Mar 2006 02:24:00 -0000 In-Reply-To: (Jim Blandy's message of "Fri, 24 Mar 2006 13:02:05 -0800") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2006-03/txt/msg00313.txt.bz2 Jim Blandy writes: > Eli Zaretskii writes: >>> From: Jim Blandy >>> Date: Wed, 22 Mar 2006 20:37:20 -0800 >>> >>> To recap: this patch adds prologue-value.[ch], support libraries for >>> the new prologue analysis strategy used in the Renesas M32C port and >>> some internal Red Hat ports. I think they make prologue analysis >>> simpler to write and more robust. >> >> Are we going to have this documented in gdbint.texinfo? > > Yes --- I promised that before. Thanks for reminding me. How's this doc patch? src/gdb/doc/ChangeLog: 2006-03-27 Jim Blandy * gdbint.texinfo (Prologue Analysis): New section. Index: src/gdb/doc/gdbint.texinfo =================================================================== --- src.orig/gdb/doc/gdbint.texinfo +++ src/gdb/doc/gdbint.texinfo @@ -287,6 +287,175 @@ used to create a new @value{GDBN} frame @code{DEPRECATED_INIT_EXTRA_FRAME_INFO} and @code{DEPRECATED_INIT_FRAME_PC} will be called for the new frame. +@section Prologue Analysis + +@cindex prologue analysis +@cindex call frame information +To produce a backtrace and allow the user to manipulate older frames' +variables and arguments, @value{GDBN} needs to find the base addresses +of older frames, and discover where those frames' registers have been +saved. Since a frame's ``callee-saves'' registers get saved by +younger frames if and when they're reused, a frame's registers may be +scattered unpredictably across younger frames. This means that +changing the value of a register-allocated variable in an older frame +may actually entail writing to a save slot in some younger frame. + +Modern versions of GCC emit Dwarf call frame information (``CFI'') +that gives @value{GDBN} all the information it needs to do this. But +CFI is not always available, so as a fallback @value{GDBN} uses a +technique called @dfn{prologue analysis} to find frame sizes and saved +registers. A prologue analyzer disassembles the function's machine +code starting from its entry point, and looks for instructions that +allocate frame space, save the stack pointer in a frame pointer +register, save registers, and so on. Obviously, this can't be done +accurately in general, but it's tractible to do well enough to be very +helpful. Prologue analysis predates the GNU toolchain's support for +CFI; at one time, prologue analysis was the only mechanism +@value{GDBN} used for stack unwinding at all, when the function +calling conventions didn't specify a fixed frame layout. + +In the olden days, function prologues were generated by hand-written, +target-specific code in GCC, and treated as opaque and untouchable by +optimizers. Looking at this code, it was usually straightforward to +write a prologue analyzer for @value{GDBN} that would accurately +understand all the prologues GCC would generate. However, over time +GCC became more aggressive about instruction scheduling, and began to +understand more about the semantics of the prologue instructions +themselves; in response, @value{GDBN}'s analyzers became more complex +and fragile. Keeping the prologue analyzers working as GCC (and the +ISA's themselves) evolved became a substantial task. + +@cindex @file{prologue-value.c} +@cindex abstract interpretation of function prologues +@cindex pseudo-evaluation of function prologues +To try to address this problem, the code in +@file{gdb/prologue-value.h} and @file{gdb/prologue-value.c} provide a +general framework for writing prologue analyzers that are simpler and +more robust than ad-hoc analyzers. When we analyze a prologue using +the prologue-value framework, we're really doing ``abstract +interpretation'' or ``pseudo-evaluation'': running the function's code +in simulation, but using conservative approximations of the values +registers and memory would hold when the code actually runs. For +example, if our function starts with the instruction: + +@example +addi r1, 42 # add 42 to r1 +@end example +@noindent +we don't know exactly what value will be in @code{r1} after executing +this instruction, but we do know it'll be 42 greater than its original +value. + +If we then see an instruction like: + +@example +addi r1, 22 # add 22 to r1 +@end example +@noindent +we still don't know what @code{r1's} value is, but again, we can say +it is now 64 greater than its original value. + +If the next instruction were: + +@example + mov r2, r1 # set r2 to r1's value +@end example +@noindent +then we can say that @code{r2's} value is now the original value of +@code{r1} plus 64. + +It's common for prologues to save registers on the stack, so we'll +need to track the values of stack frame slots, as well as the +registers. So after an instruction like this: + +@example +mov (fp+4), r2 +@end example +@noindent +Then we'd know that the stack slot four bytes above the frame pointer +holds the original value of @code{r1} plus 64. + +And so on. + +Of course, this can only go so far before it gets unreasonable. If we +wanted to be able to say anything about the value of @code{r1} after +the instruction: + +@example +xor r1, r3 # exclusive-or r1 and r3, place result in r1 +@end example +@noindent +then things would get pretty complex. But remember, we're just doing +a conservative approximation; if exclusive-or instructions aren't +relevant to prologues, we can just say @code{r1}'s value is now +``unknown''. We can ignore things that are too complex, if that loss of +information is acceptable for our application. + +So when we say ``conservative approximation'' here, what we mean is an +approximation that is either accurate, or marked ``unknown'', but +never inaccurate. + +Using this framework, a prologue analyzer is simply an interpreter for +machine code, but one that uses conservative approximations for the +contents of registers and memory instead of actual values. Starting +from the function's entry point, you simulate instructions up to the +current PC, or an instruction that you don't know how to simulate. +Now you can examine the state of the registers and stack slots you've +kept track of. + +@itemize @bullet + +@item +To see how large your stack frame is, just check the value of the +stack pointer register; if it's the original value of the SP +minus a constant, then that constant is the stack frame's size. +If the SP's value has been marked as ``unknown'', then that means +the prologue has done something too complex for us to track, and +we don't know the frame size. + +@item +To see where we've saved the previous frame's registers, we just +search the values we've tracked --- stack slots, usually, but +registers, too, if you want --- for something equal to the +register's original value. If the ABI suggests a standard place +to save a given register, then we can check there first, but +really, anything that will get us back the original value will +probably work. +@end itemize + +This does take some work. But prologue analyzers aren't +quick-and-simple pattern patching to recognize a few fixed prologue +forms any more; they're big, hairy functions. Along with inferior +function calls, prologue analysis accounts for a substantial portion +of the time needed to stabilize a @value{GDBN} port. So I think it's +worthwhile to look for an approach that will be easier to understand +and maintain. In the approach used here: + +@itemize @bullet + +@item +It's easier to see that the analyzer is correct: you just see +whether the analyzer properly (albiet conservatively) simulates +the effect of each instruction. + +@item +It's easier to extend the analyzer: you can add support for new +instructions, and know that you haven't broken anything that +wasn't already broken before. + +@item +It's orthogonal: to gather new information, you don't need to +complicate the code for each instruction. As long as your domain +of conservative values is already detailed enough to tell you +what you need, then all the existing instruction simulations are +already gathering the right data for you. + +@end itemize + +The file @file{prologue-value.h} contains detailed comments explaining +the framework and how to use it. + + @section Breakpoint Handling @cindex breakpoints