From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10332 invoked by alias); 19 Sep 2011 17:52:18 -0000 Received: (qmail 10298 invoked by uid 22791); 19 Sep 2011 17:52:11 -0000 X-SWARE-Spam-Status: No, hits=-6.7 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,SPF_HELO_PASS,TW_QU X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 19 Sep 2011 17:51:51 +0000 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p8JHplCE031067 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 19 Sep 2011 13:51:47 -0400 Received: from host1.jankratochvil.net (ovpn-116-24.ams2.redhat.com [10.36.116.24]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p8JHpiLx023111 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 19 Sep 2011 13:51:46 -0400 Received: from host1.jankratochvil.net (localhost [127.0.0.1]) by host1.jankratochvil.net (8.14.4/8.14.4) with ESMTP id p8JHph2E013840; Mon, 19 Sep 2011 19:51:43 +0200 Received: (from jkratoch@localhost) by host1.jankratochvil.net (8.14.4/8.14.4/Submit) id p8JHpgX0013839; Mon, 19 Sep 2011 19:51:42 +0200 Date: Mon, 19 Sep 2011 18:30:00 -0000 From: Jan Kratochvil To: Eli Zaretskii Cc: gdb-patches@sourceware.org Subject: Re: [patch 00/12] entryval#2: Fix x86_64 parameters, virtual tail call frames Message-ID: <20110919175141.GA24872@host1.jankratochvil.net> References: <20110913194306.GA12849@host1.jankratochvil.net> <20110916093554.GA20792@host1.jankratochvil.net> <83zki3tult.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <83zki3tult.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2011-09/txt/msg00363.txt.bz2 On Sat, 17 Sep 2011 10:56:46 +0200, Eli Zaretskii wrote: > > maybe: > > Control display of debugging info for determining virtual tail call frames. > > The tail call frames GDB determines from similar inferior debug info content > > like the @entry parameter values. > > Would it be good enough to lose the second sentence entirely? This is > just a NEWS entry, so it doesn't need to go into all the details. done. > > Added: > > Entry values are normally also printed at the function parameter list > > according to @xref{set print entry-values}. > > @ref, nor @xref. The latter produces a capitalized "See", which is > inappropriate in the middle of a sentence. Changed, it produced (in .info) function parameter list according to *Note set print entry-values::. and now it produces: function parameter list according to *note set print entry-values::. > > the whole sentence is now: > > In some cases @value{GDBN} can determine the value of function argument which > > was passed by the function caller, even if the value was modified inside the > > called function and therefore are different. > > "is different" ("argument" is single). done. > > > > +this feature will behave in the default @code{default} setting the same way as > > > ^^^^^^^ > > > Lose this "default", it's redundant. > > > > done although I disagree a bit. The setting is called "default". That it is > > the default choice is a coincidence. > > That's true, but consider how it will look in the manual: > > this feature will behave in the `default' setting the same way as ... > > Do you see any problems with reading and understanding this? Unless we change the `default' name in fact I find both ways confusing but the `default' name looks natural for the setting to me. OK, kept it that way. > > During execution of function @code{C}, there will be no indication in > > the stack frames that it was tail-called from @code{B}. > > How about "in the function call stack frames"? done. > > > However, in some cases @value{GDBN} can determine that @code{C} was > > > tail-called from @code{B}, and will then create fictitious call > > > frame for that, with the return address set up as if @code{B} called > > > @code{C} normally. > > > > Used as you wrote it. > > > > Really no nominative? ", and will then" -> ", and it will then"? > > I don't mind adding "it", if you think it will make the sentence more > clear. OK, put it there. > > Put there - isn't it too long? > > Wrap it at or before column 72 and you are good. I meant more if there aren't too many lines (not columns). Anyway OK, shortened to 66 columns now. @smallexample (gdb) x/i $pc - 2 0x40066b : jmp 0x400640 (gdb) info frame Stack level 1, frame at 0x7fffffffda30: rip = 0x40066d in b (amd64-entry-value.cc:59); saved rip 0x4004c5 tail call frame, caller of frame at 0x7fffffffda30 source language c++. Arglist at unknown address. Locals at unknown address, Previous frame's sp is 0x7fffffffda30 @end smallexample > > > > +The detection of all the possible code path executions can find them ambiguous. > > > > +There is no execution history stored (possible @ref{Reverse Execution} is never > > > > +used for this purpose) and the last known caller could have reached the known > > > > +callee by multiple different jump sequences. In such case @value{GDBN} still > > > > +tries to show at least all the unambiguous top tail callers and all the > > > > +unambiguous bottom tail calees, if any. > > > > > > I don't understand what this means in practice. If you show an > > > example of this, I could suggest some rewording. > > > > This is discussed above: > > > > tailcall: initial: 0x400735(amb_a(int)) 0x400725(amb_b(int)) 0x40071e(amb(int)) 0x400705(amb_x(int)) 0x4006f5(amb_y(int)) 0x4006e4(amb_z(int)) > > tailcall: compare: 0x400735(amb_a(int)) 0x400725(amb_b(int)) 0x400719(amb(int)) 0x400705(amb_x(int)) 0x4006f5(amb_y(int)) 0x4006e4(amb_z(int)) > > tailcall: reduced: 0x400735(amb_a(int)) 0x400725(amb_b(int)) | 0x400705(amb_x(int)) 0x4006f5(amb_y(int)) 0x4006e4(amb_z(int)) > > But this is output only with "debug tailcall" set, isn't it? Yes. > The text > I commented on is describing the functionality in general, not the > effect of the "debug tailcall" setting. I was asking what does > "@value{GDBN} still tries to show at least all the unambiguous top > tail callers and all the unambiguous bottom tail calees, if any" mean > when "debug tailcall" is _not_ set? Are you talking about output of > some other command, like "bt"? if so, can you show me its output in > such an ambiguous case? The real frames (with legacy GDB, no virtual tail call frames, the amb_* frames below are all virtual): #0 d (i=, j=) at ./gdb.arch/amd64-entry-value.cc:34 #1 0x000000000040040c in main () at ./gdb.arch/amd64-entry-value.cc:125 Normal output of this patchset (with ambiguous frames suppression): #0 d (i=, j=) at ./gdb.arch/amd64-entry-value.cc:34^M #1 0x00000000004005b4 in amb_z (i=) at ./gdb.arch/amd64-entry-value.cc:58^M #2 0x00000000004005c5 in amb_y (i=) at ./gdb.arch/amd64-entry-value.cc:64^M #3 0x00000000004005d5 in amb_x (i=) at ./gdb.arch/amd64-entry-value.cc:70^M #4 0x00000000004005f5 in amb_b (i=101) at ./gdb.arch/amd64-entry-value.cc:85^M #5 0x0000000000400605 in amb_a (i=100) at ./gdb.arch/amd64-entry-value.cc:91^M #6 0x000000000040040c in main () at ./gdb.arch/amd64-entry-value.cc:125^M Output with the ambiguious-frames-suppresion algorithm disabled: #0 d (i=125, j=125.5) at ./gdb.arch/amd64-entry-value.cc:34^M #1 0x00000000004005b4 in amb_z (i=117) at ./gdb.arch/amd64-entry-value.cc:58^M #2 0x00000000004005c5 in amb_y (i=111) at ./gdb.arch/amd64-entry-value.cc:64^M #3 0x00000000004005d5 in amb_x (i=106) at ./gdb.arch/amd64-entry-value.cc:70^M #4 0x00000000004005ee in amb (i=103) at ./gdb.arch/amd64-entry-value.cc:77^M #5 0x00000000004005f5 in amb_b (i=101) at ./gdb.arch/amd64-entry-value.cc:85^M #6 0x0000000000400605 in amb_a (i=100) at ./gdb.arch/amd64-entry-value.cc:91^M #7 0x000000000040040c in main () at ./gdb.arch/amd64-entry-value.cc:125^M That frame #4 could be also different (0x4005e9, not 0x4005ee), it is unsure what to print there, therefore GDB normally omits printing the frame #4. In this case there is only one ambiguous frame but GDB omits printing all the frames between first ambiguous one and last ambiguous one. > > It is the testcase "self: bt verbose", without `set verbose' there is: > > > > #0 d (i=, j=) at ./gdb.arch/amd64-entry-value.cc:34 > > #1 0x0000000000400715 in self (i=) at ./gdb.arch/amd64-entry-value.cc:120 > > #2 0x00000000004004d9 in main () at ./gdb.arch/amd64-entry-value.cc:191 > > > > and one can be curious why those were not recovered by the > > `entry values' feature. With `set verbose' it will print the reason: > > > > #0 d (DW_OP_GNU_entry_value resolving has found function "self(int)" at 0x4006e0 can call itself via tail calls > > I would suggest adding this example, both without and with "verbose" > setting, to the text. It makes the description much more clear and > understandable. (a) I have removed `set verbose' to have effect on this patchset. (b) I have renamed `set debug tailcall' to `set debug entry-values'. It is difficult to describe which part of debug messages belongs to which of those former two settings, if they should be really official. Added a similar example. > > Again the example given for NEWS `set debug tailcall' entry above. > > That example, together with some explanation of the "initial", > "compare", and "reduced" lines, would go a long way towards making the > docs clear. Added. Besides the archer-jankratochvil-entryval branch attaching diff against FSF GDB HEAD only for gdb/doc/ . I will submit the patch part updates when the doc gets final. Thanks, Jan --- a/gdb/doc/gdb.texinfo +++ b/gdb/doc/gdb.texinfo @@ -7277,6 +7277,23 @@ If you ask to print an object whose contents are unknown to by the debug information, @value{GDBN} will say @samp{}. @xref{Symbols, incomplete type}, for more about this. +If you append @kbd{@@entry} string to a function parameter name you get its +value at the time the function got called. If the value is not available an +error message is printed. Entry values are available only with some compilers. +Entry values are normally also printed at the function parameter list according +to @ref{set print entry-values}. + +@smallexample +Breakpoint 1, d (i=30) at gdb.base/entry-value.c:29 +29 i++; +(gdb) next +30 e (i); +(gdb) print i +$1 = 31 +(gdb) print i@@entry +$2 = 30 +@end smallexample + Strings are identified as arrays of @code{char} values without specified signedness. Arrays of either @code{signed char} or @code{unsigned char} get printed as arrays of 1 byte sized integers. @code{-fsigned-char} or @@ -7941,6 +7958,121 @@ thus speeding up the display of each Ada frame. @item show print frame-arguments Show how the value of arguments should be displayed when printing a frame. +@anchor{set print entry-values} +@item set print entry-values @var{value} +@kindex set print entry-values +Set printing of frame argument values at function entry. In some cases +@value{GDBN} can determine the value of function argument which was passed by +the function caller, even if the value was modified inside the called function +and therefore is different. With optimized code, the current value could be +unavailable, but the entry value may still be known. + +The default value is @code{default} (see below for its description). Older +@value{GDBN} behaved as with the setting @code{no}. Compilers not supporting +this feature will behave in the @code{default} setting the same way as with the +@code{no} setting. + +This functionality is currently supported only by DWARF 2 debugging format and +the compiler has to produce @samp{DW_TAG_GNU_call_site} tags. With +@value{NGCC}, you need to specify @option{-O -g} during compilation, to get +this information. + +The @var{value} parameter can be one of the following: + +@table @code +@item no +Print only actual parameter values, never print values from function entry +point. +@smallexample +#0 equal (val=5) +#0 different (val=6) +#0 lost (val=) +#0 born (val=10) +#0 invalid (val=) +@end smallexample + +@item only +Print only parameter values from function entry point. The actual parameter +values are never printed. +@smallexample +#0 equal (val@@entry=5) +#0 different (val@@entry=5) +#0 lost (val@@entry=5) +#0 born (val@@entry=) +#0 invalid (val@@entry=) +@end smallexample + +@item preferred +Print only parameter values from function entry point. If value from function +entry point is not known while the actual value is known, print the actual +value for such parameter. +@smallexample +#0 equal (val@@entry=5) +#0 different (val@@entry=5) +#0 lost (val@@entry=5) +#0 born (val=10) +#0 invalid (val@@entry=) +@end smallexample + +@item if-needed +Print actual parameter values. If actual parameter value is not known while +value from function entry point is known, print the entry point value for such +parameter. +@smallexample +#0 equal (val=5) +#0 different (val=6) +#0 lost (val@@entry=5) +#0 born (val=10) +#0 invalid (val=) +@end smallexample + +@item both +Always print both the actual parameter value and its value from function entry +point, even if values of one or both are not available due to compiler +optimizations. +@smallexample +#0 equal (val=5, val@@entry=5) +#0 different (val=6, val@@entry=5) +#0 lost (val=, val@@entry=5) +#0 born (val=10, val@@entry=) +#0 invalid (val=, val@@entry=) +@end smallexample + +@item compact +Print the actual parameter value if it is known and also its value from +function entry point if it is known. If neither is known, print for the actual +value @code{}. If not in MI mode (@pxref{GDB/MI}) and if both +values are known and identical, print the shortened +@code{param=param@@entry=VALUE} notation. +@smallexample +#0 equal (val=val@@entry=5) +#0 different (val=6, val@@entry=5) +#0 lost (val@@entry=5) +#0 born (val=10) +#0 invalid (val=) +@end smallexample + +@item default +Always print the actual parameter value. Print also its value from function +entry point, but only if it is known. If not in MI mode (@pxref{GDB/MI}) and +if both values are known and identical, print the shortened +@code{param=param@@entry=VALUE} notation. +@smallexample +#0 equal (val=val@@entry=5) +#0 different (val=6, val@@entry=5) +#0 lost (val=, val@@entry=5) +#0 born (val=10) +#0 invalid (val=) +@end smallexample +@end table + +For analysis messages on possible failures of frame argument values at function +entry resolution see @ref{set debug entry-values}. + +@item show print entry-values +Show the method being used for printing of frame argument values at function +entry. + @item set print repeats @cindex repeated array elements Set the threshold for suppressing display of repeated array @@ -9486,6 +9618,7 @@ please report it to us as a bug (including a test case!). @menu * Inline Functions:: How @value{GDBN} presents inlining +* Tail Call Frames:: @value{GDBN} analysis of jumps to functions @end menu @node Inline Functions @@ -9553,6 +9686,153 @@ and print a variable where your program stored the return value. @end itemize +@node Tail Call Frames +@section Tail Call Frames +@cindex tail call frames, debugging + +Function @code{B} can call function @code{C} in its very last statement. In +unoptimized compilation the call of @code{C} is immediately followed by return +instruction at the end of @code{B} code. Optimizing compiler may replace the +call and return in function @code{B} into one jump to function @code{C} +instead. Such use of a jump instruction is called @dfn{tail call}. + +During execution of function @code{C}, there will be no indication in the +function call stack frames that it was tail-called from @code{B}. If function +@code{A} regularly calls function @code{B} which tail-calls function @code{C}, +then @value{GDBN} will see @code{A} as the caller of @code{C}. However, in +some cases @value{GDBN} can determine that @code{C} was tail-called from +@code{B}, and it will then create fictitious call frame for that, with the +return address set up as if @code{B} called @code{C} normally. + +This functionality is currently supported only by DWARF 2 debugging format and +the compiler has to produce @samp{DW_TAG_GNU_call_site} tags. With +@value{NGCC}, you need to specify @option{-O -g} during compilation, to get +this information. + +@kbd{info frame} command (@pxref{Frame Info}) will indicate the tail call frame +kind by text @code{tail call frame} such as in this sample @value{GDBN} output: + +@smallexample +(gdb) x/i $pc - 2 + 0x40066b : jmp 0x400640 +(gdb) info frame +Stack level 1, frame at 0x7fffffffda30: + rip = 0x40066d in b (amd64-entry-value.cc:59); saved rip 0x4004c5 + tail call frame, caller of frame at 0x7fffffffda30 + source language c++. + Arglist at unknown address. + Locals at unknown address, Previous frame's sp is 0x7fffffffda30 +@end smallexample + +The detection of all the possible code path executions can find them ambiguous. +There is no execution history stored (possible @ref{Reverse Execution} is never +used for this purpose) and the last known caller could have reached the known +callee by multiple different jump sequences. In such case @value{GDBN} still +tries to show at least all the unambiguous top tail callers and all the +unambiguous bottom tail calees, if any. + +@table @code +@anchor{set debug entry-values} +@item set debug entry-values +@kindex set debug entry-values +When set to on, enables printing of analysis messages for both frame argument +values at function entry and tail calls. It will show all the possible valid +tail calls code paths it has considered. It will also print the intersection +of them with the final unambiguous (possibly partial or even empty) code path +result. + +@item show debug entry-values +@kindex show debug entry-values +Show the current state of analysis messages printing for both frame argument +values at function entry and tail calls. +@end table + +The analysis messages for tail calls can for example show why the virtual tail +call frame for function @code{c} has not been recognized (due to the indirect +reference by variable @code{x}): + +@smallexample +static void __attribute__((noinline, noclone)) c (void); +void (*x) (void) = c; +static void __attribute__((noinline, noclone)) a (void) @{ x++; @} +static void __attribute__((noinline, noclone)) c (void) @{ a (); @} +int main (void) @{ x (); return 0; @} + +Breakpoint 1, DW_OP_GNU_entry_value resolving cannot find +DW_TAG_GNU_call_site 0x40039a in main +a () at t.c:3 +3 static void __attribute__((noinline, noclone)) a (void) @{ x++; @} +(gdb) bt +#0 a () at t.c:3 +#1 0x000000000040039a in main () at t.c:5 +@end smallexample + +Another possibility is an ambiguous virtual tail call frames resolution: + +@smallexample +int i; +static void __attribute__((noinline, noclone)) f (void) @{ i++; @} +static void __attribute__((noinline, noclone)) e (void) @{ f (); @} +static void __attribute__((noinline, noclone)) d (void) @{ f (); @} +static void __attribute__((noinline, noclone)) c (void) @{ d (); @} +static void __attribute__((noinline, noclone)) b (void) +@{ if (i) c (); else e (); @} +static void __attribute__((noinline, noclone)) a (void) @{ b (); @} +int main (void) @{ a (); return 0; @} + +tailcall: initial: 0x4004d2(a) 0x4004ce(b) 0x4004b2(c) 0x4004a2(d) +tailcall: compare: 0x4004d2(a) 0x4004cc(b) 0x400492(e) +tailcall: reduced: 0x4004d2(a) | +(gdb) bt +#0 f () at t.c:2 +#1 0x00000000004004d2 in a () at t.c:8 +#2 0x0000000000400395 in main () at t.c:9 +@end smallexample + +Frames #0 and #2 are real, #1 is a virtual tail call frame. The code can have +possible execution paths +@code{main@arrow{}a@arrow{}b@arrow{}c@arrow{}d@arrow{}f} or +@code{main@arrow{}a@arrow{}b@arrow{}e@arrow{}f}, @value{GDBN} cannot find which +one from the inferior state. + +@code{initial:} state shows some random possible calling sequence @value{GDBN} +has found. It then finds another possible calling sequcen - that one is +prefixed by @code{compare:}. The non-ambiguous intersection of these two is +printed as the @code{reduced:} calling sequence. That one could have many +futher @code{compare:} and @code{reduced:} statements as long as there remain +any non-ambiguous sequence entries. + +For the frame of function @code{b} in both cases there are different possible +@code{$pc} values (@code{0x4004cc} or @code{0x4004ce}), therefore this frame is +also ambigous. The only non-ambiguous frame is the one for function @code{a}, +therefore this one is displayed to the user while the ambiguous frames are +omitted. + +There can be also reasons why printing of frame argument values at function +entry may fail: + +@smallexample +int v; +static void __attribute__((noinline, noclone)) c (int i) @{ v++; @} +static void __attribute__((noinline, noclone)) a (int i); +static void __attribute__((noinline, noclone)) b (int i) @{ a (i); @} +static void __attribute__((noinline, noclone)) a (int i) +@{ if (i) b (i - 1); else c (0); @} +int main (void) @{ a (5); return 0; @} + +(gdb) bt +#0 c (i=i@@entry=0) at t.c:2 +#1 0x0000000000400428 in a (DW_OP_GNU_entry_value resolving has found +function "a" at 0x400420 can call itself via tail calls +i=) at t.c:6 +#2 0x000000000040036e in main () at t.c:7 +@end smallexample + +@value{GDBN} cannot find out from the inferior state if and how many times did +function @code{a} call itself (via function @code{b}) as these calls would be +tail calls. Such tail calls would modify thue @code{i} variable, therefore +@value{GDBN} cannot be sure the value it knows would be right - @value{GDBN} +prints @code{} instead. @node Macros @chapter C Preprocessor Macros @@ -23063,6 +23343,9 @@ inferior function call. A frame representing an inlined function. The function was inlined into a @code{gdb.NORMAL_FRAME} that is older than this one. +@item gdb.TAILCALL_FRAME +A frame representing a tail call. @xref{Tail Call Frames}. + @item gdb.SIGTRAMP_FRAME A signal trampoline frame. This is the frame created by the OS when it calls into a signal handler.