From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9654 invoked by alias); 8 Nov 2002 21:57:39 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 9647 invoked from network); 8 Nov 2002 21:57:39 -0000 Received: from unknown (HELO mail1.microsoft.com) (131.107.3.125) by sources.redhat.com with SMTP; 8 Nov 2002 21:57:39 -0000 Received: from mail6.microsoft.com ([157.54.6.196]) by mail1.microsoft.com with Microsoft SMTPSVC(5.0.2195.5600); Fri, 8 Nov 2002 13:57:38 -0800 Received: from inet-vrs-06.redmond.corp.microsoft.com ([157.54.6.181]) by mail6.microsoft.com with Microsoft SMTPSVC(5.0.2195.5600); Fri, 8 Nov 2002 13:57:37 -0800 Received: from 157.54.8.155 by inet-vrs-06.redmond.corp.microsoft.com (InterScan E-Mail VirusWall NT); Fri, 08 Nov 2002 13:57:37 -0800 Received: from red-msg-08.redmond.corp.microsoft.com ([157.54.12.5]) by inet-hub-04.redmond.corp.microsoft.com with Microsoft SMTPSVC(5.0.2195.5600); Fri, 8 Nov 2002 13:57:37 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.0.6334.0 Content-Class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: Single step vs. "tail recursion" optimization Date: Fri, 08 Nov 2002 13:57:00 -0000 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Donn Terry" To: "Michael Snyder" Cc: X-OriginalArrivalTime: 08 Nov 2002 21:57:37.0760 (UTC) FILETIME=[D9F59600:01C28771] X-SW-Source: 2002-11/txt/msg00252.txt.bz2 (I'm sorry to have to be the messenger on this one...) Here's a mini testcase. I've also attached the resulting .s files for -O2 and -O3. Shudder. Andrew's speculation about s not working because there were no symbols is correct. S-ing works until the call to getpid(). I haven't actually tried to figure out why gdb isn't doing it right in that case because there's actually something potentially even uglier going on in the -O3 case. This is something that the "management" of gdb and the "management" of gcc are going to have to take on and resolve as either "no, you can't sanely debug -O3" or "we need some help from the compiler to sort this one out". (And if the latter, then the same help may be useful with the -O2 case!) (I haven't seen this addressed, but I could easily have missed it.) Note that in the case of -O3, foo() and bar() are NEVER actually called from main, but rather getpid() is called directly. (Note also the reordering of the functions.) (Seeing that this sort of optimization is pretty compellingly needed for C++ code, "don't do that" seems an unlikely outcome.) Donn P.S. This may explain some instances of "stack unwind missed a frame" bugs. bar() { getpid(); } foo() { bar(); } main() { foo(); } ------------------ -O2 ------------------- .file "bat.c" .global __fltused .text .p2align 4,,15 .globl _bar .def _bar; .scl 2; .type 32; .endef _bar: pushl %ebp movl %esp, %ebp popl %ebp jmp _getpid .p2align 4,,15 .globl _foo .def _foo; .scl 2; .type 32; .endef _foo: pushl %ebp movl %esp, %ebp popl %ebp jmp _bar .def ___main; .scl 2; .type 32; .endef .p2align 4,,15 .globl _main .def _main; .scl 2; .type 32; .endef _main: pushl %ebp movl %esp, %ebp pushl %eax pushl %eax xorl %eax, %eax andl $-16, %esp call __alloca call ___main call _foo movl %ebp, %esp popl %ebp ret .def _getpid; .scl 2; .type 32; .endef ------------------------ -O3 --------------------------------- .file "bat.c" .global __fltused .def ___main; .scl 2; .type 32; .endef .text .p2align 4,,15 .globl _main .def _main; .scl 2; .type 32; .endef _main: pushl %ebp movl %esp, %ebp pushl %eax pushl %eax xorl %eax, %eax andl $-16, %esp call __alloca call ___main call _getpid <<< NO CALL TO foo() movl %ebp, %esp popl %ebp ret .p2align 4,,15 .globl _bar .def _bar; .scl 2; .type 32; .endef _bar: pushl %ebp movl %esp, %ebp popl %ebp jmp _getpid .p2align 4,,15 .globl _foo .def _foo; .scl 2; .type 32; .endef _foo: pushl %ebp movl %esp, %ebp popl %ebp jmp _getpid <<< NOTE THAT foo() doesn't call bar() either! .def _getpid; .scl 2; .type 32; .endef -----Original Message----- From: Michael Snyder [mailto:msnyder@redhat.com]=20 Sent: Friday, November 08, 2002 11:43 AM To: Donn Terry Cc: gdb-patches@sources.redhat.com Subject: Re: Single step vs. "tail recursion" optimization Donn Terry wrote: >=20 > While debugging gdb, I ran across a really nasty little issue: the gcc > guys (for the "bleeding edge", at least) have generated an=20 > optimization such that if the last thing in function x is a function=20 > call to y, it will short circut the return from x, and set things up=20 > so it returns directly from y. (A special case of tail recursion=20 > optimizations.) >=20 > If you try to n (or s) over that, the debugged program runs away=20 > because gdb doesn't know about that magic. The real example is=20 > regcache_raw_read, which ends in a memcpy. Instead of jsr-ing to the=20 > memcpy and then returning, it fiddles with the stack and jmps to=20 > memcpy. Is this a known issue, and is it being worked, or have I just=20 > run across something new to worry about? >=20 > (This is on Interix (x86, obviously from the code below) with a gcc=20 > that's less than a week old. I have no idea how long it might=20 > actually have been this way. I doubt > the problem is actually unique to the x86 as this is a very general > optimization.) >=20 > Donn Tail-recursion isn't a new optimization, but I have almost no (only the vaguest) recollection of ever having run up against=20 it before. Could be there's a change with the way GCC is=20 implementing it. Could be we never handled it before. This sounds like a good argument for parsing the epilogue... ;-( Michael >=20 > Heres the code: >=20 > 0x466e37 : mov 0x1c(%eax),%ecx > 0x466e3a : mov 0x18(%eax),%eax > 0x466e3d : mov (%eax,%esi,4),%edx > 0x466e40 : mov 0x4(%ebx),%eax > 0x466e43 : add %eax,%edx > 0x466e45 : mov (%ecx,%esi,4),%eax > 0x466e48 : mov %eax,0x10(%ebp) > 0x466e4b : mov %edx,0xc(%ebp) > 0x466e4e : mov %edi,0x8(%ebp) > 0x466e51 : lea 0xfffffff4(%ebp),%esp > 0x466e54 : pop %ebx > 0x466e55 : pop %esi > 0x466e56 : pop %edi > 0x466e57 : pop %ebp > 0x466e58 : jmp 0x77d91e60 > 0x466e5d : lea 0x0(%esi),%esi