From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5959 invoked by alias); 12 Jun 2005 07:48:27 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 5952 invoked by uid 22791); 12 Jun 2005 07:48:21 -0000 Received: from sibelius.xs4all.nl (HELO sibelius.xs4all.nl) (82.92.89.47) by sourceware.org (qpsmtpd/0.30-dev) with ESMTP; Sun, 12 Jun 2005 07:48:21 +0000 Received: from elgar.sibelius.xs4all.nl (root@elgar.sibelius.xs4all.nl [192.168.0.2]) by sibelius.xs4all.nl (8.13.0/8.13.0) with ESMTP id j5C7mGAw025423; Sun, 12 Jun 2005 09:48:16 +0200 (CEST) Received: from elgar.sibelius.xs4all.nl (kettenis@localhost.sibelius.xs4all.nl [127.0.0.1]) by elgar.sibelius.xs4all.nl (8.13.4/8.13.3) with ESMTP id j5C7mGEx015882; Sun, 12 Jun 2005 09:48:16 +0200 (CEST) Received: (from kettenis@localhost) by elgar.sibelius.xs4all.nl (8.13.4/8.13.4/Submit) id j5C7m75C011986; Sun, 12 Jun 2005 09:48:07 +0200 (CEST) Date: Sun, 12 Jun 2005 07:48:00 -0000 Message-Id: <200506120748.j5C7m75C011986@elgar.sibelius.xs4all.nl> From: Mark Kettenis To: jason-swarelist@molenda.com CC: gdb-patches@sources.redhat.com In-reply-to: <20050608095805.A67988@molenda.com> (message from Jason Molenda on Wed, 8 Jun 2005 09:58:05 -0700) Subject: Re: The gdb x86 function prologue parser References: <85C775AE-3B05-431E-96D2-49EA9D1413E6@apple.com> <20050608132431.GA4970@nevyn.them.org> <20050608095805.A67988@molenda.com> X-SW-Source: 2005-06/txt/msg00118.txt.bz2 Date: Wed, 8 Jun 2005 09:58:05 -0700 From: Jason Molenda Hi Daniel, thanks for the comments. On Wed, Jun 08, 2005 at 09:24:31AM -0400, Daniel Jacobowitz wrote: > I looked at your table. > > (A) You've added jump instructions to it. Assuming that I'm following > what you're doing with this table correctly, I'm not real comfortable > with that without special cases checking the targets of the jumps. These showed up in a couple of functions that had hand-written assembly at the start of the function. Like there's one who has a special agreement with its caller about the contents of EDX, and it'd compare EDX to a value and then jump to an alternate location if it matched. The only way this code is executed is if the PC is past the jmp instruction, so I wasn't too concerned about it -- SOMEHOW we got past the jmp and back again. You are aware of the fact that the prologue scanner is used for two purposes? I not, here they are: 1. Finding out the gory details about the stack frame being executed. 2. Determining the first bit of code after the prologue, i.e. the first bit of real code. For (2) following jumps is usually a very bad thing to do. That said, I'm not sure this dual usage of the prologue scanner really makes sense these days. There is a certain lack of consistency in gdb how we handle this anyway. Maybe the best thing to do is to not use the prologue scanner for 2 at all. > > > And for goodness sakes, if we can't figure out anything > > about a function that's not at the top of the stack, don't you think > > it'd be reasonable to assume that the function has set up a stack > > frame and saved the caller's EBP? > > Because there is a GDB policy to determine information about the frame > based on the current frame, not based on where it lies on the stack. > I've experimented with this before; this change can have some weird > consequences... for instance, in any case where we can backtrace > through "foo" only because of the addition of this case, we won't be > able to backtrace through "foo" if it is on top of the stack. I'd say that's an expected behavior, but yes, it's true that this can happen. It'd be great if the prologue analyzer never got confused and could always figure out how to find a function's caller's saved fp/pc, but even if we switch to using the opcodes disassembler so we never lose on another instruction, on MacOS X we can have libraries where the functions up the stack that have no symbols whatsoever. We have no idea where the function might begin--all we know is a saved address in the middle of a function. In such a situation, is it preferable that we can't backtrace past tricky functions like these? After a month of working on the x86 port, I got so frustrated I wrote a user command that could backtrace -- define x86-bt set $frameno = 1 set $cur_ebp = $ebp printf "frame 0 EBP: 0x%08x EIP: 0x%08x\n", $ebp, $eip x/1i $eip set $prev_ebp = *((uint32_t *) $cur_ebp) set $prev_eip = *((uint32_t *) ($cur_ebp + 4)) while $prev_ebp != 0 printf "frame %d EBP: 0x%08x EIP: 0x%08x\n", $frameno, $prev_ebp, $prev_eip x/1i $prev_eip set $cur_ebp = $prev_ebp set $prev_ebp = *((uint32_t *) $cur_ebp) set $prev_eip = *((uint32_t *) ($cur_ebp + 4)) set $frameno = $frameno + 1 end end because I was having to do backtraces by manually walking the stack so often. That's when I said, "enough is enough, this is stupid that gdb can't do this." > You can find more information about this in the list archives, in > plenty of places; most recently Mark pulled together an implementation > of "set i386 trust-frame-pointer". Yeah, I couldn't comment at the time. Mark's change was wrong. Oh, I'm so happy I don't live in that silly corporate world where you grudgingly have to bite your tongue, because taking part in a technical discussion would reveal a little too much about the company's strategy ;-). He said himself, You probably want to reset it to 0 before continuing your program since I found out that bad things happen with some of the tests in the gdb testsuite with this turned on. http://sourceware.org/ml/gdb/2005-04/msg00177.html That's neither necessary nor acceptable. Mark's initial reading of the Sleep() vs SleepEx() was IMO not correct. http://sourceware.org/ml/gdb/2005-04/msg00156.html Sleep() sets up a stack frame, then jumps to SleepEx(). SleepEx doesn't set up a stack frame, but that's fine -- Sleep() did. This is another instance that bolsters my "if the function MUST have stored the caller's pc/fp, assume it did" method -- if you try to analyze SleepEx() where the PC is, you'll see a frameless function. But it's in the middle of the stack; it can't be frameless. Ah, but SleepEx() is a valid entry point itself. That's where things get tricky. I'm not saying that there is no way out, but I simply don't see it. > > + /* We found a function-start address, > > + or $pc is at 0x0 (someone jmp'ed thru NULL ptr). */ > > + if ((cache->pc != 0 || current_pc == 0) > > No way that's right. A jump through 0x0 is no different from a jump > through any other unmapped, non-code address. Normally one uses a > different frame unwinder for that case. OK, I didn't know the right practice. Right now it goes through i386_cache_frame. It's "frameless", of course, but we don't have a function symbol for it so cache->pc (WTF is up with that structure variable name, anyway--it means the start address of the function for this frame) is 0 (i.e. unset). It's probably historic; perhaps we should rename it. Mark