From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 13782 invoked by alias); 16 May 2006 20:45:22 -0000 Received: (qmail 13772 invoked by uid 22791); 16 May 2006 20:45:21 -0000 X-Spam-Check-By: sourceware.org Received: from nevyn.them.org (HELO nevyn.them.org) (66.93.172.17) by sourceware.org (qpsmtpd/0.31.1) with ESMTP; Tue, 16 May 2006 20:45:05 +0000 Received: from drow by nevyn.them.org with local (Exim 4.54) id 1Fg6Pf-0003d3-Nw; Tue, 16 May 2006 16:45:03 -0400 Date: Tue, 16 May 2006 21:38:00 -0000 From: Daniel Jacobowitz To: Mark Kettenis Cc: gdb-patches@sourceware.org Subject: Re: [RFC] Move the frame zero PC check earlier Message-ID: <20060516204503.GC13210@nevyn.them.org> Mail-Followup-To: Mark Kettenis , gdb-patches@sourceware.org References: <20060510180312.GA12606@nevyn.them.org> <200605130946.k4D9kZ2M001331@elgar.sibelius.xs4all.nl> <20060513151338.GB3721@nevyn.them.org> <200605131642.k4DGgiqa018273@elgar.sibelius.xs4all.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200605131642.k4DGgiqa018273@elgar.sibelius.xs4all.nl> User-Agent: Mutt/1.5.11+cvs20060403 X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2006-05/txt/msg00366.txt.bz2 I won't reply paragraph-by-paragraph; there's getting to be a lot of paragraphs. Please, let me know if I've cut something that I shouldn't have. On Sat, May 13, 2006 at 06:42:44PM +0200, Mark Kettenis wrote: > > Explanatory output ("why did that backtrace stop?") is available in > > "set debug frame 1". If you think it's routinely useful, then we can > > make it available in some prettier form, perhaps in "info frame" for > > the outermost frame. > > If we can reliably tell that a frame is the outermost frame, we might > indeed print that as part of "info frame". That's just about the opposite of what I'm suggesting. I think that "the stack ended jaggedly" might be useful in "info frame". > > Also, I don't think that "gdb is confused" errors are as desirable as > > you think they are. This extra frame has been reported to me as a bug > > at least three times that I can think of (twice for RTOSes and once for > > Linux KGDB). > > I can imagine you'd like to get these people off your back. And > perhaps they're right that the extra frame is caused by a bug in GDB. > But that bug is not the printing of the extra frame itself. The bug > is GDB not being able to determine that it is at the end of the stack, > which might actually be a bug in the compiler or system libraries > they're using. No! No no no. First of all, I'm not just trying to get them off my back. I think they're right and it shouldn't be displayed. Second, this _is_ GDB being able to determine that it's at the end of the stack. A return address of zero is a fairly common convention for this. It's natural, if you think about it. On architectures with a well entrenched frame pointer (exhibit A, our earlier conversation about cache->base and %ebp on x86) then that can be initialized to zero either before calling a generic higher-level function or else by handwritten startup code. The same is true for the return address; it can be set to return to nowhere. Neither "makes sense", so they are useful markers. About the only other option is the stack pointer, and you can't do that unless you're calling handwritten startup code that also knows where the stack is supposed to go - pretty rare in modern systems where that code is being called by anything other than a reset vector. > Then we should improve the unwinder. If we didn't error out with that > error, the backtrace would never end. As you well know, in many cases it is either impractical or downright impossible to improve the prologue unwinder, e.g. when OS vendors ship system libraries with neither unwind information nor symbols. > > And Joel recently reported that Ada tasking generates this message > > on at least one platform, and users are unhappy about that, too. > > IIRC this is a case where the outermost frame wasn't marked properly, > or at least not detected as such by GDB. That's the problem that > needs to be fixed. I guess that depends what you mean by "marked properly" and what you mean by "fixed". The problem was that a routine in the system libraries was called directly from new threads. The name of the routine is OS-dependant and maybe even OS-version-dependant. Changing the system libraries is out of the question here; all the world isn't free software and I believe the system in question was either HP-UX or OSF. Recognizing it by name would, I suppose, work - but you'd have to be careful of the user reusing a fairly generic function name! Or else limit the check to a specific shared object, and ditch caring about static linking. Seems kind of sketchy to me. > > And when we've run out of useful information, the stack appears to > > end, and we're quite justified in reporting that the stack ended. > > It's quite complex enough already without reporting "but the end of > > the stack looks a little funny to me...". > > No, if a stack doesn't end properly on a platform where it should end > properly, that's useful information that should be reported to the > user. I answered this one in another message, and it's basically the same as my first bit above in this message. I think it's precisely proper to report this as an end of stack condition. We're reading the stack frame from memory, so we're automatically vulnerable to displaying corrupted information if the user has scribbled on the stack. But I don't think that merits displaying something that we know is garbage. A non-zero unknown PC might be garbage or it might be code we don't have symbols for; we display it as if it were code we didn't have a symbol for. I posit that there's a difference in kind between that and a zero PC, which might be garbage or might be the end of the stack; and by analogy, I'm suggesting we display it as if it is the end of the stack. Anyway, that's my opinion. I have no idea how to proceed on this; I don't really expect any of that to change your mind, and you probably don't use any system where this extra frame is a serious annoyance. -- Daniel Jacobowitz CodeSourcery