From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7378 invoked by alias); 1 Apr 2004 10:43:10 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 7368 invoked from network); 1 Apr 2004 10:43:08 -0000 Received: from unknown (HELO nile.gnat.com) (205.232.38.5) by sources.redhat.com with SMTP; 1 Apr 2004 10:43:08 -0000 Received: from localhost (localhost [127.0.0.1]) by nile.gnat.com (Postfix) with ESMTP id 9F4E5F2EA3; Thu, 1 Apr 2004 05:43:07 -0500 (EST) Received: from nile.gnat.com ([127.0.0.1]) by localhost (nile.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 27799-01-4; Thu, 1 Apr 2004 05:43:07 -0500 (EST) Received: by nile.gnat.com (Postfix, from userid 1345) id 57A9EF2EA1; Thu, 1 Apr 2004 05:43:07 -0500 (EST) From: Paul Hilfinger To: drow@false.org Cc: cagney@gnu.org, gdb-patches@sources.redhat.com In-reply-to: <20040331165810.GA32347@nevyn.them.org> (message from Daniel Jacobowitz on Wed, 31 Mar 2004 11:58:10 -0500) Subject: Re: [RFA] Add language-dependent post-parser References: <20040330092413.2E716F281D@nile.gnat.com> <20040330142656.GA18340@nevyn.them.org> <20040331080245.C139FF2B8B@nile.gnat.com> <20040331153004.GA29623@nevyn.them.org> <20040331153650.GA30084@nevyn.them.org> <406AF6AE.5040106@gnu.org> <20040331165810.GA32347@nevyn.them.org> Message-Id: <20040401104307.57A9EF2EA1@nile.gnat.com> Date: Thu, 01 Apr 2004 10:43:00 -0000 X-Virus-Scanned: by amavisd-new at nile.gnat.com X-SW-Source: 2004-04/txt/msg00014.txt.bz2 OK guys, where do we stand? As Daniel has indicated, more elegant options than mine appear to involve a great deal of work to existing, stable parsing and evaluation code that works quite well right now, thank you, and whose only offense is to be inelegant. Ada customers have had the functionality of being able to detect, report, and resolve ambiguous references at parsing time (rather than at unexpected later moments when a saved expression is evaluated) for years, and it works pretty well (I commend the approach to the C++ folks, in fact). How about it? A few responses: > > >But I guess the point is, this is no more elegant than a second pass, > > >and whatever you implement I should probably use for C++ anyway so it > > >won't be an Ada-specific hook. Does anyone else have an opinion? I guess I don't share Daniel's distaste for multiple passes. Generally, I've found them a useful way to cleanly modularize sequences of transformations on an AST. In production compilers, the downside is performance, but as Daniel also says, this really is not an issue with the scale of parsing we do in GDB. > > Ok, two thoughts: > > > > - how come it's in this compressed postfix form? > > This could hardly be a memory usage problem? > > Hardly - since expressions are so short-lived. I think it's more > likely the emphasis was on postfix than on compressed. I wasn't around > when any of this was being designed, of course :) But there are two > plausible ways to structure this sort of yacc parser - either postfix > or tree. Apparently someone prefered postfix. Which is then awkward > to work with so it becomes prefix later. > > If we're going to really clean this up, I think that using a tree > instead would be the way to go. That's a lot of work though. Agreed, alas. And this change would then have to propagate through the evaluator code as well. > > - could multi-pass be better / cleaner long term? > > Is there a language (that we care about) with overload semantics so > > screwed up that both the containing expression and the parameters are > > needed when resolving the name? > > I don't think there is. Well, there are actually two: Ada and C++ (quick, who can tell me where C++ uses type context in name resolution?). However, in both cases, the GDB implementors (i.e., me in the case of Ada) chose to ignore this part of the language for the most part, and rely on simple bottom-up resolution. Type resolution by context can be handy for producing "special effects" of abstraction, but very seldom in cases that matter much to people typing expressions into GDB. > > One way to get an answer is to ask: how to the corresponding compilers > > (Ada, Java, ObjC, C++) all implement this? > > The only ones I'm familiar with (GCC, EDG, etc.) all do it using a tree > structure. A linearized representation is just too restrictive. And > multi-pass is out of the question if you want good performance; while > for GDB the performance of the expression parser is pretty marginal, > and the expressions we parse are pretty small, for a compiler this is a > critical bottleneck. Every additional pass over the parse tree has a > high cost. At the risk of starting an irrelevant tangential discussion, I can't resist commenting on this. In the case of tree representations, this cost is not as high as one might think. Suppose we have a 10,000-line program with 100 AST nodes per parsed line (sounds fairly generous to me). The overhead of a tree traversal then involves one dynamic type dispatch (we'll say this is a C++ implementation) per node, plus a fetch and null test for each node (for the edge leading from its parent, counting empty trees among the 100 figure). On today's machines, we're talking small fractions of a second here. Paul Hilfinger