From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-31999-listarch-gdb-patches=sources.redhat.com@sources.redhat.com>
Received: (qmail 7378 invoked by alias); 1 Apr 2004 10:43:10 -0000
Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-patches-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sources.redhat.com>
List-Help: <mailto:gdb-patches-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: gdb-patches-owner@sources.redhat.com
Received: (qmail 7368 invoked from network); 1 Apr 2004 10:43:08 -0000
Received: from unknown (HELO nile.gnat.com) (205.232.38.5)
  by sources.redhat.com with SMTP; 1 Apr 2004 10:43:08 -0000
Received: from localhost (localhost [127.0.0.1])
	by nile.gnat.com (Postfix) with ESMTP
	id 9F4E5F2EA3; Thu,  1 Apr 2004 05:43:07 -0500 (EST)
Received: from nile.gnat.com ([127.0.0.1])
 by localhost (nile.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP
 id 27799-01-4; Thu,  1 Apr 2004 05:43:07 -0500 (EST)
Received: by nile.gnat.com (Postfix, from userid 1345)
	id 57A9EF2EA1; Thu,  1 Apr 2004 05:43:07 -0500 (EST)
From: Paul Hilfinger <hilfingr@gnat.com>
To: drow@false.org
Cc: cagney@gnu.org, gdb-patches@sources.redhat.com
In-reply-to: <20040331165810.GA32347@nevyn.them.org> (message from Daniel
	Jacobowitz on Wed, 31 Mar 2004 11:58:10 -0500)
Subject: Re: [RFA] Add language-dependent post-parser
References: <20040330092413.2E716F281D@nile.gnat.com> <20040330142656.GA18340@nevyn.them.org> <20040331080245.C139FF2B8B@nile.gnat.com> <20040331153004.GA29623@nevyn.them.org> <20040331153650.GA30084@nevyn.them.org> <406AF6AE.5040106@gnu.org> <20040331165810.GA32347@nevyn.them.org>
Message-Id: <20040401104307.57A9EF2EA1@nile.gnat.com>
Date: Thu, 01 Apr 2004 10:43:00 -0000
X-Virus-Scanned: by amavisd-new at nile.gnat.com
X-SW-Source: 2004-04/txt/msg00014.txt.bz2


OK guys, where do we stand?  As Daniel has indicated, more elegant
options than mine appear to involve a great deal of work to existing,
stable parsing and evaluation code that works quite well right now,
thank you, and whose only offense is to be inelegant.  Ada customers
have had the functionality of being able to detect, report, and
resolve ambiguous references at parsing time (rather than at
unexpected later moments when a saved expression is evaluated) for
years, and it works pretty well (I commend the approach to the C++
folks, in fact).  How about it?

A few responses:

> > >But I guess the point is, this is no more elegant than a second pass,
> > >and whatever you implement I should probably use for C++ anyway so it
> > >won't be an Ada-specific hook.  Does anyone else have an opinion?

I guess I don't share Daniel's distaste for multiple passes.
Generally, I've found them a useful way to cleanly modularize
sequences of transformations on an AST.   In production compilers, the 
downside is performance, but as Daniel also says, this really is not
an issue with the scale of parsing we do in GDB.


> > Ok, two thoughts:
> > 
> > - how come it's in this compressed postfix form?
> > This could hardly be a memory usage problem?
> 
> Hardly - since expressions are so short-lived.  I think it's more
> likely the emphasis was on postfix than on compressed.  I wasn't around
> when any of this was being designed, of course :)  But there are two
> plausible ways to structure this sort of yacc parser - either postfix
> or tree.  Apparently someone prefered postfix.  Which is then awkward
> to work with so it becomes prefix later.
> 
> If we're going to really clean this up, I think that using a tree
> instead would be the way to go.  That's a lot of work though.

Agreed, alas.  And this change would then have to propagate through the 
evaluator code as well.  

> > - could multi-pass be better / cleaner long term?
> > Is there a language (that we care about) with overload semantics so 
> > screwed up that both the containing expression and the parameters are 
> > needed when resolving the name?
> 
> I don't think there is.

Well, there are actually two: Ada and C++ (quick, who can tell me
where C++ uses type context in name resolution?).  However, in both
cases, the GDB implementors (i.e., me in the case of Ada) chose to
ignore this part of the language for the most part, and rely on simple
bottom-up resolution.  Type resolution by context can be handy for
producing "special effects" of abstraction, but very seldom in cases
that matter much to people typing expressions into GDB.

> > One way to get an answer is to ask: how to the corresponding compilers 
> > (Ada, Java, ObjC, C++) all implement this?
> 
> The only ones I'm familiar with (GCC, EDG, etc.) all do it using a tree
> structure.  A linearized representation is just too restrictive.  And
> multi-pass is out of the question if you want good performance; while
> for GDB the performance of the expression parser is pretty marginal,
> and the expressions we parse are pretty small, for a compiler this is a
> critical bottleneck.  Every additional pass over the parse tree has a
> high cost.

At the risk of starting an irrelevant tangential discussion, I can't
resist commenting on this.  In the case of tree representations, this
cost is not as high as one might think.  Suppose we have a 10,000-line
program with 100 AST nodes per parsed line (sounds fairly generous to
me).  The overhead of a tree traversal then involves one dynamic type
dispatch (we'll say this is a C++ implementation) per node, plus a
fetch and null test for each node (for the edge leading from its
parent, counting empty trees among the 100 figure).  On today's
machines, we're talking small fractions of a second here.

Paul Hilfinger