From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 32587 invoked by alias); 6 Apr 2002 06:34:00 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 32580 invoked from network); 6 Apr 2002 06:33:59 -0000 Received: from unknown (HELO nevyn.them.org) (128.2.145.6) by sources.redhat.com with SMTP; 6 Apr 2002 06:33:59 -0000 Received: from drow by nevyn.them.org with local (Exim 3.35 #1 (Debian)) id 16tjm5-0001UX-00; Sat, 06 Apr 2002 01:34:09 -0500 Date: Fri, 05 Apr 2002 22:34:00 -0000 From: Daniel Jacobowitz To: Jim Blandy Cc: gdb@sources.redhat.com, Benjamin Kosnik , Daniel Berlin Subject: Re: C++ nested classes, namespaces, structs, and compound statements Message-ID: <20020406013408.A4570@nevyn.them.org> Mail-Followup-To: Jim Blandy , gdb@sources.redhat.com, Benjamin Kosnik , Daniel Berlin References: <20020406044204.245E45EA11@zwingli.cygnus.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020406044204.245E45EA11@zwingli.cygnus.com> User-Agent: Mutt/1.3.23i X-SW-Source: 2002-04/txt/msg00079.txt.bz2 Much to say, much to say... On Fri, Apr 05, 2002 at 11:42:04PM -0500, Jim Blandy wrote: > > At the moment, GDB doesn't handle C++ namespaces or nested classes > very well. I have a general idea of how we could address these > limitations, which I'd like to put up for shredding M-DEL discussion. > > Let me admit up front that I don't really know C++, so I may be saying > stupid things. Please set me straight if you notice something. I know C++ fairly well, but my grasp of the technical terminology of the language is lacking. So don't go looking up any phrases I use in here; I'm probably making them up :) I'm sure Daniel Berlin can correct any egregious errors. > You can also declare "static" struct members --- you can access them > with the `->' and `.' operators, just like ordinary members, but > they're actually variables at fixed addresses in the .data segment --- > much like a "static" variable in a C compound statement. But this > means that a simple offset from a base address is no longer sufficient > to describe a struct's member's location --- you actually start > needing something like GDB's enum address_class. Multiple inheritance > and virtual base classes introduce further complexity here. I believe you're generalizing too much here. Statics are a special case; they're essential global variables, whose name is given in the local class scope. You've also got constant data members, which are not necessarily backed by real symbols in C++ (I believe they always are, in C...). Everything else are members. The complexity from multiple inheritence and virtual base classes is essentially orthogonal. It just affects the scopes you search. > There's another difference between compound statements and structs > goes away. In C, you can only reference a struct's members using the > `.' and `->' operators, whereas you refer to a compound statement's > variables by simply naming them. But in C++, a struct's member > functions can refer to the struct's members by simply naming them. > The struct's bindings become another rib in the search path for > identifier bindings. > > In summary, the data structure GDB needs to represent C++ structs > (classes, unions, whatever) has a lot of similarities to the structure > GDB needs to represent the local variables of a compound statement. > They both need to carry bindings for several namespaces (ordinary > identifiers and structure tags). The names can refer to any manner of > things: variables, functions, namespaces, base classes, and so on. > For variables, there are a variety of locations they might occupy. GDB already does a great deal of this by the very simple method of using fully qualified names. It's served us remarkably well, although of course we're hitting its limits now. But let's not be too quick to discard that approach, for the present at least. Also, while they're often both searched, don't confuse the structure inheritence search path with the enclosing structure/namespace search path. For instance, foo->x() searches only the inheritence paths. foo->A::x is even worse (and gdb handles it badly or not at all at present, as Michael mentioned in his mail a moment ago). > So I would like to introduce to GDB a new type, `struct environment' > (or is `struct env' better?) which does about the same thing that the > `nsyms' and `sym' members of `struct block', and the `nfields' and > `fields' members of `struct type', do now: it's just a bunch of > bindings for names. We would use `struct environment': > > - in `struct block', to represent the block's local variables, replacing > `nsyms' and `sym'; > - in `struct type', to represent a struct's members, instead of > `struct fields'; and > - in our representation for C++ namespaces, which seem pretty much > like structs that can only contain static members and member > functions (i.e., you can't ever create an instance of one). > > There'd be a single set of functions for building `struct environment' > objects, and looking up bindings in them; you'd use it for variable > lookup, and in the `.' and `->' operators. It could handle hashing, > when appropriate. > > Basically, we would take two distinct areas of GDB (and a third, > namespaces, which we haven't implemented yet but will need to), and > support them all with a single structure and a single bunch of > support functions. GDB would become easier to read. How about -containing- `struct fields', instead of replacing? i.e. let the name search happen in the `struct environment', as before, but the data items would be fields (could be indicated in a flag in the environment, with a pointer to the type or symbol for the enclosing structure). I don't think turning members into symbols is a good idea. As a side note, at the same time we should generalize our overloading support to functions in addition to methods. This would give the framework to make that painless. The environment could describe an overloaded name... > As a half-baked idea, perhaps a `struct environment' object would have > a list of other `struct environment' objects you should search, too, > if you didn't find what you were looking for. We could use this to > search a compound statement's enclosing scopes to find bindings > further out, find members inherited from base classes, and resolve C++ > `using' declarations. As I said above, I think that going this route is a bad idea. It should have a pointer to the enclosing object and to that object's environment, probably, but that's the extent of it. > How does this strike people? > > Open issues: > > - This "list of other places to search" thing may be ill-formed. I > mean, sure, there are a set of similar behaviors going on there, but > are they similar enough? For example: We're thinking along the same paths here... I suspect that it is in fact ill-formed. > - What really happens when you start using `struct symbol' objects for > structure members? Do we need new address classes now for `offset > from object base address'? Does the LOC_COMPUTED idea I've been > pushing still work? Why do we want members to be symbols? A `struct field' expresses all the properties of members; symbols have other properties. I think we use symbols in too many places already. > - How do member functions work in this arrangement? Virtual member > functions? Virtual base classes? If we leave searching out of it, we're fine on this front. > - How would we introduce this incrementally? Do we want to? No, I'm serious. Incremental solutions are more practical to implement, but they will come with more baggage. Baggage will haunt us for a very long time. If we can completely handle non-minimal-symbol lookup in this way, that's a big win. Some other thoughts: This is all a question of scope. As I said, right now we handle this mostly by searching a small set of 'namespaces' (type, struct, variable...) for a given term. We try to look up fully qualified names, and qualify them as necessary. Scales fairly badly. We want to search for names in the appropriate scope, breaking up qualification as necessary. This is as opposed to search for fully qualified names. What we want is essentially: - The concept of an enclosing scope - Language dependant hooks to specify which scopes to search for a name. We should be able to get a very nice behavior here, which we sort of have now but not cleanly, and which well illustrates why the order to search can be language dependant. Consider: struct X { int foo(); int bar(); }; int baz(); int X::foo() { ... int z = bar(); } We're debugging in X::foo() and the user says 'list'. They see 'bar()'. They decide to ask 'print bar()'. We want in this case to find X::bar(). We do even (much) worse if X::foo is a static member function. On the other hand, this->bar() should work and this->baz() should not. My benchmark for whether a solution is even adequate enough to consider is whether it obsoletes DW_AT_MIPS_linkage_name. This should. I'd really like to make us independent of that, since it will make constructor handling much less of a gross special case. Daniel pointed out at one point how much space this could save in binaries. There's probably more I meant to say about this, but it's 1:30AM here... -- Daniel Jacobowitz Carnegie Mellon University MontaVista Software Debian GNU/Linux Developer