From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26660 invoked by alias); 26 Feb 2004 16:54:09 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 26391 invoked from network); 26 Feb 2004 16:54:07 -0000 Received: from unknown (HELO nevyn.them.org) (66.93.172.17) by sources.redhat.com with SMTP; 26 Feb 2004 16:54:07 -0000 Received: from drow by nevyn.them.org with local (Exim 4.30 #1 (Debian)) id 1AwOlv-0008WK-SI; Thu, 26 Feb 2004 11:54:03 -0500 Date: Thu, 26 Feb 2004 16:54:00 -0000 From: Daniel Jacobowitz To: Elena Zannoni Cc: Eli Zaretskii , Daniel Berlin , cagney@gnu.org, gdb@sources.redhat.com, mec.gnu@mindspring.com Subject: Re: Branch created for inter-compilation-unit references Message-ID: <20040226165403.GA32247@nevyn.them.org> Mail-Followup-To: Elena Zannoni , Eli Zaretskii , Daniel Berlin , cagney@gnu.org, gdb@sources.redhat.com, mec.gnu@mindspring.com References: <1037DDEA-67B5-11D8-9146-000A95DA505C@dberlin.org> <403CEE5C.5080100@gnu.org> <20040226150526.GB13921@nevyn.them.org> <16446.7027.398485.643190@localhost.redhat.com> <20040226162539.GA30921@nevyn.them.org> <16446.8064.51905.953992@localhost.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <16446.8064.51905.953992@localhost.redhat.com> User-Agent: Mutt/1.5.1i X-SW-Source: 2004-02/txt/msg00392.txt.bz2 On Thu, Feb 26, 2004 at 11:32:00AM -0500, Elena Zannoni wrote: > Thanks for understanding. To help streamline this, can I ask you to > post a little summary of the major changes you needed to do? I am > thinking we should strat having a dwarf2 handling section in the doco > and this would be a starting point. Sure. Here's the bird's-eye view of the new code. Partial symbol handling Our existing partial symbol handling is one-pass; read in one DIE at a time, grab whatever information we need from it, and then go on to either its children or its sibling. As David found out some time ago, this is incompatible with DW_AT_specification/DW_AT_abstract_origin. For C++, this means that it becomes very awkward to determine a class's fully qualified name; it may have a non-defining declaration as the child of a namespace and a defining declaration with DW_AT_specification. We have limited partial DIE support for these attributes, which we can use to find the DW_AT_name in the referenced DIE, but it is incompatible with processing multiple compilation units at a time. So to solve both of these problems, I implemented a two-pass algorithm much like the one currently used for full symbol reading. It's a little slower, but I took pains to keep the impact minimal for code that does not use namespaces and specifications. Normally we process one compilation unit at a time. When we first encounter an inter-compilation-unit reference, we switch gears (somewhat inelegantly): we build a table of all compilation unit lengths and offsets, and store it in a splay tree. Then we load compilation units into memory as needed and free them when they've been unused for a tunable amount of time. Full symbol handling The full symbol handling (with the bug fix I've posted for HEAD this week) was much closer to supporting inter-compilation-unit references than the partial symbol handling. We know (from the partial symbol table reading) whether or not we will need to support references. If we will, then for each compilation unit we cache die->type in a hash table. This allows us to recreate the type information for a compilation unit without keeping the enter DIE tree in memory. If we're in reference-supporting mode, we build full symbols using a queue. We add the current psymtab's CU to the queue; load full DIEs into memory for everything on the queue; recreate any already read in comp units using the saved die->type hash table; and process any not yet read in comp units. While loading full DIEs we queue any referenced compilation units for reading. All queued compilation units will be read in before any are turned into symtabs. I did it this way because making the symtab-building code re-entrant (so that we could build up two psymtabs at the same time) would have been very challenging, but the DIE loading code is much simpler. Also, it turns out that we can have circular type references between compilation units, which requires that we be able to build types in one pass and symbols in another. A couple of functions that return DIE pointers were adjusted to also return the pointer to the compilation unit containing the DIE, so that relative references could be followed. I think that's pretty much everything. -- Daniel Jacobowitz MontaVista Software Debian GNU/Linux Developer