From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5626 invoked by alias); 17 Sep 2002 19:50:07 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 5619 invoked from network); 17 Sep 2002 19:50:06 -0000 Received: from unknown (HELO mail.cdt.org) (206.112.85.61) by sources.redhat.com with SMTP; 17 Sep 2002 19:50:06 -0000 Received: from www.dberlin.org (pool-138-88-97-198.res.east.verizon.net [138.88.97.198]) by mail.cdt.org (Postfix) with ESMTP id 8BA01490060; Tue, 17 Sep 2002 15:27:33 -0400 (EDT) Received: from dberlin.org (unknown [192.168.0.252]) by www.dberlin.org (Postfix) with ESMTP id D14E21805C4C; Tue, 17 Sep 2002 15:50:01 -0400 (EDT) Date: Tue, 17 Sep 2002 12:50:00 -0000 Subject: Re: struct environment Content-Type: text/plain; charset=US-ASCII; format=flowed Mime-Version: 1.0 (Apple Message framework v546) Cc: Andrew Cagney , gdb To: David Carlton From: Daniel Berlin In-Reply-To: Message-Id: Content-Transfer-Encoding: 7bit X-SW-Source: 2002-09/txt/msg00236.txt.bz2 Having done this for regcache and reggroup it is worth it -> you get >> to see what actually works rather than what everyone else thinks >> should work. You can also pick out the cleanups that will make the >> final change easier. For instance - tightening up the way GDB >> creates symbol tables (DanielB suggested moving psymtab into stabs) >> an performs lookups on them. > > Personally, I'd rather tighten up the interface to symbol lookups > first and only then move on to optimizations like changing how > reading in a part of the symbol information is implemented. These are orthogonal issues. Reading in part of the symbol information is *not* an optimization (as the source stands now). It is a necessity controlled *by the symbol lookup*. If psymtabs were private to stabs, then this would not be the cast anymore, but as it stands now, the readin of psymtabs and conversion to symtabs is controlled by the symbol lookup functions. If your lookup doesn't do this as well, *and* you don't change how symbol tables are created (ie you do nothing), stuff will break. > >> Instead, initially, I think these tables could be simple linear >> lists (that is what ``struct block'' currently implements so it >> can't be any worse :-) (just the global / final table is special >> :-). > > For struct block, I'm tracking existing design extremely closely. > No offense, but you shouldn't do that. The current design does *not* scale. Or at least, not until hash tables were introduced, which happened in the past 3 months. I've only tested scaling of hash tables up to 100 meg of debug info. Even then, due to the number of blocks one has to search, things start to get a bit slow. There are plenty of projects these days with 500 or 1 gig of debug info. You *need* to be thinking about these things whenever you design the data structures and algorithms. The reason GDB was so slow on these large cases before hash tables is precisely because nobody seemed to even consider what would happen 10 years in the future. It's perfectly reasonable to assume that in 10 years, we will have 10x as much debug info to deal with. If we don't, great. If we do, and we haven't even considered it, we are *really* screwed. > Currently, struct block has these options: > > * Hash tables. (Used almost everywhere.) This is a new development. > * Lists that aren't sorted by name. (Used in blocks that are the body > of functions, and in Java's dynamic class table.) > * Lists that are sorted by name. (Used by mdebugread.c in places > where hash tables should be used.) > > Most of these can't grow, but jv-lang.c and mdebugread.c both have > code by which lists can grow. > > When the struct block conversion is done, the only change will be that > sorted lists will go away (since they're only used by mdebugread.c, > which should be converted over to using hash tables instead), and that > the common code in jv-lang.c and mdebugread.c to allow lists that grow > will all be in one place. No other changes: extremely similar data > structures, exactly the same algorithms. Which haven't served particularly well for the past 5 years (before that, maybe). No offense, and not your fault. And i'm not saying they *aren't* adequate. It just needs more thought than "it's currently done this way, thus, we should do it that way in the future". > >> - Am I correct to think that the objective is to create a directed >> acyclic graph of nametabs and have lookups search through each in >> turn. > > Not quite as Daniel has explained; C++'s name lookup rules are pretty > complicated. Which isn't to say that we have to track those name > lookup rules perfectly: e.g. it's not the end of the world if we don't > notice that a name lookup is ambiguous where a C++ compiler would be > required to notice that. No, we should actually be able to do exactly this. Even if we *don't*, we should be able to. > But it is, I think, a bad idea if we miss a > symbol that we should find, or if we return the wrong symbol instead > of the right one. > Which is what happens if you don't track the name lookup rules perfectly. THe rules specify what is ill formed. If you find symbols you shouldn't, it's not only confusing to the user, it's the wrong symbol.