From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Received: (qmail 20276 invoked from network); 9 Jan 2003 21:51:55 -0000 Received: from unknown (HELO jackfruit.Stanford.EDU) (171.64.38.136) by 209.249.29.67 with SMTP; 9 Jan 2003 21:51:55 -0000 Received: (from carlton@localhost) by jackfruit.Stanford.EDU (8.11.6/8.11.6) id h09LpbC17928; Thu, 9 Jan 2003 13:51:37 -0800 X-Authentication-Warning: jackfruit.Stanford.EDU: carlton set sender to carlton@math.stanford.edu using -f To: Paul Hilfinger Cc: Elena Zannoni , Adam Fedor , GDB Patches , Daniel Jacobowitz Subject: Re: Demangling and searches References: <200301090237.SAA22983@tully.CS.Berkeley.EDU> From: David Carlton Date: Thu, 09 Jan 2003 21:51:00 -0000 In-Reply-To: <200301090237.SAA22983@tully.CS.Berkeley.EDU> Message-ID: User-Agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Common Lisp) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2003-01/txt/msg00151.txt.bz2 On Wed, 08 Jan 2003 18:37:57 -0800, Paul Hilfinger said: >> I'm curious: in Ada, what does the mangling do? In particular, how >> much type info does it contain? In C++, the mangled name contains >> type info for the arguments for functions; I don't see how, using >> GDB's current data structures, to allow us to allow users to, say, >> break on a function without requiring them to specify the types of the >> arguments, if we took your approach. (Though it might be possible to >> modify GDB's data structures to allow that.) > In Ada, the mangled name does not contain type information, but we > actually solve an even harder problem. The mangled name contains > certain information that the user doesn't necessarily know, so that > the system CANNOT reconstruct the full mangled name from the user's > input a priori. That's true for C++, too: the users don't know the types in question, among other issues (anonymous namespaces!). Which is why the hash function that we apply to demangled names doesn't look at the entire demangled name. > I am asking why we can't change P to > K equals f (SYMBOL_NAME (s), SYMBOL_LANGUAGE (s)), > or, more abstractly, to something like > compare_demangled (K, SYMBOL_NAME (s), SYMBOL_LANGUAGE (s)) == 0 > saving considerable space in the process. (Daniel Berlin points out > that ABI also figures into these, but here I'll just go by the > parameterization in symtab.h). The answer is "because f is costly > when applied to lots of symbols." But this answer really makes sense > only if strategy 1 above is ineffective. > If your symbol-search structure is a hash table, then all you have > to do is use SYMBOL_SOURCE_NAME (s) as the hash key; it is > irrelevant whether you actually store the SYMBOL_SOURCE_NAME in s. That could work. Right now, it turns out that demangling C++ names is probably more expensive in time than in memory, so the naive implementation of what you propose (call the demangler to compute SYMBOL_SOURCE_NAME, but then forget the demangled name) wouldn't be much of a win. But it should, I think, be possible to write a function that computes our hash value directly from the mangled name, and to compute it much more quickly than calling the demangler, and to write an appropriate comparison function. It seems like a pain to implement and a maintenance burden, and I don't know exactly what the tradeoffs would be. But you might well be correct that it's possible. David Carlton carlton@math.stanford.edu