From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17029 invoked by alias); 16 Feb 2004 22:57:45 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 17021 invoked from network); 16 Feb 2004 22:57:43 -0000 Received: from unknown (HELO nevyn.them.org) (66.93.172.17) by sources.redhat.com with SMTP; 16 Feb 2004 22:57:43 -0000 Received: from drow by nevyn.them.org with local (Exim 4.30 #1 (Debian)) id 1AsrgN-0000V6-F4; Mon, 16 Feb 2004 17:57:43 -0500 Date: Mon, 16 Feb 2004 22:57:00 -0000 From: Daniel Jacobowitz To: Elena Zannoni Cc: gdb-patches@sources.redhat.com Subject: Re: [rfa] Add SYMBOL_SET_LINKAGE_NAME Message-ID: <20040216225743.GA1714@nevyn.them.org> Mail-Followup-To: Elena Zannoni , gdb-patches@sources.redhat.com References: <20040216212406.GC17141@nevyn.them.org> <16433.15049.172030.577544@localhost.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <16433.15049.172030.577544@localhost.redhat.com> User-Agent: Mutt/1.5.1i X-SW-Source: 2004-02/txt/msg00447.txt.bz2 On Mon, Feb 16, 2004 at 04:48:57PM -0500, Elena Zannoni wrote: > Daniel Jacobowitz writes: > > After this patch and my others from today there are no direct > > assignments to the symbol name. In addition to the cleanup value, I'm > > testing an approach which would change the storage of symbol names, > > which prompted me to do this. > > can you elaborate on where you are going? Sure. I'm not sure if it's actually going to end up this way, since I'm thinking it wasn't a great idea and it has some truly gross bits I haven't figured out what to do with yet - it was just a hack job last weekend. But here's what my current tree does. The C++ demangled name pointer in lang_specific is removed. The name pointer becomes a union, and a flag bit (there's about a byte's worth of empty space in general_symbol_info) is added. They look like this: /* The name(s) of this symbol. Storage for the names will be generally be allocated on the objfile_obstack for the associated objfile, through a hash table or bcache. */ union { /* If the flag HAS_DEMANGLED_NAMES (below) is clear, this points to the name of the symbol. */ char *name; /* If HAS_DEMANGLED_NAMES is set, this points to a structure describing this symbol's names. The structure may be shared. */ struct symbol_name_info *names; } name_union; /* A flag indicating which member of NAME_UNION is in use. */ unsigned int has_demangled_names : 1; struct symbol_name_info { /* ADD MORE COMMENTS */ char *linkage_name; char *demangled_name; unsigned int has_full_demangled_name : 1; unsigned int linkage_len : 31; }; I store several different pieces of data here. ->linkage_name + ->linkage_len points to the result of demangling linkage_name without DMGL_PARAMS: that is, without any function arguments or return type information. If HAS_FULL_DEMANGLED_NAME, then demangled_name points to the normal (full) demangled name. Otherwise it points to the appropriate obstack to store the demangled name on. Then I convert the symbol readers to only use the short demangled name, which I call SYMBOL_DEMANGLED_SEARCH_NAME. SYMBOL_DEMANGLED_NAME fills in lazily when needed. Some memory is saved directly because the symbol_name_info structure is shared between partial, full, and minimal symbols; removing the demangled_name pointer leaves net savings of a (very roughly) a word per C++ symbol. Symbols which do not have a demangled name no longer have space allocated for one, so there is also a savings for straight C code. I wanted the without-arguments names available because, in 99% of cases, they are all we need. Less memory, less time in the demangler, et cetera. This whole project grew out of profiling results for a large dwarf2 C application, which shows a similar profile to C++ : the biggest hot spot in startup time today is compare_psymbols. My hope is to do that using strcmp, or something even more efficient, instead of ~6,000,000 calls to strcmp_iw. I started working on the symbol name cleanups first, but I think saving the hash code in the symbol_name_info might be even more effective. -- Daniel Jacobowitz MontaVista Software Debian GNU/Linux Developer