From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Peter.Schauer" To: dberlin@redhat.com (Daniel Berlin) Cc: gdb@sources.redhat.com, gdb-patches@sources.redhat.com Subject: Re: Removal of demangled names from partial symbols Date: Mon, 04 Dec 2000 12:53:00 -0000 Message-id: <200012042053.VAA28364@reisser.regent.e-technik.tu-muenchen.de> References: X-SW-Source: 2000-12/msg00013.html : Why do we not do a lookup_minimal_symbol in : a new function, add_psymbol_and_dem_name_to_list, on the mangled name, : and if we get back a symbol, use the demangled name from that, : otherwise, demangle it. For some symbol formats (e.g. a.out) the linkage and debugging symbols are intermixed. By the time you want to record a partial symbol, the minimal symbol might have not been seen yet. Or the minimal symbol has been seen, but the minimal symbols are not yet installed, so lookup_minimal_symbol will fail. You might be able to work around this by going over all psymbols and fill in the demangled name via lookup_minimal_symbol _after_ the minimal symbols are installed, but I am not yet convinced that you don't have to pay your price on slower systems. After all, 5 secs to 6 secs is a 20% slowdown. > Demangled names were removed from partial symbols to speed start up > times a few years ago. > > However, with the minsym demangled hash table now around, we demangle > all minimal symbols when we install minimal symbols (IE we init the > demangled name on them,unconditionally). > > Since the minimal symbol table ends up including a large subset of the > mangled partial symbols (if not all of them), this means we already have a large > subset of the partial symbol names demangled for us at start up > anyway. > > Why do we not do a lookup_minimal_symbol in > a new function, add_psymbol_and_dem_name_to_list, on the mangled name, > and if we get back a symbol, use the demangled name from that, > otherwise, demangle it. > > Even tests on 100 meg of debug info show we barely add any startup > time at all (5 seconds without, 6 seconds with) . > In fact, all added startup time is attributable to the > fact that to save memory, I had it bcache the demangled name in > SYMBOL_INIT_DEMANGLED_NAME. If you don't bcache it (like right now), > it's in memory in at least the full symbol, and the minimal > symbol (it's actually in memory once for every time > SYMBOL_INIT_DEMANGLED_NAME is called on a symbol, and the demangling succeeds). > > I think 1 second on 100 meg of debug info is worth it to not have to > linear search on every symbol lookup, which is amazingly > slow, and if you have gdb using swap at all because of the number of > symbols, you are almost guaranteed to hit the swap > hard on *every* single lookup, since we have to go through every > single symbol. > > This would solve the problem of not being able to lookup partial > symbols by demangled name, and allow us to binary search them without > fear of missing a symbol. > > Would this be acceptable? > > My next trick after that would be to add a mangled->demangled mapping > structure, if it's necessary to improve speed, and just use that to > lookup the names before demangling the > name over again, in cases where we do (ie SYMBOL_INIT_DEMANGLED) need > to find a demangled name for a mangled one, and use that > rather than the minimal symbol table to try to find the name. > The reason for this is that a hash table (in this case, we are > using the minimal symbol demangled hash table as a lookup table) is the wrong structure > for this, since demangled names can be *very* large (average of 82 > chars on my large C++ programs), and we always have to hash the entire > string, then do a whole bunch of string compares, because the chains are > long. This is okay when we hit (except for the long chains), but on > misses we waste the same amount of times as hits, if not more. The > string compares on hits also cost a lot because of the length of the string. > We really should use a ternary search tree or some structure like it, > which on hits is actually faster (since we don't need multiple > string compares), and on misses is a whole ton faster, since we abort > much sooner. > > --Dan > > > -- Peter Schauer pes@regent.e-technik.tu-muenchen.de >From dberlin@cygnus.com Mon Dec 04 14:13:00 2000 From: Daniel Berlin To: "Peter.Schauer" Cc: gdb@sources.redhat.com, gdb-patches@sources.redhat.com Subject: Re: Removal of demangled names from partial symbols Date: Mon, 04 Dec 2000 14:13:00 -0000 Message-id: References: <200012042053.VAA28364@reisser.regent.e-technik.tu-muenchen.de> X-SW-Source: 2000-12/msg00014.html Content-length: 2626 On Mon, 4 Dec 2000, Peter.Schauer wrote: > : Why do we not do a lookup_minimal_symbol in > : a new function, add_psymbol_and_dem_name_to_list, on the mangled name, > : and if we get back a symbol, use the demangled name from that, > : otherwise, demangle it. > > For some symbol formats (e.g. a.out) the linkage and debugging symbols are > intermixed. By the time you want to record a partial symbol, the minimal > symbol might have not been seen yet. Or the minimal symbol has been seen, > but the minimal symbols are not yet installed, so lookup_minimal_symbol > will fail. > So i could just use the mapping structure i describe below, and have both SYMBOL_INIT_DEMANGLED_NAME, et al (IE all the ways we init the demangled name) attempt to look up the mangled name in it to see if we've demangled it before. Then it wouldn't matter what order this happened it. > You might be able to work around this by going over all psymbols and fill > in the demangled name via lookup_minimal_symbol _after_ the minimal symbols > are installed, but I am not yet convinced that you don't have to pay your > price on slower systems. After all, 5 secs to 6 secs is a 20% slowdown. Except we already pay the price on ELF, where unless you've stripped it, your minsyms contain just about all the symbols used in the program. It's also an issue of saving ~50% of the memory we use now (Remember, the slowdown comes from the bcaching, which would also not be necessary with the mapping structure), and making lookups dramatically faster. Without the bcaching, profiling shows all the time being used because the lookup_minimal symbol, as you would expect. Further analysis shows msymbol_hash_iw function isn't very good, and even if it was perfect we have 379 buckets, total, per object file, so on 100000 symbols, most concentrated in one object file (the main program in this case, which has 100 meg of debug info, and 100k symbols) ,we have chains with an average length of 286. If we assume we have to search half those, we are doing 143 strcmps to find the minimal symbol to get the demangled name out of. The ternary search tree i propose would require the equivalent of one strcmp to find the name. Users i've talked to are more annoyed that once gdb gets going, it takes for ever to do anything, because the symbol lookups take so long, than they are that gdb takes 3 minutes vs 3 minutes, 30 seconds, to start up. Let me finish integrating the mapping structure (I have the ternary search tree in libiberty, in case others wanted to use it), and do some tests to see how much faster it is. --Dan >From cgd@sibyte.com Mon Dec 04 14:35:00 2000 From: cgd@sibyte.com (Chris G. Demetriou) To: fche@redhat.com (Frank Ch. Eigler) Cc: gdb@sourceware.cygnus.com Subject: Re: How to adequately test 'sim' changes? Date: Mon, 04 Dec 2000 14:35:00 -0000 Message-id: <5titozak8j.fsf@highland.sibyte.com> References: <5twvdj29mi.fsf@highland.sibyte.com> X-SW-Source: 2000-12/msg00015.html Content-length: 2342 fche@redhat.com (Frank Ch. Eigler) writes: > : (2) which MIPS targets should be used for testing of > : architecture-dependent changes? > > AFAIK, mipstx39-elf and mips64vr5000-elf are a reasonable pair. > > : (3) which non-MIPS targets should be used for (additional) testing of > : architecture-independent changes? (I'm not thinking anything big > : here, just the need for compile/smoke-test of minor additions.) [...] > > The fr30, arm, mn10300, erc32 (sparc) would give reasonably wide > coverage for build-breakage testing. FYI, what i settled on was the script below. I've been running it on a solaris-sparc host, with a single source tree containing: * binutils * gdb + sim * libgloss + newlib * gcc (_only_ gcc, not the additional subtrees) ("it only chews about 1.3GB per test tree!" 8-) Anyway, the setup seems to work well for some of the boards, and of course there are classes of optimizations, etc., that cause some of the targets GCC tests to fall right over. I've got a auestion about this type of testing procedure. I've been going under the assumption that, if a reasonable number of the GCC 'execute' tests pass, this configuration is reasonable for testing the simulator for that target. As it turns out, some of the configurations _don't_ actually pass any of the execute tests (fr30-elf and sparclite-elf lose building them, and so don't even attempt to execute), and e.g. fr30 runs into some issues in the sim tests. I'm using dejagnu 1.3.1-19990614 from ftp://gcc.gnu.org/pub/egcs/infrastructure . Any advice on this, either targets/boards to use, or to upgrade & use the dejagnu from CVS? cgd ======== #!/bin/ksh do_date() { echo "*** `date`: $*" } do_target() { tgt=$1 case $target in mipstx39-elf) board=tx39-sim ;; *) board=`echo $tgt | sed -e 's,-elf,-sim,'` ;; esac do_date start mkdir BUILD.$tgt cd BUILD.$tgt ../src/configure --disable-shared --disable-nls --enable-languages=c \ --target=$tgt --prefix=`pwd`/../INSTALL.$tgt --with-gnu-as \ --with-gnu-ld do_date configured gmake do_date made gmake -k check RUNTESTFLAGS="--target_board=$board" do_date checked } do_date start for target in \ mipstx39-elf mips64-elf fr30-elf arm-elf mn10300-elf sparclite-elf do ( do_target $target ) > LOG.$target 2>&1 & done wait do_date all done >From ac131313@cygnus.com Mon Dec 04 15:10:00 2000 From: Andrew Cagney To: "Chris G. Demetriou" Cc: "Frank Ch. Eigler" , gdb@sourceware.cygnus.com Subject: Re: How to adequately test 'sim' changes? Date: Mon, 04 Dec 2000 15:10:00 -0000 Message-id: <3A2C221E.1D2DA89C@cygnus.com> References: <5twvdj29mi.fsf@highland.sibyte.com> <5titozak8j.fsf@highland.sibyte.com> X-SW-Source: 2000-12/msg00016.html Content-length: 576 "Chris G. Demetriou" wrote: > > : (3) which non-MIPS targets should be used for (additional) testing of > > : architecture-independent changes? (I'm not thinking anything big > > : here, just the need for compile/smoke-test of minor additions.) [...] > > > > The fr30, arm, mn10300, erc32 (sparc) would give reasonably wide > > coverage for build-breakage testing. FYI, I'm trying to prepare a list of targets that are known to cross compile. See: http://sources.redhat.com/ml/gdb-patches/2000-12/msg00024.html I've an SH script that goes with the changes :-) Andrew >From dberlin@redhat.com Mon Dec 04 15:36:00 2000 From: Daniel Berlin To: "Peter.Schauer" Cc: gdb@sources.redhat.com, gdb-patches@sources.redhat.com Subject: Re: Removal of demangled names from partial symbols Date: Mon, 04 Dec 2000 15:36:00 -0000 Message-id: References: <200012042053.VAA28364@reisser.regent.e-technik.tu-muenchen.de> X-SW-Source: 2000-12/msg00017.html Content-length: 5887 "Peter.Schauer" writes: Just a little more data. I rigged lookup_minimal_symbol to print the number of times (to a file, one number per line) it had to do "msymbol = msymbol->hash_next", if the number was > 0. All I did for this print out is start gdb like "gdb ~/corr101", then quit as soon as we got a prompt. corr101 has 27905 minsyms in it (according to maint print msymbols). We do at least 1 strcmp, possibly 2, for each of the compares in the loop looking for the minimal symbol name (which ends when we find the symbol, or msymbol is NULL), because it uses SYMBOL_MATCHES_NAME. The max number of times we ended up doing this SYMBOL_MATCHES_NAME, is 559 (which occurs ~1500 times). We had 544476 lookups which took greater than 0 of these "compares" (which each is 1 strcmp, and maybe 1 strcmp_iw). We did 40875096 "compares" in these lookups. Or, an average of 75 "compares" per lookup of a given name. So, even if the ternary search tree was the equivalent of 3 compares (due to overhead or whatever.), we'd still be able to do the partial demangling fillin (which is completely dominated, at 99.99% of the time being spent looking up the minsym to get the demangled name right now) an average 25 times faster than lookup_minimal_symbol does it. So i'm pretty confident we can completely eliminate the 20% slowdown i was seeing. Note, i'm not counting the compares both passes lookup_minimal_symbol does, just the first pass. The second pass is over the minsym demangled hash table, and goes slower, and is a waste of time for looking up the symbol to get the mangled name, but i'm just trying to prove a point about how much faster the demangling by lookup i want to use in the partial symbol stuff could be. Let me finish implementing and see if i'm right. --Dan > : Why do we not do a lookup_minimal_symbol in > : a new function, add_psymbol_and_dem_name_to_list, on the mangled name, > : and if we get back a symbol, use the demangled name from that, > : otherwise, demangle it. > > For some symbol formats (e.g. a.out) the linkage and debugging symbols are > intermixed. By the time you want to record a partial symbol, the minimal > symbol might have not been seen yet. Or the minimal symbol has been seen, > but the minimal symbols are not yet installed, so lookup_minimal_symbol > will fail. > > You might be able to work around this by going over all psymbols and fill > in the demangled name via lookup_minimal_symbol _after_ the minimal symbols > are installed, but I am not yet convinced that you don't have to pay your > price on slower systems. After all, 5 secs to 6 secs is a 20% slowdown. > > > Demangled names were removed from partial symbols to speed start up > > times a few years ago. > > > > However, with the minsym demangled hash table now around, we demangle > > all minimal symbols when we install minimal symbols (IE we init the > > demangled name on them,unconditionally). > > > > Since the minimal symbol table ends up including a large subset of the > > mangled partial symbols (if not all of them), this means we already have a large > > subset of the partial symbol names demangled for us at start up > > anyway. > > > > Why do we not do a lookup_minimal_symbol in > > a new function, add_psymbol_and_dem_name_to_list, on the mangled name, > > and if we get back a symbol, use the demangled name from that, > > otherwise, demangle it. > > > > Even tests on 100 meg of debug info show we barely add any startup > > time at all (5 seconds without, 6 seconds with) . > > In fact, all added startup time is attributable to the > > fact that to save memory, I had it bcache the demangled name in > > SYMBOL_INIT_DEMANGLED_NAME. If you don't bcache it (like right now), > > it's in memory in at least the full symbol, and the minimal > > symbol (it's actually in memory once for every time > > SYMBOL_INIT_DEMANGLED_NAME is called on a symbol, and the demangling succeeds). > > > > I think 1 second on 100 meg of debug info is worth it to not have to > > linear search on every symbol lookup, which is amazingly > > slow, and if you have gdb using swap at all because of the number of > > symbols, you are almost guaranteed to hit the swap > > hard on *every* single lookup, since we have to go through every > > single symbol. > > > > This would solve the problem of not being able to lookup partial > > symbols by demangled name, and allow us to binary search them without > > fear of missing a symbol. > > > > Would this be acceptable? > > > > My next trick after that would be to add a mangled->demangled mapping > > structure, if it's necessary to improve speed, and just use that to > > lookup the names before demangling the > > name over again, in cases where we do (ie SYMBOL_INIT_DEMANGLED) need > > to find a demangled name for a mangled one, and use that > > rather than the minimal symbol table to try to find the name. > > The reason for this is that a hash table (in this case, we are > > using the minimal symbol demangled hash table as a lookup table) is the wrong structure > > for this, since demangled names can be *very* large (average of 82 > > chars on my large C++ programs), and we always have to hash the entire > > string, then do a whole bunch of string compares, because the chains are > > long. This is okay when we hit (except for the long chains), but on > > misses we waste the same amount of times as hits, if not more. The > > string compares on hits also cost a lot because of the length of the string. > > We really should use a ternary search tree or some structure like it, > > which on hits is actually faster (since we don't need multiple > > string compares), and on misses is a whole ton faster, since we abort > > much sooner. > > > > --Dan > > > > > > > > -- > Peter Schauer pes@regent.e-technik.tu-muenchen.de