Removal of demangled names from partial symbols

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

* Removal of demangled names from partial symbols
@ 2000-12-04 10:22 Daniel Berlin
  2000-12-04 12:53 ` Peter.Schauer
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Berlin @ 2000-12-04 10:22 UTC (permalink / raw)
  To: gdb; +Cc: gdb-patches

Demangled names were removed from partial symbols to speed start up
times a few years ago.

However, with the minsym demangled hash table now around, we demangle
all minimal symbols when we install minimal symbols (IE we init the
demangled name on them,unconditionally).

Since the minimal symbol table ends up including a large subset of the
mangled partial symbols (if not all of them), this means we already have a large
subset of the partial symbol names demangled for us at start up
anyway.

Why do we not do a lookup_minimal_symbol in
a new function, add_psymbol_and_dem_name_to_list, on the mangled name,
and if we get  back a symbol, use the demangled name from that,
otherwise, demangle it.

Even tests on 100 meg of debug info show we barely add any startup
time at all (5 seconds without, 6 seconds with) . 
In fact, all added startup time is attributable to the
fact that to save memory, I had it bcache the demangled name in
SYMBOL_INIT_DEMANGLED_NAME.  If you don't bcache it (like right now),
it's in memory  in at least the full symbol, and the minimal
symbol (it's  actually in memory once for every time
SYMBOL_INIT_DEMANGLED_NAME is called on a symbol, and the demangling succeeds).

I think 1 second on 100 meg of debug info is worth it to not have to
linear search on every symbol lookup, which is amazingly 
slow, and if you have gdb using swap at all because of the number of
symbols, you are almost guaranteed to hit the swap 
hard on *every* single lookup, since we have to go through every
single symbol. 

This would solve the problem of not being able to lookup partial
symbols by demangled name, and allow us to binary search them without
fear of missing a symbol.

Would this be acceptable?

My next trick after that would be to add a mangled->demangled mapping
structure, if it's necessary to improve speed, and just use that to
lookup the names before demangling the 
name over again, in cases where we do (ie SYMBOL_INIT_DEMANGLED) need
to find a demangled name for a mangled one, and use that
rather than the minimal symbol table to try to find the name.
The reason for this is that a hash table (in this case, we are
using the minimal symbol demangled hash table as a lookup table) is the wrong structure
for this, since demangled names can be *very* large (average of 82
chars on my large C++ programs), and we always have to hash the entire
string, then do a whole bunch of string compares, because the chains are
long. This is okay when we hit (except for the long chains), but on
misses we waste the same amount of times as hits, if not more. The
string compares on hits also cost a lot because of the length of the string.
We really should use a ternary search tree or some structure like it,
which on hits is actually faster (since we don't need multiple
string compares), and on misses is a whole ton faster, since we abort
much sooner.

--Dan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Removal of demangled names from partial symbols
  2000-12-04 10:22 Removal of demangled names from partial symbols Daniel Berlin
@ 2000-12-04 12:53 ` Peter.Schauer
  0 siblings, 0 replies; 3+ messages in thread
From: Peter.Schauer @ 2000-12-04 12:53 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gdb, gdb-patches

: Why do we not do a lookup_minimal_symbol in
: a new function, add_psymbol_and_dem_name_to_list, on the mangled name,
: and if we get  back a symbol, use the demangled name from that,
: otherwise, demangle it.

For some symbol formats (e.g. a.out) the linkage and debugging symbols are
intermixed. By the time you want to record a partial symbol, the minimal
symbol might have not been seen yet. Or the minimal symbol has been seen,
but the minimal symbols are not yet installed, so lookup_minimal_symbol
will fail.

You might be able to work around this by going over all psymbols and fill
in the demangled name via lookup_minimal_symbol _after_ the minimal symbols
are installed, but I am not yet convinced that you don't have to pay your
price on slower systems. After all, 5 secs to 6 secs is a 20% slowdown.

> Demangled names were removed from partial symbols to speed start up
> times a few years ago.
> 
> However, with the minsym demangled hash table now around, we demangle
> all minimal symbols when we install minimal symbols (IE we init the
> demangled name on them,unconditionally).
> 
> Since the minimal symbol table ends up including a large subset of the
> mangled partial symbols (if not all of them), this means we already have a large
> subset of the partial symbol names demangled for us at start up
> anyway.
> 
> Why do we not do a lookup_minimal_symbol in
> a new function, add_psymbol_and_dem_name_to_list, on the mangled name,
> and if we get  back a symbol, use the demangled name from that,
> otherwise, demangle it.
> 
> Even tests on 100 meg of debug info show we barely add any startup
> time at all (5 seconds without, 6 seconds with) . 
> In fact, all added startup time is attributable to the
> fact that to save memory, I had it bcache the demangled name in
> SYMBOL_INIT_DEMANGLED_NAME.  If you don't bcache it (like right now),
> it's in memory  in at least the full symbol, and the minimal
> symbol (it's  actually in memory once for every time
> SYMBOL_INIT_DEMANGLED_NAME is called on a symbol, and the demangling succeeds).
> 
> I think 1 second on 100 meg of debug info is worth it to not have to
> linear search on every symbol lookup, which is amazingly 
> slow, and if you have gdb using swap at all because of the number of
> symbols, you are almost guaranteed to hit the swap 
> hard on *every* single lookup, since we have to go through every
> single symbol. 
> 
> This would solve the problem of not being able to lookup partial
> symbols by demangled name, and allow us to binary search them without
> fear of missing a symbol.
> 
> Would this be acceptable?
> 
> My next trick after that would be to add a mangled->demangled mapping
> structure, if it's necessary to improve speed, and just use that to
> lookup the names before demangling the 
> name over again, in cases where we do (ie SYMBOL_INIT_DEMANGLED) need
> to find a demangled name for a mangled one, and use that
> rather than the minimal symbol table to try to find the name.
> The reason for this is that a hash table (in this case, we are
> using the minimal symbol demangled hash table as a lookup table) is the wrong structure
> for this, since demangled names can be *very* large (average of 82
> chars on my large C++ programs), and we always have to hash the entire
> string, then do a whole bunch of string compares, because the chains are
> long. This is okay when we hit (except for the long chains), but on
> misses we waste the same amount of times as hits, if not more. The
> string compares on hits also cost a lot because of the length of the string.
> We really should use a ternary search tree or some structure like it,
> which on hits is actually faster (since we don't need multiple
> string compares), and on misses is a whole ton faster, since we abort
> much sooner.
> 
> --Dan
> 
> 
> 

-- 
Peter Schauer			pes@regent.e-technik.tu-muenchen.de
From dberlin@cygnus.com Mon Dec 04 14:13:00 2000
From: Daniel Berlin <dberlin@cygnus.com>
To: "Peter.Schauer" <Peter.Schauer@Regent.E-Technik.TU-Muenchen.DE>
Cc: gdb@sources.redhat.com, gdb-patches@sources.redhat.com
Subject: Re: Removal of demangled names from partial symbols
Date: Mon, 04 Dec 2000 14:13:00 -0000
Message-id: <Pine.SOL.3.91.1001204133004.11865A-100000@cse.cygnus.com>
References: <200012042053.VAA28364@reisser.regent.e-technik.tu-muenchen.de>
X-SW-Source: 2000-12/msg00014.html
Content-length: 2626

On Mon, 4 Dec 2000, Peter.Schauer wrote:

> : Why do we not do a lookup_minimal_symbol in
> : a new function, add_psymbol_and_dem_name_to_list, on the mangled name,
> : and if we get  back a symbol, use the demangled name from that,
> : otherwise, demangle it.
> 
> For some symbol formats (e.g. a.out) the linkage and debugging symbols are
> intermixed. By the time you want to record a partial symbol, the minimal
> symbol might have not been seen yet. Or the minimal symbol has been seen,
> but the minimal symbols are not yet installed, so lookup_minimal_symbol
> will fail.
> 

So i could just use the mapping structure i describe below, and have both 
SYMBOL_INIT_DEMANGLED_NAME, et al (IE all the ways we init the demangled 
name) attempt to look up the mangled name in it to  see if we've  demangled 
it before. Then it wouldn't matter what order this happened it.

> You might be able to work around this by going over all psymbols and fill
> in the demangled name via lookup_minimal_symbol _after_ the minimal symbols
> are installed, but I am not yet convinced that you don't have to pay your
> price on slower systems. After all, 5 secs to 6 secs is a 20% slowdown.
Except we already pay the price on ELF, where unless you've stripped it, 
your minsyms contain just about all the symbols used in the program.

It's also an issue of saving ~50% of the memory we use now (Remember, the 
slowdown comes from the bcaching, which would also not be necessary with 
the mapping structure), and making lookups dramatically faster.
Without the bcaching, profiling shows all the time being used because 
the lookup_minimal symbol, as you would expect.

Further analysis shows msymbol_hash_iw function isn't very good, and even 
if it was perfect  we have 379 buckets, total, per object file, so on 100000 
symbols, most  concentrated in one object file (the main program in this 
case, which  has 100 meg of debug info, and 100k symbols) ,we have chains 
with an  average length of 286. If we assume we have to search half those, 
we are doing 143 strcmps to find the minimal symbol to get the demangled 
name out of. The ternary search tree i propose would require the equivalent of one strcmp to find the 
name.

Users i've talked to are  more annoyed that once gdb gets going, it takes 
for ever to do anything, because the symbol lookups take so long, than 
they are that gdb takes 3 minutes vs 3 minutes, 30 seconds, to start up.

Let me finish integrating the mapping structure (I have the ternary 
search tree in libiberty, in case others wanted to use it), and do some 
tests to see how much faster it is.

--Dan
From cgd@sibyte.com Mon Dec 04 14:35:00 2000
From: cgd@sibyte.com (Chris G. Demetriou)
To: fche@redhat.com (Frank Ch. Eigler)
Cc: gdb@sourceware.cygnus.com
Subject: Re: How to adequately test 'sim' changes?
Date: Mon, 04 Dec 2000 14:35:00 -0000
Message-id: <5titozak8j.fsf@highland.sibyte.com>
References: <5twvdj29mi.fsf@highland.sibyte.com> <o5g0k7hnk8.fsf@toenail.toronto.redhat.com>
X-SW-Source: 2000-12/msg00015.html
Content-length: 2342

fche@redhat.com (Frank Ch. Eigler) writes:
> : (2) which MIPS targets should be used for testing of
> : architecture-dependent changes?
> 
> AFAIK, mipstx39-elf and mips64vr5000-elf are a reasonable pair.
> 
> : (3) which non-MIPS targets should be used for (additional) testing of
> : architecture-independent changes?  (I'm not thinking anything big
> : here, just the need for compile/smoke-test of minor additions.) [...]
> 
> The fr30, arm, mn10300, erc32 (sparc) would give reasonably wide
> coverage for build-breakage testing.

FYI, what i settled on was the script below.

I've been running it on a solaris-sparc host, with a single source
tree containing:

	* binutils
	* gdb + sim
	* libgloss + newlib
	* gcc (_only_ gcc, not the additional subtrees)

("it only chews about 1.3GB per test tree!"  8-)

Anyway, the setup seems to work well for some of the boards, and of
course there are classes of optimizations, etc., that cause some of
the targets GCC tests to fall right over.

I've got a auestion about this type of testing procedure.  I've been
going under the assumption that, if a reasonable number of the GCC
'execute' tests pass, this configuration is reasonable for testing the
simulator for that target.  As it turns out, some of the
configurations _don't_ actually pass any of the execute tests
(fr30-elf and sparclite-elf lose building them, and so don't even
attempt to execute), and e.g. fr30 runs into some issues in the sim
tests.  I'm using dejagnu 1.3.1-19990614 from
ftp://gcc.gnu.org/pub/egcs/infrastructure .  Any advice on this, either
targets/boards to use, or to upgrade & use the dejagnu from CVS?

cgd
========
#!/bin/ksh

do_date() {
	echo "*** `date`: $*"
}

do_target() {

	tgt=$1
	case $target in
	mipstx39-elf)
		board=tx39-sim
		;;
	*)
		board=`echo $tgt | sed -e 's,-elf,-sim,'`
		;;
	esac

	do_date start
	mkdir BUILD.$tgt
	cd BUILD.$tgt
	../src/configure --disable-shared --disable-nls --enable-languages=c \
	    --target=$tgt --prefix=`pwd`/../INSTALL.$tgt --with-gnu-as \
	    --with-gnu-ld
	do_date configured
	gmake
	do_date made
	gmake -k check RUNTESTFLAGS="--target_board=$board"
	do_date checked
}

do_date start

for target in \
    mipstx39-elf mips64-elf fr30-elf arm-elf     mn10300-elf sparclite-elf
do
	( do_target $target ) > LOG.$target 2>&1 &
done

wait

do_date all done
From ac131313@cygnus.com Mon Dec 04 15:10:00 2000
From: Andrew Cagney <ac131313@cygnus.com>
To: "Chris G. Demetriou" <cgd@sibyte.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>, gdb@sourceware.cygnus.com
Subject: Re: How to adequately test 'sim' changes?
Date: Mon, 04 Dec 2000 15:10:00 -0000
Message-id: <3A2C221E.1D2DA89C@cygnus.com>
References: <5twvdj29mi.fsf@highland.sibyte.com> <o5g0k7hnk8.fsf@toenail.toronto.redhat.com> <5titozak8j.fsf@highland.sibyte.com>
X-SW-Source: 2000-12/msg00016.html
Content-length: 576

"Chris G. Demetriou" wrote:

> > : (3) which non-MIPS targets should be used for (additional) testing of
> > : architecture-independent changes?  (I'm not thinking anything big
> > : here, just the need for compile/smoke-test of minor additions.) [...]
> >
> > The fr30, arm, mn10300, erc32 (sparc) would give reasonably wide
> > coverage for build-breakage testing.

FYI,

I'm trying to prepare a list of targets that are known to cross
compile.  See:

http://sources.redhat.com/ml/gdb-patches/2000-12/msg00024.html

I've an SH script that goes with the changes :-)

	Andrew
From dberlin@redhat.com Mon Dec 04 15:36:00 2000
From: Daniel Berlin <dberlin@redhat.com>
To: "Peter.Schauer" <Peter.Schauer@regent.e-technik.tu-muenchen.de>
Cc: gdb@sources.redhat.com, gdb-patches@sources.redhat.com
Subject: Re: Removal of demangled names from partial symbols
Date: Mon, 04 Dec 2000 15:36:00 -0000
Message-id: <m3y9xvpxmn.fsf@dan2.cygnus.com>
References: <200012042053.VAA28364@reisser.regent.e-technik.tu-muenchen.de>
X-SW-Source: 2000-12/msg00017.html
Content-length: 5887

"Peter.Schauer" <Peter.Schauer@regent.e-technik.tu-muenchen.de> writes:

Just a little more data.

I rigged lookup_minimal_symbol to print the number of times (to a
file, one number per line) it had to
do "msymbol = msymbol->hash_next", if the number was > 0.

All I did for this print out is start gdb like "gdb ~/corr101", then
quit as soon as we got a prompt.

corr101 has  27905 minsyms in it (according to maint print msymbols).

We do at least 1 strcmp, possibly 2, for each of the compares in the
loop looking for the minimal symbol name (which ends when we find the
symbol, or msymbol is NULL), because it uses SYMBOL_MATCHES_NAME.

The max number of times we ended up doing this SYMBOL_MATCHES_NAME, is
559 (which occurs ~1500 times).

We had 544476 lookups which took greater than 0 of these "compares"
(which each is 1 strcmp, and maybe 1 strcmp_iw).

We did 40875096 "compares" in these lookups.

Or, an average of 75 "compares" per lookup of a given name.

So, even if the ternary search tree was the equivalent of 3 compares
(due to overhead or whatever.), we'd still be able to do the partial
demangling fillin (which is completely dominated, at 99.99% of the
time being spent looking up the minsym to get the demangled name right
now) an average 25 times faster than lookup_minimal_symbol does it. So
i'm pretty confident we can completely eliminate the 20% slowdown i
was seeing.
Note, i'm not counting the compares both passes lookup_minimal_symbol does, just
the first pass. The second pass is over the minsym demangled hash
table, and goes slower, and is a waste of time for looking up the
symbol to get the mangled name, but i'm just trying to prove a point about how
much faster the demangling by lookup i want to use in the partial
symbol stuff could be.

Let me finish implementing and see if i'm right.

--Dan

> : Why do we not do a lookup_minimal_symbol in
> : a new function, add_psymbol_and_dem_name_to_list, on the mangled name,
> : and if we get  back a symbol, use the demangled name from that,
> : otherwise, demangle it.
> 
> For some symbol formats (e.g. a.out) the linkage and debugging symbols are
> intermixed. By the time you want to record a partial symbol, the minimal
> symbol might have not been seen yet. Or the minimal symbol has been seen,
> but the minimal symbols are not yet installed, so lookup_minimal_symbol
> will fail.
> 
> You might be able to work around this by going over all psymbols and fill
> in the demangled name via lookup_minimal_symbol _after_ the minimal symbols
> are installed, but I am not yet convinced that you don't have to pay your
> price on slower systems. After all, 5 secs to 6 secs is a 20% slowdown.
> 
> > Demangled names were removed from partial symbols to speed start up
> > times a few years ago.
> > 
> > However, with the minsym demangled hash table now around, we demangle
> > all minimal symbols when we install minimal symbols (IE we init the
> > demangled name on them,unconditionally).
> > 
> > Since the minimal symbol table ends up including a large subset of the
> > mangled partial symbols (if not all of them), this means we already have a large
> > subset of the partial symbol names demangled for us at start up
> > anyway.
> > 
> > Why do we not do a lookup_minimal_symbol in
> > a new function, add_psymbol_and_dem_name_to_list, on the mangled name,
> > and if we get  back a symbol, use the demangled name from that,
> > otherwise, demangle it.
> > 
> > Even tests on 100 meg of debug info show we barely add any startup
> > time at all (5 seconds without, 6 seconds with) . 
> > In fact, all added startup time is attributable to the
> > fact that to save memory, I had it bcache the demangled name in
> > SYMBOL_INIT_DEMANGLED_NAME.  If you don't bcache it (like right now),
> > it's in memory  in at least the full symbol, and the minimal
> > symbol (it's  actually in memory once for every time
> > SYMBOL_INIT_DEMANGLED_NAME is called on a symbol, and the demangling succeeds).
> > 
> > I think 1 second on 100 meg of debug info is worth it to not have to
> > linear search on every symbol lookup, which is amazingly 
> > slow, and if you have gdb using swap at all because of the number of
> > symbols, you are almost guaranteed to hit the swap 
> > hard on *every* single lookup, since we have to go through every
> > single symbol. 
> > 
> > This would solve the problem of not being able to lookup partial
> > symbols by demangled name, and allow us to binary search them without
> > fear of missing a symbol.
> > 
> > Would this be acceptable?
> > 
> > My next trick after that would be to add a mangled->demangled mapping
> > structure, if it's necessary to improve speed, and just use that to
> > lookup the names before demangling the 
> > name over again, in cases where we do (ie SYMBOL_INIT_DEMANGLED) need
> > to find a demangled name for a mangled one, and use that
> > rather than the minimal symbol table to try to find the name.
> > The reason for this is that a hash table (in this case, we are
> > using the minimal symbol demangled hash table as a lookup table) is the wrong structure
> > for this, since demangled names can be *very* large (average of 82
> > chars on my large C++ programs), and we always have to hash the entire
> > string, then do a whole bunch of string compares, because the chains are
> > long. This is okay when we hit (except for the long chains), but on
> > misses we waste the same amount of times as hits, if not more. The
> > string compares on hits also cost a lot because of the length of the string.
> > We really should use a ternary search tree or some structure like it,
> > which on hits is actually faster (since we don't need multiple
> > string compares), and on misses is a whole ton faster, since we abort
> > much sooner.
> > 
> > --Dan
> > 
> > 
> > 
> 
> -- 
> Peter Schauer			pes@regent.e-technik.tu-muenchen.de

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Removal of demangled names from partial symbols
       [not found] <200012050836.JAA29089@reisser.regent.e-technik.tu-muenchen.de>
@ 2000-12-06 17:32 ` Daniel Berlin
  0 siblings, 0 replies; 3+ messages in thread
From: Daniel Berlin @ 2000-12-06 17:32 UTC (permalink / raw)
  To: Peter.Schauer; +Cc: gdb-patches, gdb

"Peter.Schauer" <Peter.Schauer@regent.e-technik.tu-muenchen.de> writes:

> > symbols, most  concentrated in one object file (the main program in this 
> > case, which  has 100 meg of debug info, and 100k symbols) ,we have chains 
> 
> Are you really meaning that 100 MB of debug info are generated from one
> .o file ?
> 
Not only am I really meaning it, it's not uncommon.
Because with the STL and whatnot, you end up with a lot of code in
headers, and with template instantiations, where you get debug info
for every instantiation, even if the duplicate *code* is factored out
by linkonce sections, you have lots of debug info in C++ apps.
The two main issues with this type of thing are memory usage, and
lookup speed.

I'm working on both, the memory usage, and the speed of the symbol
lookup.
As it stands, it takes >200 meg of memory (easily) to debug this 100
meg app.

100 meg of this comes from the way we just read the entire .debug_info
section in in one huge block of memory we allocate, rather than doing
something smart, like mmap. I have patches that use BFD's file_window
stuff (which uses mmap), that i've been meaning to submit. Saves us a
bunch of memory, and the memory it does use is now shared, rather than
just ours. 

This is also why i spent time working up ways to remove duplicate info
and whatnot.

The mozilla folks (fer instance) are desperate to be able to actually
debug GDB on a workstation that has less than 4 gig of memory. And
they should be able to.

Currently, the symbol lookup isn't just slow in these situations, when
we try to debug large apps  with "only" double the memory that the app's debug
info takes up, we  hit the swap very hard, and it's completely
unusuable.  We get no  relief, because it has to touch literally every
symbol until it finds the symbol, so we end up having to swap out
just about everything, and then swap it back in when we find the
symbol, damning us to have to swap it back out again when we go
looking for another symbol.

If you have an app as large on disk as you have memory, you are
literally guaranteed to get worst case swapping behavior on every symbol
lookup.

If we were binary searches, we'd touch log (100k), or 16, symbols max,
to find one in all of them. In reality, since we are doing it block by
block, the number is a bit higher, but not much.
That means we touch such a ridicuously lower amount of swap, it's not
funny.

It's like 1 or 2 pages per block of symbols versus all the pages for
the symbols in that block.

> In that case I am beginning to understand why you need to get rid of the
> linear search so desperately...
> 

Yup.
gdb is worthless on medium->large size C++ apps right now.
literally worthless.

--Dan

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2000-12-06 17:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-12-04 10:22 Removal of demangled names from partial symbols Daniel Berlin
2000-12-04 12:53 ` Peter.Schauer
     [not found] <200012050836.JAA29089@reisser.regent.e-technik.tu-muenchen.de>
2000-12-06 17:32 ` Daniel Berlin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox