From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Berlin To: gdb@sources.redhat.com Cc: gdb-patches@sources.redhat.com Subject: Removal of demangled names from partial symbols Date: Mon, 04 Dec 2000 10:22:00 -0000 Message-id: X-SW-Source: 2000-12/msg00012.html Demangled names were removed from partial symbols to speed start up times a few years ago. However, with the minsym demangled hash table now around, we demangle all minimal symbols when we install minimal symbols (IE we init the demangled name on them,unconditionally). Since the minimal symbol table ends up including a large subset of the mangled partial symbols (if not all of them), this means we already have a large subset of the partial symbol names demangled for us at start up anyway. Why do we not do a lookup_minimal_symbol in a new function, add_psymbol_and_dem_name_to_list, on the mangled name, and if we get back a symbol, use the demangled name from that, otherwise, demangle it. Even tests on 100 meg of debug info show we barely add any startup time at all (5 seconds without, 6 seconds with) . In fact, all added startup time is attributable to the fact that to save memory, I had it bcache the demangled name in SYMBOL_INIT_DEMANGLED_NAME. If you don't bcache it (like right now), it's in memory in at least the full symbol, and the minimal symbol (it's actually in memory once for every time SYMBOL_INIT_DEMANGLED_NAME is called on a symbol, and the demangling succeeds). I think 1 second on 100 meg of debug info is worth it to not have to linear search on every symbol lookup, which is amazingly slow, and if you have gdb using swap at all because of the number of symbols, you are almost guaranteed to hit the swap hard on *every* single lookup, since we have to go through every single symbol. This would solve the problem of not being able to lookup partial symbols by demangled name, and allow us to binary search them without fear of missing a symbol. Would this be acceptable? My next trick after that would be to add a mangled->demangled mapping structure, if it's necessary to improve speed, and just use that to lookup the names before demangling the name over again, in cases where we do (ie SYMBOL_INIT_DEMANGLED) need to find a demangled name for a mangled one, and use that rather than the minimal symbol table to try to find the name. The reason for this is that a hash table (in this case, we are using the minimal symbol demangled hash table as a lookup table) is the wrong structure for this, since demangled names can be *very* large (average of 82 chars on my large C++ programs), and we always have to hash the entire string, then do a whole bunch of string compares, because the chains are long. This is okay when we hit (except for the long chains), but on misses we waste the same amount of times as hits, if not more. The string compares on hits also cost a lot because of the length of the string. We really should use a ternary search tree or some structure like it, which on hits is actually faster (since we don't need multiple string compares), and on misses is a whole ton faster, since we abort much sooner. --Dan