From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31853 invoked by alias); 18 Feb 2004 21:09:35 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 31803 invoked from network); 18 Feb 2004 21:09:32 -0000 Received: from unknown (HELO nevyn.them.org) (66.93.172.17) by sources.redhat.com with SMTP; 18 Feb 2004 21:09:32 -0000 Received: from drow by nevyn.them.org with local (Exim 4.30 #1 (Debian)) id 1AtYwi-0004U9-Cf; Wed, 18 Feb 2004 16:09:28 -0500 Date: Wed, 18 Feb 2004 21:09:00 -0000 From: Daniel Jacobowitz To: carlton@kealia.com, gdb@sources.redhat.com Subject: Huge slowdown since 6.0 Message-ID: <20040218210927.GA16641@nevyn.them.org> Mail-Followup-To: carlton@kealia.com, gdb@sources.redhat.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.1i X-SW-Source: 2004-02/txt/msg00231.txt.bz2 I finished prototyping my partial-symbol-sorting changes, as discussed on gdb-patches, and went to measure their impact. The results were a little confusing, but I've tracked them down. At the bottom of my message are profiling numbers if you want to see them. The summary is that I hope they won't be needed. Basically, the bottleneck in strcmp_iw_ordered that I am attacking was not there in 6.0. This baffled me. There were just as many psymbols and there should have been just as many calls; the change to use strcmp_iw_ordered predates 6.0. The only reasonable explanation is that the number of global symbols has vastly increased. This appears to be the case. David, the blame appears to be yours, in dwarf2read.c revision 1.120: @@ -1519,14 +1556,16 @@ add_partial_symbol (struct partial_die_i /* For C++, these implicitly act as typedefs as well. */ add_psymbol_to_list (actual_name, strlen (actual_name), VAR_DOMAIN, LOC_TYPEDEF, - &objfile->static_psymbols, + &objfile->global_psymbols, 0, (CORE_ADDR) 0, cu_language, objfile); } break; case DW_TAG_enumerator: add_psymbol_to_list (actual_name, strlen (actual_name), VAR_DOMAIN, LOC_CONST, - &objfile->static_psymbols, + cu_language == language_cplus + ? &objfile->static_psymbols + : &objfile->global_psymbols, 0, (CORE_ADDR) 0, cu_language, objfile); break; default: Could you re-explain the need for this change, please? You said: + /* NOTE: carlton/2003-11-10: C++ class symbols shouldn't + really ever be static objects: otherwise, if you try + to, say, break of a class's method and you're in a file + which doesn't mention that class, it won't work unless + the check for all static symbols in lookup_symbol_aux + saves you. See the OtherFileClass tests in + gdb.c++/namespace.exp. */ + but is that also necessary for enumerators? As things stand, one large enum in a used header can lead to worse-than-linear slowdown of GDB startup for DWARF-2, and a couple megabytes of wasted memory. Profiling information: HEAD: % time /opt/src/gdb/x86-as/gdb/gdb -batch /usr/lib/debug/libc.so.6 Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1". /opt/src/gdb/x86-as/gdb/gdb -batch /usr/lib/debug/libc.so.6 3.35s user 0.32s system 97% cpu 3.751 total With my changes: drow@nevyn:/big/fsf/projects/sym-full-name/ot% time ../obj/gdb/gdb -batch /usr/lib/debug/libc.so.6 Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1". ../obj/gdb/gdb -batch /usr/lib/debug/libc.so.6 2.43s user 0.21s system 99% cpu 2.656 total Awesome! 28% speedup. But: 6.0: % time ./gdb/gdb -batch /usr/lib/debug/libc.so.6 ./gdb/gdb -batch /usr/lib/debug/libc.so.6 1.95s user 0.26s system 96% cpu 2.288 total So it recovers a bare half of what we've lost since GDB 6.0. All binaries built using the same compiler and CFLAGS (-O2). Profiles for 6.0: % cumulative self self total time samples samples calls T1/call T1/call name 28.46 9128.00 9128.00 bcache 10.92 12631.00 3503.00 read_partial_die 9.10 15549.00 2918.00 hash 8.13 18155.00 2606.00 htab_find_slot_with_hash 5.57 19943.00 1788.00 read_attribute_value 5.11 21581.00 1638.00 htab_hash_string 3.80 22799.00 1218.00 read_unsigned_leb128 ... 0.04 31769.00 14.00 strcmp_iw_ordered ... 0.01 32011.00 2.00 compare_psymbols For HEAD: % cumulative self self total time samples samples calls T1/call T1/call name 19.00 8696.00 8696.00 strcmp_iw_ordered 16.31 16162.00 7466.00 bcache_data 7.50 19594.00 3432.00 read_partial_die 6.63 22630.00 3036.00 htab_find_slot_with_hash 4.93 24888.00 2258.00 hash 4.87 27118.00 2230.00 symbol_natural_name 3.94 28923.00 1805.00 read_attribute_value 3.69 30611.00 1688.00 htab_hash_string 3.28 32111.00 1500.00 _init 2.95 33461.00 1350.00 compare_psymbols With my changes: % cumulative self self total time samples samples calls T1/call T1/call name 17.53 8015.00 8015.00 bcache_data 7.52 11455.00 3440.00 htab_find_slot_with_hash 7.31 14799.00 3344.00 msymbol_hash_iw 7.18 18082.00 3283.00 read_partial_die 5.13 20426.00 2344.00 hash 4.30 22392.00 1966.00 compare_psymbols 4.23 24328.00 1936.00 gnu_debuglink_crc32 3.92 26122.00 1794.00 read_attribute_value 3.67 27800.00 1678.00 htab_hash_string -- Daniel Jacobowitz MontaVista Software Debian GNU/Linux Developer