Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* Slow handling of C++ symbol names
@ 2003-11-19 21:10 Will Cohen
  2003-11-19 22:10 ` Daniel Berlin
  2003-11-20  2:05 ` Michael Snyder
  0 siblings, 2 replies; 5+ messages in thread
From: Will Cohen @ 2003-11-19 21:10 UTC (permalink / raw)
  To: gdb-patches, wcohen

When debugging some C++ program with gdb (both versions 5.3 and 6.0) it 
takes a long time to load the debugging information and get a 
command-line prompt. The particular example that demonstrates this 
problem is monotone (http://www.venge.net/monotone/) which makes use of 
the boost libraries (http://www.boost.org/). gdb loading monotone with 
the associated debugging information takes over 30 seconds to get the 
initial command-line prompt on a 1.7GHz Athlon with 512MB or DRAM. 
monotone is not a small program, over 3MB of debugging information. I 
tooks some measurements with OProfile to find out where gdb 5.3 was 
spending its time. 3/4 of the time is spent in dyn_string_substring and 
__GI_strncpy, the column below the '%'. I recompiled gdb with '-pg' to 
find that this is due to the C++ name demangling. The highlights of 
gprof output are shown below. I looked at the symbols for monotone/boost 
in the executable, some of them are VERY long and for monotone/boost 
they average 147 characters in length. An example of ONE symbol over 2KB 
is at the end of this mail (there are others like that). I suspect that 
some of the algorithms used in gdb may not expect such long symbol names 
and be part of the reason for the slow startup, e.g. work 
O(length_of_string ^2). Has anyone else noticed this type of problem before?

[wcohen@slingshot monotone-0.7]$ oprofpp -l -r -k /usr/bin/gdb|more
Cpu type: Athlon
Cpu speed was (MHz estimation) : 1687.55
Counter 0 counted CPU_CLK_UNHALTED events (Cycles outside of halt state) 
with a
unit mask of 0x00 (No unit mask) count 843500
vma      samples  %           symbol name             image name
081e20c0 36781    55.9424     dyn_string_substring 
/usr/lib/debug/usr/bin/gdb.debug
00077290 14743    22.4235     __GI_strncpy            /lib/tls/libc-2.3.2.so
080f5e40 3434     5.22297     msymbol_hash_iw 
/usr/lib/debug/usr/bin/gdb.debug
00021dc0 1748     2.65864     __ctype_b_loc           /lib/tls/libc-2.3.2.so
00076990 1603     2.4381      __GI_strcpy             /lib/tls/libc-2.3.2.so
081e1e10 1417     2.1552      dyn_string_insert 
/usr/lib/debug/usr/bin/gdb.debug


The profile information from gprof.

Flat profile:

Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total
  time   seconds   seconds    calls   s/call   s/call  name
  88.41     18.46    18.46   354132     0.00     0.00  dyn_string_substring
   5.84     19.68     1.22    19299     0.00     0.00  msymbol_hash_iw
...


                 0.00    0.00  220617 demangle_class_enum_type <cycle 2> 
[423]
                 0.00    0.00      95             demangle_function_type 
<cycle 2> [482]
-----------------------------------------------
                 0.00    0.74   14108/354132      demangle_name <cycle 
2> [29]
                 0.00    1.04   19908/354132      demangle_type_ptr 
<cycle 2> [28]
                 0.00    4.55   87228/354132      demangle_prefix <cycle 
2> [24]
                 0.01   12.16  232888/354132      demangle_type <cycle 
2> [23]
[21]    88.6    0.01   18.49  354132         substitution_add [21]
                18.46    0.01  354132/354132      dyn_string_substring [22]
                 0.00    0.01  354132/979183      dyn_string_new [40]
-----------------------------------------------
                18.46    0.01  354132/354132      substitution_add [21]
[22]    88.5   18.46    0.01  354132         dyn_string_substring [22]
                 0.01    0.01  354132/4003799     dyn_string_resize [35]
-----------------------------------------------
                                13904             demangle_type <cycle 
2> [23]
                                   55             demangle_operator_name 
<cycle 2>


Find out how many symbols monotone has
$ nm monotone|wc
   23623   70853 3657239



Some of symbol names from the boost library are really, really, really 
long. Here is an example that is over 2K in length:

08145572 W 
_ZN5boost6spirit12subrule_listINS0_14subrule_parserILi0ENS0_8sequenceINS3_INS3_INS3_INS0_6actionINS0_7subruleILi1ENS0_15closure_contextI17rcs_admin_closureEEEEN7phoenix5actorINSA_9compositeINSA_9assign_opENSB_INSC_INSA_21member_var_ptr_actionI9rcs_admin8rcs_fileEENSB_INSA_14closure_memberILi0ENSA_7closureISG_NSA_5nil_tESK_EEEEEESK_SK_SK_EEEENSB_INSA_8argumentILi0EEEEESK_SK_EEEEEENS0_11alternativeINS0_11kleene_starINS4_INS5_ILi3ENS6_I17rcs_delta_closureEEEENSB_INSC_INSA_26member_function_ptr_actionIvSG_RK9rcs_deltaSK_SK_SK_EESN_SS_SK_SK_EEEEEEEENS3_INSX_INS5_ILi12ENS0_14parser_contextEEEEES19_EEEEEENSW_INS5_ILi2ENS6_I16rcs_desc_closureEEEENS3_IS1C_S1I_EEEEEENSX_INS4_INS5_ILi4ENS6_I21rcs_deltatext_closureEEEENSB_INSC_INS11_IvSG_RK13rcs_deltatextSK_SK_SK_EESN_SS_SK_SK_EEEEEEEEEENS0_10contiguousINSX_INS0_5chlitIcEEEEEEEENS6_I16rcs_file_closureEEEENS1_INS2_ILi10ES20_S1A_EENS1_INS2_ILi14ENS0_8positiveINSW_INS0_12digit_parserES20_EEEES1A_EENS1_INS2_ILi13ENS0_10differenceINS0_12graph

_parserENS0_5chsetIcEEEES1A_EENS1_INS2_ILi12ENS3_INS3_INS5_ILi8ENS6_I14string_closureEEEENSX_INS5_ILi9ES2K_EEEEEENS5_ILi10ES1A_EEEES1A_EENS1_INS2_ILi9ENSW_INSW_INSW_IS2L_NS5_ILi7ES2K_EEEENS5_ILi5ES2K_EEEES2P_EES2K_EENS1_INS2_ILi4ENS3_INS3_INS3_INS3_INS4_IS2S_NSB_INSC_ISD_NSB_INSC_INSE_ISsS1P_EENSB_INSI_ILi0ENSJ_IS1P_SK_SK_EEEEEESK_SK_SK_EEEESS_SK_SK_EEEEEENS0_6strlitIPKcEEEENS4_IS2U_S35_EEEENSW_IS3A_NS3_IS1C_S3A_EEEEEES3C_EES1N_EENS1_INS2_ILi2ENS3_IS3A_NS4_IS2U_NSB_INSC_ISD_NSB_INSI_ILi0ENSJ_ISsSK_SK_EEEEEESS_SK_SK_EEEEEEEES1H_EENS1_INS2_ILi3ENS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS4_IS2S_NSB_INSC_ISD_NSB_INSC_INSE_ISsS12_EENSB_INSI_ILi0ENSJ_IS12_SK_SK_EEEEEESK_SK_SK_EEEESS_SK_SK_EEEEEES3A_EES3Z_EES2P_EES3A_EENS4_IS2L_S3Y_EEEES2P_EES3A_EENS0_8optionalIS44_EEEES2P_EES3A_EENSX_INS4_IS2S_NSB_INSC_INS11_IvS12_RKSsSK_SK_SK_EES3U_SS_SK_SK_EEEEEEEEEES2P_EES3A_EENS48_IS3Z_EEEES2P_EESZ_EENS1_INS2_ILi1ENS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS

3_INS3_INS3_INS3_INS3_INS3_IS3A_NS4_IS2S_NSB_INSC_ISD_NSB_INSC_INSE_ISsSF_EENSB_INSI_ILi0ENSJ_ISF_SK_SK_EEEEEESK_SK_SK_EEEESS_SK_SK_EEEEEEEES2P_EENS48_IS51_EEEES3A_EENSX_INS4_IS2L_S4Y_EEEEEES2P_EES3A_EENSX_INS3_INS3_INS5_ILi6ES2K_EENS5_ILi11ES1A_EEEES2S_EEEEEES2P_EES3A_EENSX_INS3_INS3_IS2L_S5B_EES2S_EEEEEES2P_EENS48_INS3_IS3A_S2P_EEEEEENS48_INS3_INS3_IS3A_NS4_IS2U_S4Y_EEEES2P_EEEEEES5T_EES8_EENS0_5nil_tEEEEEEEEEEEEEEEEEEEEC1ERKS26_RKS66_

Doing a little computation on the wc of the nm output to figure out the 
average length of each symbol. Account for the 8 character address and 1 
character symbol type information:

(3657239 char - (8 char address + 1 char type) * 23623)/23623= 146 char


-Will


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Slow handling of C++ symbol names
  2003-11-19 21:10 Slow handling of C++ symbol names Will Cohen
@ 2003-11-19 22:10 ` Daniel Berlin
  2003-11-20  2:05 ` Michael Snyder
  1 sibling, 0 replies; 5+ messages in thread
From: Daniel Berlin @ 2003-11-19 22:10 UTC (permalink / raw)
  To: Will Cohen; +Cc: gdb-patches

>
> Find out how many symbols monotone has
> $ nm monotone|wc
>    23623   70853 3657239
>
>
>
> Some of symbol names from the boost library are really, really, really
> long. Here is an example that is over 2K in length:
>
That's not too bad, actually.
Consider that your 2k symbol expands to over 80k when demangled!


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Slow handling of C++ symbol names
  2003-11-19 21:10 Slow handling of C++ symbol names Will Cohen
  2003-11-19 22:10 ` Daniel Berlin
@ 2003-11-20  2:05 ` Michael Snyder
  2003-11-20 14:27   ` Will Cohen
  1 sibling, 1 reply; 5+ messages in thread
From: Michael Snyder @ 2003-11-20  2:05 UTC (permalink / raw)
  To: Will Cohen; +Cc: gdb-patches

Will Cohen wrote:
> When debugging some C++ program with gdb (both versions 5.3 and 6.0) it 
> takes a long time to load the debugging information and get a 
> command-line prompt. The particular example that demonstrates this 
> problem is monotone (http://www.venge.net/monotone/) which makes use of 
> the boost libraries (http://www.boost.org/). gdb loading monotone with 
> the associated debugging information takes over 30 seconds to get the 
> initial command-line prompt on a 1.7GHz Athlon with 512MB or DRAM. 

30 seconds!  Why, you young whipper-snapper... I remember when you
used to have to go get a cup of coffee while gdb loaded symbols!   ;-)
Hell, sometimes you could go out and *buy* the coffee...




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Slow handling of C++ symbol names
  2003-11-20  2:05 ` Michael Snyder
@ 2003-11-20 14:27   ` Will Cohen
  2003-11-20 20:22     ` Michael Snyder
  0 siblings, 1 reply; 5+ messages in thread
From: Will Cohen @ 2003-11-20 14:27 UTC (permalink / raw)
  To: Michael Snyder; +Cc: gdb-patches

Michael Snyder wrote:
> Will Cohen wrote:
> 
>> When debugging some C++ program with gdb (both versions 5.3 and 6.0) 
>> it takes a long time to load the debugging information and get a 
>> command-line prompt. The particular example that demonstrates this 
>> problem is monotone (http://www.venge.net/monotone/) which makes use 
>> of the boost libraries (http://www.boost.org/). gdb loading monotone 
>> with the associated debugging information takes over 30 seconds to get 
>> the initial command-line prompt on a 1.7GHz Athlon with 512MB or DRAM. 
> 
> 
> 30 seconds!  Why, you young whipper-snapper... I remember when you
> used to have to go get a cup of coffee while gdb loaded symbols!   ;-)
> Hell, sometimes you could go out and *buy* the coffee...

In the early 90's I did a port of GCC to the Gould NP1, a dual processor 
  machine (18MHz ECL multiboard processors) with a staggering 256MB of 
RAM at Purdue University. Unfortunately, the machine was the main 
machine for ALL the undergraduates. During the afternoons it was typical 
to see gcc 1.40 take 3 hours to build when there were 70 or more people 
on the machine.

The process for gdb debugging monotone wouldn't fit in the NP1's 256MB 
of memory. According to top SIZE is 265MB and RSS is 262MB.  So maybe 30 
seconds for filling up a quarter gigabyte of memory isn't so bad.

-Will


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Slow handling of C++ symbol names
  2003-11-20 14:27   ` Will Cohen
@ 2003-11-20 20:22     ` Michael Snyder
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Snyder @ 2003-11-20 20:22 UTC (permalink / raw)
  To: Will Cohen; +Cc: gdb-patches

Will Cohen wrote:
> Michael Snyder wrote:
> 
>> Will Cohen wrote:
>>
>>> When debugging some C++ program with gdb (both versions 5.3 and 6.0) 
>>> it takes a long time to load the debugging information and get a 
>>> command-line prompt. The particular example that demonstrates this 
>>> problem is monotone (http://www.venge.net/monotone/) which makes use 
>>> of the boost libraries (http://www.boost.org/). gdb loading monotone 
>>> with the associated debugging information takes over 30 seconds to 
>>> get the initial command-line prompt on a 1.7GHz Athlon with 512MB or 
>>> DRAM. 
>>
>>
>>
>> 30 seconds!  Why, you young whipper-snapper... I remember when you
>> used to have to go get a cup of coffee while gdb loaded symbols!   ;-)
>> Hell, sometimes you could go out and *buy* the coffee...
> 
> 
> In the early 90's I did a port of GCC to the Gould NP1, a dual processor 
>  machine (18MHz ECL multiboard processors) with a staggering 256MB of 
> RAM at Purdue University. Unfortunately, the machine was the main 
> machine for ALL the undergraduates. During the afternoons it was typical 
> to see gcc 1.40 take 3 hours to build when there were 70 or more people 
> on the machine.
> 
> The process for gdb debugging monotone wouldn't fit in the NP1's 256MB 
> of memory. According to top SIZE is 265MB and RSS is 262MB.  So maybe 30 
> seconds for filling up a quarter gigabyte of memory isn't so bad.

It would be bad if it were just copying from disk to memory,
but it's not bad for parsing one database of that size into another
with a completely different format.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-11-20 20:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-11-19 21:10 Slow handling of C++ symbol names Will Cohen
2003-11-19 22:10 ` Daniel Berlin
2003-11-20  2:05 ` Michael Snyder
2003-11-20 14:27   ` Will Cohen
2003-11-20 20:22     ` Michael Snyder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox