* Re: Slow handling of C++ symbol names
@ 2003-12-03 16:48 Michael Elizabeth Chastain
2003-12-03 16:52 ` David Carlton
2003-12-03 16:58 ` Ian Lance Taylor
0 siblings, 2 replies; 29+ messages in thread
From: Michael Elizabeth Chastain @ 2003-12-03 16:48 UTC (permalink / raw)
To: ian, mec.gnu; +Cc: ac131313, drow, gdb, wcohen
Ian Lance Taylor writes:
> Could you extract a few of the larger demangled names from each version,
> and post them? It might be a good double-check that something isn't
> weirdly broken.
I'm working on it.
I'm writing a Perl script to pass each name through both demanglers,
strip the differences, and compare the results. There are a lot of
picky differences to account for.
So far I've found one type of difference that looks like a bug in the
new demangler.
_ZStltI9file_pathSsEbRKSt4pairIT_T0_ES6_
OLD: bool std::operator< <file_path, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::pair<file_path, std::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&, std::pair<file_path, std::basic_string<char, > std::char_traits<char>, std::allocator<char> > > const&)
NEW: bool std::operator<<file_path, std::string>(std::pair<file_path, std::string> const&, std::pair<file_path, std::string> const&)
The old demangler produces "operator< <", and the new demangler produces
"operator <<". I'm not a name mangling expert but I think that
"operator <" is correct here and the new demangler suffers from
shift-operator-versus-template-syntax gotcha.
Also, why do you want the demangled names? I would think you would
have the same old+new demanglers that I do, so that the mangled names
would suffice. My reason is that the mangled names which are different
amount to 3 megabytes or so, but the demangled names which are different
amount to 300 megabytes or so.
Michael C
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-12-03 16:48 Slow handling of C++ symbol names Michael Elizabeth Chastain
@ 2003-12-03 16:52 ` David Carlton
2003-12-03 16:58 ` Ian Lance Taylor
1 sibling, 0 replies; 29+ messages in thread
From: David Carlton @ 2003-12-03 16:52 UTC (permalink / raw)
To: Michael Elizabeth Chastain; +Cc: ian, ac131313, drow, gdb, wcohen
On Wed, 3 Dec 2003 11:47:34 -0500 (EST), mec.gnu@mindspring.com (Michael Elizabeth Chastain) said:
> _ZStltI9file_pathSsEbRKSt4pairIT_T0_ES6_
> OLD: bool std::operator< <file_path, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::pair<file_path, std::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&, std::pair<file_path, std::basic_string<char, > std::char_traits<char>, std::allocator<char> > > const&)
> NEW: bool std::operator<<file_path, std::string>(std::pair<file_path, std::string> const&, std::pair<file_path, std::string> const&)
Fascinating.
> The old demangler produces "operator< <", and the new demangler
> produces "operator <<". I'm not a name mangling expert but I think
> that "operator <" is correct here and the new demangler suffers from
> shift-operator-versus-template-syntax gotcha.
Yeah, that's a bug, even one which could cause problems for GDB.
David Carlton
carlton@kealia.com
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-12-03 16:48 Slow handling of C++ symbol names Michael Elizabeth Chastain
2003-12-03 16:52 ` David Carlton
@ 2003-12-03 16:58 ` Ian Lance Taylor
2003-12-03 17:08 ` Daniel Jacobowitz
1 sibling, 1 reply; 29+ messages in thread
From: Ian Lance Taylor @ 2003-12-03 16:58 UTC (permalink / raw)
To: Michael Elizabeth Chastain; +Cc: ac131313, drow, gdb, wcohen
mec.gnu@mindspring.com (Michael Elizabeth Chastain) writes:
> So far I've found one type of difference that looks like a bug in the
> new demangler.
>
> _ZStltI9file_pathSsEbRKSt4pairIT_T0_ES6_
> OLD: bool std::operator< <file_path, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::pair<file_path, std::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&, std::pair<file_path, std::basic_string<char, > std::char_traits<char>, std::allocator<char> > > const&)
> NEW: bool std::operator<<file_path, std::string>(std::pair<file_path, std::string> const&, std::pair<file_path, std::string> const&)
>
> The old demangler produces "operator< <", and the new demangler produces
> "operator <<". I'm not a name mangling expert but I think that
> "operator <" is correct here and the new demangler suffers from
> shift-operator-versus-template-syntax gotcha.
Whoops. Quite right.
The old name you list above was generated using DMGL_VERBOSE. I can
tell by use of std::basic_string<char, std::char_traits<char>,
std::allocator<char> > where the new demangler just uses std::string.
The old demangler generates the former with DMGL_VERBOSE, or
std::string without DMGL_VERBOSE.
If the old demangler is being run with DMGL_VERBOSE, then I'm not
surprised that the resulting strings are much larger. That was the
main thing which concerned me, since I didn't see anything in gdb
which passed DMGL_VERBOSE to the demangler.
> Also, why do you want the demangled names? I would think you would
> have the same old+new demanglers that I do, so that the mangled names
> would suffice. My reason is that the mangled names which are different
> amount to 3 megabytes or so, but the demangled names which are different
> amount to 300 megabytes or so.
Oh sure, for some reason I thought it was easier for you to send the
demangled names. Doesn't make much sense, I guess. The mangled names
are fine.
Ian
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-12-03 16:58 ` Ian Lance Taylor
@ 2003-12-03 17:08 ` Daniel Jacobowitz
2003-12-03 17:34 ` Ian Lance Taylor
0 siblings, 1 reply; 29+ messages in thread
From: Daniel Jacobowitz @ 2003-12-03 17:08 UTC (permalink / raw)
To: Ian Lance Taylor; +Cc: Michael Elizabeth Chastain, ac131313, gdb, wcohen
On Wed, Dec 03, 2003 at 11:57:56AM -0500, Ian Lance Taylor wrote:
> mec.gnu@mindspring.com (Michael Elizabeth Chastain) writes:
>
> > So far I've found one type of difference that looks like a bug in the
> > new demangler.
> >
> > _ZStltI9file_pathSsEbRKSt4pairIT_T0_ES6_
> > OLD: bool std::operator< <file_path, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::pair<file_path, std::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&, std::pair<file_path, std::basic_string<char, > std::char_traits<char>, std::allocator<char> > > const&)
> > NEW: bool std::operator<<file_path, std::string>(std::pair<file_path, std::string> const&, std::pair<file_path, std::string> const&)
> >
> > The old demangler produces "operator< <", and the new demangler produces
> > "operator <<". I'm not a name mangling expert but I think that
> > "operator <" is correct here and the new demangler suffers from
> > shift-operator-versus-template-syntax gotcha.
>
> Whoops. Quite right.
>
> The old name you list above was generated using DMGL_VERBOSE. I can
> tell by use of std::basic_string<char, std::char_traits<char>,
> std::allocator<char> > where the new demangler just uses std::string.
> The old demangler generates the former with DMGL_VERBOSE, or
> std::string without DMGL_VERBOSE.
>
> If the old demangler is being run with DMGL_VERBOSE, then I'm not
> surprised that the resulting strings are much larger. That was the
> main thing which concerned me, since I didn't see anything in gdb
> which passed DMGL_VERBOSE to the demangler.
GDB never does set DMGL_VERBOSE. Are you sure the old demangler
wouldn't produce that without DMGL_VERBOSE?
Maybe the old demangler had a test reversed. ISTR that c++filt passes
DMGL_VERBOSE, and that generates std::string rather than
std::basic_string for the above.
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-12-03 17:08 ` Daniel Jacobowitz
@ 2003-12-03 17:34 ` Ian Lance Taylor
2003-12-03 17:38 ` Daniel Jacobowitz
0 siblings, 1 reply; 29+ messages in thread
From: Ian Lance Taylor @ 2003-12-03 17:34 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: Michael Elizabeth Chastain, ac131313, gdb, wcohen
Daniel Jacobowitz <drow@mvista.com> writes:
> GDB never does set DMGL_VERBOSE. Are you sure the old demangler
> wouldn't produce that without DMGL_VERBOSE?
>
> Maybe the old demangler had a test reversed. ISTR that c++filt passes
> DMGL_VERBOSE, and that generates std::string rather than
> std::basic_string for the above.
Well, hmmm. The code looks right to me. When I run my copy of
c++filt built with the old demangler, I get the long string, as
expected.
I note that DMGL_VERBOSE was only added on 2002-02-05, so if you're
using a c++filt from sources before that you will get the smaller
demangling.
Ian
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-12-03 17:34 ` Ian Lance Taylor
@ 2003-12-03 17:38 ` Daniel Jacobowitz
2003-12-03 17:54 ` Ian Lance Taylor
0 siblings, 1 reply; 29+ messages in thread
From: Daniel Jacobowitz @ 2003-12-03 17:38 UTC (permalink / raw)
To: Ian Lance Taylor; +Cc: Michael Elizabeth Chastain, ac131313, gdb, wcohen
On Wed, Dec 03, 2003 at 12:34:31PM -0500, Ian Lance Taylor wrote:
> Daniel Jacobowitz <drow@mvista.com> writes:
>
> > GDB never does set DMGL_VERBOSE. Are you sure the old demangler
> > wouldn't produce that without DMGL_VERBOSE?
> >
> > Maybe the old demangler had a test reversed. ISTR that c++filt passes
> > DMGL_VERBOSE, and that generates std::string rather than
> > std::basic_string for the above.
>
> Well, hmmm. The code looks right to me. When I run my copy of
> c++filt built with the old demangler, I get the long string, as
> expected.
>
> I note that DMGL_VERBOSE was only added on 2002-02-05, so if you're
> using a c++filt from sources before that you will get the smaller
> demangling.
I'm using one from binutils as of a month or so ago, and I get the
short string. Hmm. I don't know quite what's going on.
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-12-03 17:38 ` Daniel Jacobowitz
@ 2003-12-03 17:54 ` Ian Lance Taylor
0 siblings, 0 replies; 29+ messages in thread
From: Ian Lance Taylor @ 2003-12-03 17:54 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: Michael Elizabeth Chastain, ac131313, gdb, wcohen
Daniel Jacobowitz <drow@mvista.com> writes:
> On Wed, Dec 03, 2003 at 12:34:31PM -0500, Ian Lance Taylor wrote:
> > Daniel Jacobowitz <drow@mvista.com> writes:
> >
> > > GDB never does set DMGL_VERBOSE. Are you sure the old demangler
> > > wouldn't produce that without DMGL_VERBOSE?
> > >
> > > Maybe the old demangler had a test reversed. ISTR that c++filt passes
> > > DMGL_VERBOSE, and that generates std::string rather than
> > > std::basic_string for the above.
> >
> > Well, hmmm. The code looks right to me. When I run my copy of
> > c++filt built with the old demangler, I get the long string, as
> > expected.
> >
> > I note that DMGL_VERBOSE was only added on 2002-02-05, so if you're
> > using a c++filt from sources before that you will get the smaller
> > demangling.
>
> I'm using one from binutils as of a month or so ago, and I get the
> short string. Hmm. I don't know quite what's going on.
Me neither.
Everything makes sense to me and I get the results I expect. If we
want to pursue this further, though it may not matter much, I think
you need to debug your c++filt. See if flag_verbose is set. The
relevant function is called demangle_substitution(). The first call
to the function will call result_add() with either std::string or the
longer version.
Ian
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
@ 2003-12-03 22:30 Michael Elizabeth Chastain
0 siblings, 0 replies; 29+ messages in thread
From: Michael Elizabeth Chastain @ 2003-12-03 22:30 UTC (permalink / raw)
To: cagney, mec.gnu; +Cc: drow, gdb, ian
Here is a new table:
gdb memory utime stime elapsed
5.3 263M 85.47 26.80 189.68
6.0 264M 83.76 26.08 187.90
H19 263M 83.44 25.99 190.00
H21 46M 7.37 2.07 9.46
H30 46M 2.72 0.29 3.03
H19 is gdb HEAD 2003-11-19 16:00:00 UTC.
This has the exact same demangler as gdb 6.0 (modulo comments).
H21 is gdb HEAD 2003-11-21 17:00:00 UTC.
This has the old demangler with two bug-fix patches.
H01 is gdb HEAD 2003-11-30 06:08:24 UTC.
This is the new demangler.
Nothing happened to performance from 5.3 to 6.0 or from 6.0 to H19.
Something happened between H19 and H21. So I did:
cvs diff -u -D "2003-11-19 16:00:00 UTC" -D "2003-11-21 17:00:00 UTC"
This gives about 2000 lines of changes from H19 to H21, very manageable.
Nothing changed in gdb c++ support. All the good changes were in
libiberty/cp-demangle.c. So I believe the bug fixes in the old
demangler were very effective. Either that, or they introduced new bugs
which cause gdb to get to its first prompt a lot sooner.
Something happened from H19 to H30. I don't know if it's
gdb or the demangler, but I'm inclined to credit the demangler.
Michael C
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
@ 2003-12-03 20:27 Michael Elizabeth Chastain
2003-12-03 20:35 ` Ian Lance Taylor
0 siblings, 1 reply; 29+ messages in thread
From: Michael Elizabeth Chastain @ 2003-12-03 20:27 UTC (permalink / raw)
To: ian; +Cc: gdb, wcohen
As far as DMGL_VERBOSE and other DMGL_* options go, I am actually just
running two versions of c++filt with no special options and then
comparing the results. So I am not covering the layer of software where
gdb uses different DMGL_* options.
A little grep fu says that gdb uses combinations of:
DMGL_NO_OPTS
DMGL_ANSI
DMGL_ARM
DMGL_JAVA
DMGL_PARAMS
Michael C
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-12-03 20:27 Michael Elizabeth Chastain
@ 2003-12-03 20:35 ` Ian Lance Taylor
0 siblings, 0 replies; 29+ messages in thread
From: Ian Lance Taylor @ 2003-12-03 20:35 UTC (permalink / raw)
To: Michael Elizabeth Chastain; +Cc: gdb, wcohen
mec.gnu@mindspring.com (Michael Elizabeth Chastain) writes:
> As far as DMGL_VERBOSE and other DMGL_* options go, I am actually just
> running two versions of c++filt with no special options and then
> comparing the results. So I am not covering the layer of software where
> gdb uses different DMGL_* options.
Oh, OK. c++filt always passes DMGL_VERBOSE. That causes the old
demangler to print much longer names. The new demangler ignores
DMGL_VERBOSE. At least for now.
Ian
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
@ 2003-12-03 20:17 Michael Elizabeth Chastain
0 siblings, 0 replies; 29+ messages in thread
From: Michael Elizabeth Chastain @ 2003-12-03 20:17 UTC (permalink / raw)
To: cagney, mec.gnu; +Cc: drow, gdb, ian
Crap, I thought my timeline looked funny!
Try moving some of those 2003-08-XX to 2003-11-XX:
2003-05-03 Fix typo in "std::basic_istream ... char_traints ..." [sic]
2003-08-12 Fix license (no code change)
2003-11-19 Some bug fixes to old demangler by ILT
2003-11-20 Another ILT bug fix to old demangler
2003-11-21 New demangler
Michael C
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
@ 2003-12-03 20:14 Michael Elizabeth Chastain
0 siblings, 0 replies; 29+ messages in thread
From: Michael Elizabeth Chastain @ 2003-12-03 20:14 UTC (permalink / raw)
To: cagney; +Cc: drow, gdb, ian
mec> Here are some memory usage and time numbers.
mec>
mec> gdb memory utime stime elapsed
mec> 5.3 263M 84.79 26.52 185.92
mec> 6.0 263M 83.17 26.45 181.04
mec> HEAD 43M 2.78 0.27 3.23
mec>
mec> gdb HEAD is from '2003-11-30 06:08:24 UTC'.
ac> Michael, not to overload you but would it also be possible to do the
ac> numbers on a just-before-new-mangler-mainline?
Yeah, that's pretty straightforward.
Here's a timeline on cp-demangle.c (times are from the src/ archive).
2003-05-03 Fix typo in "std::basic_istream ... char_traints ..." [sic]
2003-08-12 Fix license (no code change)
2003-08-19 Some bug fixes to old demangler by ILT
2003-08-20 Another ILT bug fix to old demangler
2003-08-21 New demangler
So I'll do:
gdb HEAD 2003-11-19 16:00:00 UTC
which is just before the bug fixes of the old demangler,
gdb HEAD 2003-11-21 17:00:00 UTC
which is just before the new demangler appeared in src
Michael C
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
@ 2003-12-03 19:57 Michael Elizabeth Chastain
2003-12-03 20:16 ` Daniel Jacobowitz
0 siblings, 1 reply; 29+ messages in thread
From: Michael Elizabeth Chastain @ 2003-12-03 19:57 UTC (permalink / raw)
To: drow, ian; +Cc: gdb
ilt> Everything makes sense to me and I get the results I expect.
That's good enough for me. I don't care much about the verbosity.
My problem is: I've got 20,000 mangled names, and they demangle
to 250 megabytes of text, and I want to find real bugs like the
"operator< <" versus "operator <<" bug.
It's a big needle-in-haystack problem. I'm messing around with
a Perl script that helps sort out the haystack. After "monotone",
I'd like to run it on mangled names from cygwin and mozilla and
open-office too.
Michael C
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-12-03 19:57 Michael Elizabeth Chastain
@ 2003-12-03 20:16 ` Daniel Jacobowitz
0 siblings, 0 replies; 29+ messages in thread
From: Daniel Jacobowitz @ 2003-12-03 20:16 UTC (permalink / raw)
To: Michael Elizabeth Chastain; +Cc: ian, gdb
On Wed, Dec 03, 2003 at 02:56:45PM -0500, Michael Chastain wrote:
> ilt> Everything makes sense to me and I get the results I expect.
>
> That's good enough for me. I don't care much about the verbosity.
> My problem is: I've got 20,000 mangled names, and they demangle
> to 250 megabytes of text, and I want to find real bugs like the
> "operator< <" versus "operator <<" bug.
>
> It's a big needle-in-haystack problem. I'm messing around with
> a Perl script that helps sort out the haystack. After "monotone",
> I'd like to run it on mangled names from cygwin and mozilla and
> open-office too.
If what you're trying to do is canonicalize the names, consider the 20K
or so parser I posted to the "C++ / Java regressions" thread on Monday.
It has a couple of bugs left but it's pretty solid.
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
@ 2003-12-02 22:38 Michael Elizabeth Chastain
0 siblings, 0 replies; 29+ messages in thread
From: Michael Elizabeth Chastain @ 2003-12-02 22:38 UTC (permalink / raw)
To: ian, mec.gnu; +Cc: ac131313, drow, gdb, wcohen
> Could you extract a few of the larger demangled names from
> each version, and post them? It might be a good double-check that
> something isn't weirdly broken. It would be nice to have the mangled
> symbol name too, but not critical.
Actually the demangled names are too big to post!
How about if I extract all the mangled names and put up a tarball?
I'll get some representative demangled names from each version too.
I gotta write Perl just to process these things, I can't open up
a 250 megabyte file in any text editor on my 128 megabyte laptop!
(I am going out dancing tonight, back in 6 hours or so. If you are
really impatient you could grab Will Cohen's original binary from
that URL).
Michael C
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
@ 2003-12-02 22:09 Michael Elizabeth Chastain
2003-12-02 22:23 ` Ian Lance Taylor
2003-12-03 18:52 ` Andrew Cagney
0 siblings, 2 replies; 29+ messages in thread
From: Michael Elizabeth Chastain @ 2003-12-02 22:09 UTC (permalink / raw)
To: ac131313, wcohen; +Cc: drow, gdb
Andrew Cagney writes:
> I guess we get to restart the analysis :-(
Sounds like a job for GDB QA Guy! :)
I downloaded Will's executables from:
http://people.redhat.com/wcohen/gdb_tuning/monostrip
http://people.redhat.com/wcohen/gdb_tuning/monotone
Here are some memory usage and time numbers.
gdb memory utime stime elapsed
5.3 263M 84.79 26.52 185.92
6.0 263M 83.17 26.45 181.04
HEAD 43M 2.78 0.27 3.23
gdb HEAD is from '2003-11-30 06:08:24 UTC'.
My system is a dell inspiron 2500 with intel celeron 800 MHz, 128
megabytes of memory, running without the X window system. I used "top"
to monitor the memory usage. I am just reporting the biggest "SIZE"
that I saw in "top" before gdb exited.
So there was no increase in resource usage from gdb 5.3 to gdb 6.0.
All three versions of gdb print "(no debugging symbols found)...".
All three versions can do "break main; run; cont; backtrace"
and print a good looking backtrace. ("cont" hits a segfault).
I checked "maint print msymbols" with gdb 6.0 and gdb HEAD.
Both versions had a highest msymbol number of 23521.
However, gdb 6.0 produced an output file 266966644 bytes long,
and gdb HEAD produced an output file only 32695256 bytes long.
So next I counted the length of each line in the output. The length of
the line is a few characters longer than the actual demangled minsym
because of the line index and stuff.
The top ten line lengths for gdb 6.0 were:
1911946
1740037
1699747
1697323
1636399
1626161
1601007
1512726
1512726
1512641
And the top ten line lengths for gdb HEAD were:
77302
77302
77172
77172
77172
77154
77147
77147
77147
77147
So, the new demangler is printing different, much smaller output.
In every case that I've seen, the new demangler has been correct,
and the old demangler has had many bugs. So I have no reason to doubt
the new demangler now. It would be useful to take some of these
2000-byte mangled symbols and check that the 70K demangled forms are
correct, but I don't have the resources for that yet.
I don't know why gdb HEAD is printing "(no debugging symbols found)...",
even for "monotone". That's probably the most important bug now.
Michael C
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: Slow handling of C++ symbol names
2003-12-02 22:09 Michael Elizabeth Chastain
@ 2003-12-02 22:23 ` Ian Lance Taylor
2003-12-02 22:29 ` Daniel Jacobowitz
2003-12-03 18:52 ` Andrew Cagney
1 sibling, 1 reply; 29+ messages in thread
From: Ian Lance Taylor @ 2003-12-02 22:23 UTC (permalink / raw)
To: Michael Elizabeth Chastain; +Cc: ac131313, wcohen, drow, gdb
mec.gnu@mindspring.com (Michael Elizabeth Chastain) writes:
> So, the new demangler is printing different, much smaller output.
That's odd. I could understand it if gdb were passing DMGL_VERBOSE to
cplus_demangle(). But as far as I know it is not doing so in any
version. Could you extract a few of the larger demangled names from
each version, and post them? It might be a good double-check that
something isn't weirdly broken. It would be nice to have the mangled
symbol name too, but not critical.
Thanks.
Ian
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-12-02 22:23 ` Ian Lance Taylor
@ 2003-12-02 22:29 ` Daniel Jacobowitz
0 siblings, 0 replies; 29+ messages in thread
From: Daniel Jacobowitz @ 2003-12-02 22:29 UTC (permalink / raw)
To: Ian Lance Taylor; +Cc: Michael Elizabeth Chastain, ac131313, wcohen, gdb
On Tue, Dec 02, 2003 at 05:23:16PM -0500, Ian Lance Taylor wrote:
> mec.gnu@mindspring.com (Michael Elizabeth Chastain) writes:
>
> > So, the new demangler is printing different, much smaller output.
>
> That's odd. I could understand it if gdb were passing DMGL_VERBOSE to
> cplus_demangle(). But as far as I know it is not doing so in any
> version. Could you extract a few of the larger demangled names from
> each version, and post them? It might be a good double-check that
> something isn't weirdly broken. It would be nice to have the mangled
> symbol name too, but not critical.
>
> Thanks.
My guess is that this has something to do with DMGL_PARAMS. But again
that's odd since we ought to be passing that when we demangle msyms.
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-12-02 22:09 Michael Elizabeth Chastain
2003-12-02 22:23 ` Ian Lance Taylor
@ 2003-12-03 18:52 ` Andrew Cagney
1 sibling, 0 replies; 29+ messages in thread
From: Andrew Cagney @ 2003-12-03 18:52 UTC (permalink / raw)
To: Michael Elizabeth Chastain; +Cc: ac131313, wcohen, drow, gdb
> Andrew Cagney writes:
>
>> I guess we get to restart the analysis :-(
>
>
> Sounds like a job for GDB QA Guy! :)
>
> I downloaded Will's executables from:
>
> http://people.redhat.com/wcohen/gdb_tuning/monostrip
> http://people.redhat.com/wcohen/gdb_tuning/monotone
>
> Here are some memory usage and time numbers.
>
> gdb memory utime stime elapsed
> 5.3 263M 84.79 26.52 185.92
> 6.0 263M 83.17 26.45 181.04
> HEAD 43M 2.78 0.27 3.23
>
> gdb HEAD is from '2003-11-30 06:08:24 UTC'.
Michael, not to overload you but would it also be possible to do the
numbers on a just-before-new-mangler-mainline? At present there's a gap
in our knowledge - we don't know the effect that just the demangler
change had. Also lets us compare:
6.0 vs PRE vs POST
Andrew
^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <3FBBDC27.50204@redhat.com>]
* Re: Slow handling of C++ symbol names
[not found] <3FBBDC27.50204@redhat.com>
@ 2003-11-19 21:13 ` Daniel Jacobowitz
2003-11-20 16:09 ` Andrew Cagney
0 siblings, 1 reply; 29+ messages in thread
From: Daniel Jacobowitz @ 2003-11-19 21:13 UTC (permalink / raw)
To: Will Cohen; +Cc: gdb
First of all, please use gdb@, not gdb-patches@.
On Wed, Nov 19, 2003 at 04:09:59PM -0500, Will Cohen wrote:
> When debugging some C++ program with gdb (both versions 5.3 and 6.0) it
> takes a long time to load the debugging information and get a
> command-line prompt. The particular example that demonstrates this
> problem is monotone (http://www.venge.net/monotone/) which makes use of
> the boost libraries (http://www.boost.org/). gdb loading monotone with
> the associated debugging information takes over 30 seconds to get the
> initial command-line prompt on a 1.7GHz Athlon with 512MB or DRAM.
> monotone is not a small program, over 3MB of debugging information. I
> tooks some measurements with OProfile to find out where gdb 5.3 was
> spending its time. 3/4 of the time is spent in dyn_string_substring and
> __GI_strncpy, the column below the '%'. I recompiled gdb with '-pg' to
> find that this is due to the C++ name demangling. The highlights of
> gprof output are shown below. I looked at the symbols for monotone/boost
> in the executable, some of them are VERY long and for monotone/boost
> they average 147 characters in length. An example of ONE symbol over 2KB
> is at the end of this mail (there are others like that). I suspect that
> some of the algorithms used in gdb may not expect such long symbol names
> and be part of the reason for the slow startup, e.g. work
> O(length_of_string ^2). Has anyone else noticed this type of problem before?
Have you tried a more recent version of GDB? This may have been
improved. Definitely some startup time issues were fixed.
Also, the demangler actually comes from GCC, not GDB. All we can do is
try to call it less often.
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: Slow handling of C++ symbol names
2003-11-19 21:13 ` Daniel Jacobowitz
@ 2003-11-20 16:09 ` Andrew Cagney
2003-11-20 16:19 ` Daniel Jacobowitz
0 siblings, 1 reply; 29+ messages in thread
From: Andrew Cagney @ 2003-11-20 16:09 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: Will Cohen, gdb
>
> Have you tried a more recent version of GDB? This may have been
> improved. Definitely some startup time issues were fixed.
Nothing in the ChangeLogs jumped out.
> Also, the demangler actually comes from GCC, not GDB.
I don't see GCC being motivated to fix it though :-(
> All we can do is
> try to call it less often.
Which leads to the question. Why does GDB demangle symbols? My
simplistic understanding of the code is that GDB only needs the "iw"
(a.k.a., demangled string up to but excluding the lparen and ignoring
white space) part of the symbol in the search table (the rest isn't so
critical and can be constructed on-demand).
Andrew
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-11-20 16:09 ` Andrew Cagney
@ 2003-11-20 16:19 ` Daniel Jacobowitz
2003-11-20 16:54 ` Andrew Cagney
0 siblings, 1 reply; 29+ messages in thread
From: Daniel Jacobowitz @ 2003-11-20 16:19 UTC (permalink / raw)
To: Andrew Cagney; +Cc: Will Cohen, gdb
On Thu, Nov 20, 2003 at 11:08:59AM -0500, Andrew Cagney wrote:
> >
> >Have you tried a more recent version of GDB? This may have been
> >improved. Definitely some startup time issues were fixed.
>
> Nothing in the ChangeLogs jumped out.
Well, I remember fixing some startup time issues since then :P For
instance, the cache shared between minimal and partial symbols should
cut demangling time about in half.
> >Also, the demangler actually comes from GCC, not GDB.
>
> I don't see GCC being motivated to fix it though :-(
>
> > All we can do is
> > try to call it less often.
>
> Which leads to the question. Why does GDB demangle symbols? My
> simplistic understanding of the code is that GDB only needs the "iw"
> (a.k.a., demangled string up to but excluding the lparen and ignoring
> white space) part of the symbol in the search table (the rest isn't so
> critical and can be constructed on-demand).
A substantial amount of demangling is needed to produce the part of
the symbol before the lparen; consider templates. Also, we need the
full names in the minimal symbol for break 'foo(int)' with quotes to
work. And there are assumptions of unique symbol names in our
hashing/searching, IIRC.
I'm sure there are tricks we can do to cut down on how early or often
we demangle, but it still seems to be necessary.
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-11-20 16:19 ` Daniel Jacobowitz
@ 2003-11-20 16:54 ` Andrew Cagney
2003-11-20 16:58 ` Daniel Jacobowitz
2003-11-21 17:58 ` Daniel Jacobowitz
0 siblings, 2 replies; 29+ messages in thread
From: Andrew Cagney @ 2003-11-20 16:54 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: Will Cohen, gdb
>
> Well, I remember fixing some startup time issues since then :P For
> instance, the cache shared between minimal and partial symbols should
> cut demangling time about in half.
Ah, this: symbol_set_names? I was looking at the demangler.
> Which leads to the question. Why does GDB demangle symbols? My
>> simplistic understanding of the code is that GDB only needs the "iw"
>> (a.k.a., demangled string up to but excluding the lparen and ignoring
>> white space) part of the symbol in the search table (the rest isn't so
>> critical and can be constructed on-demand).
>
>
> A substantial amount of demangling is needed to produce the part of
> the symbol before the lparen; consider templates. Also, we need the
> full names in the minimal symbol for break 'foo(int)' with quotes to
> work. And there are assumptions of unique symbol names in our
> hashing/searching, IIRC.
But without looking at the data we've no idea how substantial any
particular part really is. For instance, when analysing the bcache
found that when debugging a C program every entry is 28 bytes in size!
'foo(int)' can be broken down into "foo" "(int)" the latter only being
demanged and stored on-demand.
> I'm sure there are tricks we can do to cut down on how early or often
> we demangle, but it still seems to be necessary.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-11-20 16:54 ` Andrew Cagney
@ 2003-11-20 16:58 ` Daniel Jacobowitz
2003-11-21 17:58 ` Daniel Jacobowitz
1 sibling, 0 replies; 29+ messages in thread
From: Daniel Jacobowitz @ 2003-11-20 16:58 UTC (permalink / raw)
To: Andrew Cagney; +Cc: Will Cohen, gdb
On Thu, Nov 20, 2003 at 11:54:33AM -0500, Andrew Cagney wrote:
> >
> >Well, I remember fixing some startup time issues since then :P For
> >instance, the cache shared between minimal and partial symbols should
> >cut demangling time about in half.
>
> Ah, this: symbol_set_names? I was looking at the demangler.
>
> >Which leads to the question. Why does GDB demangle symbols? My
> >>simplistic understanding of the code is that GDB only needs the "iw"
> >>(a.k.a., demangled string up to but excluding the lparen and ignoring
> >>white space) part of the symbol in the search table (the rest isn't so
> >>critical and can be constructed on-demand).
> >
> >
> >A substantial amount of demangling is needed to produce the part of
> >the symbol before the lparen; consider templates. Also, we need the
> >full names in the minimal symbol for break 'foo(int)' with quotes to
> >work. And there are assumptions of unique symbol names in our
> >hashing/searching, IIRC.
>
> But without looking at the data we've no idea how substantial any
> particular part really is. For instance, when analysing the bcache
> found that when debugging a C program every entry is 28 bytes in size!
>
> 'foo(int)' can be broken down into "foo" "(int)" the latter only being
> demanged and stored on-demand.
But does doing that save any measurable time in the demangling? I'm
guessing very little, because of the way v3 mangled substitutions work.
Someone more familiar with the demangler could have a look at this.
Also, the demangler is currently being rewritten (again). Result might
be faster; DJ posted it to gcc-patches a few days ago.
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-11-20 16:54 ` Andrew Cagney
2003-11-20 16:58 ` Daniel Jacobowitz
@ 2003-11-21 17:58 ` Daniel Jacobowitz
2003-11-26 14:27 ` Andrew Cagney
1 sibling, 1 reply; 29+ messages in thread
From: Daniel Jacobowitz @ 2003-11-21 17:58 UTC (permalink / raw)
To: Andrew Cagney; +Cc: Will Cohen, gdb
On Thu, Nov 20, 2003 at 11:54:33AM -0500, Andrew Cagney wrote:
> >
> >Well, I remember fixing some startup time issues since then :P For
> >instance, the cache shared between minimal and partial symbols should
> >cut demangling time about in half.
>
> Ah, this: symbol_set_names? I was looking at the demangler.
>
> >Which leads to the question. Why does GDB demangle symbols? My
> >>simplistic understanding of the code is that GDB only needs the "iw"
> >>(a.k.a., demangled string up to but excluding the lparen and ignoring
> >>white space) part of the symbol in the search table (the rest isn't so
> >>critical and can be constructed on-demand).
> >
> >
> >A substantial amount of demangling is needed to produce the part of
> >the symbol before the lparen; consider templates. Also, we need the
> >full names in the minimal symbol for break 'foo(int)' with quotes to
> >work. And there are assumptions of unique symbol names in our
> >hashing/searching, IIRC.
>
> But without looking at the data we've no idea how substantial any
> particular part really is. For instance, when analysing the bcache
> found that when debugging a C program every entry is 28 bytes in size!
>
> 'foo(int)' can be broken down into "foo" "(int)" the latter only being
> demanged and stored on-demand.
"They" tell me that the demangler did not support doing this. But it
may soon.
Will, you may want to test a CVS snapshot from today or later. The
demangler for GNU v3 was replaced last night; the new one appears to be
about twice as efficient.
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-11-21 17:58 ` Daniel Jacobowitz
@ 2003-11-26 14:27 ` Andrew Cagney
2003-11-26 16:24 ` Will Cohen
2003-11-26 20:45 ` Will Cohen
0 siblings, 2 replies; 29+ messages in thread
From: Andrew Cagney @ 2003-11-26 14:27 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: Will Cohen, gdb
> But without looking at the data we've no idea how substantial any
>> particular part really is. For instance, when analysing the bcache
>> found that when debugging a C program every entry is 28 bytes in size!
>>
>> 'foo(int)' can be broken down into "foo" "(int)" the latter only being
>> demanged and stored on-demand.
>
>
> "They" tell me that the demangler did not support doing this. But it
> may soon.
>
> Will, you may want to test a CVS snapshot from today or later. The
> demangler for GNU v3 was replaced last night; the new one appears to be
> about twice as efficient.
FYI, Will got hold of an RPM based on 20031117 and I believe the
performance numbers came back the same as 6.0 :-( That would make it
post your changes but pre the new demangler.
I guess we get to restart the analysis :-(
Andrew
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-11-26 14:27 ` Andrew Cagney
@ 2003-11-26 16:24 ` Will Cohen
2003-11-26 20:48 ` Andrew Cagney
2003-11-26 20:45 ` Will Cohen
1 sibling, 1 reply; 29+ messages in thread
From: Will Cohen @ 2003-11-26 16:24 UTC (permalink / raw)
To: Andrew Cagney; +Cc: Daniel Jacobowitz, gdb
Andrew Cagney wrote:
>> But without looking at the data we've no idea how substantial any
>>
>>> particular part really is. For instance, when analysing the bcache
>>> found that when debugging a C program every entry is 28 bytes in size!
>>>
>>> 'foo(int)' can be broken down into "foo" "(int)" the latter only
>>> being demanged and stored on-demand.
>>
>>
>>
>> "They" tell me that the demangler did not support doing this. But it
>> may soon.
>>
>> Will, you may want to test a CVS snapshot from today or later. The
>> demangler for GNU v3 was replaced last night; the new one appears to be
>> about twice as efficient.
>
>
> FYI, Will got hold of an RPM based on 20031117 and I believe the
> performance numbers came back the same as 6.0 :-( That would make it
> post your changes but pre the new demangler.
>
> I guess we get to restart the analysis :-(
>
> Andrew
I tried building a uberbaum tree with a snapshot this morning
(2003/11/26). The uberbaum built a gdb, but didn't complete building
everything.
I tried using this gdb with the new demangler on the monotone executable
that take 30 seconds or so to load. gdb segfaults when loading the
debugging information. This is definitely caused by reading in the
debugging information. gdb was able to read in a stripped version of the
same file and the gdb dies in libiberty/cp-demangle.c:d_print_comp which
is called from demangler.
I have made the monotone files available at
http://people.redhat.com/wcohen/gdb_tuning/
--Will
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Slow handling of C++ symbol names
2003-11-26 14:27 ` Andrew Cagney
2003-11-26 16:24 ` Will Cohen
@ 2003-11-26 20:45 ` Will Cohen
1 sibling, 0 replies; 29+ messages in thread
From: Will Cohen @ 2003-11-26 20:45 UTC (permalink / raw)
To: Andrew Cagney; +Cc: Daniel Jacobowitz, gdb
Andrew Cagney wrote:
>> But without looking at the data we've no idea how substantial any
>>
>>> particular part really is. For instance, when analysing the bcache
>>> found that when debugging a C program every entry is 28 bytes in size!
>>>
>>> 'foo(int)' can be broken down into "foo" "(int)" the latter only
>>> being demanged and stored on-demand.
>>
>>
>>
>> "They" tell me that the demangler did not support doing this. But it
>> may soon.
>>
>> Will, you may want to test a CVS snapshot from today or later. The
>> demangler for GNU v3 was replaced last night; the new one appears to be
>> about twice as efficient.
>
>
> FYI, Will got hold of an RPM based on 20031117 and I believe the
> performance numbers came back the same as 6.0 :-( That would make it
> post your changes but pre the new demangler.
>
> I guess we get to restart the analysis :-(
>
> Andrew
I tried a very recent snapshot of gdb and libiberty (used the uberbaum
checkout the to get pieces 2003/11/25). The uberbaum appears to be
broken and doesn't build to completion, but it does build gdb
executable. However, the new cp-demangler.c gets a segfault error when
the gdb executable is loading the monotone executable. I have filed a
bugzilla entry, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13206 . It
looks like there is some more work required.
The problem is definitely related to the processing of the debugging
information. The gdb executable loads the stripped version of the
monoexecutable without problem.
-Will
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2003-12-03 22:30 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-03 16:48 Slow handling of C++ symbol names Michael Elizabeth Chastain
2003-12-03 16:52 ` David Carlton
2003-12-03 16:58 ` Ian Lance Taylor
2003-12-03 17:08 ` Daniel Jacobowitz
2003-12-03 17:34 ` Ian Lance Taylor
2003-12-03 17:38 ` Daniel Jacobowitz
2003-12-03 17:54 ` Ian Lance Taylor
-- strict thread matches above, loose matches on Subject: below --
2003-12-03 22:30 Michael Elizabeth Chastain
2003-12-03 20:27 Michael Elizabeth Chastain
2003-12-03 20:35 ` Ian Lance Taylor
2003-12-03 20:17 Michael Elizabeth Chastain
2003-12-03 20:14 Michael Elizabeth Chastain
2003-12-03 19:57 Michael Elizabeth Chastain
2003-12-03 20:16 ` Daniel Jacobowitz
2003-12-02 22:38 Michael Elizabeth Chastain
2003-12-02 22:09 Michael Elizabeth Chastain
2003-12-02 22:23 ` Ian Lance Taylor
2003-12-02 22:29 ` Daniel Jacobowitz
2003-12-03 18:52 ` Andrew Cagney
[not found] <3FBBDC27.50204@redhat.com>
2003-11-19 21:13 ` Daniel Jacobowitz
2003-11-20 16:09 ` Andrew Cagney
2003-11-20 16:19 ` Daniel Jacobowitz
2003-11-20 16:54 ` Andrew Cagney
2003-11-20 16:58 ` Daniel Jacobowitz
2003-11-21 17:58 ` Daniel Jacobowitz
2003-11-26 14:27 ` Andrew Cagney
2003-11-26 16:24 ` Will Cohen
2003-11-26 20:48 ` Andrew Cagney
2003-11-26 20:45 ` Will Cohen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox