* Associating C++ new with constructor
@ 2007-02-15 11:50 Ross Boylan
2007-02-15 13:35 ` Daniel Jacobowitz
2007-02-15 15:38 ` Michael Veksler
0 siblings, 2 replies; 6+ messages in thread
From: Ross Boylan @ 2007-02-15 11:50 UTC (permalink / raw)
To: gdb; +Cc: Ross Boylan
This is NOT about the problems setting a breakpoint on a C++
constructor!
I've been looking for a way to count creations and destructions of C++
objects, with the counts kept on a per class basis. I've looked at
various tools to track memory use, but none of them appear to provide
this directly (even when attention is limited to the free store).
Some capture the call stack when the call to new is made, but this
doesn't definitively identify the class in question. (It can provide
the line of the call, but that's an imperfect indicator. If there are
multiple calls on one line, or the call is split over several lines,
things are ugly. And if we have
template<typename T> class Foo {
.....
T* p = new T(); // maybe needs to be new typename T()?
.....
};
the source is insufficient to identify the type.)
The problem is that the call to new precedes the call to the
constructor; new doesn't know the type, and the type is not on the
call stack.
Might it be possible, programmatically, to trace the stack up, locate
the call that will be made next, and get the constructor that way?
Here's a toy example on linux-i386:
Source line is
A *pA = new A();
disassembly
0x08048c9c <main+86>: movl $0x2c,0xc(%esp)
0x08048ca4 <main+94>: movl $0x804910d,0x8(%esp)
0x08048cac <main+102>: movl $0x8049166,0x4(%esp)
0x08048cb4 <main+110>: movl $0x4,(%esp)
0x08048cbb <main+117>: call 0x8048eac <_ZnwjPKcS0_m>
0x08048cc0 <main+122>: mov %eax,%ebx
0x08048cc2 <main+124>: mov %ebx,(%esp)
0x08048cc5 <main+127>: call 0x8048e9e <A>
0x08048cca <main+132>: mov %ebx,0xfffffff4(%ebp)
The call stack has main+122 on it. Adding 5 gets me to the next call,
and
(gdb) info symbol 0x8048e9e
A::A() in section .text
(which I suppose is what the <A> annotation in the disassembly was
about).
So my recipe is
a) get the address of the caller
b) add 5
c) extract the location being called there
d) get the symbol being called, and use it to identify the class
(this last step involves several substeps, including notably doing the
disassemly, getting the symbol, unmangling the symbol, and extracting
the class name).
Does that have any chance of working with any generality?
I'm not necessarily looking to do the instrumentation from within gdb,
though I'd certainly like to avoid having to redo the logic of getting
debug info, etc..
The most obvious concern is that calls to the c'tor might get
optimized away (for example, class A above is empty, but I compiled
with g++ -O0).
It would also be nice to get the address of new automatically. Can
anyone explain why this symbolic lookup failed? (the program wasn't
running)
(gdb) info symb 0x8048eac # works
operator new(unsigned int, char const*, char const*, unsigned long) in section .text
(gdb) info add 'operator new(unsigned int, char const*, char const*,
unsigned long)' # fails
No symbol "'operator new(unsigned int, char const*, char const*, unsigned long)'" in current context.
I tried several variants of the name for new; none worked.
Incidentally, it would be nice to be able to get all c'tor calls, not
just those associated with the heap, but that obviously does run into
the problems putting breakpoints on them. It would also require
identifying all the c'tors.
Thanks.
Ross Boylan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Associating C++ new with constructor
2007-02-15 11:50 Associating C++ new with constructor Ross Boylan
@ 2007-02-15 13:35 ` Daniel Jacobowitz
2007-02-15 15:38 ` Michael Veksler
1 sibling, 0 replies; 6+ messages in thread
From: Daniel Jacobowitz @ 2007-02-15 13:35 UTC (permalink / raw)
To: Ross Boylan; +Cc: gdb
On Wed, Feb 14, 2007 at 11:35:16PM -0800, Ross Boylan wrote:
> Does that have any chance of working with any generality?
No, none. The compiler can do an arbitrary amount of work between the
two calls. They can end up with branches in between, or a call to new
could be used to conditionally create one of two objects that happen
to have the same size.
--
Daniel Jacobowitz
CodeSourcery
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Associating C++ new with constructor
2007-02-15 11:50 Associating C++ new with constructor Ross Boylan
2007-02-15 13:35 ` Daniel Jacobowitz
@ 2007-02-15 15:38 ` Michael Veksler
2007-02-15 21:24 ` Ross Boylan
1 sibling, 1 reply; 6+ messages in thread
From: Michael Veksler @ 2007-02-15 15:38 UTC (permalink / raw)
To: Ross Boylan; +Cc: gdb
Ross Boylan wrote:
> This is NOT about the problems setting a breakpoint on a C++
> constructor!
>
> I've been looking for a way to count creations and destructions of C++
> objects, with the counts kept on a per class basis.
Many (9?) years ago I have written a dumb perl script to do this task.
It is
still being used from time to time, despite valgrind's massif.
I can't give you the script, without considerable paperwork, since
it is (C) IBM (and I won't try because it is too embarrassing to
release such a trivial script).
The idea is simple, and took me less than a day to implement:
Define a template class that can monitor object creation and
destruction.
Insert a member of this type into all non-POD classes:
class MyFoo {
....
....
...
Count<MyFoo> m_count_for_MyFoo;
};
This member will be constructed every time your MyFoo is
constructed. Inside the Count class you can do whatever
you want, every Count<T> class may update a global
registry, from which you can print statistics for all Count
classes.
Note that the perl script simply detects "class ....." and
inserts the members into it. The nice thing about this
template thing, is that it should work well even
if your MyFoo is also templated, and it works
well for all possible constructors.
It is imperfect since it can't reliably detect all
non-POD classes, and it does not work for std::***
classes. To make it 100% reliable, I had to write
a complete C++ parser!.
--
Michael Veksler
http:///tx.technion.ac.il/~mveksler
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Associating C++ new with constructor
2007-02-15 15:38 ` Michael Veksler
@ 2007-02-15 21:24 ` Ross Boylan
2007-02-15 22:10 ` Michael Veksler
0 siblings, 1 reply; 6+ messages in thread
From: Ross Boylan @ 2007-02-15 21:24 UTC (permalink / raw)
To: Michael Veksler; +Cc: Ross Boylan, gdb
Daniel, thanks for your response. I've been trying to avoid messing
with the source code, but it sounds as if I'm stuck with it.
I suppose one other option would be go to the source line of the call
to new and try to pull the class out of it. It might work well enough
to do the job.
I was thinking of tools to rewrite the source automatically, and
Michael provides one suggestion below. I have a few questions about
what he wrote.
Before I get to that, though, can anyone explain why gdb was able to
give me the name of the new() function given the address, but couldn't
give me the address when given the name of the function (details in
the original post)?
On Thu, Feb 15, 2007 at 03:35:14PM +0200, Michael Veksler wrote:
> Ross Boylan wrote:
> >This is NOT about the problems setting a breakpoint on a C++
> >constructor!
> >
> >I've been looking for a way to count creations and destructions of C++
> >objects, with the counts kept on a per class basis.
> Many (9?) years ago I have written a dumb perl script to do this task.
> It is
> still being used from time to time, despite valgrind's massif.
> I can't give you the script, without considerable paperwork, since
> it is (C) IBM (and I won't try because it is too embarrassing to
> release such a trivial script).
No problem.
>
> The idea is simple, and took me less than a day to implement:
> Define a template class that can monitor object creation and
> destruction.
> Insert a member of this type into all non-POD classes:
> class MyFoo {
> ....
> ....
> ...
> Count<MyFoo> m_count_for_MyFoo;
> };
>
> This member will be constructed every time your MyFoo is
> constructed. Inside the Count class you can do whatever
> you want, every Count<T> class may update a global
> registry, from which you can print statistics for all Count
> classes.
One nice thing about this vs. the memory corruption detection tools is
that it catches all instance creation, not just stuff on the heap.
>
> Note that the perl script simply detects "class ....." and
> inserts the members into it. The nice thing about this
> template thing, is that it should work well even
> if your MyFoo is also templated, and it works
> well for all possible constructors.
>
> It is imperfect since it can't reliably detect all
> non-POD classes, and it does not work for std::***
> classes. To make it 100% reliable, I had to write
> a complete C++ parser!.
Why isn't this reliable for all classes in your source code (at least
if you add a search for "struct")? Are you referring to cases that
use pre-processor magic, or is there something else?
Another approach is in libcwd, which records the class along with the
memory allocation if you insert a call to AllocTag() after the call to
new. AllocTag uses templates under the hood to get the type of the
pointer automatically, and libcwd provides a macro NEW that will take
care of it automatically, e.g., A *pA = NEW(A()).
This problem was driving me crazy enough that I was looking into
putting hooks in gcc, which has the advantage of providing a C++
parser. That approach has a few disadvantages. Aside from being a
big project (at least for someone like myself who knows little about
compilers), it would be impractical to redistribute.
There seems to be some possibility that the upcoming revision of the
C++ standard will incorporate improved reflection capabilities; that
might help with this problem.
Ross
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Associating C++ new with constructor
2007-02-15 21:24 ` Ross Boylan
@ 2007-02-15 22:10 ` Michael Veksler
2007-02-16 5:03 ` Ross Boylan
0 siblings, 1 reply; 6+ messages in thread
From: Michael Veksler @ 2007-02-15 22:10 UTC (permalink / raw)
To: Ross Boylan; +Cc: gdb
Ross Boylan wrote:
...
> On Thu, Feb 15, 2007 at 03:35:14PM +0200, Michael Veksler wrote:
>
>> The idea is simple, and took me less than a day to implement:
>> Define a template class that can monitor object creation and
>> destruction.
>> Insert a member of this type into all non-POD classes:
>> class MyFoo {
>> ....
>> ....
>> ...
>> Count<MyFoo> m_count_for_MyFoo;
>> };
>>
The difficulty with this, is that it will work reliably only for
non-POD classes. For POD (Plain Old Data) classes or
structs, you are not allowed to automatically modify them,
because they might participate in code such as
struct A {
int a, b;
};
void Serialize(ostream & out, const A& a) {
out.write(reinterpret_cast<const char*>(&a), sizeof(a));
}
This is well defined only if A is a POD, if you add a Count
instance then the program becomes:
1. undefined
2. outputs wrong stuff to the stream.
> One nice thing about this vs. the memory corruption detection tools is
> that it catches all instance creation, not just stuff on the heap.
>
And that's one of the reason it is still used used despite of other tools.
>> It is imperfect since it can't reliably detect all
>> non-POD classes, and it does not work for std::***
>> classes. To make it 100% reliable, I had to write
>> a complete C++ parser!.
>>
>
> Why isn't this reliable for all classes in your source code (at least
> if you add a search for "struct")? Are you referring to cases that
> use pre-processor magic, or is there something else?
>
Read my comment above about the importance of non-POD checks.
There is no preprocessor magic, only a trivial perl script that inserts
instances into non-POD classes. The inaccuracy comes from
1. It is difficult to find all definitions of all classes. You should
ignore
occurrences of the "class" word in strings and in /* */ comments.
You should find "class" words expanded by some user macros.
This is virtually impossible to get right without at least running
a preprocessor.
2. It is difficult to be certain that some class is a non-POD class,
since its POD-ness depends also on its parents and members.
Also, user's macros may hide stuff, like virtual methods.
Templates make it even more difficult.
I simply made some dumb assumptions in my script that all "class"
definitions are non-POD (or at least never used as POD), and
all "struct" definitions are suspected to be POD. Fortunately,
this assumption is correct most of the time due to common
coding conventions.
> Another approach is in libcwd, which records the class along with the
> memory allocation if you insert a call to AllocTag() after the call to
> new. AllocTag uses templates under the hood to get the type of the
> pointer automatically, and libcwd provides a macro NEW that will take
> care of it automatically, e.g., A *pA = NEW(A()).
>
So you have to write your code this way, or write a preprocessor script
to modify
A *pA= new A(new B, new C[5]));
to
A *pA= NEW(NEW(B), NEW_ARRAY(C, 5));
(I just guess how new over an array are done, I don't really use libcwd).
> This problem was driving me crazy enough that I was looking into
> putting hooks in gcc, which has the advantage of providing a C++
> parser. That approach has a few disadvantages. Aside from being a
> big project (at least for someone like myself who knows little about
> compilers), it would be impractical to redistribute.
>
I don't think there should be any problem to redistribute such a tool.
I know of at least one such project.
--
Michael Veksler
http:///tx.technion.ac.il/~mveksler
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Associating C++ new with constructor
2007-02-15 22:10 ` Michael Veksler
@ 2007-02-16 5:03 ` Ross Boylan
0 siblings, 0 replies; 6+ messages in thread
From: Ross Boylan @ 2007-02-16 5:03 UTC (permalink / raw)
To: Michael Veksler; +Cc: ross, Ross Boylan, gdb
On Thu, 2007-02-15 at 22:29 +0200, Michael Veksler wrote:
> Ross Boylan wrote:
> ...
> > On Thu, Feb 15, 2007 at 03:35:14PM +0200, Michael Veksler wrote:
> >
> >> The idea is simple, and took me less than a day to implement:
> >> Define a template class that can monitor object creation and
> >> destruction.
> >> Insert a member of this type into all non-POD classes:
> >> class MyFoo {
> >> ....
> >> ....
> >> ...
> >> Count<MyFoo> m_count_for_MyFoo;
> >> };
> >>
> The difficulty with this, is that it will work reliably only for
> non-POD classes. For POD (Plain Old Data) classes or
> structs, you are not allowed to automatically modify them,
> because they might participate in code such as
> struct A {
> int a, b;
> };
> void Serialize(ostream & out, const A& a) {
> out.write(reinterpret_cast<const char*>(&a), sizeof(a));
> }
>
> This is well defined only if A is a POD, if you add a Count
> instance then the program becomes:
> 1. undefined
> 2. outputs wrong stuff to the stream.
I'm not following. Do you mean that the C++ standard doesn't like the
code, or just that streaming out (and back in) an arbitrary object (the
counter instance) is unlikely to work, along with any other operations
that treat A as a bucket of bits (e.g., memmove)?
....
> >> It is imperfect since it can't reliably detect all
> >> non-POD classes, and it does not work for std::***
> >> classes. To make it 100% reliable, I had to write
> >> a complete C++ parser!.
> >>
> >
> > Why isn't this reliable for all classes in your source code (at least
> > if you add a search for "struct")? Are you referring to cases that
> > use pre-processor magic, or is there something else?
> >
> Read my comment above about the importance of non-POD checks.
> There is no preprocessor magic, only a trivial perl script that inserts
> instances into non-POD classes. The inaccuracy comes from
>
> 1. It is difficult to find all definitions of all classes. You should
> ignore
> occurrences of the "class" word in strings and in /* */ comments.
> You should find "class" words expanded by some user macros.
That's what I meant by my reference to "preprocessor magic" above.
> This is virtually impossible to get right without at least running
> a preprocessor.
> 2. It is difficult to be certain that some class is a non-POD class,
> since its POD-ness depends also on its parents and members.
> Also, user's macros may hide stuff, like virtual methods.
> Templates make it even more difficult.
>
> I simply made some dumb assumptions in my script that all "class"
> definitions are non-POD (or at least never used as POD), and
> all "struct" definitions are suspected to be POD. Fortunately,
> this assumption is correct most of the time due to common
> coding conventions.
> > Another approach is in libcwd, which records the class along with the
> > memory allocation if you insert a call to AllocTag() after the call to
> > new. AllocTag uses templates under the hood to get the type of the
> > pointer automatically, and libcwd provides a macro NEW that will take
> > care of it automatically, e.g., A *pA = NEW(A()).
> >
> So you have to write your code this way, or write a preprocessor script
> to modify
>
> A *pA= new A(new B, new C[5]));
>
> to
>
> A *pA= NEW(NEW(B), NEW_ARRAY(C, 5));
Yes (I'm not sure above the array syntax either). Actually, nesting the
NEWs might blow up, since the macro expands to a couple of statements (I
think--I guess it could use a "," operator).
>
> (I just guess how new over an array are done, I don't really use libcwd).
> > This problem was driving me crazy enough that I was looking into
> > putting hooks in gcc, which has the advantage of providing a C++
> > parser. That approach has a few disadvantages. Aside from being a
> > big project (at least for someone like myself who knows little about
> > compilers), it would be impractical to redistribute.
> >
> I don't think there should be any problem to redistribute such a tool.
> I know of at least one such project.
If my instructions for building the test suite start with "build a
custom compiler with this patch set" that is likely to be off-putting.
In fact, it's off-putting for me :)
There's a project to make it easier to create addons to gcc (GEM), but
since it's not in the main distribution one has to build a custom
compiler to use an addon anyway.
--
Ross Boylan wk: (415) 514-8146
185 Berry St #5700 ross@biostat.ucsf.edu
Dept of Epidemiology and Biostatistics fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm: (415) 550-1062
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-02-15 22:10 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-15 11:50 Associating C++ new with constructor Ross Boylan
2007-02-15 13:35 ` Daniel Jacobowitz
2007-02-15 15:38 ` Michael Veksler
2007-02-15 21:24 ` Ross Boylan
2007-02-15 22:10 ` Michael Veksler
2007-02-16 5:03 ` Ross Boylan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox