From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2267 invoked by alias); 15 Feb 2007 20:29:39 -0000 Received: (qmail 2257 invoked by uid 22791); 15 Feb 2007 20:29:38 -0000 X-Spam-Check-By: sourceware.org Received: from mailgw2.technion.ac.il (HELO mailgw2.technion.ac.il) (132.68.238.33) by sourceware.org (qpsmtpd/0.31) with ESMTP; Thu, 15 Feb 2007 20:29:33 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mailgw2.technion.ac.il (Postfix) with ESMTP id 99ABD39017D; Thu, 15 Feb 2007 22:29:30 +0200 (IST) Received: from mailgw2.technion.ac.il ([127.0.0.1]) by localhost (mailgw2.technion.ac.il [127.0.0.1]) (amavisd-new, port 10024) with LMTP id oDCceA2Hm+DP; Thu, 15 Feb 2007 22:29:30 +0200 (IST) Received: from techunix.technion.ac.il (techunix.technion.ac.il [132.68.1.28]) by mailgw2.technion.ac.il (Postfix) with ESMTP id 61CBA39014A; Thu, 15 Feb 2007 22:29:25 +0200 (IST) Received: from tp-veksler.haifa.ibm.com (techunix.technion.ac.il [132.68.1.28]) by techunix.technion.ac.il (Postfix) with ESMTP id 7576F1496E; Thu, 15 Feb 2007 22:29:25 +0200 (IST) (envelope-from mveksler@tx.technion.ac.il) Message-ID: <45D4C2A5.4010807@tx.technion.ac.il> Date: Thu, 15 Feb 2007 22:10:00 -0000 From: Michael Veksler User-Agent: Thunderbird 2.0b2 (X11/20070116) MIME-Version: 1.0 To: Ross Boylan Cc: gdb@sourceware.org Subject: Re: Associating C++ new with constructor References: <20070215073516.GW5871@wheat.betterworld.us> <45D46192.30504@tx.technion.ac.il> <20070215182625.GX5871@wheat.betterworld.us> In-Reply-To: <20070215182625.GX5871@wheat.betterworld.us> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2007-02/txt/msg00163.txt.bz2 Ross Boylan wrote: ... > On Thu, Feb 15, 2007 at 03:35:14PM +0200, Michael Veksler wrote: > >> The idea is simple, and took me less than a day to implement: >> Define a template class that can monitor object creation and >> destruction. >> Insert a member of this type into all non-POD classes: >> class MyFoo { >> .... >> .... >> ... >> Count m_count_for_MyFoo; >> }; >> The difficulty with this, is that it will work reliably only for non-POD classes. For POD (Plain Old Data) classes or structs, you are not allowed to automatically modify them, because they might participate in code such as struct A { int a, b; }; void Serialize(ostream & out, const A& a) { out.write(reinterpret_cast(&a), sizeof(a)); } This is well defined only if A is a POD, if you add a Count instance then the program becomes: 1. undefined 2. outputs wrong stuff to the stream. > One nice thing about this vs. the memory corruption detection tools is > that it catches all instance creation, not just stuff on the heap. > And that's one of the reason it is still used used despite of other tools. >> It is imperfect since it can't reliably detect all >> non-POD classes, and it does not work for std::*** >> classes. To make it 100% reliable, I had to write >> a complete C++ parser!. >> > > Why isn't this reliable for all classes in your source code (at least > if you add a search for "struct")? Are you referring to cases that > use pre-processor magic, or is there something else? > Read my comment above about the importance of non-POD checks. There is no preprocessor magic, only a trivial perl script that inserts instances into non-POD classes. The inaccuracy comes from 1. It is difficult to find all definitions of all classes. You should ignore occurrences of the "class" word in strings and in /* */ comments. You should find "class" words expanded by some user macros. This is virtually impossible to get right without at least running a preprocessor. 2. It is difficult to be certain that some class is a non-POD class, since its POD-ness depends also on its parents and members. Also, user's macros may hide stuff, like virtual methods. Templates make it even more difficult. I simply made some dumb assumptions in my script that all "class" definitions are non-POD (or at least never used as POD), and all "struct" definitions are suspected to be POD. Fortunately, this assumption is correct most of the time due to common coding conventions. > Another approach is in libcwd, which records the class along with the > memory allocation if you insert a call to AllocTag() after the call to > new. AllocTag uses templates under the hood to get the type of the > pointer automatically, and libcwd provides a macro NEW that will take > care of it automatically, e.g., A *pA = NEW(A()). > So you have to write your code this way, or write a preprocessor script to modify A *pA= new A(new B, new C[5])); to A *pA= NEW(NEW(B), NEW_ARRAY(C, 5)); (I just guess how new over an array are done, I don't really use libcwd). > This problem was driving me crazy enough that I was looking into > putting hooks in gcc, which has the advantage of providing a C++ > parser. That approach has a few disadvantages. Aside from being a > big project (at least for someone like myself who knows little about > compilers), it would be impractical to redistribute. > I don't think there should be any problem to redistribute such a tool. I know of at least one such project. -- Michael Veksler http:///tx.technion.ac.il/~mveksler