From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5324 invoked by alias); 13 Mar 2006 20:19:12 -0000 Received: (qmail 5305 invoked by uid 22791); 13 Mar 2006 20:19:11 -0000 X-Spam-Check-By: sourceware.org Received: from skewer.dreamhost.com (HELO skewer.dreamhost.com) (64.111.107.13) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 13 Mar 2006 20:19:06 +0000 Received: from linux-009002219123.watson.ibm.com (yktgi01e0-s4.watson.ibm.com [129.34.20.23]) by skewer.dreamhost.com (Postfix) with ESMTP id 10F8F7C81E; Mon, 13 Mar 2006 12:19:03 -0800 (PST) Subject: Re: linker debug info editing From: Daniel Berlin To: Jim Blandy Cc: binutils@sourceware.org, gdb-patches@sourceware.org In-Reply-To: <1142217868.20401.15.camel@linux.site> References: <20060310124921.GN6777@bubble.grove.modra.org> <8f2776cb0603101744w3dd59741s4ad8e17b7069a6fa@mail.gmail.com> <1142217868.20401.15.camel@linux.site> Content-Type: text/plain Date: Tue, 14 Mar 2006 15:10:00 -0000 Message-Id: <1142281149.20401.84.camel@dyn9002219123> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2006-03/txt/msg00181.txt.bz2 > I spent a large amount of time when we were implementing > -feliminate-dwarf2-dups measuring the cost of various schemes for > deciding what to try to split and what not. Oh, i forgot the other half of problems. If you don't do something like split every die, then you get into things where you have to name the linkonce sections by the content (using md5 hashes or something), so that you'd don't accidently eliminate one set of dies that really isn't an exact duplicate of some other. Once you do this you get hit with ordering issues if the include files in something else are in a slightly different order, and this causes the md5's to be different. *or* you hit the problem that you may not split exactly the same (due to ordering, more dies, etc), and thus, the dies in one split CU may be different than another, and thus, will not be commoned (because they will have different linkonce section names from the content hash) If you do something obvious like split every type/subprogram die into it's own section, and linkonce those, the cost in .o files is enormous. Then, for each type DIE, you have: 12 bytes for a debug_info header ~10-20 bytes of extra overhead from using dw_form_refaddr references (assuming it is referenced ~3-5 times) 30 bytes for the new section name (.debug_info.linkonce.<16 byte md5>) 10+ bytes (i forget the number) for the extra section table entry you've added 10+ bytes for the extra symbol you need to reference the thing everywhere else It ends up being about 100 bytes per die you put into it's own CU If you have a C++ app, and 5000 types, you've just wasted 500k. I had tried this scheme once, and it usually doubled (or more) the size of the .o files. IOW, it's always a better idea to eliminate in the linker if possible, which has *none* of these issues. It's hard work, but straightforward there if you have all the info you need.