From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31402 invoked by alias); 9 Apr 2012 17:56:08 -0000 Received: (qmail 31325 invoked by uid 22791); 9 Apr 2012 17:56:05 -0000 X-SWARE-Spam-Status: No, hits=-5.7 required=5.0 tests=AWL,BAYES_00,KAM_STOCKGEN,KHOP_RCVD_UNTRUST,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 09 Apr 2012 17:55:49 +0000 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q39Htl9b029508 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 9 Apr 2012 13:55:47 -0400 Received: from barimba (ovpn01.gateway.prod.ext.phx2.redhat.com [10.5.9.1]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id q39Htjuw011300 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Mon, 9 Apr 2012 13:55:46 -0400 From: Tom Tromey To: Siva Chandra Cc: gdb-patches@sourceware.org, Doug Evans Subject: Re: [RFC - Python Scripting] New method Objfile.symtabs () - docs included References: <09787EF419216C41A903FD14EE5506DD0313D5C22C@AUSX7MCPC103.AMER.DELL.COM> Date: Mon, 09 Apr 2012 17:56:00 -0000 In-Reply-To: (Siva Chandra's message of "Fri, 6 Apr 2012 22:11:45 +0530") Message-ID: <87vcl894da.fsf@fleche.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.95 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2012-04/txt/msg00154.txt.bz2 >>>>> "Siva" =3D=3D Siva Chandra writes: Tom> How about just the plain Objfile.iterator(REGEX)? Tom> It would return an iterator that lazily instantiates symtabs. Tom> (Or, at least conceptually lazily instantiates them, as it isn't clear Tom> that this is efficiently implementable with the current Tom> quick_symbol_functions API.) Siva> Is this equivalent to what Paul Koning suggests and has the same Siva> problem which Doug Evans points out (source file matching with a REGEX Siva> aside)? More or less. The key thing is to pass a kind of selector to the iterator so that it can pick which symtabs to expand. This lets you keep a cap on memory use. Ultimately if a script asks for all the CUs, memory and CPU are going to go through the roof. I think it is good to warn about this and provide ways to work around it, but in the end I don't have a problem with it. If you want to look at all the symbols in a big program, well, that's just a lot of work. Doug> I can think of a simple(quick) solution: create the symtab object when Doug> the psymtab object is created (suitably modified to clean up anything Doug> obvious), but not expand it. =C2=A0[btw, Psymtabs at the moment aren'= t as Doug> much of an implementation detail as we want them to be (IMO).] It's Doug> not perfect: Depending on what you want from the symtab you may Doug> ultimately end up expanding everything anyway. =C2=A0But it's a step.= =C2=A0One Doug> way to go would be to build into gdb the ability to discard the Doug> expansion say when memory gets tight. =C2=A0I think there's room for Doug> improvement before we get to that though. FWIW, I have a 2 part plan for lazy CU expansion that, I think, would help this situation quite a bit, for DWARF anyway. The first part is to change the DWARF reader not to fully read symbols, then change symtab.c to finish filling in a function symbol just before it is returned, and change SYMBOL_TYPE to lazily read the type. This is not extremely hard to do, and it both speeds up CU expansion (just skipping function bodies is worth ~30%) and reduces memory use in the common case where most symbols in a CU are not needed. I have a starter patch for the function body part if anybody is curious. Part 2 is more involved -- change partial symtabs to also record the DIE offsets, and then change the psymtab -> symtab expander to just copy this information over. The idea here is that we then don't need to re-scan .debug_info at all, we're just shuffling some bits around. Part of this could also be to introduce some memory savings by letting (some) symbols point to their corresponding partial symbol, instead of effectively having a copy of the data. I like the idea of being able to discard symtabs when memory is tight. Tom