From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-89619-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 31402 invoked by alias); 9 Apr 2012 17:56:08 -0000
Received: (qmail 31325 invoked by uid 22791); 9 Apr 2012 17:56:05 -0000
X-SWARE-Spam-Status: No, hits=-5.7 required=5.0	tests=AWL,BAYES_00,KAM_STOCKGEN,KHOP_RCVD_UNTRUST,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 09 Apr 2012 17:55:49 +0000
Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25])	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q39Htl9b029508	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);	Mon, 9 Apr 2012 13:55:47 -0400
Received: from barimba (ovpn01.gateway.prod.ext.phx2.redhat.com [10.5.9.1])	by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id q39Htjuw011300	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO);	Mon, 9 Apr 2012 13:55:46 -0400
From: Tom Tromey <tromey@redhat.com>
To: Siva Chandra <sivachandra@google.com>
Cc: gdb-patches@sourceware.org, Doug Evans <dje@google.com>
Subject: Re: [RFC - Python Scripting] New method Objfile.symtabs () - docs included
References: <CAGyQ6gxzmufX7CyKpCw=81EM-4m=jaN5LZGf1kmPBAHnGpFX6A@mail.gmail.com>	<09787EF419216C41A903FD14EE5506DD0313D5C22C@AUSX7MCPC103.AMER.DELL.COM>	<CADPb22QhVip8xafOSPffWch-FUp+aY6J6+Ak-oyxQX01zAZadQ@mail.gmail.com>	<CAGyQ6gxnOEjbhtM9k0hv1Hy7cgS9+vwdOPyozuBtCgvNsYa3jg@mail.gmail.com>
Date: Mon, 09 Apr 2012 17:56:00 -0000
In-Reply-To: <CAGyQ6gxnOEjbhtM9k0hv1Hy7cgS9+vwdOPyozuBtCgvNsYa3jg@mail.gmail.com>	(Siva Chandra's message of "Fri, 6 Apr 2012 22:11:45 +0530")
Message-ID: <87vcl894da.fsf@fleche.redhat.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.95 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2012-04/txt/msg00154.txt.bz2

>>>>> "Siva" =3D=3D Siva Chandra <sivachandra@google.com> writes:

Tom> How about just the plain Objfile.iterator(REGEX)?
Tom> It would return an iterator that lazily instantiates symtabs.
Tom> (Or, at least conceptually lazily instantiates them, as it isn't clear
Tom> that this is efficiently implementable with the current
Tom> quick_symbol_functions API.)

Siva> Is this equivalent to what Paul Koning suggests and has the same
Siva> problem which Doug Evans points out (source file matching with a REGEX
Siva> aside)?

More or less.  The key thing is to pass a kind of selector to the
iterator so that it can pick which symtabs to expand.  This lets you
keep a cap on memory use.

Ultimately if a script asks for all the CUs, memory and CPU are going to
go through the roof.  I think it is good to warn about this and provide
ways to work around it, but in the end I don't have a problem with it.
If you want to look at all the symbols in a big program, well, that's
just a lot of work.

Doug> I can think of a simple(quick) solution: create the symtab object when
Doug> the psymtab object is created (suitably modified to clean up anything
Doug> obvious), but not expand it. =C2=A0[btw, Psymtabs at the moment aren'=
t as
Doug> much of an implementation detail as we want them to be (IMO).] It's
Doug> not perfect: Depending on what you want from the symtab you may
Doug> ultimately end up expanding everything anyway. =C2=A0But it's a step.=
 =C2=A0One
Doug> way to go would be to build into gdb the ability to discard the
Doug> expansion say when memory gets tight. =C2=A0I think there's room for
Doug> improvement before we get to that though.

FWIW, I have a 2 part plan for lazy CU expansion that, I think, would
help this situation quite a bit, for DWARF anyway.


The first part is to change the DWARF reader not to fully read symbols,
then change symtab.c to finish filling in a function symbol just before
it is returned, and change SYMBOL_TYPE to lazily read the type.

This is not extremely hard to do, and it both speeds up CU expansion
(just skipping function bodies is worth ~30%) and reduces memory use in
the common case where most symbols in a CU are not needed.

I have a starter patch for the function body part if anybody is curious.


Part 2 is more involved -- change partial symtabs to also record the DIE
offsets, and then change the psymtab -> symtab expander to just copy
this information over.  The idea here is that we then don't need to
re-scan .debug_info at all, we're just shuffling some bits around.

Part of this could also be to introduce some memory savings by letting
(some) symbols point to their corresponding partial symbol, instead of
effectively having a copy of the data.


I like the idea of being able to discard symtabs when memory is tight.

Tom