From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-32252-listarch-gdb-patches=sources.redhat.com@sources.redhat.com>
Received: (qmail 12190 invoked by alias); 13 Apr 2004 05:20:23 -0000
Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-patches-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sources.redhat.com>
List-Help: <mailto:gdb-patches-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: gdb-patches-owner@sources.redhat.com
Received: (qmail 12181 invoked from network); 13 Apr 2004 05:20:21 -0000
Received: from unknown (HELO takamaka.act-europe.fr) (142.179.108.108)
  by sources.redhat.com with SMTP; 13 Apr 2004 05:20:21 -0000
Received: by takamaka.act-europe.fr (Postfix, from userid 507)
	id 3508F47D63; Mon, 12 Apr 2004 22:20:21 -0700 (PDT)
Date: Tue, 13 Apr 2004 05:20:00 -0000
From: Joel Brobecker <brobecker@gnat.com>
To: Jim Blandy <jimb@redhat.com>
Cc: Daniel Jacobowitz <drow@mvista.com>,
	Eli Zaretskii <eliz@elta.co.il>, gdb-patches@sources.redhat.com
Subject: Re: [RFC/dwarf-2] Add support for included files
Message-ID: <20040413052021.GA1173@gnat.com>
References: <vt2d66jdbd8.fsf@zenia.home>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <vt2d66jdbd8.fsf@zenia.home>
User-Agent: Mutt/1.4i
X-SW-Source: 2004-04/txt/msg00267.txt.bz2

> Picking up a dropped thread from January:
> 
> http://sources.redhat.com/ml/gdb-patches/2004-01/msg00015.html

Thank you Jim.

> Some comments:
> 
> It's fine for GDB core code to assume that the psymtabs mention every
> source file in the program.  They're supposed to.  The whole purpose
> of the psymtabs is to act as an index for the full symbol information,
> so that without taking the time/memory to read full symbols, we can at
> least find the compilation unit we do need to read.  If that index
> doesn't mention a source file, then it's incomplete.

Cool.

> I'm not too enthusiastic about Eli's suggestion that we put off
> scanning for source file names until we get a filename we can't find.
> It would help: we can find source files most of the time already, and
> when the second scan was needed, it wouldn't be any more work than
> what Joel's proposing we do every time.  But it's adding a third phase
> to debug reading; I'd prefer to fix this problem without architectural
> changes like that, if it's practical.

That's my feeling too.

> The original motivation for introducing psymtabs at all was speed:
> GDB's startup time was, believe it or not, CPU-bound.  By constructing
> only partial symtabs at startup time, we avoided parsing types
> (essentially).  How does this patch affect the time GDB takes to start
> up on itself?  (If you had some monster program lying around, that
> would be better, but if all we've got is a brute...)

Unfortunately, I don't have a monster program. I did some experimenting
with GDB, though. Here is how I did it: Measuring the total elapsed time
it takes GDB in batch mode to load the symbols and then exit, and did
it ten times in a row inside a for loop. I measured the time difference
between an unmodified GDB and a modified one. Without the change, I
get a total elapsed time for the 10 consecutive runs of 9.453 secs.
After my change, I get a total time of 9.604 secs. That gives us a
rough increase of 1.6%, or said differently, an increase of startup
time of 1 sec for each minute. That's all supposing that GDB is your
fairly typical program.

If somebody could send me a large binary, something like mozilla
all built with dwarf-2 for instance, I'd be happy to rerun my
experiment.

> If it makes a big speed difference, then let's try ignoring
> DW_LNE_define_file, not scanning the line number program, and just
> entering the files given in the file_names table as psymtabs.  GCC
> doesn't generate DW_LNE_define_file.  If that is much faster, then I
> think it'll be okay.

I tried that, but unfortunately, I've seen cases where the compiler
emits an entry for an included file, but then never mentions it in
the line table. For instance with an Ada program, I see that the
compiler includes an entry for file "system.ads" (not sure why,
is it because it's in the direct closure of my program?) in the Line
Number Program Header, but then never uses it to provide line numbers.
The latter is expected, since this file is a runtime file containing
only declarations, so no statements causing code to be generated.

If we were to skip parsing the line table, we would end up creating
a psymtab for that system.ads file, and I think that it could cause
an internal error later down the road when we try to promote that
psymtab to a full symtab. The current code, I think, is not prepared
for that situation. I can give it a try, but that's what I remember from
my analysis of the current code back when I started working on this.

I think it would be reasonable to argue that mentioning a file in the
header without using it in the line table is a compiler bug. I am not
sure about this as the spec is vague, but I wouldn't certainly object.
However, if we decide that yes it is a compiler bug, and yes we should
create the psymtab only based on the line table header, then we also
have to accept the fact that we'll still have the problem should the
compiler decide to rely on DW_LNE_define_file rather than the header to
build the list of included files.  Given the small startup time increase
that I have measured so far, I think it's fine to scan the line table.

But I am happy any way. I hope I have provided sufficient information
for us to make a decision.

> (It's interesting to note that, since DW_LNE_define_file exists, the
> mechanisms Dwarf 3 provides for "accelerated access" are insufficient:
> there's no way to build a complete index of source file names without
> reading the line tables.)

I really wondered about the usefulness of this opcode. Now, I am
wondering even more...

-- 
Joel