Lazy CU expansion (Was: Will therefore GDB utilize C++ or not?)

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

From: Tom Tromey <tromey@redhat.com>
To: John Gilmore <gnu@toad.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>,
	       Pedro Alves <palves@redhat.com>,
	gdb@sourceware.org
Subject: Lazy CU expansion (Was: Will therefore GDB utilize C++ or not?)
Date: Fri, 18 May 2012 18:51:00 -0000	[thread overview]
Message-ID: <87fwaxgw5q.fsf_-_@fleche.redhat.com> (raw)
In-Reply-To: <201204182309.q3IN9FcF019607@new.toad.com> (John Gilmore's	message of "Wed, 18 Apr 2012 16:09:15 -0700")

>>>>> "John" == John Gilmore <gnu@toad.com> writes:

Some comments on your comments about lazy CU expansion.

John> The whole design of partial_symbols was that they're only needed when
John> the real symbols haven't been read in.  This is well documented.  In
John> fact the partial_symtab for a file can be (or used to be able to be)
John> thrown away when the real symtab is created, and many symbol-readers
John> never bothered to create partial_symbols.

I don't think it's been possible to discard a partial symtab for many
years now.

It doesn't seem very worthwhile to do it, given that the bulk of the
memory is in the psymbols themselves, and these can't ever be deleted.
But, maybe it would be worth trying.

John> Partial symtabs were only a
John> speed optimization to avoid parsing Stabs debugging info when host
John> machines ran at 20 megahertz.  You could probably get rid of them
John> entirely nowadays.

This seems unlikely to me; but due to memory use, not CPU.

Partial symbols take a lot less memory than full symbols, partly because
they are smaller, but more importantly because they can be put into the
bcache, and this is quite effective in practice.

John> The GDB Internals manual (which I originated when I discovered that
John> there was no internals documentation) makes it clear that there are
John> only a few ways to look up a symbol.  Has that nice clean bit of
John> modular C programming has been retained over the last decade?

No, the symbol tables are a total mess, and the internals manual is out
of date.

John> So how is this idea of pointing to psymbols going to save any
John> memory?

'struct symbol' starts with a 'general_symbol_info', and also includes
'domain' and 'aclass' fields -- all of which are duplicated in the
partial symbol.

So, pointing to the partial symbol will save at least
sizeof(general_symbol_info) - sizeof(void*) bytes per symbol.  On x86-64
that is 32 bytes.  Maybe it could save more memory with more packing.

More importantly, this sort of thing would allow instantiation of a full
symtab without re-parsing the DWARF.  Re-parsing is slow, and also
mostly pointless, as most symbols in a given CU are never used.

John> And if you're going to have to allocate all the memory for the
John> struct symbol, then why not populate it with the real information
John> for the symbol, instead of just a psymbol pointer?

Reading the remaining information is slow and uses memory, but the
results are often not used.  So it would be preferable to fill in the
details on demand.

Just skipping function bodies alone saves ~30% of the CU expansion time.

John> It's much simpler to read all the symbols in a symbol file, in
John> order, and once you're doing that anyway, you might as well save
John> them all.

Yes, it is simpler.  This is what is done now.  I think it doesn't scale
very well... Jan has dug up some C++ libraries where there is one
enormous CU which sucks up a lot of time if you happen to have to expand
it.

Tom> Full symbols are already reasonably C++y, what with the ops vector.

John> It looks to me like the "ops vector" in symbols in gdb-7.4 is pretty
John> minimal, only applying to a tiny number of symbol categories (and the
John> comments in findvar.c -- from 2004 -- report that DWARF2 symbols screw
John> up the ops vector anyway).  Large parts of GDB touch symbols; is the
John> idea that all of these will be rewritten to indirect through an
John> ops-table (either explicitly in C, or implicitly in C++) without ever
John> accessing fields (like SYMBOL_CLASS(sym)) directly?  Do you think this
John> will make GDB faster and smaller?  I don't.

I doubt it would be smaller.  History indicates this is of zero
importance.

It would probably be faster.  At least for lazy CU expansion, the
changes are of the form:

#define SYMBOL_TYPE(sym) \
  ((sym)->type ? (sym)->type : compute_symbol_type (sym))

... or moral equivalent.

Rewriting is not necessary, you can redefine the macros.
But, rewriting the uses would be better if we were moving to C++.
This is easy though.

John> (There's a comment in symtab.c from 2003 that says address classes and
John> ops vectors should be merged.  But clearly nobody has felt like doing
John> that work in the last 9 years -- probably because so many places in the
John> code would need to be touched.

I'm not sure I trust that comment.  I find that in general, comments in
GDB relating to future maintenance issues are often questionable.

Tom

next prev parent reply	other threads:[~2012-05-18 18:51 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-30 16:14 Will therefore GDB utilize C++ or not? Jan Kratochvil
2012-04-04 20:48 ` Tom Tromey
2012-04-04 21:55   ` Mark Kettenis
2012-04-05  3:31     ` Sergio Durigan Junior
2012-04-05 11:46     ` Phil Muldoon
2012-04-06  0:35       ` Will therefore GDB utilize C++? Not John Gilmore
2012-04-06  1:35         ` Russell Shaw
2012-04-06 13:16           ` Joel Brobecker
2012-04-06 14:43             ` Russell Shaw
2012-04-06 15:34             ` Michael Eager
2012-04-06 23:32             ` John Gilmore
2012-04-07  1:04               ` Robert Dewar
2012-04-07  1:52                 ` Thomas Dineen
2012-04-07 16:54               ` Michael Eager
2012-04-09 23:59               ` Stan Shebs
2012-04-05  0:22   ` Will therefore GDB utilize C++ or not? asmwarrior
2012-04-09 18:41   ` Pedro Alves
2012-04-09 19:05     ` Jan Kratochvil
2012-04-09 19:49       ` Pedro Alves
2012-04-09 20:15         ` Paul Smith
2012-04-12 20:06         ` Daniel Jacobowitz
2012-04-12 21:28           ` Paul_Koning
2012-04-13  0:04           ` Doug Evans
2012-04-18 14:10             ` Pedro Alves
2012-04-18 20:27             ` Tom Tromey
2012-04-18 14:08           ` Pedro Alves
2012-04-21 17:24             ` Daniel Jacobowitz
2012-04-16  6:55         ` Jan Kratochvil
2012-04-18 14:11           ` Pedro Alves
2012-04-18 15:16             ` Jan Kratochvil
2012-04-18 15:28               ` Pedro Alves
2012-04-18 15:54                 ` Jan Kratochvil
2012-04-18 16:01                   ` Pedro Alves
2012-04-18 16:07                   ` Joel Brobecker
2012-04-18 16:13                     ` Jan Kratochvil
2012-04-18 16:23                       ` Joel Brobecker
2012-04-18 16:31                         ` Joel Sherrill
2012-04-18 16:50                           ` Pedro Alves
2012-04-18 16:57                             ` Joel Brobecker
2012-04-18 17:28                               ` Joel Sherrill
2012-04-18 17:40                               ` Paul_Koning
2012-04-18 20:37                                 ` Frank Ch. Eigler
2012-04-18 20:38                                   ` Paul_Koning
2012-04-18 20:36                     ` Tom Tromey
2012-04-18 17:48                   ` John Gilmore
2012-04-18 19:07                     ` Tom Tromey
2012-04-18 23:10                       ` John Gilmore
2012-05-18 18:36                         ` Tom Tromey
2012-05-18 18:47                           ` Paul_Koning
2012-05-18 19:36                             ` Tom Tromey
2012-05-18 19:44                               ` Paul_Koning
2012-05-18 20:07                                 ` Tom Tromey
2012-05-18 20:41                                   ` Aurelian Melinte
2012-05-18 18:51                         ` Tom Tromey [this message]
2012-04-18 20:34                   ` Tom Tromey
2012-04-18 19:18               ` Will C++ proponents spend 20 minutes to try what they're proposing? John Gilmore
2012-04-18 19:23                 ` Jan Kratochvil
2012-04-18 20:40                 ` Tom Tromey
2012-04-18 20:56                   ` Mike Frysinger
2012-04-18 20:31           ` Will therefore GDB utilize C++ or not? Tom Tromey
2012-04-18 20:25         ` Tom Tromey
2012-05-21 18:11           ` Pedro Alves
2012-05-21 18:36             ` Jan Kratochvil
2012-11-21 20:18             ` Jan Kratochvil
2012-04-10  0:23     ` Yao Qi
2012-04-10  9:47       ` Yao Qi
2012-04-18 20:11     ` Tom Tromey
2012-04-18 20:31       ` Can it really be ok to map GPL'd code into any old process? John Gilmore
2012-04-18 20:36         ` Pedro Alves
2012-04-23 18:03       ` Will therefore GDB utilize C++ or not? Tom Tromey
2012-05-18 19:55     ` Tom Tromey
2012-05-18 21:56       ` Joel Brobecker
2012-05-19  2:17         ` Tom Tromey
2012-05-19 15:21           ` Daniel Jacobowitz
2012-05-19 21:36             ` Joel Brobecker
2012-05-20 12:16         ` Frank Ch. Eigler
2012-05-21 15:56       ` Pedro Alves
2012-05-21 16:15         ` Jan Kratochvil
2012-05-21 17:37           ` Paul_Koning
2012-05-21 17:58             ` Jan Kratochvil
2012-05-22 18:03               ` Paul_Koning
2012-05-21 18:08           ` Pedro Alves
2012-05-21 18:08             ` Tom Tromey
2012-05-21 18:10             ` Jan Kratochvil
2012-05-21 18:54             ` Matt Rice
2012-05-26 15:50         ` Jan Kratochvil
2012-06-02  7:01           ` Russell Shaw
2012-06-02  7:13             ` Jan Kratochvil
2012-06-02 10:47               ` Russell Shaw
2012-06-02 11:10                 ` Jan Kratochvil
2012-06-02 11:15                   ` Jan Kratochvil
2012-06-02 11:15                   ` Russell Shaw
2012-11-22 18:46     ` Jan Kratochvil
2012-11-22 21:42       ` John Gilmore
2012-11-23 15:26         ` Jan Kratochvil
2012-11-27  1:29       ` Stan Shebs
2012-11-27  2:02         ` Paul_Koning
2012-11-27  2:59           ` Stan Shebs
2012-11-27 15:17             ` Paul_Koning
2012-11-27 21:14             ` Tom Tromey
2012-04-09 23:23   ` Stan Shebs
2012-04-18 14:22     ` Pedro Alves
2012-04-18 18:12       ` Stan Shebs
2012-04-18 18:32         ` Paul_Koning
2012-04-18 18:37         ` Pedro Alves
2012-04-19  8:43   ` Yao Qi
2012-12-04 14:17     ` Jan Kratochvil
2012-12-04 14:44       ` Mark Kettenis
2012-12-04 14:52         ` Jan Kratochvil
     [not found]           ` <CACTLOFof0v6NJe8WemS--Q3iMSXjHciwASxBAED5ki3scaNZuw@mail.gmail.com>
2012-12-07 12:57             ` Jan Kratochvil
2012-12-07 13:25             ` Yao Qi
2012-12-11  6:25               ` Matt Rice
2012-12-13 15:12                 ` Jan Kratochvil
2012-12-14 11:03                   ` Matt Rice
2012-12-14 12:16                     ` Jan Kratochvil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fwaxgw5q.fsf_-_@fleche.redhat.com \
    --to=tromey@redhat.com \
    --cc=gdb@sourceware.org \
    --cc=gnu@toad.com \
    --cc=jan.kratochvil@redhat.com \
    --cc=palves@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox