From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Berlin To: Jim Blandy Cc: Daniel Berlin , gdb@sources.redhat.com Subject: Re: improving psymtab generation memory usage Date: Thu, 19 Jul 2001 08:52:00 -0000 Message-id: <87wv54apc5.fsf@cgsoftware.com> References: <20010718225933.4186C5E9D8@zwingli.cygnus.com> <87ae21bhyg.fsf@cgsoftware.com> X-SW-Source: 2001-07/msg00278.html Jim Blandy writes: > Daniel Berlin writes: >> I gave up on playing language specific games, and now just MD5 the >> partial (and full die) attributes (for partial, we md5 the tag, name, >> lowpc, and highpc), we keep a splay tree of md5 checksums, and skip >> any dies (and their children) at the top level if it's md5 is in the >> tree. >> >> You'd have to have something at the same name, same tag, same >> namespace, and same location in memory for the partial die to match. > > Okay, so, just considering partial symbol table generation: > > Dwarf 2 partial symbol table production is driven by a nested loop. > > - The outer loop is in dwarf2_build_psymtabs_hard, handling one > compilation unit per iteration. If the CU die has any children, > then we call scan_partial_symbols to read them. > > - scan_partial_symbols contains the inner loop, with one iteration per > die (although we sometimes skip dies using sibling pointers, if > they're present). This inner loop calls read_partial_die to read in > a die and some of its attributes. If the die represents something > that needs a partial symbol table entry, it calls add_partial_symbol > on the die. > > - add_partial_symbol looks at the die and makes the appropriate call > to add_psymbol_to_list. > > - add_psymbol_to_list (in symfile.c now) bcaches the symbol name, and > then bcaches the `struct partial_symbol' object referring to the > name. So all the data the psymtab needs has been copied out of the > objects dwarf2read.c created. > > If I'm following the code correctly, this means that, after every > iteration of the inner loop, there are no references to the die data > we just read. This had better be true, because all the psymtab's > memory needs to be owned by the objfile, and not disappear when > dwarf2_build_psymtabs_hard returns. > > While we're building the partial symbol table, all the attribute > values are allocated in dwarf2_tmp_obstack. In fact, except for the > abbrev table, all the memory consumed is in dwarf2_tmp_obstack. (And > the abbrev table could go there too, it seems.) > > This means that we could free dwarf2_tmp_obstack after every iteration > in scan_partial_symbols. (At least, we could free it back to the > point it was at when we entered the function.) If the die's were > small, this wouldn't even cause us to call free. The maximum memory > consumption by the Dwarf 2 reader at any one time would be two dies: > the CU, and the current die. > > Now, if I see a duplicate die, I'm still going to process it. But the > bcache should be eliminating duplicate psymtab entries anyway --- > you'll get another entry in the global_psymbols or static_psymbols > array, but that's just a pointer, not a whole object. > > This approach isn't sensitive to changes in attribute offsets > (although you point out that that isn't as big a deal as one would > expect). > > So what I'm saying is, we should be able to cut partial symbol table > generation memory usage to a minimum by adding one call to > obstack_free in scan_partial_symbols. Sure, for psymbols, this will work fine. For full symbols, it won't. I did the full symbol elimination first, then psymbols, so i just used the full symbol approach scaled down. > > I know I'm punting on the full symbol stuff, but for partial symbols, > does this sound right? -- "In Vegas, I got into a long argument with the man at the roulette wheel over what I considered to be an odd number. "-Steven Wright