From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16455 invoked by alias); 25 Sep 2002 03:37:14 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 16448 invoked from network); 25 Sep 2002 03:37:13 -0000 Received: from unknown (HELO localhost.redhat.com) (66.30.197.194) by sources.redhat.com with SMTP; 25 Sep 2002 03:37:13 -0000 Received: by localhost.redhat.com (Postfix, from userid 469) id CEA74FE71; Tue, 24 Sep 2002 23:35:14 -0400 (EDT) From: Elena Zannoni MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Message-ID: <15761.12017.210026.925891@localhost.redhat.com> Date: Tue, 24 Sep 2002 20:37:00 -0000 To: Jim Blandy Cc: David Carlton , Daniel Jacobowitz , gdb@sources.redhat.com Subject: Re: suggestion for dictionary representation In-Reply-To: References: <200209230244.g8N2ieo21741@zenia.red-bean.com> <20020923031056.GA26307@nevyn.them.org> X-SW-Source: 2002-09/txt/msg00395.txt.bz2 On this topic, see the following thread. DanielJ and I had the same discuss= ion then. http://sources.redhat.com/ml/gdb-patches/2001-11/msg00130.html Elena Jim Blandy writes: >=20 > David Carlton writes: > > > I'm tempted to whack the block special case for function arguments. = It > > > may make name lookup a little more complicated but I think it will m= ake > > > everything clearer. We could, of course, try this on the branch and > > > see if we like the results :) > >=20 > > Would it be reasonable to break up function blocks into two separate > > blocks: a linear block that only defines the parameters for the > > function and a non-linear block that contains the actual local > > variables? Not that I think Jim's scheme is a bad one - I agree that > > it's better than the current scheme - but given the possibility of > > local variables shadowing function parameters, it seems to me to be > > conceptually cleaner to have two separate blocks appear anyways, and > > it also solves this problem. >=20 > The issue is a bit more tangled than you think, I think. Splitting > the function's body and its formals into two separate blocks is a good > idea, but it isn't going to get rid of all your duplicates. A single > formal parameter can have two symbols in a function's block that > describe it. Try this out on a Pentium. (The `-O2' and `-gstabs+' > are required.) >=20 > $ cat func.c > #include >=20 > int > main (int argc, char **argv) > { > static int local =3D 3; > printf ("%d\n", argc * local); > } > $ gcc -O2 -gstabs+ func.c -o func >=20 > Then start up GDB on GDB on `func': >=20 > (top-gdb) run > The program being debugged has been started already. > Start it from the beginning? (y or n) y >=20 > Starting program: gdb -nw func > GNU gdb 2002-09-16-cvs > Copyright 2002 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and y= ou are > welcome to change it and/or distribute copies of it under certain cond= itions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for det= ails. > This GDB was configured as "i686-pc-linux-gnu"... > (gdb) >=20 > Set a breakpoint in main, just to get the symbols read: >=20 > (gdb) break main > Breakpoint 1 at 0x804834c: file func.c, line 7. > (gdb) >=20 > Drop out to the enclosing GDB: >=20 > (gdb) info > (top-gdb) >=20 > It just so happens that `func.c' is the first compilation unit of the > first executable file in GDB's list: >=20 > (top-gdb) print object_files->symtabs->filename > $177 =3D 0x82fdbf8 "func.c" > (top-gdb) >=20 > If that's not so for you, you'll need to walk `symtabs' to find the > right symtab. Anyway, let's check out this symtab's blockvector. I'm > just using [0] as a postfix dereferencing operator here: >=20 > (top-gdb) print object_files->symtabs->blockvector[0] > $178 =3D {nblocks =3D 3, block =3D {0x82f8b74}} > (top-gdb) >=20 > The first and second blocks are the global and static blocks, so the > third one is probably for `main': >=20 > (top-gdb) print object_files->symtabs->blockvector->block[2] > $179 =3D (struct block *) 0x82f8ab4 > (top-gdb) p *$179 > $180 =3D {startaddr =3D 134513472, endaddr =3D 134513513, function =3D= 0x82f8988,=20 > superblock =3D 0x82f8ae4, gcc_compile_flag =3D 2 '\002', hashtable = =3D 0 '\0',=20 > nsyms =3D 4, sym =3D {0x82f89c4}} > (top-gdb) p *$179->function > $181 =3D {ginfo =3D {name =3D 0x82f89bc "main", value =3D {ivalue =3D = 137333428,=20 > block =3D 0x82f8ab4,=20 > bytes =3D 0x82f8ab4 "@\203\004\bi\203\004\b\210\211/\b=E4\212/\b= \002",=20 > address =3D 137333428, chain =3D 0x82f8ab4}, language_specific = =3D { > cplus_specific =3D {demangled_name =3D 0x0}}, language =3D langu= age_c,=20 > section =3D 11, bfd_section =3D 0x82d4fc0}, type =3D 0x82faaa8,=20 > namespace =3D VAR_NAMESPACE, aclass =3D LOC_BLOCK, line =3D 5, aux_v= alue =3D { > basereg =3D 0}, aliases =3D 0x0, ranges =3D 0x0, hash_next =3D 0x0} > (top-gdb) >=20 > And it was! Let's look at those four symbols: >=20 > (top-gdb) p *$179->sym[0] > $182 =3D {ginfo =3D {name =3D 0x82f89f8 "argc", value =3D {ivalue =3D = 8, block =3D 0x8,=20 > bytes =3D 0x8
, address =3D 8, chain = =3D 0x8},=20 > language_specific =3D {cplus_specific =3D {demangled_name =3D 0x0}= },=20 > language =3D language_c, section =3D 0, bfd_section =3D 0x0}, type= =3D 0x82df828, > namespace =3D VAR_NAMESPACE, aclass =3D LOC_ARG, line =3D 4, aux_val= ue =3D { > basereg =3D 0}, aliases =3D 0x0, ranges =3D 0x0, hash_next =3D 0x0} > (top-gdb) p *$179->sym[1] > $183 =3D {ginfo =3D {name =3D 0x82f8a34 "argv", value =3D {ivalue =3D = 12, block =3D 0xc,=20 > bytes =3D 0xc
, address =3D 12, chain= =3D 0xc},=20 > language_specific =3D {cplus_specific =3D {demangled_name =3D 0x0}= },=20 > language =3D language_c, section =3D 0, bfd_section =3D 0x0}, type= =3D 0x82faaf4, > namespace =3D VAR_NAMESPACE, aclass =3D LOC_ARG, line =3D 4, aux_val= ue =3D { > basereg =3D 0}, aliases =3D 0x0, ranges =3D 0x0, hash_next =3D 0x0} > (top-gdb) p *$179->sym[2] > $184 =3D {ginfo =3D {name =3D 0x82f8a70 "argc", value =3D {ivalue =3D = 0, block =3D 0x0,=20 > bytes =3D 0x0, address =3D 0, chain =3D 0x0}, language_specific = =3D { > cplus_specific =3D {demangled_name =3D 0x0}}, language =3D langu= age_c,=20 > section =3D 0, bfd_section =3D 0x0}, type =3D 0x82df828,=20 > namespace =3D VAR_NAMESPACE, aclass =3D LOC_REGISTER, line =3D 4, au= x_value =3D { > basereg =3D 0}, aliases =3D 0x0, ranges =3D 0x0, hash_next =3D 0x0} > (top-gdb) p *$179->sym[3] > $185 =3D {ginfo =3D {name =3D 0x82f8aac "local", value =3D {ivalue =3D= 134517720,=20 > block =3D 0x80493d8, bytes =3D 0x80493d8 "=C9\f", address =3D 13= 4517720,=20 > chain =3D 0x80493d8}, language_specific =3D {cplus_specific =3D { > demangled_name =3D 0x0}}, language =3D language_c, section =3D= 14,=20 > bfd_section =3D 0x0}, type =3D 0x82df828, namespace =3D VAR_NAMESP= ACE,=20 > aclass =3D LOC_STATIC, line =3D 6, aux_value =3D {basereg =3D 0}, al= iases =3D 0x0,=20 > ranges =3D 0x0, hash_next =3D 0x0} > (top-gdb)=20 >=20 > Hey! Why are there two entries for argc? (This is the extra tangle I > was referring to. If you know all about this, you can stop reading > now.) >=20 > The two `argc' symbols have different address classes: one has an > address class that indicates it's an argument, and the other doesn't. > The argument symbol describes where the variable is passed on the > stack (eight bytes after %ebp), whereas the non-argument symbol > describes where the variable lives in the block of the function: > register zero, or %eax. >=20 > As a sanity check, let's look at the IA-32 code for main: >=20 > (top-gdb) c > Continuing. > (gdb) disass main > Dump of assembler code for function main: > 0x8048340
: push %ebp > 0x8048341 : mov %esp,%ebp > 0x8048343 : sub $0x8,%esp > 0x8048346 : mov 0x8(%ebp),%eax > 0x8048349 : and $0xfffffff0,%esp > 0x804834c : mov 0x80493d8,%edx > 0x8048352 : movl $0x80483c8,(%esp,1) > 0x8048359 : imul %edx,%eax > 0x804835c : mov %eax,0x4(%esp,1) > 0x8048360 : call 0x8048268 > 0x8048365 : mov %ebp,%esp > 0x8048367 : pop %ebp > 0x8048368 : ret=20=20=20=20 > End of assembler dump. > (gdb)=20 >=20 > So, yes, the compiler did copy `argc' from the stack into %eax. > Check. >=20 > But *why* does GDB do this? I have no idea. It seems to me that, > with prologue skipping et al, simply having a single LOC_REGPARM would > be the Right Thing. I don't really know when GDB will prefer the > argument entry, and when it'll prefer the non-argument entry. >=20 > I suspect it's historical. If you look at the stabs spec, you'll see > that it actually emits two stabs for arguments that are passed in one > place, but get moved somewhere else: >=20 > $ objdump --stabs func > ... > 329 FUN 0 5 08048340 12145 main:F(0,1) > 330 PSYM 0 4 00000008 12157 argc:p(0,1) > 331 PSYM 0 4 0000000c 12169 argv:p(1,1)=3D*(7,36) > 332 SLINE 0 5 00000000 0=20=20=20=20=20=20 > 333 SLINE 0 7 0000000c 0=20=20=20=20=20=20 > 334 SLINE 0 8 00000025 0=20=20=20=20=20=20 > 335 RSYM 0 4 00000000 12189 argc:r(0,1) > 336 STSYM 0 6 080493d8 12201 local:V(0,1) > 337 LBRAC 0 0 0000000c 0=20=20=20=20=20=20 > 338 RBRAC 0 0 00000029 0=20=20=20=20=20=20 > 339 FUN 0 0 00000029 0=20=20=20=20=20=20 > ... > $=20 >=20 > The PSYM accounts for the argument symbol, and the RSYM accounts for > the internal symbol. A lot of GDB's data structures very closely > match what's provided in STABS. (The partial symbol tables are a > good example of this: they correspond exactly to the EXCL links.) >=20 > But anyway, all this could be handled much better nowadays using Dwarf > 2 CFA and location lists. I've been saying that for years, but it > hasn't happened yet. Andrew has the CFI done now (I think?), and > Daniel B. has submitted a patch for location expressions (but not > location lists, tho they would be easy to add), but it's awaiting > revision while he works on law school.