From mboxrd@z Thu Jan 1 00:00:00 1970 From: Richard Henderson To: Jim Blandy Cc: gcc@gcc.gnu.org, gdb@sourceware.cygnus.com Subject: Re: IA32: printing FP register variables Date: Mon, 26 Jul 1999 13:15:00 -0000 Message-id: <19990726131518.D2954@cygnus.com> References: <9500.931826533@upchuck.cygnus.com> <19990719234100.C17063@cygnus.com> X-SW-Source: 1999-q3/msg00101.html On Mon, Jul 26, 1999 at 01:42:41PM -0500, Jim Blandy wrote: > Okay. So the current plan is: > > 1) EGCS folks get regstack to actually propagate the variable/register > mapping to its output accurately. Yes, this is a reasonable plan. As to a time frame... I don't know. What we need is a good infrastructure for handling LRS debug info in a convenient form within the compiler. Whether this gets emitted as dwarf2 location ranges, or stabs live ranges, or as a set of virtual lexical blocks. What Cygnus has now for LRS just won't work when the size live ranges are on the order of instructions. If someone has brilliant ideas how to manage and manipulate this stuff in a way that won't make things an absolute nightmare, I'd be glad to hear it. I'll throw out one idea -- live ranges must always begin with a store. What if we put a REG_START_RANGE in places that a user variable moves to a new location. And a REG_END_RANGE note in places a user variable truely dies and its register gets re-allocated to some temp. The value of the note would be some canonical token that lets us match up with the DECL in question. Just before, or in the process of, emitting debug info, we use the CFG and the REG_*_RANGE notes to transform this somewhat fluid form into something that the debugger can use. A particularly major open issue is how the optimizers can easily know when they need to create these notes, and which variables the notes are being created for. I could easily see it being a not insignificant task to find this, given that it's all scattered about. But the fact that it's scattered about is crucial to being resiliant against ever-changing insn sequences. The extra memory and time might even call for it only being enabled at -g3 or something. Something we'll have to examine. Thoughts? r~ >From jimb@cygnus.com Mon Jul 26 18:50:00 1999 From: Jim Blandy To: Sekaran Nanja Cc: Rajiv Mirani , Mike Vermeulen , gdb@sourceware.cygnus.com Subject: Re: Regarding Range Table Date: Mon, 26 Jul 1999 18:50:00 -0000 Message-id: References: <37990D39.B393D61B@cup.hp.com> <379CCBBD.6E3FC2F3@cup.hp.com> X-SW-Source: 1999-q3/msg00102.html Content-length: 6697 [To gdb@sourceware readers: Sekaran is working on helping GDB debug optimized code. He would like to extend the way GDB represents variables that live in different places at different times --- for example, a variable that mostly lives on the stack, except in the body of a particular loop, where it is brought into a register. GDB represents such variables using multiple struct symbol objects, one for each home the variable occupies. The `ranges' element of the `struct symbol' lists the addresses at which that `struct symbol' is appropriate. This is a weird representation, since you can have several struct symbol objects corresponding to a single variable. It would be more intuitive to identify each variable with a single `struct symbol' object, and have that object carry a mapping from code addresses to homes. But here we are.] Sekaran Nanja writes: Jim> Would it be okay if we had this discussion on gdb@sourceware.cygnus.com? Sekaran> I think it is OK. Jim> It's fine with me to extend the range_list struct, as long as the new Jim> semantics are very clear. If they're not well-explained, or they're Jim> muddy, then we won't be able to maintain it. Jim> Jim> So I need to understand exactly what you want to do to symbols and Jim> their range lists, and how to interpret the resulting structures. Sekaran> Please note that HP captures the following information for live Sekaran> ranges of variables: Symbol Address (Register - but this can Sekaran> possibly change for different code ranges) Jim> This is just a mapping from code addresses onto variable homes, right? Jim> To represent this information, no changes to GDB's symbol structures Jim> are necessary, right? Sekaran> We don't need to change the symbol struct but we need to Sekaran> change range_list struct to support this. Please note that Sekaran> Register value for the assigned symbol may change in code Sekaran> ranges and so we need to keep the address info. in Sekaran> range_list. I am planning to change the struct range_list as Sekaran> follows: Sekaran> Sekaran> struct range_list { Sekaran> unsigned int set_early : 1; Sekaran> unsigned int set_late : 1; Sekaran> unsigned int unknown : 1; Sekaran> unsigned int reserved : 29 Sekaran> Sekaran> enum address_class aclass BYTE_BITFIELD; Sekaran> CORE_ADDR symbol_address; Sekaran> int comes_from_line; Sekaran> int moved_to_line; Sekaran> Sekaran> CORE_ADDR start; Sekaran> CORE_ADDR end; Sekaran> struct range_list *next; Sekaran> } Sekaran> Set Early/ Set Late - Flag for critical point assignment (To provide Sekaran> useful warning to the user) and possibly associated line numbers for Sekaran> these. Jim> I don't understand what this is. Could you explain it in more Jim> detail? Sekaran> Please note that due to optimization, critical assignment Sekaran> statement(s) can possibly moved around. In these cases, the Sekaran> user needs to be warned regarding this optimization by Sekaran> displaying warning to the user that either the assignment has Sekaran> taken place earlier at a specific line or it is going to Sekaran> happen at a specific line. Please note that the following Sekaran> newly added fields to range_list are used support this Sekaran> feature: Sekaran> Sekaran> set_early Sekaran> set_late Sekaran> comes_from_line Sekaran> moved_to_line. I am still not clear on what set_early, set_late, comes_from_line and moved_to_line would mean, but I am guessing that they are meant to handle cases like these: - a variable which is live in the source code at a given point is not actually live in the optimized code which the user is debugging, because the compiler has moved an assignment to the variable from before the current source line to after the current source line. - a variable is live in the source code at a given point, but all its uses have been moved above it, along with *another* assignment to the variable. So the variable is live, but its value isn't going to go where you'd expect. (Here, "before" and "after" refer to control flow, not to source positions). Is that correct? GDB's representation for split live ranges is not very intuitive, and I don't think you're using it correctly. GDB uses a separate struct symbol for each home a variable might have. A struct symbol's `ranges' list lists those address ranges over which the variable home given in that struct symbol applies. So, for example, if a variable `foo' usually lives on the stack, but lives in register 5 from code addresses 0x1020 and 0x1030, and register 6 from code addresses 0x1040 to 0x1050, then you would have: - the primary struct symbol, whose aclass is LOC_LOCAL, and whose SYMBOL_VALUE is the offset from the frame base, whose `ranges' list is empty, and whose `aliases' list contains two struct symbols: - the first alias symbol, whose aclass is LOC_REGISTER and whose SYMBOL_VALUE is 5, and whose `ranges' list contains the single element, {start=0x1020, end=0x1030, next=0} - the second alias symbol, whose aclass is LOC_REGISTER and whose SYMBOL_VALUE is 6, and whose `ranges' list contains the single element, {start=0x1040, end=0x1050, next=0}. All of these struct symbols have the name `foo', and all refer to the same variable. (I think this is crummy, but that's the way it is.) In other words, when a variable has several homes, GDB doesn't represent this using a single struct symbol with several elements in its `ranges' list, each one specifying a particular home for that range. Instead, GDB represents this using multiple `struct symbol' objects, each of which probably has a single element in its `ranges' list. The only case where a symbol's `ranges' list would have more than one element is if the variable has the same home in several distinct ranges. An OP_VAR_VALUE node of a GDB `struct expression' will contain a reference to the symbol object appropriate for the location at which the expression will be evaluated --- it could be a the primary or an alias. The expression evaluator assumes that the home in the struct symbol is correct. So instead of adding `aclass' and `symbol_address' to struct range_list, it seems to me that you should instead build a separate struct symbol for each home the variable might have. It makes more sense to me to add your new info to the struct symbol itself, so that value_of_variable can check it and complain appropriately. It would be nice to do it in a way which doesn't use much space if we don't have this kind of information for the symbol. What do you think? >From jimb@cygnus.com Mon Jul 26 19:09:00 1999 From: Jim Blandy To: Richard Henderson Cc: gcc@gcc.gnu.org, gdb@sourceware.cygnus.com Subject: Re: IA32: printing FP register variables Date: Mon, 26 Jul 1999 19:09:00 -0000 Message-id: References: <9500.931826533@upchuck.cygnus.com> <19990719234100.C17063@cygnus.com> <19990726131518.D2954@cygnus.com> X-SW-Source: 1999-q3/msg00103.html Content-length: 179 Would it be possible to re-derive some of the LRS info by analyzing the final code? The less information you need to carry through the optimizations, the better, it seems to me. >From jimb@cygnus.com Mon Jul 26 19:12:00 1999 From: Jim Blandy To: Richard Henderson Cc: gcc@gcc.gnu.org, gdb@sourceware.cygnus.com Subject: Re: IA32: printing FP register variables Date: Mon, 26 Jul 1999 19:12:00 -0000 Message-id: References: <9500.931826533@upchuck.cygnus.com> <19990719234100.C17063@cygnus.com> <19990726131518.D2954@cygnus.com> X-SW-Source: 1999-q3/msg00104.html Content-length: 322 > > 1) EGCS folks get regstack to actually propagate the variable/register > > mapping to its output accurately. > > Yes, this is a reasonable plan. As to a time frame... I don't know. We have no customer complaining about this. So there's no external pressure. It's just lame when GDB can't reliably print `foo'. >From jimb@cygnus.com Mon Jul 26 19:15:00 1999 From: Jim Blandy To: Roger Cruz Cc: gdb@sourceware.cygnus.com Subject: Re: How do I type cast an array of strings in GDB? Date: Mon, 26 Jul 1999 19:15:00 -0000 Message-id: References: <379CBFBC.6164B1CA@ignitus.com> X-SW-Source: 1999-q3/msg00105.html Content-length: 164 >From an operational perspective, you want to prevent GDB from fetching the value of semTypeMsg as a machine word. You might try: print ((char **) &semTypeMsg)[1] >From snanja@cup.hp.com Mon Jul 26 19:36:00 1999 From: Sekaran Nanja To: Jim Blandy Cc: Rajiv Mirani , Mike Vermeulen , gdb@sourceware.cygnus.com Subject: Re: Regarding Range Table Date: Mon, 26 Jul 1999 19:36:00 -0000 Message-id: <379D1B3A.32B9121B@cup.hp.com> References: <37990D39.B393D61B@cup.hp.com> <379CCBBD.6E3FC2F3@cup.hp.com> X-SW-Source: 1999-q3/msg00106.html Content-length: 8692 Jim>I am still not clear on what set_early, set_late, comes_from_line and Jim>moved_to_line would mean, but I am guessing that they are meant to Jim>handle cases like these: Jim>- a variable which is live in the source code at a given point is not Jim>actually live in the optimized code which the user is debugging, Jim> because the compiler has moved an assignment to the variable from Jim> before the current source line to after the current source line. Jim>- a variable is live in the source code at a given point, but all its Jim> uses have been moved above it, along with *another* assignment to Jim> the variable. So the variable is live, but its value isn't going to Jim> go where you'd expect. Jim>(Here, "before" and "after" refer to control flow, not to source Jim>positions). Jim>Is that correct? Yes. This is correct. Jim>So instead of adding `aclass' and `symbol_address' to struct Jim>range_list, it seems to me that you should instead build a separate Jim>struct symbol for each home the variable might have. Yes. I agree. Jim>It makes more sense to me to add your new info to the struct symbol Jim>itself, so that value_of_variable can check it and complain Jim>appropriately. It would be nice to do it in a way which doesn't use Jim>much space if we don't have this kind of information for the symbol. Jim>What do you think? Yes. It makes sense to add this info. to struct symbol to keep things clean but we lose the new info. for an alias symbol containing multiple ranges (Ex: variable has the same home in several distinct ranges.). I think it is better to save these new info. in range_list struct. Please let me know what do you think about this. Thanks, Sekaran. Jim Blandy wrote: > [To gdb@sourceware readers: Sekaran is working on helping GDB debug > optimized code. He would like to extend the way GDB represents > variables that live in different places at different times --- for > example, a variable that mostly lives on the stack, except in the body > of a particular loop, where it is brought into a register. > > GDB represents such variables using multiple struct symbol objects, > one for each home the variable occupies. The `ranges' element of the > `struct symbol' lists the addresses at which that `struct symbol' is > appropriate. > > This is a weird representation, since you can have several struct > symbol objects corresponding to a single variable. It would be more > intuitive to identify each variable with a single `struct symbol' > object, and have that object carry a mapping from code addresses to > homes. But here we are.] > > Sekaran Nanja writes: > > Jim> Would it be okay if we had this discussion on gdb@sourceware.cygnus.com? > > Sekaran> I think it is OK. > > Jim> It's fine with me to extend the range_list struct, as long as the new > Jim> semantics are very clear. If they're not well-explained, or they're > Jim> muddy, then we won't be able to maintain it. > Jim> > Jim> So I need to understand exactly what you want to do to symbols and > Jim> their range lists, and how to interpret the resulting structures. > > Sekaran> Please note that HP captures the following information for live > Sekaran> ranges of variables: Symbol Address (Register - but this can > Sekaran> possibly change for different code ranges) > > Jim> This is just a mapping from code addresses onto variable homes, right? > Jim> To represent this information, no changes to GDB's symbol structures > Jim> are necessary, right? > > Sekaran> We don't need to change the symbol struct but we need to > Sekaran> change range_list struct to support this. Please note that > Sekaran> Register value for the assigned symbol may change in code > Sekaran> ranges and so we need to keep the address info. in > Sekaran> range_list. I am planning to change the struct range_list as > Sekaran> follows: > Sekaran> > Sekaran> struct range_list { > Sekaran> unsigned int set_early : 1; > Sekaran> unsigned int set_late : 1; > Sekaran> unsigned int unknown : 1; > Sekaran> unsigned int reserved : 29 > Sekaran> > Sekaran> enum address_class aclass BYTE_BITFIELD; > Sekaran> CORE_ADDR symbol_address; > Sekaran> int comes_from_line; > Sekaran> int moved_to_line; > Sekaran> > Sekaran> CORE_ADDR start; > Sekaran> CORE_ADDR end; > Sekaran> struct range_list *next; > Sekaran> } > > Sekaran> Set Early/ Set Late - Flag for critical point assignment (To provide > Sekaran> useful warning to the user) and possibly associated line numbers for > Sekaran> these. > > Jim> I don't understand what this is. Could you explain it in more > Jim> detail? > > Sekaran> Please note that due to optimization, critical assignment > Sekaran> statement(s) can possibly moved around. In these cases, the > Sekaran> user needs to be warned regarding this optimization by > Sekaran> displaying warning to the user that either the assignment has > Sekaran> taken place earlier at a specific line or it is going to > Sekaran> happen at a specific line. Please note that the following > Sekaran> newly added fields to range_list are used support this > Sekaran> feature: > Sekaran> > Sekaran> set_early > Sekaran> set_late > Sekaran> comes_from_line > Sekaran> moved_to_line. > > I am still not clear on what set_early, set_late, comes_from_line and > moved_to_line would mean, but I am guessing that they are meant to > handle cases like these: > > - a variable which is live in the source code at a given point is not > actually live in the optimized code which the user is debugging, > because the compiler has moved an assignment to the variable from > before the current source line to after the current source line. > > - a variable is live in the source code at a given point, but all its > uses have been moved above it, along with *another* assignment to > the variable. So the variable is live, but its value isn't going to > go where you'd expect. > > (Here, "before" and "after" refer to control flow, not to source > positions). > > Is that correct? > > GDB's representation for split live ranges is not very intuitive, and > I don't think you're using it correctly. > > GDB uses a separate struct symbol for each home a variable might have. > A struct symbol's `ranges' list lists those address ranges over which > the variable home given in that struct symbol applies. > > So, for example, if a variable `foo' usually lives on the stack, but > lives in register 5 from code addresses 0x1020 and 0x1030, and > register 6 from code addresses 0x1040 to 0x1050, then you would have: > > - the primary struct symbol, whose aclass is LOC_LOCAL, and whose > SYMBOL_VALUE is the offset from the frame base, whose `ranges' list > is empty, and whose `aliases' list contains two struct symbols: > > - the first alias symbol, whose aclass is LOC_REGISTER and whose > SYMBOL_VALUE is 5, and whose `ranges' list contains the single > element, {start=0x1020, end=0x1030, next=0} > > - the second alias symbol, whose aclass is LOC_REGISTER and whose > SYMBOL_VALUE is 6, and whose `ranges' list contains the single > element, {start=0x1040, end=0x1050, next=0}. > > All of these struct symbols have the name `foo', and all refer to the > same variable. (I think this is crummy, but that's the way it is.) > > In other words, when a variable has several homes, GDB doesn't > represent this using a single struct symbol with several elements in > its `ranges' list, each one specifying a particular home for that > range. Instead, GDB represents this using multiple `struct symbol' > objects, each of which probably has a single element in its `ranges' > list. > > The only case where a symbol's `ranges' list would have more than one > element is if the variable has the same home in several distinct > ranges. > > An OP_VAR_VALUE node of a GDB `struct expression' will contain a > reference to the symbol object appropriate for the location at which > the expression will be evaluated --- it could be a the primary or an > alias. The expression evaluator assumes that the home in the struct > symbol is correct. > > So instead of adding `aclass' and `symbol_address' to struct > range_list, it seems to me that you should instead build a separate > struct symbol for each home the variable might have. > > It makes more sense to me to add your new info to the struct symbol > itself, so that value_of_variable can check it and complain > appropriately. It would be nice to do it in a way which doesn't use > much space if we don't have this kind of information for the symbol. > > What do you think?