From mboxrd@z Thu Jan  1 00:00:00 1970
From: Richard Henderson <rth@cygnus.com>
To: Jim Blandy <jimb@cygnus.com>
Cc: gcc@gcc.gnu.org, gdb@sourceware.cygnus.com
Subject: Re: IA32: printing FP register variables
Date: Mon, 26 Jul 1999 13:15:00 -0000
Message-id: <19990726131518.D2954@cygnus.com>
References: <9500.931826533@upchuck.cygnus.com> <np1zeci6tm.fsf@zwingli.cygnus.com> <npn1ws2xp1.fsf@zwingli.cygnus.com> <19990719234100.C17063@cygnus.com> <npemhv45i6.fsf@zwingli.cygnus.com>
X-SW-Source: 1999-q3/msg00101.html

On Mon, Jul 26, 1999 at 01:42:41PM -0500, Jim Blandy wrote:
> Okay.  So the current plan is:
> 
> 1) EGCS folks get regstack to actually propagate the variable/register
>    mapping to its output accurately.

Yes, this is a reasonable plan.  As to a time frame... I don't know.

What we need is a good infrastructure for handling LRS debug info in
a convenient form within the compiler.  Whether this gets emitted as
dwarf2 location ranges, or stabs live ranges, or as a set of virtual
lexical blocks.  What Cygnus has now for LRS just won't work when the
size live ranges are on the order of instructions.

If someone has brilliant ideas how to manage and manipulate this stuff
in a way that won't make things an absolute nightmare, I'd be glad to
hear it.

I'll throw out one idea -- live ranges must always begin with a store.
What if we put a REG_START_RANGE in places that a user variable moves
to a new location.  And a REG_END_RANGE note in places a user variable
truely dies and its register gets re-allocated to some temp.  The value
of the note would be some canonical token that lets us match up with
the DECL in question.

Just before, or in the process of, emitting debug info, we use the
CFG and the REG_*_RANGE notes to transform this somewhat fluid form
into something that the debugger can use.

A particularly major open issue is how the optimizers can easily know
when they need to create these notes, and which variables the notes 
are being created for.  I could easily see it being a not insignificant
task to find this, given that it's all scattered about.  But the fact
that it's scattered about is crucial to being resiliant against
ever-changing insn sequences.

The extra memory and time might even call for it only being enabled
at -g3 or something.  Something we'll have to examine.

Thoughts?


r~
>From jimb@cygnus.com Mon Jul 26 18:50:00 1999
From: Jim Blandy <jimb@cygnus.com>
To: Sekaran Nanja <snanja@cup.hp.com>
Cc: Rajiv Mirani <mirani@cup.hp.com>, Mike Vermeulen <mev@hpclhayb.cup.hp.com>, gdb@sourceware.cygnus.com
Subject: Re: Regarding Range Table
Date: Mon, 26 Jul 1999 18:50:00 -0000
Message-id: <npn1wi3m05.fsf@zwingli.cygnus.com>
References: <37990D39.B393D61B@cup.hp.com> <np908343yh.fsf@zwingli.cygnus.com> <379CCBBD.6E3FC2F3@cup.hp.com>
X-SW-Source: 1999-q3/msg00102.html
Content-length: 6697

[To gdb@sourceware readers: Sekaran is working on helping GDB debug
optimized code.  He would like to extend the way GDB represents
variables that live in different places at different times --- for
example, a variable that mostly lives on the stack, except in the body
of a particular loop, where it is brought into a register.

GDB represents such variables using multiple struct symbol objects,
one for each home the variable occupies.  The `ranges' element of the
`struct symbol' lists the addresses at which that `struct symbol' is
appropriate.

This is a weird representation, since you can have several struct
symbol objects corresponding to a single variable.  It would be more
intuitive to identify each variable with a single `struct symbol'
object, and have that object carry a mapping from code addresses to
homes.  But here we are.]


Sekaran Nanja <snanja@cup.hp.com> writes:
 
Jim> Would it be okay if we had this discussion on gdb@sourceware.cygnus.com?

Sekaran> I think it is OK.

Jim> It's fine with me to extend the range_list struct, as long as the new
Jim> semantics are very clear.  If they're not well-explained, or they're
Jim> muddy, then we won't be able to maintain it.
Jim>
Jim> So I need to understand exactly what you want to do to symbols and
Jim> their range lists, and how to interpret the resulting structures.

Sekaran> Please note that HP captures the following information for live
Sekaran> ranges of variables: Symbol Address (Register - but this can
Sekaran> possibly change for different code ranges)

Jim> This is just a mapping from code addresses onto variable homes, right?
Jim> To represent this information, no changes to GDB's symbol structures
Jim> are necessary, right?

Sekaran> We don't need to change the symbol struct but we need to
Sekaran> change range_list struct to support this.  Please note that
Sekaran> Register value for the assigned symbol may change in code
Sekaran> ranges and so we need to keep the address info. in
Sekaran> range_list. I am planning to change the struct range_list as
Sekaran> follows:
Sekaran> 
Sekaran> struct range_list {
Sekaran>          unsigned int  set_early : 1;
Sekaran>          unsigned int set_late : 1;
Sekaran>          unsigned int unknown : 1;
Sekaran>          unsigned int reserved  : 29
Sekaran> 
Sekaran>         enum  address_class  aclass BYTE_BITFIELD;
Sekaran>         CORE_ADDR   symbol_address;
Sekaran>          int  comes_from_line;
Sekaran>          int  moved_to_line;
Sekaran> 
Sekaran>         CORE_ADDR  start;
Sekaran>         CORE_ADDR  end;
Sekaran>          struct range_list  *next;
Sekaran> }

Sekaran> Set Early/ Set Late - Flag for critical point assignment (To provide
Sekaran> useful warning to the user) and possibly associated line numbers for
Sekaran> these.
 
Jim> I don't understand what this is.  Could you explain it in more
Jim> detail?

Sekaran> Please note that due to optimization, critical assignment
Sekaran> statement(s) can possibly moved around.  In these cases, the
Sekaran> user needs to be warned regarding this optimization by
Sekaran> displaying warning to the user that either the assignment has
Sekaran> taken place earlier at a specific line or it is going to
Sekaran> happen at a specific line.  Please note that the following
Sekaran> newly added fields to range_list are used support this
Sekaran> feature:
Sekaran>
Sekaran> set_early
Sekaran> set_late
Sekaran> comes_from_line
Sekaran> moved_to_line.


I am still not clear on what set_early, set_late, comes_from_line and
moved_to_line would mean, but I am guessing that they are meant to
handle cases like these:

- a variable which is live in the source code at a given point is not
  actually live in the optimized code which the user is debugging,
  because the compiler has moved an assignment to the variable from
  before the current source line to after the current source line.

- a variable is live in the source code at a given point, but all its
  uses have been moved above it, along with *another* assignment to
  the variable.  So the variable is live, but its value isn't going to
  go where you'd expect.

(Here, "before" and "after" refer to control flow, not to source
positions).

Is that correct?

GDB's representation for split live ranges is not very intuitive, and
I don't think you're using it correctly.

GDB uses a separate struct symbol for each home a variable might have.
A struct symbol's `ranges' list lists those address ranges over which
the variable home given in that struct symbol applies.

So, for example, if a variable `foo' usually lives on the stack, but
lives in register 5 from code addresses 0x1020 and 0x1030, and
register 6 from code addresses 0x1040 to 0x1050, then you would have:

- the primary struct symbol, whose aclass is LOC_LOCAL, and whose
  SYMBOL_VALUE is the offset from the frame base, whose `ranges' list
  is empty, and whose `aliases' list contains two struct symbols:

  - the first alias symbol, whose aclass is LOC_REGISTER and whose
    SYMBOL_VALUE is 5, and whose `ranges' list contains the single
    element, {start=0x1020, end=0x1030, next=0}

  - the second alias symbol, whose aclass is LOC_REGISTER and whose
    SYMBOL_VALUE is 6, and whose `ranges' list contains the single
    element, {start=0x1040, end=0x1050, next=0}.

All of these struct symbols have the name `foo', and all refer to the
same variable.  (I think this is crummy, but that's the way it is.)

In other words, when a variable has several homes, GDB doesn't
represent this using a single struct symbol with several elements in
its `ranges' list, each one specifying a particular home for that
range.  Instead, GDB represents this using multiple `struct symbol'
objects, each of which probably has a single element in its `ranges'
list.

The only case where a symbol's `ranges' list would have more than one
element is if the variable has the same home in several distinct
ranges.

An OP_VAR_VALUE node of a GDB `struct expression' will contain a
reference to the symbol object appropriate for the location at which
the expression will be evaluated --- it could be a the primary or an
alias.  The expression evaluator assumes that the home in the struct
symbol is correct.

So instead of adding `aclass' and `symbol_address' to struct
range_list, it seems to me that you should instead build a separate
struct symbol for each home the variable might have.

It makes more sense to me to add your new info to the struct symbol
itself, so that value_of_variable can check it and complain
appropriately.  It would be nice to do it in a way which doesn't use
much space if we don't have this kind of information for the symbol.

What do you think?
>From jimb@cygnus.com Mon Jul 26 19:09:00 1999
From: Jim Blandy <jimb@cygnus.com>
To: Richard Henderson <rth@cygnus.com>
Cc: gcc@gcc.gnu.org, gdb@sourceware.cygnus.com
Subject: Re: IA32: printing FP register variables
Date: Mon, 26 Jul 1999 19:09:00 -0000
Message-id: <nphfmq3ktz.fsf@zwingli.cygnus.com>
References: <9500.931826533@upchuck.cygnus.com> <np1zeci6tm.fsf@zwingli.cygnus.com> <npn1ws2xp1.fsf@zwingli.cygnus.com> <19990719234100.C17063@cygnus.com> <npemhv45i6.fsf@zwingli.cygnus.com> <19990726131518.D2954@cygnus.com>
X-SW-Source: 1999-q3/msg00103.html
Content-length: 179

Would it be possible to re-derive some of the LRS info by analyzing
the final code?  The less information you need to carry through the
optimizations, the better, it seems to me.
>From jimb@cygnus.com Mon Jul 26 19:12:00 1999
From: Jim Blandy <jimb@cygnus.com>
To: Richard Henderson <rth@cygnus.com>
Cc: gcc@gcc.gnu.org, gdb@sourceware.cygnus.com
Subject: Re: IA32: printing FP register variables
Date: Mon, 26 Jul 1999 19:12:00 -0000
Message-id: <npg12a3kom.fsf@zwingli.cygnus.com>
References: <9500.931826533@upchuck.cygnus.com> <np1zeci6tm.fsf@zwingli.cygnus.com> <npn1ws2xp1.fsf@zwingli.cygnus.com> <19990719234100.C17063@cygnus.com> <npemhv45i6.fsf@zwingli.cygnus.com> <19990726131518.D2954@cygnus.com>
X-SW-Source: 1999-q3/msg00104.html
Content-length: 322

> > 1) EGCS folks get regstack to actually propagate the variable/register
> >    mapping to its output accurately.
> 
> Yes, this is a reasonable plan.  As to a time frame... I don't know.

We have no customer complaining about this.  So there's no external
pressure.  It's just lame when GDB can't reliably print `foo'.
>From jimb@cygnus.com Mon Jul 26 19:15:00 1999
From: Jim Blandy <jimb@cygnus.com>
To: Roger Cruz <rogerc@ignitus.com>
Cc: gdb@sourceware.cygnus.com
Subject: Re: How do I type cast an array of strings in GDB?
Date: Mon, 26 Jul 1999 19:15:00 -0000
Message-id: <npemhu3kk4.fsf@zwingli.cygnus.com>
References: <379CBFBC.6164B1CA@ignitus.com>
X-SW-Source: 1999-q3/msg00105.html
Content-length: 164

>From an operational perspective, you want to prevent GDB from fetching
the value of semTypeMsg as a machine word.  You might try:

print ((char **) &semTypeMsg)[1]
>From snanja@cup.hp.com Mon Jul 26 19:36:00 1999
From: Sekaran Nanja <snanja@cup.hp.com>
To: Jim Blandy <jimb@cygnus.com>
Cc: Rajiv Mirani <mirani@cup.hp.com>, Mike Vermeulen <mev@hpclhayb.cup.hp.com>, gdb@sourceware.cygnus.com
Subject: Re: Regarding Range Table
Date: Mon, 26 Jul 1999 19:36:00 -0000
Message-id: <379D1B3A.32B9121B@cup.hp.com>
References: <37990D39.B393D61B@cup.hp.com> <np908343yh.fsf@zwingli.cygnus.com> <379CCBBD.6E3FC2F3@cup.hp.com> <npn1wi3m05.fsf@zwingli.cygnus.com>
X-SW-Source: 1999-q3/msg00106.html
Content-length: 8692

Jim>I am still not clear on what set_early, set_late, comes_from_line and
Jim>moved_to_line would mean, but I am guessing that they are meant to
Jim>handle cases like these:

Jim>- a variable which is live in the source code at a given point is not
Jim>actually live in the optimized code which the user is debugging,
Jim>  because the compiler has moved an assignment to the variable from
Jim> before the current source line to after the current source line.

Jim>- a variable is live in the source code at a given point, but all its
Jim>  uses have been moved above it, along with *another* assignment to
Jim> the variable.  So the variable is live, but its value isn't going to
Jim> go where you'd expect.

Jim>(Here, "before" and "after" refer to control flow, not to source
Jim>positions).

Jim>Is that correct?
Yes. This is correct.

Jim>So instead of adding `aclass' and `symbol_address' to struct
Jim>range_list, it seems to me that you should instead build a separate
Jim>struct symbol for each home the variable might have.

Yes. I agree.


Jim>It makes more sense to me to add your new info to the struct symbol
Jim>itself, so that value_of_variable can check it and complain
Jim>appropriately.  It would be nice to do it in a way which doesn't use
Jim>much space if we don't have this kind of information for the symbol.

Jim>What do you think?
Yes. It makes sense to add this info. to struct symbol to keep things clean but
we lose the new info. for  an alias symbol containing multiple ranges (Ex:
variable has the same home in several distinct ranges.). I think it is better to
save these new info. in  range_list struct. Please let me know what do you think
about this.

Thanks,
Sekaran.

Jim Blandy wrote:

> [To gdb@sourceware readers: Sekaran is working on helping GDB debug
> optimized code.  He would like to extend the way GDB represents
> variables that live in different places at different times --- for
> example, a variable that mostly lives on the stack, except in the body
> of a particular loop, where it is brought into a register.
>
> GDB represents such variables using multiple struct symbol objects,
> one for each home the variable occupies.  The `ranges' element of the
> `struct symbol' lists the addresses at which that `struct symbol' is
> appropriate.
>
> This is a weird representation, since you can have several struct
> symbol objects corresponding to a single variable.  It would be more
> intuitive to identify each variable with a single `struct symbol'
> object, and have that object carry a mapping from code addresses to
> homes.  But here we are.]
>
> Sekaran Nanja <snanja@cup.hp.com> writes:
>
> Jim> Would it be okay if we had this discussion on gdb@sourceware.cygnus.com?
>
> Sekaran> I think it is OK.
>
> Jim> It's fine with me to extend the range_list struct, as long as the new
> Jim> semantics are very clear.  If they're not well-explained, or they're
> Jim> muddy, then we won't be able to maintain it.
> Jim>
> Jim> So I need to understand exactly what you want to do to symbols and
> Jim> their range lists, and how to interpret the resulting structures.
>
> Sekaran> Please note that HP captures the following information for live
> Sekaran> ranges of variables: Symbol Address (Register - but this can
> Sekaran> possibly change for different code ranges)
>
> Jim> This is just a mapping from code addresses onto variable homes, right?
> Jim> To represent this information, no changes to GDB's symbol structures
> Jim> are necessary, right?
>
> Sekaran> We don't need to change the symbol struct but we need to
> Sekaran> change range_list struct to support this.  Please note that
> Sekaran> Register value for the assigned symbol may change in code
> Sekaran> ranges and so we need to keep the address info. in
> Sekaran> range_list. I am planning to change the struct range_list as
> Sekaran> follows:
> Sekaran>
> Sekaran> struct range_list {
> Sekaran>          unsigned int  set_early : 1;
> Sekaran>          unsigned int set_late : 1;
> Sekaran>          unsigned int unknown : 1;
> Sekaran>          unsigned int reserved  : 29
> Sekaran>
> Sekaran>         enum  address_class  aclass BYTE_BITFIELD;
> Sekaran>         CORE_ADDR   symbol_address;
> Sekaran>          int  comes_from_line;
> Sekaran>          int  moved_to_line;
> Sekaran>
> Sekaran>         CORE_ADDR  start;
> Sekaran>         CORE_ADDR  end;
> Sekaran>          struct range_list  *next;
> Sekaran> }
>
> Sekaran> Set Early/ Set Late - Flag for critical point assignment (To provide
> Sekaran> useful warning to the user) and possibly associated line numbers for
> Sekaran> these.
>
> Jim> I don't understand what this is.  Could you explain it in more
> Jim> detail?
>
> Sekaran> Please note that due to optimization, critical assignment
> Sekaran> statement(s) can possibly moved around.  In these cases, the
> Sekaran> user needs to be warned regarding this optimization by
> Sekaran> displaying warning to the user that either the assignment has
> Sekaran> taken place earlier at a specific line or it is going to
> Sekaran> happen at a specific line.  Please note that the following
> Sekaran> newly added fields to range_list are used support this
> Sekaran> feature:
> Sekaran>
> Sekaran> set_early
> Sekaran> set_late
> Sekaran> comes_from_line
> Sekaran> moved_to_line.
>
> I am still not clear on what set_early, set_late, comes_from_line and
> moved_to_line would mean, but I am guessing that they are meant to
> handle cases like these:
>
> - a variable which is live in the source code at a given point is not
>   actually live in the optimized code which the user is debugging,
>   because the compiler has moved an assignment to the variable from
>   before the current source line to after the current source line.
>
> - a variable is live in the source code at a given point, but all its
>   uses have been moved above it, along with *another* assignment to
>   the variable.  So the variable is live, but its value isn't going to
>   go where you'd expect.
>
> (Here, "before" and "after" refer to control flow, not to source
> positions).
>
> Is that correct?
>
> GDB's representation for split live ranges is not very intuitive, and
> I don't think you're using it correctly.
>
> GDB uses a separate struct symbol for each home a variable might have.
> A struct symbol's `ranges' list lists those address ranges over which
> the variable home given in that struct symbol applies.
>
> So, for example, if a variable `foo' usually lives on the stack, but
> lives in register 5 from code addresses 0x1020 and 0x1030, and
> register 6 from code addresses 0x1040 to 0x1050, then you would have:
>
> - the primary struct symbol, whose aclass is LOC_LOCAL, and whose
>   SYMBOL_VALUE is the offset from the frame base, whose `ranges' list
>   is empty, and whose `aliases' list contains two struct symbols:
>
>   - the first alias symbol, whose aclass is LOC_REGISTER and whose
>     SYMBOL_VALUE is 5, and whose `ranges' list contains the single
>     element, {start=0x1020, end=0x1030, next=0}
>
>   - the second alias symbol, whose aclass is LOC_REGISTER and whose
>     SYMBOL_VALUE is 6, and whose `ranges' list contains the single
>     element, {start=0x1040, end=0x1050, next=0}.
>
> All of these struct symbols have the name `foo', and all refer to the
> same variable.  (I think this is crummy, but that's the way it is.)
>
> In other words, when a variable has several homes, GDB doesn't
> represent this using a single struct symbol with several elements in
> its `ranges' list, each one specifying a particular home for that
> range.  Instead, GDB represents this using multiple `struct symbol'
> objects, each of which probably has a single element in its `ranges'
> list.
>
> The only case where a symbol's `ranges' list would have more than one
> element is if the variable has the same home in several distinct
> ranges.
>
> An OP_VAR_VALUE node of a GDB `struct expression' will contain a
> reference to the symbol object appropriate for the location at which
> the expression will be evaluated --- it could be a the primary or an
> alias.  The expression evaluator assumes that the home in the struct
> symbol is correct.
>
> So instead of adding `aclass' and `symbol_address' to struct
> range_list, it seems to me that you should instead build a separate
> struct symbol for each home the variable might have.
>
> It makes more sense to me to add your new info to the struct symbol
> itself, so that value_of_variable can check it and complain
> appropriately.  It would be nice to do it in a way which doesn't use
> much space if we don't have this kind of information for the symbol.
>
> What do you think?