[PATCH] Add support for tracking/evaluating dwarf2 location expressions

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* [PATCH] Add support for tracking/evaluating dwarf2 location expressions
@ 2001-03-30 11:11 Daniel Berlin
  2001-03-30 15:36 ` Andrew Cagney
  2001-05-21 14:46 ` [PATCH] Add support for tracking/evaluating dwarf2 location expressions Jim Blandy
  0 siblings, 2 replies; 28+ messages in thread
From: Daniel Berlin @ 2001-03-30 11:11 UTC (permalink / raw)
  To: gdb-patches

This patch adds a place in the symbol structure to store the dwarf2
location (an upcoming dwarf2read huge rewrite patch uses it), an
evaluation maachine to findvar.c, and has read_var_value use the dwarf2
location if it's available.

NOte that in struct d2_location, in the symtab.h patch, that framlocdesc
cannot be unioned with locdesc, because the symbol providing the frame
location description  usually has it's own real location.

Currently, because GDB makes no use of the frame unwind info, etc, I
evaluate the frame pointer location expression for whatever is providing
it for the thing we are trying to evaluate, rather than rely what GDB
tells us the frame pointer is.

Yes, it's currently pointless to do this,  it's just future proofing, and
what we are supposed to do, anyway.

This patch will have no effect whatosever without the dwarf2read.c
rewrite, but it won't break anything, neither, so i sepereated it out and
am submitting it now.
--Dan
2001-03-30  Daniel Berlin  <dberlin@redhat.com>

	* symtab.h (address_class): Add LOC_DWARF2_EXPRESSION.
	(struct symbol): Add struct d2_location to store dwarf2
	location expressions.
	(SYMBOL_DWARF2_LOCATION, SYMBOL_DWARF2_FRAME_LOCATION): New macros.

	* findvar.c (read_leb128): New function, read leb128 data from a
	buffer.
	(evaluate_dwarf2_locdesc): New function, evaluate a given
	dwarf2 location description.
	(read_var_value): Use the dwarf2 location expression if it's available.

Index: findvar.c
===================================================================
RCS file: /cvs/src/src/gdb/findvar.c,v
retrieving revision 1.19
diff -c -3 -p -w -B -b -r1.19 findvar.c
*** findvar.c	2001/03/06 08:21:07	1.19
--- findvar.c	2001/03/30 18:17:34
***************
*** 31,36 ****
--- 31,37 ----
  #include "gdb_string.h"
  #include "floatformat.h"
  #include "symfile.h"		/* for overlay functions */
+ #include "elf/dwarf2.h"
  #include "regcache.h"

  /* This is used to indicate that we don't know the format of the floating point
*************** symbol_read_needs_frame (struct symbol *
*** 501,507 ****
--- 502,807 ----
      }
    return 1;
  }
+ struct dwarf_block
+ {
+ 	unsigned int size;
+ 	unsigned char *data;
+ };
+ /* Read a LEB128 value from the buffer given by data */
+ static unsigned long int
+ read_leb128 (unsigned char *data, int *length_return, int sign)
+ {
+   unsigned long int result = 0;
+   unsigned int      num_read = 0;
+   int               shift = 0;
+   unsigned char     byte;
+
+   do
+     {
+       byte = * data ++;
+       num_read ++;
+
+       result |= (byte & 0x7f) << shift;
+
+       shift += 7;
+
+     }
+   while (byte & 0x80);
+
+   if (length_return != NULL)
+     *length_return = num_read;
+
+   if (sign && (shift < 32) && (byte & 0x40))
+     result |= -1 << shift;
+
+   return result;
+ }
+
+ /* Evaluate a dwarf2 location description, given in THEBLOCK, in the
+    context of frame FRAME. */
+ value_ptr
+ evaluate_dwarf2_locdesc (register struct symbol *var, struct frame_info *frame,
+ 			 struct dwarf_block *theblock, struct type *type)
+ {
+   int i;
+   int stacki;
+   unsigned char op;
+   unsigned int bytes_read;
+   value_ptr stack[64];
+
+   i=0;
+   stacki=0;
+   stack[stacki]=0;
+   bytes_read = 0;
+
+   /* Loop until we are done with the location expression we were asked
+      to evaluate */
+   while (i < theblock->size)
+     {
+       /* Get the opcode */
+       op = theblock->data[i++];
+
+       if (op >= DW_OP_reg0 && op <= DW_OP_reg31)
+ 	{
+ 	  /* Push value of register specified by opcode onto stack */
+ 	  stack[++stacki] = value_from_register (type, op - DW_OP_reg0, frame);
+ 	}
+       else if (op == DW_OP_regx)
+ 	{
+ 	  unsigned int unsnd;
+
+ 	  /* Get the register number operand */
+ 	  unsnd = read_leb128 (theblock->data + i, &bytes_read, 0);
+ 	  i += bytes_read;
+
+ 	  /* Push the value of that register onto the stack */
+ 	  stack[++stacki] = value_from_register (type, unsnd, frame);
+ 	}
+       else if (op >= DW_OP_breg0 && op <= DW_OP_breg31)
+ 	{
+ 	  unsigned int basereg = op - DW_OP_breg0;
+ 	  unsigned int basereg_addr;
+ 	  unsigned int addr;
+
+ 	  /* Allocate a buffer to store the register data */
+ 	  char *buf = (char*) alloca (MAX_REGISTER_RAW_SIZE);
+
+ 	  /* Read the offset from the base register */
+ 	  basereg_addr = read_leb128 (theblock->data + i, &bytes_read, 0);
+ 	  i += bytes_read;
+
+ 	  /* Get the base register */
+ 	  get_saved_register (buf, NULL, NULL, frame, basereg, NULL);
+
+ 	  /* Extract the address from the base register */
+ 	  addr = extract_address (buf, REGISTER_RAW_SIZE (basereg));
+ 	  addr += basereg_addr;
+
+ 	  /* Push the value of *(basereg + offset) onto the stack */
+ 	  stack[++stacki] = value_ind (value_from_pointer (lookup_pointer_type (type), addr));
+ 	}
+       else if (op == DW_OP_bregx)
+ 	{
+
+ 	  unsigned int basereg;
+ 	  unsigned int addr;
+ 	  unsigned int basereg_addr;
+
+ 	  /* Allocate a buffer to store the register data */
+ 	  char *buf = (char*) alloca (MAX_REGISTER_RAW_SIZE);
+
+ 	  /* Read the base register */
+ 	  basereg = read_leb128 (theblock->data + i, &bytes_read, 0);
+ 	  i += bytes_read;
+
+ 	  /* Read the offset from the base register */
+ 	  basereg_addr = read_leb128 (theblock->data + i, &bytes_read, 0);
+ 	  i += bytes_read;
+
+ 	  /* Get the base register */
+ 	  get_saved_register (buf, NULL, NULL, frame, basereg, NULL);
+
+ 	  /* Extract address from the base register */
+ 	  addr = extract_address (buf, REGISTER_RAW_SIZE (basereg));
+ 	  addr += basereg_addr;
+
+ 	  /* Push the value of *(basereg + offset) onto the stack */
+ 	  stack[++stacki] = value_ind (value_from_pointer (lookup_pointer_type (type), addr));
+ 	}
+       else if (op == DW_OP_fbreg)
+ 	{
+ 	  value_ptr framepointer;
+ 	  unsigned int basereg_addr;
+ 	  struct symbol *framefunc;
+
+ 	  /* Get our current function's symbol, so we can get the
+ 	     frame location (we don't do this using gdb's frame, since
+ 	     it doesn't yet support dwarf2 frame unwinding, etc, and
+ 	     thus, the frame it says may not *really* be what we
+ 	     want) */
+ 	  framefunc = get_frame_function (frame);
+
+ 	  /* Read the offset from the frame pointer */
+ 	  basereg_addr = read_leb128 (theblock->data + i, &bytes_read, 0);
+ 	  i+= bytes_read;
+
+ 	  /* Make sure we have a valid frame function, and that it has
+ 	     a dwarf2 frame location for us */
+ 	  if (framefunc == NULL)
+ 	    error ("Cannot find function for frame, and therefore, can't figure out framepointer");
+ 	  if (SYMBOL_DWARF2_FRAME_LOC (framefunc) == NULL)
+ 	    error ("Frame base location for function is NULL, therefore, can't figure out framepointer");
+
+ 	  /* Recursively evaluate the location of the frame pointer */
+ 	  framepointer = evaluate_dwarf2_locdesc (var, frame, SYMBOL_DWARF2_FRAME_LOC (framefunc), builtin_type_CORE_ADDR);
+
+ 	  /* Add the offset */
+ 	  framepointer = value_add (framepointer, value_from_longest (builtin_type_CORE_ADDR, basereg_addr));
+
+ 	  /* Push the value of *(fp + offset) onto the stack */
+ 	  stack[++stacki] = value_ind (value_from_pointer (lookup_pointer_type (type), value_as_pointer (framepointer)));
+ 	}
+       else if (op == DW_OP_addr)
+ 	{
+ 	  unsigned int addr;
+ 	  /* Extract the address from the data */
+ 	  addr = extract_address (theblock->data + i, TYPE_LENGTH (builtin_type_ptr));
+ 	  i += TYPE_LENGTH (builtin_type_ptr);

+ 	  /* Push the value at *addr onto the stack */
+ 	  stack[++stacki] = value_ind (value_from_pointer (lookup_pointer_type (type), addr));
+ 	}
+       else if (op >= DW_OP_const1u && op <= DW_OP_const4s)
+ 	{
+ 	  unsigned int readsize;
+ 	  unsigned int sign=0;
+
+ 	  /* Determine the size to read, and whether it's signed or
+ 	     not */
+ 	  switch (op)
+ 	    {
+ 	    case DW_OP_const1s:
+ 	      sign = 1;
+ 	    case DW_OP_const1u:
+ 	      readsize = 1;
+ 	      break;
+ 	    case DW_OP_const2s:
+ 	      sign = 1;
+ 	    case DW_OP_const2u:
+ 	      readsize = 2;
+ 	      break;
+ 	    case DW_OP_const4s:
+ 	      sign = 1;
+ 	    case DW_OP_const4u:
+ 	      readsize = 4;
+ 	      break;
+ 	    default:
+ 	      error ("Unknown constant size while trying to evaluate dwarf2 expression");
+ 	    }
+ 	  if (sign)
+ 	    {
+ 	      LONGEST value;
+
+ 	      /* Extract the signed integer from the data */
+ 	      value = extract_signed_integer (theblock->data + i, readsize);
+
+ 	      /* FIXME: we have no builtin_type_LONGEST, so i can't use that here */
+ 	      stack[++stacki] = value_from_longest (builtin_type_long, value);
+ 	    }
+ 	  else
+ 	    {
+ 	      ULONGEST value;
+
+ 	      /* Extract the unsigned integer from the data */
+ 	      value = extract_unsigned_integer (theblock->data + i, readsize);
+
+ 	      /* FIXME: we have no builtin_type_ULONGEST, so i can't use that here */
+ 	      stack[++stacki] = value_from_longest (builtin_type_unsigned_long, value);
+ 	    }
+ 	  i += readsize;
+ 	}
+       else if (op == DW_OP_constu)
+ 	{
+ 	  unsigned int bytes_read;
+ 	  ULONGEST value;
+
+ 	  /* Read in the uleb128 value for the constant */
+ 	  value = read_leb128 (theblock->data + i, &bytes_read, 0);
+ 	  i += bytes_read;
+
+ 	  /*FIXME: we have no builtin_type_ULONGEST, so i can't use that here */
+ 	  stack[++stacki] = value_from_longest (builtin_type_unsigned_long, value);
+ 	}
+       else if (op == DW_OP_consts)
+ 	{
+ 	  unsigned int bytes_read;
+ 	  LONGEST value;
+
+ 	  /* Read in the sleb128 value for the constant */
+ 	  value = read_leb128 (theblock->data + i, &bytes_read, 1);
+ 	  i += bytes_read;
+
+ 	  /*FIXME: we have no builtin_type_LONGEST, so i can't use that here */
+ 	  stack[++stacki] = value_from_longest (builtin_type_long, value);
+ 	}
+       else if (op == DW_OP_plus)
+ 	{
+ 	  /* Add the value of the top of the stack, to the item at
+ 	     top-1, and pop top item */
+ 	  if (VALUE_LVAL (stack[stacki]) == lval_memory)
+ 	    VALUE_ADDRESS (stack[stacki - 1]) += value_as_long (stack[stacki]);
+ 	  else
+ 	    stack[stacki - 1] = value_add (stack[stacki - 1], stack[stacki]);
+ 	  stacki--;
+ 	}
+       else if (op == DW_OP_plus_uconst)
+ 	{
+ 	  unsigned int bytes_read;
+ 	  ULONGEST value;
+
+ 	  /* Read in uleb128 value for constant to add */
+ 	  value = read_leb128 (theblock->data + i, &bytes_read, 0);
+ 	  i += bytes_read;
+
+ 	  /* Add the constant to the value at the top of the stack */
+ 	  if (VALUE_LVAL (stack[stacki]) == lval_memory)
+ 	    VALUE_ADDRESS (stack[stacki]) += value;
+ 	  else
+ 	    stack[stacki] = value_add (stack[stacki], value_from_longest (builtin_type_unsigned_long, value));
+ 	}
+       else if (op == DW_OP_minus)
+ 	{
+ 	  /* Subtract the topofstack-1 value from the topofstack, put
+ 	     it in topofstack-1, and pop the top of the stack.
+
+ 	  I.E. if stack is
+ 	  10
+ 	  4
+ 	  then act like we popped both, and pushed 6 onto the stack.
+ 	  */
+
+ 	  if (VALUE_LVAL (stack[stacki]) == lval_memory)
+ 	    VALUE_ADDRESS (stack[stacki - 1]) = VALUE_ADDRESS (stack[stacki]) - value_as_long (stack[stacki - 1]);
+ 	  else
+ 	    stack[stacki - 1] = value_sub (stack[stacki], stack[stacki - 1]);
+ 	  stacki--;
+ 	}
+       else if (op == DW_OP_deref)
+ 	{
+ 	  /* Dereference the address at the top of the stack */
+ 	  stack[stacki] = value_ind (value_at (lookup_pointer_type (type),
+ 					       value_as_pointer (value_addr (stack[stacki])),
+ 					       VALUE_BFD_SECTION (stack[stacki])));
+ 	}
+       else
+ 	{
+ 	  break;
+ 	}
+     }
+   /* Return whatever the top of the stack is */
+   return stack[stacki];
+ }
+
  /* Given a struct symbol for a variable,
     and a stack frame id, read the value of the variable
     and return a (pointer to a) struct value containing the value.
*************** read_var_value (register struct symbol *
*** 524,530 ****

    if (frame == NULL)
      frame = selected_frame;
!
    switch (SYMBOL_CLASS (var))
      {
      case LOC_CONST:
--- 824,831 ----

    if (frame == NULL)
      frame = selected_frame;
!   if (SYMBOL_DWARF2_LOC (var) != NULL)
!     return evaluate_dwarf2_locdesc (var, frame, (struct dwarf_block *) SYMBOL_DWARF2_LOC (var), SYMBOL_TYPE (var));
    switch (SYMBOL_CLASS (var))
      {
      case LOC_CONST:
Index: symtab.h
===================================================================
RCS file: /cvs/src/src/gdb/symtab.h,v
retrieving revision 1.18
diff -c -3 -p -w -B -b -r1.18 symtab.h
*** symtab.h	2001/02/08 06:03:54	1.18
--- symtab.h	2001/03/30 18:17:35
*************** enum address_class
*** 653,660 ****
       * with a level of indirection.
       */

!     LOC_INDIRECT

    };

  /* Linked list of symbol's live ranges. */
--- 653,663 ----
       * with a level of indirection.
       */

!     LOC_INDIRECT,

+     /* Location is a dwarf2 expression */
+     LOC_DWARF2_EXPRESSION
+
    };

  /* Linked list of symbol's live ranges. */
*************** struct symbol
*** 712,717 ****
--- 715,725 ----
  	short basereg;
        }
      aux_value;
+     struct
+     {
+ 	    void *locdesc;
+ 	    void *framelocdesc;
+     } d2_location;


      /* Link to a list of aliases for this symbol.
*************** struct symbol
*** 729,734 ****
--- 737,744 ----
  #define SYMBOL_TYPE(symbol)		(symbol)->type
  #define SYMBOL_LINE(symbol)		(symbol)->line
  #define SYMBOL_BASEREG(symbol)		(symbol)->aux_value.basereg
+ #define SYMBOL_DWARF2_LOC(symbol)  	(symbol)->d2_location.locdesc
+ #define SYMBOL_DWARF2_FRAME_LOC(symbol) (symbol)->d2_location.framelocdesc
  #define SYMBOL_ALIASES(symbol)		(symbol)->aliases
  #define SYMBOL_RANGES(symbol)		(symbol)->ranges



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-03-30 11:11 [PATCH] Add support for tracking/evaluating dwarf2 location expressions Daniel Berlin
@ 2001-03-30 15:36 ` Andrew Cagney
  2001-03-30 18:53   ` Daniel Berlin
  2001-05-21 14:46 ` [PATCH] Add support for tracking/evaluating dwarf2 location expressions Jim Blandy
  1 sibling, 1 reply; 28+ messages in thread
From: Andrew Cagney @ 2001-03-30 15:36 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gdb-patches

Dan, just some things to tweek,

State machines are normally implemented using a switch and not a chain
of ifs.  Check dwarf2read.c:decode_locdesc() as an example (hmm, slight
dejavu :-).

While fall through switches such as:

+         /* Determine the size to read, and whether it's signed or
+            not */
+         switch (op)
+           {
+           case DW_OP_const1s:
+             sign = 1;
+           case DW_OP_const1u:
+             readsize = 1;
+             break;

might be cute they tend to make the life of those that follow painful. 
As they say, dumb the code down.

Don't forget ``a = b'' not ``a=b''.

Just use ``struct value *'', I've every intention of zapping
``value_ptr''! :-)

  value_ptr stack[64];
Is there a constant for this?  A quick glance at decode_locdesc() and it
has the same hardwired constant.

From memory, the dwarf2 state machine doco states that you should return
the top-of-stack when the machine has finished executing.  This means
that the stack may not be empty and that the code could potentially leak
``struct value *''s.

	enjoy,
		Andrew

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location  expressions
  2001-03-30 15:36 ` Andrew Cagney
@ 2001-03-30 18:53   ` Daniel Berlin
  2001-04-06 11:53     ` Andrew Cagney
  0 siblings, 1 reply; 28+ messages in thread
From: Daniel Berlin @ 2001-03-30 18:53 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: gdb-patches

Andrew Cagney <ac131313@cygnus.com> writes:

> Dan, just some things to tweek,
> 
> State machines are normally implemented using a switch and not a chain
> of ifs.  Check dwarf2read.c:decode_locdesc() as an example (hmm, slight
> dejavu :-).

I actually just ripped mine, and copied the one from gcc's
unwind-dw2.c, and modified it to use the right gdb functions instead.
It's implemented as a switch.

It would be nice to just directly share it with gcc, but i doubt
that's feasible.

> Don't forget ``a = b'' not ``a=b''.
> 
> Just use ``struct value *'', I've every intention of zapping
> ``value_ptr''! :-)

I just turned it into a shell around execute_stack_op (see
unwind-dw2.c in the gcc source, it just does what i did, without using
values, just pointers), which uses a stack of CORE_ADDR's. We then
take whatever it gives us, and turn it into a value, once, and return that.

> 
>   value_ptr stack[64];
> Is there a constant for this?  A quick glance at decode_locdesc() and it
> has the same hardwired constant.
Nobody has ever produced location expressions that need more.

I wouldn't worry too much about it, the dwarf2 unwinder in gcc used to
handle exceptions has the exact same limitation. So if they aren't
worried about mission critical/ multi-threaded code that tries to catch/throw
exceptions breaking, i'm not to worried about the debugger used to
debug them breaking. :)

I can't think of anything that needs more than 64 stack slots to
describe in dwarf2 location expressions. I'm pretty sure i could
implement some pretty useful *real* programs in that limit, let alone
simple descriptions of locations.
> 
> >From memory, the dwarf2 state machine doco states that you should return
> the top-of-stack when the machine has finished executing.  This means
> that the stack may not be empty and that the code could potentially leak
> ``struct value *''s.

This is fixed by the change to a stack of CORE_ADDR's.

> 
> 	enjoy,
> 		Andrew

-- 
The other night I came home late, and tried to unlock my house
with my car keys.  I started the house up.  So, I drove it
around for a while.  I was speeding, and a cop pulled me over.
He asked where I lived.  I said, "right here, officer".  Later,

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location  expressions
  2001-03-30 18:53   ` Daniel Berlin
@ 2001-04-06 11:53     ` Andrew Cagney
  2001-04-06 12:10       ` Daniel Berlin
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Cagney @ 2001-04-06 11:53 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gdb-patches

> >   value_ptr stack[64];
> > Is there a constant for this?  A quick glance at decode_locdesc() and it
> > has the same hardwired constant.
> Nobody has ever produced location expressions that need more.

The problem typically isn't with what people are doing intentionally but
rather unintentionally.  The code opens the way for an input file to
cause gdb to overflow a buffer and trash its stack.

Since we're trying to lessen the likelyhood of GDB corrupting its stack
and dumping core, I think the code should include some sort of stack
range check.

	Andrew

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location  expressions
  2001-04-06 11:53     ` Andrew Cagney
@ 2001-04-06 12:10       ` Daniel Berlin
  2001-04-06 12:36         ` Kevin Buettner
  0 siblings, 1 reply; 28+ messages in thread
From: Daniel Berlin @ 2001-04-06 12:10 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: Daniel Berlin, gdb-patches

On Fri, 6 Apr 2001, Andrew Cagney wrote:

>
> > >   value_ptr stack[64];
> > > Is there a constant for this?  A quick glance at decode_locdesc() and it
> > > has the same hardwired constant.
> > Nobody has ever produced location expressions that need more.
>
> The problem typically isn't with what people are doing intentionally but
> rather unintentionally.  The code opens the way for an input file to
> cause gdb to overflow a buffer and trash its stack.

Well, as I said, it will trash  GCC as well, since they do no range
checking, and have the exact same limit.
But i'll range check it, just the same.

>
> Since we're trying to lessen the likelyhood of GDB corrupting its stack
> and dumping core, I think the code should include some sort of stack
> range check.

As you wish, done in the latest version.
I moved it all into dwarf2eval.c as well.

>
> 	Andrew
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-04-06 12:10       ` Daniel Berlin
@ 2001-04-06 12:36         ` Kevin Buettner
  2001-04-06 23:18           ` [PATCH] Add support for tracking/evaluating dwarf2 locationexpressions Daniel Berlin
  0 siblings, 1 reply; 28+ messages in thread
From: Kevin Buettner @ 2001-04-06 12:36 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Andrew Cagney, gdb-patches, Daniel Berlin

On Apr 6,  3:10pm, Daniel Berlin wrote:

> On Fri, 6 Apr 2001, Andrew Cagney wrote:
> 
> > > >   value_ptr stack[64];
> > > > Is there a constant for this?  A quick glance at decode_locdesc() and it
> > > > has the same hardwired constant.
> > > Nobody has ever produced location expressions that need more.
> >
> > The problem typically isn't with what people are doing intentionally but
> > rather unintentionally.  The code opens the way for an input file to
> > cause gdb to overflow a buffer and trash its stack.
> 
> Well, as I said, it will trash  GCC as well, since they do no range
> checking, and have the exact same limit.
> But i'll range check it, just the same.

Maybe GCC has been designed so that it'll never need a bigger stack. 
But keep in mind that GDB needs to accept as input the output of
compilers other than GCC.  Perhaps some other compiler, through either
a bug or a feature, will produce more complicated location expressions
than GCC.

Anyway, I'm glad you've added the range check.

Kevin


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 locationexpressions
  2001-04-06 12:36         ` Kevin Buettner
@ 2001-04-06 23:18           ` Daniel Berlin
  0 siblings, 0 replies; 28+ messages in thread
From: Daniel Berlin @ 2001-04-06 23:18 UTC (permalink / raw)
  To: Kevin Buettner; +Cc: Andrew Cagney, gdb-patches, Daniel Berlin

On Fri, 6 Apr 2001, Kevin Buettner wrote:

> On Apr 6,  3:10pm, Daniel Berlin wrote:
>
> > On Fri, 6 Apr 2001, Andrew Cagney wrote:
> >
> > > > >   value_ptr stack[64];
> > > > > Is there a constant for this?  A quick glance at decode_locdesc() and it
> > > > > has the same hardwired constant.
> > > > Nobody has ever produced location expressions that need more.
> > >
> > > The problem typically isn't with what people are doing intentionally but
> > > rather unintentionally.  The code opens the way for an input file to
> > > cause gdb to overflow a buffer and trash its stack.
> >
> > Well, as I said, it will trash  GCC as well, since they do no range
> > checking, and have the exact same limit.
> > But i'll range check it, just the same.
>
> Maybe GCC has been designed so that it'll never need a bigger stack.
No, it hasn't.
It's a FIXME that's never been fixed :)

> But keep in mind that GDB needs to accept as input the output of
> compilers other than GCC.

Of course.

I think you aren't getting what i say by "break GCC". I mean that all of
the STL classes, and anything that *can* throw exceptions, would miserably
fail, and segfault, at runtime, if something ever went above that limit.

That's a much greater risk than we are facing, by having gdb maybe dump
core, no?

  Perhaps some other compiler, through either
> a bug or a feature, will produce more complicated location expressions
> than GCC.

Yeah, but a single location expression requiring 64 things on the stack,
just to evaluate? Remember, adds/removes/etc reduce the number of things
on the stack, or keep it constant, they don't add.

And also, location expressions are for a single web (unioned live ranges
of a variable), location lists are used to describe where it is over a
given range of PC's, which consist of multiple location expressions.

I can't even fathom a way to use more than 64 stack entries.  You could
split a variable into almost an infinite number of places (IE say the
first byte is in a register, the next byte is at this given memory
address, etc), at once, and *still* not hit the limit.
But anyway, i added the range check, so this is all moot. ust rambling. :)

 > Anyway, I'm glad  you've added the range check. >
> Kevin
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-03-30 11:11 [PATCH] Add support for tracking/evaluating dwarf2 location expressions Daniel Berlin
  2001-03-30 15:36 ` Andrew Cagney
@ 2001-05-21 14:46 ` Jim Blandy
  2001-05-21 18:49   ` Daniel Berlin
  2001-06-06  9:07   ` Andrew Cagney
  1 sibling, 2 replies; 28+ messages in thread
From: Jim Blandy @ 2001-05-21 14:46 UTC (permalink / raw)
  To: gdb-patches; +Cc: Daniel Berlin

Daniel Berlin <dberlin@redhat.com> writes:
> This patch adds a place in the symbol structure to store the dwarf2
> location (an upcoming dwarf2read huge rewrite patch uses it), an
> evaluation maachine to findvar.c, and has read_var_value use the dwarf2
> location if it's available.

I think this is the wrong approach to the right idea.

I think it would be a great idea to give GDB more powerful ways to
describe the locations of variables, structure fields, and so on.
This would be a clean alternative to including special-purpose,
ABI-specific logic in GDB.

For example, one of the nice properties of the way GDB's `struct type'
represents a structure type in the present code is that it's mostly
ABI-independent.  Since the debug info spells out the exact position
of each structure member, GDB doesn't have to know the ABI's structure
layout rules, alignment requirements, and so on.

However, if we want to support virtual base classes, it's not clear
how to extend that structure appropriately.  In the V2 C++ ABI, each
object contained pointers to its virtual bases.  In the V3 C++ ABI, an
object's virtual table provides offsets to its virtual bases.  At the
moment, our `solution' is to provide ABI-specific code to find your
virtual bases.

If GDB supported a fully general location expression language, like
Dwarf 2 location expressions, then `struct type' could simply provide
a location expression to find each member or base class of the object.
In most cases, these location expressions would be trivial, but we
could easily encode both the V2 and V3 C++ ABI behaviors in this
manner.  It would then become the compiler's responsibility to
describe the ABI it was using.  We could take a similar approach to
finding virtual functions.

That said, I think the core of GDB should be independent of the
debugging information format.  If GDB does adopt a fully general
location expression language, then it should have a self-contained
description, which can be designed to suit GDB's needs.  Then we can
translate Dwarf2 expressions into that form.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-05-21 14:46 ` [PATCH] Add support for tracking/evaluating dwarf2 location expressions Jim Blandy
@ 2001-05-21 18:49   ` Daniel Berlin
  2001-05-22 12:46     ` Jim Blandy
  2001-06-06  9:07   ` Andrew Cagney
  1 sibling, 1 reply; 28+ messages in thread
From: Daniel Berlin @ 2001-05-21 18:49 UTC (permalink / raw)
  To: Jim Blandy; +Cc: gdb-patches

Jim Blandy <jimb@zwingli.cygnus.com> writes:

> Daniel Berlin <dberlin@redhat.com> writes:
> > This patch adds a place in the symbol structure to store the dwarf2
> > location (an upcoming dwarf2read huge rewrite patch uses it), an
> > evaluation maachine to findvar.c, and has read_var_value use the dwarf2
> > location if it's available.
> 
> I think this is the wrong approach to the right idea.
> 
> I think it would be a great idea to give GDB more powerful ways to
> describe the locations of variables, structure fields, and so on.
> This would be a clean alternative to including special-purpose,
> ABI-specific logic in GDB.
Yup.

> 
> For example, one of the nice properties of the way GDB's `struct type'
> represents a structure type in the present code is that it's mostly
> ABI-independent.  Since the debug info spells out the exact position
> of each structure member, GDB doesn't have to know the ABI's structure
> layout rules, alignment requirements, and so on.
> 
> However, if we want to support virtual base classes, it's not clear
> how to extend that structure appropriately.  In the V2 C++ ABI, each
> object contained pointers to its virtual bases.  In the V3 C++ ABI, an
> object's virtual table provides offsets to its virtual bases.  At the
> moment, our `solution' is to provide ABI-specific code to find your
> virtual bases.
> 
> If GDB supported a fully general location expression language, like
> Dwarf 2 location expressions, then `struct type' could simply provide
> a location expression to find each member or base class of the object.
> In most cases, these location expressions would be trivial, but we
> could easily encode both the V2 and V3 C++ ABI behaviors in this
> manner.  It would then become the compiler's responsibility to
> describe the ABI it was using.  We could take a similar approach to
> finding virtual functions.
> 
> 
> That said, I think the core of GDB should be independent of the
> debugging information format.  If GDB does adopt a fully general
> location expression language, then it should have a self-contained
> description, which can be designed to suit GDB's needs.  Then we can
> translate Dwarf2 expressions into that form.

Sure.

However, IMHO, the location expression language that dwarf2 uses is
self-contained, and suited for GDB's needs. Perfectly in fact. It
provides no more or less than necessary to describe our locations,
except where it saved a little space in real use (see below) and
it's trivial to convert other expression languages into it. and The
location list is simply a list of ranges and location expressions,
which is what we would have come up with anyway, judging from GDB structures.

In fact, I'd wager if we tried, we'd come up with something almost
exactly like d2 location expressions:

We would want some kind of simple, stack based language.
Normal binary and unary arithmetic operations, and normal stack
operations (rotate, pop, drop, etc).
Past these, we'd want a way of:
dereferencing the value on top of the stack as if it was a memory address.
A way to load constant values onto the stack
A way to load register values onto the stack
An efficient encoding, able to handle differing sizes for data types
on different platforms.

Well, there ya go, you now have dwarf2 location expressions, possibly
with different opcode names, and maybe a few more or a few less
arithmetic operations, and a few more or a few less built in opcodes
for common operations, to save space or not save space (the base
register+offset opcodes, the literal 1-31 opcodes, the register 1-31
opcodes, the fbreg opcode).

Why bother not just using something we already have a standard for, if
it's going to 99.9% the same?

For fun, I converted the stabs reader to generate these expressions,
in fact.  Didnt' take long at all.

For proof it's self-contained, it's the exception handling format used
by gcc 2.95 and 3 (on a non-SJLJ platform, of course).

libgcc in 2.95 (it was moved to libsupc++ in 3.0,  so that it only
gets linked to c++ programs), in fact, contains the evaluator, and
it's linked with every gcc compiled program.

The only thing DWARF2'ish about the location expressions is that they
are defined in the DWARF2 spec.  They depend in no way on dwarf2 at
all. 

In fact, here's an evaluator for them for GDB. 
Feel free to point out any dwarf2 specific parts (struct dwarf_block,
which besides a prototype for evaluate_dwarf2_locdesc, is the only
thing in dwarf2eval.h, is just:

struct dwarf_block
{
        unsigned int size;
        unsigned char *data;
};

Hardly dwarf2 specific, it's just the chunk of location expression
data, and the size, so we know where it ends.

LEB128 (little endian base 128) is used by Intel's IA64 unwind format, which uses slightly
incompatible and  different opcodes to save space, so LEB128 isn't a
dwarf2 specific thing either).

This code handles the complete language.



#include "defs.h"
#include "symtab.h"
#include "frame.h"
#include "value.h"
#include "gdbcore.h"
#include "elf/dwarf2.h"
#include "dwarf2eval.h"

/* Decode the unsigned LEB128 constant at BUF into the variable pointed to
   by R, and return the new value of BUF.  */

static unsigned char *
read_uleb128 (unsigned char *buf, ULONGEST *r)
{
  unsigned shift = 0;
  ULONGEST result = 0;

  while (1)
    {
      unsigned char byte = *buf++;
      result |= (byte & 0x7f) << shift;
      if ((byte & 0x80) == 0)
	break;
      shift += 7;
    }
  *r = result;
  return buf;
}

/* Decode the signed LEB128 constant at BUF into the variable pointed to
   by R, and return the new value of BUF.  */

static unsigned char *
read_sleb128 (unsigned char *buf, LONGEST *r)
{
  unsigned shift = 0;
  LONGEST result = 0;
  unsigned char byte;
  
  while (1)
    {
      byte = *buf++;
      result |= (byte & 0x7f) << shift;
      shift += 7;
      if ((byte & 0x80) == 0)
	break;
    }
  if (shift < (sizeof (*r) * 8) && (byte & 0x40) != 0)
    result |= - (1 << shift);
  
  *r = result;
  return buf;
}
static CORE_ADDR
execute_stack_op (struct symbol *var,  unsigned char *op_ptr, unsigned char *op_end,
		  struct frame_info *frame, CORE_ADDR initial, value_ptr *currval)
{
  CORE_ADDR stack[64];	/* ??? Assume this is enough. */
  int stack_elt;
  stack[0] = initial;
  stack_elt = 1;
  while (op_ptr < op_end)
    {
      enum dwarf_location_atom op = *op_ptr++;
      ULONGEST result, reg;
      LONGEST offset;
      switch (op)
	{
	case DW_OP_lit0:
	case DW_OP_lit1:
	case DW_OP_lit2:
	case DW_OP_lit3:
	case DW_OP_lit4:
	case DW_OP_lit5:
	case DW_OP_lit6:
	case DW_OP_lit7:
	case DW_OP_lit8:
	case DW_OP_lit9:
	case DW_OP_lit10:
	case DW_OP_lit11:
	case DW_OP_lit12:
	case DW_OP_lit13:
	case DW_OP_lit14:
	case DW_OP_lit15:
	case DW_OP_lit16:
	case DW_OP_lit17:
	case DW_OP_lit18:
	case DW_OP_lit19:
	case DW_OP_lit20:
	case DW_OP_lit21:
	case DW_OP_lit22:
	case DW_OP_lit23:
	case DW_OP_lit24:
	case DW_OP_lit25:
	case DW_OP_lit26:
	case DW_OP_lit27:
	case DW_OP_lit28:
	case DW_OP_lit29:
	case DW_OP_lit30:
	case DW_OP_lit31:
	  result = op - DW_OP_lit0;
	  break;
	  
	case DW_OP_addr:
	  result = (CORE_ADDR) extract_unsigned_integer (op_ptr, TARGET_PTR_BIT / TARGET_CHAR_BIT);
	  op_ptr += TARGET_PTR_BIT / TARGET_CHAR_BIT;
	  break;
	  
	case DW_OP_const1u:
	  result = extract_unsigned_integer (op_ptr, 1);
	  op_ptr += 1;
	  break;
	case DW_OP_const1s:
	  result = extract_signed_integer (op_ptr, 1);
	  op_ptr += 1;
	  break;
	case DW_OP_const2u:
	  result = extract_unsigned_integer (op_ptr, 2);
	  op_ptr += 2;
	  break;
	case DW_OP_const2s:
	  result = extract_signed_integer (op_ptr, 2);
	  op_ptr += 2;
	  break;
	case DW_OP_const4u:
	  result = extract_unsigned_integer (op_ptr, 4);
	  op_ptr += 4;
	  break;
	case DW_OP_const4s:
	  result = extract_signed_integer (op_ptr, 4);
	  op_ptr += 4;
	  break;
	case DW_OP_const8u:
	  result = extract_unsigned_integer (op_ptr, 8);
	  op_ptr += 8;
	  break;
	case DW_OP_const8s:
	  result = extract_signed_integer (op_ptr, 8);
	  op_ptr += 8;
	  break;
	case DW_OP_constu:
	  op_ptr = read_uleb128 (op_ptr, &result);
	  break;
	case DW_OP_consts:
	  op_ptr = read_sleb128 (op_ptr, &offset);
	  result = offset;
	  break;

	case DW_OP_reg0:
	case DW_OP_reg1:
	case DW_OP_reg2:
	case DW_OP_reg3:
	case DW_OP_reg4:
	case DW_OP_reg5:
	case DW_OP_reg6:
	case DW_OP_reg7:
	case DW_OP_reg8:
	case DW_OP_reg9:
	case DW_OP_reg10:
	case DW_OP_reg11:
	case DW_OP_reg12:
	case DW_OP_reg13:
	case DW_OP_reg14:
	case DW_OP_reg15:
	case DW_OP_reg16:
	case DW_OP_reg17:
	case DW_OP_reg18:
	case DW_OP_reg19:
	case DW_OP_reg20:
	case DW_OP_reg21:
	case DW_OP_reg22:
	case DW_OP_reg23:
	case DW_OP_reg24:
	case DW_OP_reg25:
	case DW_OP_reg26:
	case DW_OP_reg27:
	case DW_OP_reg28:
	case DW_OP_reg29:
	case DW_OP_reg30:
	case DW_OP_reg31:	
	  {
	    /* Allocate a buffer to store the register data */
	    char *buf = (char*) alloca (MAX_REGISTER_RAW_SIZE);	   	    

	    get_saved_register (buf, NULL, NULL, frame, op - DW_OP_reg0, NULL);
	    result = extract_address (buf, REGISTER_RAW_SIZE (op - DW_OP_reg0));
	    if (currval != NULL)
	      {
		store_typed_address (VALUE_CONTENTS_RAW (*currval), SYMBOL_TYPE (var), result);
		VALUE_LVAL (*currval) = not_lval;
	      }
	  }
	  break;
	case DW_OP_regx:   
	  {
	    char *buf = (char*) alloca (MAX_REGISTER_RAW_SIZE);    

	    op_ptr = read_uleb128 (op_ptr, &reg);
	    get_saved_register (buf, NULL, NULL, frame, reg, NULL);
	    result = extract_address (buf, REGISTER_RAW_SIZE (reg));
	  }
	  break;
	case DW_OP_breg0:
	case DW_OP_breg1:
	case DW_OP_breg2:
	case DW_OP_breg3:
	case DW_OP_breg4:
	case DW_OP_breg5:
	case DW_OP_breg6:
	case DW_OP_breg7:
	case DW_OP_breg8:
	case DW_OP_breg9:
	case DW_OP_breg10:
	case DW_OP_breg11:
	case DW_OP_breg12:
	case DW_OP_breg13:
	case DW_OP_breg14:
	case DW_OP_breg15:
	case DW_OP_breg16:
	case DW_OP_breg17:
	case DW_OP_breg18:
	case DW_OP_breg19:
	case DW_OP_breg20:
	case DW_OP_breg21:
	case DW_OP_breg22:
	case DW_OP_breg23:
	case DW_OP_breg24:
	case DW_OP_breg25:
	case DW_OP_breg26:
	case DW_OP_breg27:
	case DW_OP_breg28:
	case DW_OP_breg29:
	case DW_OP_breg30:
	case DW_OP_breg31:
	  {
	    char *buf = (char*) alloca (MAX_REGISTER_RAW_SIZE);
	    
	    op_ptr = read_sleb128 (op_ptr, &offset);
	    get_saved_register (buf, NULL, NULL, frame, op - DW_OP_breg0, NULL);
	    result = extract_address (buf, REGISTER_RAW_SIZE (op - DW_OP_breg0));
	    result += offset;
	  }
	  break;
	case DW_OP_bregx:
	  {
	    char *buf = (char *) alloca (MAX_REGISTER_RAW_SIZE);
	   
	    op_ptr = read_uleb128 (op_ptr, &reg);
	    op_ptr = read_sleb128 (op_ptr, &offset);
	    get_saved_register (buf, NULL, NULL, frame, reg, NULL);
	    result = extract_address (buf, REGISTER_RAW_SIZE (reg));
	    result += offset;
	  }
	  break;
	case DW_OP_fbreg:
	  {
	    struct symbol *framefunc;
	    unsigned char *datastart;
	    unsigned char *dataend;
	    struct dwarf_block *theblock;

	    framefunc = get_frame_function (frame);
	    op_ptr = read_sleb128 (op_ptr, &offset);
	    theblock = SYMBOL_FRAME_LOC_EXPR (framefunc);
	    datastart = theblock->data;
	    dataend = theblock->data + theblock->size;
	    result = execute_stack_op (var, datastart, dataend, frame, 0, NULL) + offset; 
	  }
	  break;
	case DW_OP_dup:
	  if (stack_elt < 1)
	    abort ();
	  result = stack[stack_elt - 1];
	  break;
	  
	case DW_OP_drop:
	  if (--stack_elt < 0)
	    abort ();
	  goto no_push;
	  
	case DW_OP_pick:
	  offset = *op_ptr++;
	  if (offset >= stack_elt - 1)
	    abort ();
	  result = stack[stack_elt - 1 - offset];
	  break;
	  
	case DW_OP_over:
	  if (stack_elt < 2)
	    abort ();
	  result = stack[stack_elt - 2];
	  break;

	case DW_OP_rot:
	  {
	    CORE_ADDR t1, t2, t3;
	    
	    if (stack_elt < 3)
	      abort ();
	    t1 = stack[stack_elt - 1];
	    t2 = stack[stack_elt - 2];
	    t3 = stack[stack_elt - 3];
	    stack[stack_elt - 1] = t2;
	    stack[stack_elt - 2] = t3;
	    stack[stack_elt - 3] = t1;
	    goto no_push;
	  }
	  
	case DW_OP_deref:
	case DW_OP_deref_size:
	case DW_OP_abs:
	case DW_OP_neg:
	case DW_OP_not:
	case DW_OP_plus_uconst:
	  /* Unary operations.  */
	  if (--stack_elt < 0)
	    abort ();
	  result = stack[stack_elt];
	  
	  switch (op)
	    {
	    case DW_OP_deref:
	      {
		result = (CORE_ADDR) read_memory_unsigned_integer (result, TARGET_PTR_BIT / TARGET_CHAR_BIT);
	      }
	      break;
	      
	    case DW_OP_deref_size:
	      {
		switch (*op_ptr++)
		  {
		  case 1:
		    result = read_memory_unsigned_integer (result, 1);
		    break;
		  case 2:
		    result = read_memory_unsigned_integer (result, 2);
		    break;
		  case 4:
		    result = read_memory_unsigned_integer (result, 4);
		    break;
		  case 8:
		    result = read_memory_unsigned_integer (result, 8);
		    break;
		  default:
		    abort ();
		  }
	      }
	      break;
	      
	    case DW_OP_abs:
	      if ((signed int) result < 0)
		result = -result;
	      break;
	    case DW_OP_neg:
	      result = -result;
	      break;
	    case DW_OP_not:
	      result = ~result;
	      break;
	    case DW_OP_plus_uconst:
	      op_ptr = read_uleb128 (op_ptr, &reg);
	      result += reg;
	      break;
	    }
	  break;
	  
	case DW_OP_and:
	case DW_OP_div:
	case DW_OP_minus:
	case DW_OP_mod:
	case DW_OP_mul:
	case DW_OP_or:
	case DW_OP_plus:
	case DW_OP_le:
	case DW_OP_ge:
	case DW_OP_eq:
	case DW_OP_lt:
	case DW_OP_gt:
	case DW_OP_ne:
	  {
	    /* Binary operations.  */
	    CORE_ADDR first, second;
	    if ((stack_elt -= 2) < 0)
	      abort ();
	    second = stack[stack_elt];
	    first = stack[stack_elt + 1];
	    
	    switch (op)
	      {
	      case DW_OP_and:
		result = second & first;
		break;
	      case DW_OP_div:
		result = (LONGEST)second / (LONGEST)first;
		break;
	      case DW_OP_minus:
		result = second - first;
		break;
	      case DW_OP_mod:
		result = (LONGEST)second % (LONGEST)first;
		break;
	      case DW_OP_mul:
		result = second * first;
		break;
	      case DW_OP_or:
		result = second | first;
		break;
	      case DW_OP_plus:
		result = second + first;
		break;
	      case DW_OP_shl:
		result = second << first;
		break;
	      case DW_OP_shr:
		result = second >> first;
		break;
	      case DW_OP_shra:
		result = (LONGEST)second >> first;
		break;
	      case DW_OP_xor:
		result = second ^ first;
		break;
	      case DW_OP_le:
		result = (LONGEST)first <= (LONGEST)second;
		break;
	      case DW_OP_ge:
		result = (LONGEST)first >= (LONGEST)second;
		break;
	      case DW_OP_eq:
		result = (LONGEST)first == (LONGEST)second;
		break;
	      case DW_OP_lt:
		result = (LONGEST)first < (LONGEST)second;
		break;
	      case DW_OP_gt:
		result = (LONGEST)first > (LONGEST)second;
		break;
	      case DW_OP_ne:
		result = (LONGEST)first != (LONGEST)second;
		break;
	      }
	  }
	  break;
	  
	case DW_OP_skip:
	  offset = extract_signed_integer (op_ptr, 2);
	  op_ptr += 2;
	  op_ptr += offset;
	  goto no_push;
	  
	case DW_OP_bra:
	  if (--stack_elt < 0)
	    abort ();
	  offset = extract_signed_integer (op_ptr, 2);
	  op_ptr += 2;
	  if (stack[stack_elt] != 0)
	    op_ptr += offset;
	  goto no_push;
	  
	case DW_OP_nop:
	  goto no_push;
	  
	default:
	  abort ();
	}
      
      /* Most things push a result value.  */
      if ((size_t) stack_elt >= sizeof(stack)/sizeof(*stack))
	abort ();
      stack[++stack_elt] = result;
    no_push:;
    }
  
  /* We were executing this program to get a value.  It should be
     at top of stack.  */
  if (stack_elt-1 < 0)
    abort ();
  return stack[stack_elt];
}

/* Evaluate a location description, given in THEBLOCK, in the
   context of frame FRAME. */
value_ptr 
evaluate_loc_desc (struct symbol *var, struct frame_info *frame, 
			 struct dwarf_block *theblock, struct type *type)
{
  CORE_ADDR result;
  value_ptr retval;
  retval = allocate_value(type);
  VALUE_LVAL (retval) = lval_memory;
  VALUE_BFD_SECTION (retval) = SYMBOL_BFD_SECTION (var);
  result = execute_stack_op (var, theblock->data, theblock->data+theblock->size, frame, 0, &retval);
  if (VALUE_LVAL (retval) == lval_memory)
    {
      VALUE_LAZY (retval) = 1;
      VALUE_ADDRESS (retval) = result;
    }
  return retval;
}



-- 
"What do batteries run on?
"-Steven Wright


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-05-21 18:49   ` Daniel Berlin
@ 2001-05-22 12:46     ` Jim Blandy
  2001-05-22 13:51       ` Daniel Berlin
  0 siblings, 1 reply; 28+ messages in thread
From: Jim Blandy @ 2001-05-22 12:46 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gdb-patches

Daniel Berlin <dan@cgsoftware.com> writes:
> However, IMHO, the location expression language that dwarf2 uses is
> self-contained, and suited for GDB's needs. Perfectly in fact. It
> provides no more or less than necessary to describe our locations,
> except where it saved a little space in real use (see below) and
> it's trivial to convert other expression languages into it. and The
> location list is simply a list of ranges and location expressions,
> which is what we would have come up with anyway, judging from GDB structures.
> 
> In fact, I'd wager if we tried, we'd come up with something almost
> exactly like d2 location expressions:
> 
> We would want some kind of simple, stack based language.
> Normal binary and unary arithmetic operations, and normal stack
> operations (rotate, pop, drop, etc).
> Past these, we'd want a way of:
> dereferencing the value on top of the stack as if it was a memory address.
> A way to load constant values onto the stack
> A way to load register values onto the stack
> An efficient encoding, able to handle differing sizes for data types
> on different platforms.

Yes, you're right --- what we'd come up with would be very similar to
Dwarf 2 location expressions.

> Why bother not just using something we already have a standard for,
> if it's going to 99.9% the same?

There's nothing technically wrong with using Dwarf 2 location
expressions and location lists internally in GDB.  We could do a
perfectly correct implementation that would work fine.  When I say
`self-contained', I'm not talking about the implementation; I'm more
concerned with modularity, and who has control of the definition.

I'm not sure how to put this convincingly.  Let me try to draw an
analogy with another problematic area.  GDB stores the machine's
registers as a big block of bytes.  The layout of GDB's register file
is the same as the layout of the 'G' packet.  While there's nothing
technically wrong with this, it's ended up being a technical
nightmare, because whenever we want to do something reasonable with
GDB's internal register file, we have to worry about the remote
protocol.

Using a common structure there simplified the remote protocol code,
but it also introduced a binding (in the mechanical sense, of things
rubbing against each other in a way that prevents the machine from
operating smoothly) between the remote protocol and GDB's internals
that has been a pain in the neck.

I don't want to introduce this sort of binding between GDB's debug
readers and its expression evaluator, by requiring that they use the
same representation for location expressions.

I admit that this is a very touchy-feely argument.  If the other
maintainers don't share my sense of unease with using Dwarf 2 location
expressions directly in GDB, then I'll drop the objection.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-05-22 12:46     ` Jim Blandy
@ 2001-05-22 13:51       ` Daniel Berlin
  2001-05-22 23:14         ` Jim Blandy
  0 siblings, 1 reply; 28+ messages in thread
From: Daniel Berlin @ 2001-05-22 13:51 UTC (permalink / raw)
  To: Jim Blandy; +Cc: Daniel Berlin, gdb-patches

Jim Blandy <jimb@zwingli.cygnus.com> writes:

> Daniel Berlin <dan@cgsoftware.com> writes:
> > However, IMHO, the location expression language that dwarf2 uses is
> > self-contained, and suited for GDB's needs. Perfectly in fact. It
> > provides no more or less than necessary to describe our locations,
> > except where it saved a little space in real use (see below) and
> > it's trivial to convert other expression languages into it. and The
> > location list is simply a list of ranges and location expressions,
> > which is what we would have come up with anyway, judging from GDB structures.
> > 
> > In fact, I'd wager if we tried, we'd come up with something almost
> > exactly like d2 location expressions:
> > 
> > We would want some kind of simple, stack based language.
> > Normal binary and unary arithmetic operations, and normal stack
> > operations (rotate, pop, drop, etc).
> > Past these, we'd want a way of:
> > dereferencing the value on top of the stack as if it was a memory address.
> > A way to load constant values onto the stack
> > A way to load register values onto the stack
> > An efficient encoding, able to handle differing sizes for data types
> > on different platforms.
> 
> Yes, you're right --- what we'd come up with would be very similar to
> Dwarf 2 location expressions.
> 
> > Why bother not just using something we already have a standard for,
> > if it's going to 99.9% the same?
> 
> There's nothing technically wrong with using Dwarf 2 location
> expressions and location lists internally in GDB.  We could do a
> perfectly correct implementation that would work fine.  When I say
> `self-contained', I'm not talking about the implementation; I'm more
> concerned with modularity, and who has control of the definition.
Okey dokey.
But then agian, remember, the people who have control of the dwarf2
location expression definition are those who know what they need to be
able to represent, and be able to read. Namely, the compiler writers
and debugger writers.
That's all the dwarf2 committee is made up of. Compiler and debugger
vendors.  

However, if it helps make you feel we have more control over it, i'll
happily rewrite the location expression 
section of the dwarf2 doc, replacing DWARF2 with GDB, throw it in the
docs dir as a decription of our location expression format, and we can use
it as a starting point, and not be beholden to anyone at all, or even
feel the slightest twinge of guilt about adding/removing/changing
opcodes.

> 
> I'm not sure how to put this convincingly.  Let me try to draw an
> analogy with another problematic area.  GDB stores the machine's
> registers as a big block of bytes.  The layout of GDB's register file
> is the same as the layout of the 'G' packet.  While there's nothing
> technically wrong with this, it's ended up being a technical
> nightmare, because whenever we want to do something reasonable with
> GDB's internal register file, we have to worry about the remote
> protocol.

Okey, but the solution to this, IMHO, is to make the structures
opaque, and provide accessor routines, so that the only thing that can touch them is a few routines
in a file that knows the actual internal format, and how to do the
various ops we want to do (like adding a dereference to the top of the
stack, etc).

Then, we aren't dependent in the debug readers on the actual format
involved, and the only thing that is dependent on the format is the
actual evaluator for location expressions.

> 
> Using a common structure there simplified the remote protocol code,
> but it also introduced a binding (in the mechanical sense, of things
> rubbing against each other in a way that prevents the machine from
> operating smoothly) between the remote protocol and GDB's internals
> that has been a pain in the neck.

I believe the binding was introduced because of an assumption about
internal structure.  If you can't make these assumptions (attempts to
indirect through the opaque structure would give you compiler errors,
because it would have an incomplete type, or be a void *, or
something, depending on how you went about making it opaque), then you
are forced to do it the right way.

> 
> I don't want to introduce this sort of binding between GDB's debug
> readers and its expression evaluator, by requiring that they use the
> same representation for location expressions.

There is no such binding introduced, the expression evaluator language
has always been seperate, and will continue to be.  Unless you mean the symbol expression
evaluator we would be implementing, in which case, it would just use
the accessor functions itself to get the opcodes and arguments. By
doing this, the actual gdb internal representation could be different
from what the language says as well. The accessors would just
transform it on the fly to the language we have defined for the
symbol expressions.

> 
> I admit that this is a very touchy-feely argument.  If the other
> maintainers don't share my sense of unease with using Dwarf 2 location
> expressions directly in GDB, then I'll drop the objection.

I don't think it should be *visible* that they are d2 location
expressions, to anything but a small number of routines that deal with
the internal gdb representation of a symbol location expression.  I'll
happily make the necessary accessors and opaque types and whatnot to
enforce this.

-- 
"There's a pizza place near where I live that sells only slices.
in the back you can see a guy tossing a triangle in the air.
"-Steven Wright

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-05-22 13:51       ` Daniel Berlin
@ 2001-05-22 23:14         ` Jim Blandy
  2001-05-23  8:51           ` Daniel Berlin
  2001-05-23 11:53           ` Daniel Berlin
  0 siblings, 2 replies; 28+ messages in thread
From: Jim Blandy @ 2001-05-22 23:14 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gdb-patches

Daniel Berlin <dan@cgsoftware.com> writes:
> However, if it helps make you feel we have more control over it,
> i'll happily rewrite the location expression section of the dwarf2
> doc, replacing DWARF2 with GDB, throw it in the docs dir as a
> decription of our location expression format, and we can use it as a
> starting point, and not be beholden to anyone at all, or even feel
> the slightest twinge of guilt about adding/removing/changing
> opcodes.

Yes, that's a step in the right direction.

I don't think it matters whether the representation is opaque or not.
Opacity has its price --- opacity can cause obscurity.

What's important is for the code to take an explicit step to translate
all debug info location information (including Dwarf 2 location
expressions) into the internal form.  The code must not assume that,
say, for Dwarf 2, the translation into internal form is trivial.

One way to enforce this would be to take the opportunity to simplify
the language a little bit.  For example, I don't think we need
forty-two different opcodes for pushing constants (DW_OP_lit*,
DW_OP_addr, DW_OP_const*); I don't think we need the compound opcodes
like DW_OP_breg*, DW_OP_plus_uconst; and so on.  And pushing constants
out into an array on the side of the bytecode stream, with a bytecode
to fetch constants by index, would allow us to bcache bytecode strings
more effectively --- everyone whose location is a constant offset from
a register, say, could share a single bytecode string.

This would have the side effect (which may seem like make-work, but I
think is actually a good thing) of preventing people from just dumping
blocks of Dwarf2 locexpr bytecode into GDB's internals --- creating
the sort of binding we're trying to avoid.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-05-22 23:14         ` Jim Blandy
@ 2001-05-23  8:51           ` Daniel Berlin
  2001-05-23 11:53           ` Daniel Berlin
  1 sibling, 0 replies; 28+ messages in thread
From: Daniel Berlin @ 2001-05-23  8:51 UTC (permalink / raw)
  To: Jim Blandy; +Cc: Daniel Berlin, gdb-patches

Jim Blandy <jimb@zwingli.cygnus.com> writes:

> Daniel Berlin <dan@cgsoftware.com> writes:
> > However, if it helps make you feel we have more control over it,
> > i'll happily rewrite the location expression section of the dwarf2
> > doc, replacing DWARF2 with GDB, throw it in the docs dir as a
> > decription of our location expression format, and we can use it as a
> > starting point, and not be beholden to anyone at all, or even feel
> > the slightest twinge of guilt about adding/removing/changing
> > opcodes.
> 
> Yes, that's a step in the right direction.
> 
> I don't think it matters whether the representation is opaque or not.
> Opacity has its price --- opacity can cause obscurity.

> 
> What's important is for the code to take an explicit step to translate
> all debug info location information (including Dwarf 2 location
> expressions) into the internal form.  The code must not assume that,
> say, for Dwarf 2, the translation into internal form is trivial.
> 
> One way to enforce this would be to take the opportunity to simplify
> the language a little bit.  For example, I don't think we need
> forty-two different opcodes for pushing constants (DW_OP_lit*,
> DW_OP_addr, DW_OP_const*); I don't think we need the compound opcodes
> like DW_OP_breg*, DW_OP_plus_uconst; and so on.  And pushing constants
> out into an array on the side of the bytecode stream, with a bytecode
> to fetch constants by index, would allow us to bcache bytecode strings
> more effectively --- everyone whose location is a constant offset from
> a register, say, could share a single bytecode string.

Funny you mention, I had just done the first of these (removing a lot
of space saving opcodes) last night.

It's when I noticed we can't properly pass LONGEST values (long long
int on my powerbook) through vararg functions.


> 
> This would have the side effect (which may seem like make-work, but I
> think is actually a good thing) of preventing people from just dumping
> blocks of Dwarf2 locexpr bytecode into GDB's internals --- creating
> the sort of binding we're trying to avoid.
It's not possible anyway.
The opcode numbers are different already


The enum used to look like:

enum dwarf_location_atom
{
        DW_OP_addr = 0x3f,
        DW_OP_xxx = 0x49,
        DW_OP_yyy = 0x42,
}
etc
It now looks like
enum locexpr_opcode
{
        GLO_addr,
        GLO_xxx,
        GLO_yyy
}



-- 
"My girlfriend does her nails with white-out.  When she's asleep,
I go over there and write misspelled words on them.
"-Steven Wright


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-05-22 23:14         ` Jim Blandy
  2001-05-23  8:51           ` Daniel Berlin
@ 2001-05-23 11:53           ` Daniel Berlin
  2001-05-23 21:53             ` Jim Blandy
  1 sibling, 1 reply; 28+ messages in thread
From: Daniel Berlin @ 2001-05-23 11:53 UTC (permalink / raw)
  To: Jim Blandy; +Cc: Daniel Berlin, gdb-patches

Jim Blandy <jimb@zwingli.cygnus.com> writes:

> Daniel Berlin <dan@cgsoftware.com> writes:
> > However, if it helps make you feel we have more control over it,
> > i'll happily rewrite the location expression section of the dwarf2
> > doc, replacing DWARF2 with GDB, throw it in the docs dir as a
> > decription of our location expression format, and we can use it as a
> > starting point, and not be beholden to anyone at all, or even feel
> > the slightest twinge of guilt about adding/removing/changing
> > opcodes.
> 
> Yes, that's a step in the right direction.
> 
> I don't think it matters whether the representation is opaque or not.
> Opacity has its price --- opacity can cause obscurity.
> 
> What's important is for the code to take an explicit step to translate
> all debug info location information (including Dwarf 2 location
> expressions) into the internal form.  The code must not assume that,
> say, for Dwarf 2, the translation into internal form is trivial.
> 
> One way to enforce this would be to take the opportunity to simplify
> the language a little bit.  For example, I don't think we need
> forty-two different opcodes for pushing constants (DW_OP_lit*,
> DW_OP_addr, DW_OP_const*); I don't think we need the compound opcodes
> like DW_OP_breg*, DW_OP_plus_uconst; and so on.  And pushing constants
> out into an array on the side of the bytecode stream, with a bytecode
> to fetch constants by index, would allow us to bcache bytecode strings
> more effectively --- everyone whose location is a constant offset from
> a register, say, could share a single bytecode string.

Here is how the language currently stands (this is actually symeval.h pasted): 

#ifndef SYMEVAL_H
#define SYMEVAL_H

enum locexpr_opcode
  {
    GLO_addr,
    GLO_deref,
    GLO_constu,
    GLO_consts,
    GLO_dup,
    GLO_drop,
    GLO_over,
    GLO_pick,
    GLO_swap,
    GLO_rot,
    GLO_xderef,
    GLO_abs,
    GLO_and,
    GLO_div,
    GLO_minus,
    GLO_mod,
    GLO_mul,
    GLO_neg,
    GLO_not,
    GLO_or,
    GLO_plus,
    GLO_shl,
    GLO_shr,
    GLO_shra,
    GLO_xor,
    GLO_bra,
    GLO_eq,
    GLO_ge,
    GLO_gt,
    GLO_le,
    GLO_lt,
    GLO_ne,
    GLO_skip,
    GLO_regx,
    GLO_fbreg,
    GLO_bregx,
    GLO_piece,
    GLO_deref_size,
    GLO_xderef_size
  };


void init_locexpr (locexpr *);
void destroy_locexpr (locexpr *);
void locexpr_push_op  (locexpr, enum locexpr_opcode, const char *, ...);
void locexpr_get_op (locexpr, unsigned int,  enum locexpr_opcode *);
void locexpr_get_longest (locexpr, unsigned int, LONGEST *);
void locexpr_get_ulongest (locexpr, unsigned int, ULONGEST *);
void locexpr_get_uchar (locexpr, unsigned int, unsigned char *);
value_ptr evaluate_locexpr (struct symbol *, struct frame_info *, locexpr, struct type *);
#endif

I kept bregx only because we already have a location type for based
registers, so it seemed to make sense (Admittedly, all it does is take
us from two opcodes to one for this).

fbreg we can't get rid of, we don't necessarily know the frame
register at debug reading time (it may itself be using a location list
to say where it is).

xderef we can't get rid of, we have architectures with multiple
address spaces, so it'll be necessary for them.

the *_size we can't get rid of, because we frequently only want a
certain number of bytes of a pointer.

We could get rid of the "default" opcodes deref and xderef, and just
always use xderef_size and deref_size with an explicit size.

I think other than that, i've gotten rid of all of the unneeded
opcodes.

If everyone agrees, i'll write up the docs on these opcodes, and the
functions to manipulate location expressions.

It currently has no dependencies on anything at all, so i can submit
the symeval.[ch] files, and the symtab.h change to add locexpr to the
symbol struct (and to the location types enum) , without affecting
anything else.

Eventually, I'll get rid of all the other location types, but this can
be done incrementally, so it's no rush at all.

> 
> This would have the side effect (which may seem like make-work, but I
> think is actually a good thing) of preventing people from just dumping
> blocks of Dwarf2 locexpr bytecode into GDB's internals --- creating
> the sort of binding we're trying to avoid.

-- 
"I was watching the Superbowl with my 92 year old grandfather.
The team scored a touchdown.  They showed the instant replay.
He thought they scored another one.  I was gonna tell him, but I
figured the game *he* was watching was better.
"-Steven Wright


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-05-23 11:53           ` Daniel Berlin
@ 2001-05-23 21:53             ` Jim Blandy
  2001-05-23 22:56               ` Daniel Berlin
  0 siblings, 1 reply; 28+ messages in thread
From: Jim Blandy @ 2001-05-23 21:53 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gdb-patches, gdb

Hey, everyone.  If folks think the general direction we're going here
(explained in my first post to this thread) is not a good idea, speak
now.  Stuff is happening.

Daniel Berlin <dan@cgsoftware.com> writes:
> void init_locexpr (locexpr *);
> void destroy_locexpr (locexpr *);
> void locexpr_push_op  (locexpr, enum locexpr_opcode, const char *, ...);
> void locexpr_get_op (locexpr, unsigned int,  enum locexpr_opcode *);
> void locexpr_get_longest (locexpr, unsigned int, LONGEST *);
> void locexpr_get_ulongest (locexpr, unsigned int, ULONGEST *);
> void locexpr_get_uchar (locexpr, unsigned int, unsigned char *);
> value_ptr evaluate_locexpr (struct symbol *, struct frame_info *, locexpr, struct type *);

GDB tends not to use typedefs, which I think is a good thing.  I think
`struct locexpr' would be better.

There should be a way to efficiently build locexprs in obstacks.

How big are your stack elements?  I think it's very important that the
semantics of a particular string of bytecodes be independent of word
size issues.  That is, if I construct a particular string, I want it
to compute the same values no matter what sort of host I'm running on.
Have you thought about how to ensure that architecturally?  Think
about right shifts, divides, comparisons --- anything that can bring
information down from the upper bits.

> I kept bregx only because we already have a location type for based
> registers, so it seemed to make sense (Admittedly, all it does is take
> us from two opcodes to one for this).

I think it should be deleted.

> fbreg we can't get rid of, we don't necessarily know the frame
> register at debug reading time (it may itself be using a location list
> to say where it is).

Yep, it's its own animal.

> We could get rid of the "default" opcodes deref and xderef, and just
> always use xderef_size and deref_size with an explicit size.

I think we should do that.

> If everyone agrees, i'll write up the docs on these opcodes, and the
> functions to manipulate location expressions.

Yeah, the interface isn't clear to me.  I'm especially curious about
that constructor.  Varargs give me the heebie-jeebies.  I think Andrew
has said he feels the same way.

> Eventually, I'll get rid of all the other location types, but this can
> be done incrementally, so it's no rush at all.

Right, that's where I want to go with this too.  Note that if we do
that, the LOC_* enum encodes information about whether something is an
argument or not (i.e., LOC_ARG vs. LOC_LOCAL, LOC_REGISTER
vs. LOC_REGPARM).  We'll need to encode that information otherhow.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-05-23 21:53             ` Jim Blandy
@ 2001-05-23 22:56               ` Daniel Berlin
  0 siblings, 0 replies; 28+ messages in thread
From: Daniel Berlin @ 2001-05-23 22:56 UTC (permalink / raw)
  To: Jim Blandy; +Cc: Daniel Berlin, gdb-patches, gdb

Jim Blandy <jimb@zwingli.cygnus.com> writes:

> Hey, everyone.  If folks think the general direction we're going here
> (explained in my first post to this thread) is not a good idea, speak
> now.  Stuff is happening.
> 
> Daniel Berlin <dan@cgsoftware.com> writes:
> > void init_locexpr (locexpr *);
> > void destroy_locexpr (locexpr *);
> > void locexpr_push_op  (locexpr, enum locexpr_opcode, const char *, ...);
> > void locexpr_get_op (locexpr, unsigned int,  enum locexpr_opcode *);
> > void locexpr_get_longest (locexpr, unsigned int, LONGEST *);
> > void locexpr_get_ulongest (locexpr, unsigned int, ULONGEST *);
> > void locexpr_get_uchar (locexpr, unsigned int, unsigned char *);
> > value_ptr evaluate_locexpr (struct symbol *, struct frame_info *, locexpr, struct type *);
> 
> GDB tends not to use typedefs, which I think is a good thing.  I think
> `struct locexpr' would be better.
The only thing is that i made locexpr a typedef to locexpr_priv *, so
you'd get a compile error if anything outside symeval.c tried to
access the internal structure.

I can change this if we all agree this is a bad idea.
I just think we have a tendency to use internal structure to get
things done quicker, even when we know we shouldn't,  and mark it
"FIXME", and it never gets fixed.

> 
> There should be a way to efficiently build locexprs in obstacks.

I know, i'm on it already.
It was a first pass, so it did the "realloc by one extra byte" for
pushing opcodes.

> 
> How big are your stack elements? 
opcodes are unsigned char.

operands are encoded as one of:
unsigned char
uleb128
sleb128

They are passed into the opcode pusher as LONGEST or ULONGEST or unsigned char.
>  I think it's very important that the
> semantics of a particular string of bytecodes be independent of word
> size issues.  That is, if I construct a particular string, I want it
> to compute the same values no matter what sort of host I'm running on.
> Have you thought about how to ensure that architecturally? 
Yes.

sizeof(char) is guaranteed to be 1 byte always, which is why it's used
for representing the opcode.
uleb128 is host independent and word size independent (in memory it's
a string of bytes. It always encodes to teh same string of bytes on
any platform, and decodes to the same value on any platform).
As is sleb128.
This, and the fact that it's efficient, is why it was used for DWARF2
in the first place.

>  Think
> about right shifts, divides, comparisons --- anything that can bring
> information down from the upper bits.
You can only decode into LONGEST, ULONGEST, and unsigned char. All
operations that can bring info down from upper bits already have explicit
casts to LONGEST (inherited from the fact that this is baesd on the
dwarf2 locexpr evaluator, which was based on unwind-dw2-fde.c from
GCC, which does the same type of casts).
Like so:

   case GLO_div:
		result = (LONGEST)second / (LONGEST)first;
		break; case GLO_shra:
....
        case GLO_shra:
		result = (LONGEST)second >> first;
		break;
	  ...
	      case GLO_le:
		result = (LONGEST)first <= (LONGEST)second;
		break;
	      case GLO_ge:
		result = (LONGEST)first >= (LONGEST)second;
		break;
	      case GLO_eq:
		result = (LONGEST)first == (LONGEST)second;
		break;
	      case GLO_lt:
		result = (LONGEST)first < (LONGEST)second;
		break;
	      case GLO_gt:
		result = (LONGEST)first > (LONGEST)second;
		break;
	      case GLO_ne:
		result = (LONGEST)first != (LONGEST)second;
		break;



> 
> > I kept bregx only because we already have a location type for based
> > registers, so it seemed to make sense (Admittedly, all it does is take
> > us from two opcodes to one for this).
> 
> I think it should be deleted.
Okeydokey.

> 
> > fbreg we can't get rid of, we don't necessarily know the frame
> > register at debug reading time (it may itself be using a location list
> > to say where it is).
> 
> Yep, it's its own animal.
> 
> > We could get rid of the "default" opcodes deref and xderef, and just
> > always use xderef_size and deref_size with an explicit size.
> 
> I think we should do that.
> 
> > If everyone agrees, i'll write up the docs on these opcodes, and the
> > functions to manipulate location expressions.
> 
> Yeah, the interface isn't clear to me. 

Well, it's built to make it so the only thing that knows the internal,
in-memory structure of the location expression is push and get.  

If I added a cookie, so that you didn't have to pass the
get_opcode/get_operand routines a position (which i'll probably do),
you could make things like "constants being in a seperate bcache'd
array" completely independent of the evaluator.  Only the get routine
would know this.
> I'm especially curious about
> that constructor.  Varargs give me the heebie-jeebies.  I think Andrew
> has said he feels the same way.

It was either this, or have one to push opcodes, and three to push
operands.
Or a bunch to cover each combination of opcode+operands you have.

or if you'd rather see the 9 functions like:

locexpr_push_uu
locexpr_push_us
locexpr_push_uc
locexpr_push_ss
locexpr_push_su
locexpr_push_sc
locexpr_push_cc
locexpr_push_cu
locexpr_push_cs

Having just one to push all of them also has the advantage that we can type
check the arguments to make sure you are giving the  right number and
type of operands to the right opcode. 

This is currently implemented.

We could also do it with the 9 functions, but not the one to push
opcodes, and three to push operands (without global variables or
something messy, anyway).

Without this, it's harder to track down bugs in converters, since you
won't see a problem until evaluation, and even then, it may be
sporadic (I actually hit a bug in evaluating fbreg when it was a
dwarf2 evaluator because on powerpc, the offsets from it are always
positive, and x86, they can be negative. I read the opcode as unsigned
accidently, and it took me 3 weeks to track it down to this.)


> 
> > Eventually, I'll get rid of all the other location types, but this can
> > be done incrementally, so it's no rush at all.
> 
> Right, that's where I want to go with this too.  Note that if we do
> that, the LOC_* enum encodes information about whether something is an
> argument or not (i.e., LOC_ARG vs. LOC_LOCAL, LOC_REGISTER
> vs. LOC_REGPARM).  We'll need to encode that information otherhow.
Yes.  

-- 
"My neighbor has a circular driveway...  He can't get out.
"-Steven Wright


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-05-21 14:46 ` [PATCH] Add support for tracking/evaluating dwarf2 location expressions Jim Blandy
  2001-05-21 18:49   ` Daniel Berlin
@ 2001-06-06  9:07   ` Andrew Cagney
  2001-06-06  9:46     ` Daniel Berlin
  1 sibling, 1 reply; 28+ messages in thread
From: Andrew Cagney @ 2001-06-06  9:07 UTC (permalink / raw)
  To: Jim Blandy, Daniel Berlin; +Cc: gdb-patches

Two things:

GDB has, well almost has, a byte code interpreter it 100% controls. Jim, 
remember tracepoints?

To initially set the bar very high (I'm sure it will soon come crashing 
down) the byte code interpreter should to be implemented as a true state 
machine.  This is significant - instead of assuming that target_read() 
returns the data immediatly, it should instead be designed to allow for 
the day when target_read() returns ERETRY.

Hmm, no, I lied, three things :-)

To follow up Jim's comment about GDB 100% controlling the byte code 
interpreter.  Am I correct to think that, for tracepoints to continue 
working, GDB will need to translate DWARF2 bytecode expressions into 
tracepoint expressions so that they can be run on the target when 
fetching variables?

	Andrew

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-06-06  9:07   ` Andrew Cagney
@ 2001-06-06  9:46     ` Daniel Berlin
  2001-06-07  7:29       ` Andrew Cagney
  0 siblings, 1 reply; 28+ messages in thread
From: Daniel Berlin @ 2001-06-06  9:46 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: Jim Blandy, gdb-patches

Andrew Cagney <ac131313@cygnus.com> writes:

> Two things:
> 
> GDB has, well almost has, a byte code interpreter it 100%
> controls. Jim, remember tracepoints?

Uggggh.
I've already sent an implementation of one we would 100% control, that
is easy for debug formats to convert into, and easy for us to
evaluate.

It would also be trivial to use it *as* the agent bytecode if you
wanted to.   I don't think the agent bytecode, however, should be used
as our symbol location language.  It just doesn't seem to fit well at
all in terms of interface, it's original purpose, etc.

We seem to have a problem in GDB with shoe horning things to fit where
they don't.

Look at struct type's target_type, and tell me if you don't think it
came about because someone thought "I know, let's just reuse this
thing that is somewhat related but not really".


> 
> 
> To initially set the bar very high (I'm sure it will soon come
> crashing down) the byte code interpreter should to be implemented as a
> true state machine. 
>  This is significant - instead of assuming that
> target_read() returns the data immediatly, it should instead be
> designed to allow for the day when target_read() returns ERETRY.
> 
We are trying to replace the way locations are described now with
something that is more expressive.  Not change the fundamentals of how
locations work.
We still, besides getting symeval approved, eventually have to replace
all the current uses for the  different locations with just this one
type of location type (IE get rid of LOCATION_BASE_REG, etc, in favor
of the bytecoded form). This is in itself, at least 6 months minimum
(because of how far reaching a  change it will be). Then we can worry
about merging the tracepoint stuff and the location evaluation 
stuff.
Doing that right now would make making this an incremental change much
harder.  Right now, symeval requires no changes to other parts of
gdb.  Zero. Everything that worked before keeps working.   We can
incrementally change the symbol readers to start using it.
Then we can incrementally change the reader to not depend on always
having the value actually there.
Etc.


> 
> Hmm, no, I lied, three things :-)
> 
> To follow up Jim's comment about GDB 100% controlling the byte code
> interpreter. 
I don't think anyone is unhappy with the current symeval I posted,
except in the interface.
IE The 100% control issue seems to be resovled.
>  Am I correct to think that, for tracepoints to continue
> working, GDB will need to translate DWARF2 bytecode expressions into
> tracepoint expressions so that they can be run on the target when
> fetching variables?

It'll need to translate GDB location expressions (they are different
than the dwarf2 bytecode expressions in subtle ways, and have
different opcode numbers anyway.) into tracepoint expressions.

However, It would actually be easier on the tracepoint expression code
eventually.

Right now, it has to handle all of the location types differently.





> 
> 
> 	Andrew

-- 
"I play the harmonica.  The only way I can play is if I get my
car going really fast, and stick it out the window.  I've been
arrested three times for practicing.
"-Steven Wright


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-06-06  9:46     ` Daniel Berlin
@ 2001-06-07  7:29       ` Andrew Cagney
  0 siblings, 0 replies; 28+ messages in thread
From: Andrew Cagney @ 2001-06-07  7:29 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Jim Blandy, gdb-patches

>> GDB has, well almost has, a byte code interpreter it 100%
>> controls. Jim, remember tracepoints?
> 
> 
> Uggggh.
> I've already sent an implementation of one we would 100% control, that
> is easy for debug formats to convert into, and easy for us to
> evaluate.


Yes, that makes two implementations.


> It would also be trivial to use it *as* the agent bytecode if you
> wanted to.   I don't think the agent bytecode, however, should be used
> as our symbol location language.  It just doesn't seem to fit well at
> all in terms of interface, it's original purpose, etc.


That wouldn't fly.  The agent bytecode is documented and published and a 
sample implementation is somewhere.


> We seem to have a problem in GDB with shoe horning things to fit where
> they don't.



>> To initially set the bar very high (I'm sure it will soon come
>> crashing down) the byte code interpreter should to be implemented as a
>> true state machine. 
>> This is significant - instead of assuming that
>> target_read() returns the data immediatly, it should instead be
>> designed to allow for the day when target_read() returns ERETRY.
>> 
> 
> We are trying to replace the way locations are described now with
> something that is more expressive.  Not change the fundamentals of how
> locations work.
> We still, besides getting symeval approved, eventually have to replace
> all the current uses for the  different locations with just this one
> type of location type (IE get rid of LOCATION_BASE_REG, etc, in favor
> of the bytecoded form). This is in itself, at least 6 months minimum
> (because of how far reaching a  change it will be). Then we can worry
> about merging the tracepoint stuff and the location evaluation 
> stuff.
> Doing that right now would make making this an incremental change much
> harder.  Right now, symeval requires no changes to other parts of
> gdb.  Zero. Everything that worked before keeps working.   We can
> incrementally change the symbol readers to start using it.
> Then we can incrementally change the reader to not depend on always
> having the value actually there.
> Etc.


Ah, Ok.

Hmm, a reader not depending on always having the value actually there 
would mean having partially evaluated expressions around.

Andrew



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
@ 2001-03-30 13:49 David Taylor
  2001-03-30 14:42 ` Daniel Berlin
  0 siblings, 1 reply; 28+ messages in thread
From: David Taylor @ 2001-03-30 13:49 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gdb-patches

    Date: Fri, 30 Mar 2001 14:10:46 -0500 (EST)
    From: Daniel Berlin <dberlin@redhat.com>

    This patch adds a place in the symbol structure to store the dwarf2
    location (an upcoming dwarf2read huge rewrite patch uses it), an
    evaluation maachine to findvar.c, and has read_var_value use the dwarf2
    location if it's available.

    NOte that in struct d2_location, in the symtab.h patch, that framlocdesc
    cannot be unioned with locdesc, because the symbol providing the frame
    location description  usually has it's own real location.

    Currently, because GDB makes no use of the frame unwind info, etc, I
    evaluate the frame pointer location expression for whatever is providing
    it for the thing we are trying to evaluate, rather than rely what GDB
    tells us the frame pointer is.

I have read that "sentence" multiple times and I still  don't know what
you are trying to say.  Sorry.

    Yes, it's currently pointless to do this,  it's just future proofing, and
    what we are supposed to do, anyway.

    This patch will have no effect whatosever without the dwarf2read.c
    rewrite, but it won't break anything, neither, so i sepereated it out and
    am submitting it now.
    --Dan
    2001-03-30  Daniel Berlin  <dberlin@redhat.com>

	    * symtab.h (address_class): Add LOC_DWARF2_EXPRESSION.
	    (struct symbol): Add struct d2_location to store dwarf2
	    location expressions.
	    (SYMBOL_DWARF2_LOCATION, SYMBOL_DWARF2_FRAME_LOCATION): New macros.

	    * findvar.c (read_leb128): New function, read leb128 data from a
	    buffer.

I don't know what leb128 data is (and I doubt that I'm alone).  I
suspect that it is something dwarf of dwarf2 specific and that this
function really belongs in another file (dwarf2read.c maybe?).  At a
minimum there should probably be some sort of explanatory comment.

	    (evaluate_dwarf2_locdesc): New function, evaluate a given
	    dwarf2 location description.

This is a very dwarf2 specific change to findvar.c.  What happens if
something other than dwarf2 is in use?

Also, should these functions be in here?  In dwarf2read.c?  Or in a
new file (dwarf2eval.c?)?

	    (read_var_value): Use the dwarf2 location expression if it's available.

How do you know that the location expression is NULL when you haven't
set it -- in particular, what's to prevent it from having garbage in
it?

And is it likely that some other symbol reader would someday want a
similar hook?  For example, what about dwarf1?  Or is dwarf2 likely to
be the only one?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-03-30 13:49 David Taylor
@ 2001-03-30 14:42 ` Daniel Berlin
  2001-03-30 15:14   ` Andrew Cagney
                     ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Daniel Berlin @ 2001-03-30 14:42 UTC (permalink / raw)
  To: David Taylor; +Cc: gdb-patches

(If you aren't familiar, the address_class enum lists the ways you can
describe the location of a symbol. that's what i'm talking about when
i say address classes)
David Taylor <taylor@cygnus.com> writes:

>     Date: Fri, 30 Mar 2001 14:10:46 -0500 (EST)
>     From: Daniel Berlin <dberlin@redhat.com>
> 
>     This patch adds a place in the symbol structure to store the dwarf2
>     location (an upcoming dwarf2read huge rewrite patch uses it), an
>     evaluation maachine to findvar.c, and has read_var_value use the dwarf2
>     location if it's available.
> 
>     NOte that in struct d2_location, in the symtab.h patch, that framlocdesc
>     cannot be unioned with locdesc, because the symbol providing the frame
>     location description  usually has it's own real location.
> 
>     Currently, because GDB makes no use of the frame unwind info, etc, I
>     evaluate the frame pointer location expression for whatever is providing
>     it for the thing we are trying to evaluate, rather than rely what GDB
>     tells us the frame pointer is.
> 
> I have read that "sentence" multiple times and I still  don't know what
> you are trying to say.  Sorry.

I'm saying currently, the frame base in dwarf2 more descriptive than
the frame pointer in the frame info we have. 

> 
>     Yes, it's currently pointless to do this,  it's just future proofing, and
>     what we are supposed to do, anyway.
> 
>     This patch will have no effect whatosever without the dwarf2read.c
>     rewrite, but it won't break anything, neither, so i sepereated it out and
>     am submitting it now.
>     --Dan
>     2001-03-30  Daniel Berlin  <dberlin@redhat.com>
> 
> 	    * symtab.h (address_class): Add LOC_DWARF2_EXPRESSION.
> 	    (struct symbol): Add struct d2_location to store dwarf2
> 	    location expressions.
> 	    (SYMBOL_DWARF2_LOCATION, SYMBOL_DWARF2_FRAME_LOCATION): New macros.
> 
> 	    * findvar.c (read_leb128): New function, read leb128 data from a
> 	    buffer.
> 
> I don't know what leb128 data is (and I doubt that I'm alone).  I
> suspect that it is something dwarf of dwarf2 specific and that this
> function really belongs in another file (dwarf2read.c maybe?).

LEB128 = Little endian base 128
There is one in there. But we can't use it. It expects to be passed
bfd's .
> At a
> minimum there should probably be some sort of explanatory comment.
Sure.
> 
> 	    (evaluate_dwarf2_locdesc): New function, evaluate a given
> 	    dwarf2 location description.
> 
> This is a very dwarf2 specific change to findvar.c.  What happens if
> something other than dwarf2 is in use?

Um, if it doesn't have a dwarf2 location for the symbol, it just does
what it used to.
> 
> Also, should these functions be in here?  In dwarf2read.c?  Or in a
> new file (dwarf2eval.c?)?
Maybe dwarf2eval, but all the code it's related to is in findvar.c. 

I think you might not quite be understanding what dwarf2 location
expressions/lists are.

They are a way of describing where something is, over it's
lifetime (lists are, expressions are the location over a specific
range. So location lists are simply lists of location expressions, and
the associated address range where that expression is valid). However,
they are much more descriptive than GDB's current 
support for address classes (IE LOC_BASEPARM, etc), and thus,
occasionally we hit ones we can't transform into GDB's other simple
address classes.
It would also be flat out impossible to support location lists, since
gdb's address classes only expect a symbol to be at one location, over
it's lifetime.  So in order to be able to support optimized code
debugging, etc, you need to be able to evaluate the dwarf2 location
list at runtime, to get the address of a particular symbol, at a
particular time.

In fact, dwarf2 location expressions are actually a turing complete
language.  They are evaluated by a stack machine.

If you still aren't grasping it, let me give you an example:

Let's say we have a very simple c file

test.c:

int main(int argc, char *argv[])
{
        int a;
        a=5;
}

If you compile this file with dwarf2 info, the location of a will be
described (most likely) as the location expression "DW_OP_fbreg 8".

When we read the dwarf2 info, we currently try to convert "DW_OP_fbreg
8" into a gdb address class/location, which in this case, is
LOC_BASEREG, SYMBOL_BASEREG(x) = FP_REGNUM, SYMBOL_VALUE(x) = 8 (i'm
doin this from memory, it's probably completely wrong, but i'm just
trying to show something, not be absolutely correct).

If all of these location expression opcodes were simple, we wouldn't
have a problem (we also wouldn't be able to describe things anywhere
near as well as we could).  However, you also have opcodes like
DW_OP_deref, which is supposed to take the top of the current
dwarf2 location machine stack, dereference as if it was a pointer, and
push  the resulting value back onto the top of the stack.  Obviously,
since we don't have access to target memory at symbol reading time, we
can't do this (their is currently a hack that lets us do it if
DW_OP_deref is the last expression). This is because we are expected
to be able to evaluate them at runtime, but because gdb had no way of
evaluating them until how, we had to settle for trying to convert it
into something gdb could handle, and as they get more complex, or when
you try to do something that needs the expressive power, you fail.
We can't even come close to the expressive power with the other
address classes.  There are things like DW_OP_piece that let you say
"the first byte of this variable is in in this place, the next 2 bytes
are here, the last byte is here", etc.  
As I said, it's a turing complete language. 

Keep in mind, the above is just a single location expression. 
Location lists allow you to fully describe the lifetime of even a
variable in optimized code, by associating expressions with ranges of
PC's. 

This code evaluates dwarf2 location expressions to get the location of
a variable, if a symbol has one.  This is used in preference to using the
less expressive way gdb currently supports.

All i've done is add a new way to describe the location of a variable,
and the associated way to evaluate that description.

So the code doesn't belong in dwarf2read.c.  dwarf2read.c is a symbol
reader. dwarf2 location expressions are a way of describing variable
locations.  The other code to process the current gdb address
classes/locations is in findvar.c, so that's where i put it.

> 
> 	    (read_var_value): Use the dwarf2 location expression if it's available.
> 
> How do you know that the location expression is NULL when you haven't
> set it -- in particular, what's to prevent it from having garbage in
> it?
Because the symbol gets memset to 0.

> 
> And is it likely that some other symbol reader would someday want a
> similar hook?
It's not a hook really.
The dwarf2reader never calls it, is has no relation to symbol reading.

It is related to describing the location of a variable at runtime.

>  For example, what about dwarf1?
Nope
>   Or is dwarf2 likely to be the only one?

No. DWARF2 is the only symbol format i know of which has support for
something as expressive as location expressions/lists.

GCC uses DWARF2 call frame annotation (which is location lists paired
with registers to tell you were a given register is at a given pc) to
do it's exceptions, for instance.  

If we ever want to be able to use the unwind info or call frame info
to be able to do optimized code debugging without much trouble, we
need to be able to evaluate the expressions.

If it helps, pretend the word DWARF2 appears nowhere in the
patch. There is no relation to the symbol reader.

I could convert the stabs reader to use this way of describing
locations, there's just no point (since stabs's locations aren't more
expressive than what gdb supports).

-- 
I play the harmonica.  The only way I can play is if I get my
car going really fast, and stick it out the window.  I've been
arrested three times for practicing.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-03-30 14:42 ` Daniel Berlin
@ 2001-03-30 15:14   ` Andrew Cagney
  2001-03-30 18:44     ` Daniel Berlin
  2001-03-30 15:23   ` Elena Zannoni
  2001-03-30 15:24   ` Andrew Cagney
  2 siblings, 1 reply; 28+ messages in thread
From: Andrew Cagney @ 2001-03-30 15:14 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: David Taylor, gdb-patches

> No. DWARF2 is the only symbol format i know of which has support for
> something as expressive as location expressions/lists.

It is often refered to as live range splitting.  I've seen an
implementation using stabs.

	Andrew


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location  expressions
  2001-03-30 15:14   ` Andrew Cagney
@ 2001-03-30 18:44     ` Daniel Berlin
  0 siblings, 0 replies; 28+ messages in thread
From: Daniel Berlin @ 2001-03-30 18:44 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: David Taylor, gdb-patches

Andrew Cagney <ac131313@cygnus.com> writes:

> > No. DWARF2 is the only symbol format i know of which has support for
> > something as expressive as location expressions/lists.
> 
> It is often refered to as live range splitting.  I've seen an
> implementation using stabs.
> 
> 	Andrew
No, it's not *just* live range splitting.
I know about the live range splitting extensions to stabs.
I've also reimplemented live range splitting the new register
allocator i've been working on for gcc (on the gcc
new-regalloc-branch).

--Dan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-03-30 14:42 ` Daniel Berlin
  2001-03-30 15:14   ` Andrew Cagney
@ 2001-03-30 15:23   ` Elena Zannoni
  2001-03-30 15:24   ` Andrew Cagney
  2 siblings, 0 replies; 28+ messages in thread
From: Elena Zannoni @ 2001-03-30 15:23 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: David Taylor, gdb-patches

Daniel Berlin writes:
 > (If you aren't familiar, the address_class enum lists the ways you can
 > describe the location of a symbol. that's what i'm talking about when
 > i say address classes)
 > David Taylor <taylor@cygnus.com> writes:
 > 
 > >     Date: Fri, 30 Mar 2001 14:10:46 -0500 (EST)
 > >     From: Daniel Berlin <dberlin@redhat.com>
 > > 
 > >     This patch adds a place in the symbol structure to store the dwarf2
 > >     location (an upcoming dwarf2read huge rewrite patch uses it), an
 > >     evaluation maachine to findvar.c, and has read_var_value use the dwarf2
 > >     location if it's available.
 > > 
 > >     NOte that in struct d2_location, in the symtab.h patch, that framlocdesc
 > >     cannot be unioned with locdesc, because the symbol providing the frame
 > >     location description  usually has it's own real location.
 > > 
 > >     Currently, because GDB makes no use of the frame unwind info, etc, I
 > >     evaluate the frame pointer location expression for whatever is providing
 > >     it for the thing we are trying to evaluate, rather than rely what GDB
 > >     tells us the frame pointer is.
 > > 
 > > I have read that "sentence" multiple times and I still  don't know what
 > > you are trying to say.  Sorry.
 > 
 > I'm saying currently, the frame base in dwarf2 more descriptive than
 > the frame pointer in the frame info we have. 
 > 
 > > 
 > >     Yes, it's currently pointless to do this,  it's just future proofing, and
 > >     what we are supposed to do, anyway.
 > > 
 > >     This patch will have no effect whatosever without the dwarf2read.c
 > >     rewrite, but it won't break anything, neither, so i sepereated it out and
 > >     am submitting it now.
 > >     --Dan
 > >     2001-03-30  Daniel Berlin  <dberlin@redhat.com>
 > > 
 > > 	    * symtab.h (address_class): Add LOC_DWARF2_EXPRESSION.
 > > 	    (struct symbol): Add struct d2_location to store dwarf2
 > > 	    location expressions.
 > > 	    (SYMBOL_DWARF2_LOCATION, SYMBOL_DWARF2_FRAME_LOCATION): New macros.
 > > 
 > > 	    * findvar.c (read_leb128): New function, read leb128 data from a
 > > 	    buffer.
 > > 
 > > I don't know what leb128 data is (and I doubt that I'm alone).  I
 > > suspect that it is something dwarf of dwarf2 specific and that this
 > > function really belongs in another file (dwarf2read.c maybe?).
 > 
 > LEB128 = Little endian base 128
 > There is one in there. But we can't use it. It expects to be passed
 > bfd's .
 > > At a
 > > minimum there should probably be some sort of explanatory comment.
 > Sure.
 > > 
 > > 	    (evaluate_dwarf2_locdesc): New function, evaluate a given
 > > 	    dwarf2 location description.
 > > 
 > > This is a very dwarf2 specific change to findvar.c.  What happens if
 > > something other than dwarf2 is in use?
 > 
 > Um, if it doesn't have a dwarf2 location for the symbol, it just does
 > what it used to.
 > > 
 > > Also, should these functions be in here?  In dwarf2read.c?  Or in a
 > > new file (dwarf2eval.c?)?
 > Maybe dwarf2eval, but all the code it's related to is in findvar.c. 
 > 
 > I think you might not quite be understanding what dwarf2 location
 > expressions/lists are.
 > 
 > They are a way of describing where something is, over it's
 > lifetime (lists are, expressions are the location over a specific
 > range. So location lists are simply lists of location expressions, and
 > the associated address range where that expression is valid). However,
 > they are much more descriptive than GDB's current 
 > support for address classes (IE LOC_BASEPARM, etc), and thus,
 > occasionally we hit ones we can't transform into GDB's other simple
 > address classes.
 > It would also be flat out impossible to support location lists, since
 > gdb's address classes only expect a symbol to be at one location, over
 > it's lifetime.  So in order to be able to support optimized code
 > debugging, etc, you need to be able to evaluate the dwarf2 location
 > list at runtime, to get the address of a particular symbol, at a
 > particular time.
 > 
 > In fact, dwarf2 location expressions are actually a turing complete
 > language.  They are evaluated by a stack machine.
 > 
 > If you still aren't grasping it, let me give you an example:
 > 
 > Let's say we have a very simple c file
 > 
 > test.c:
 > 
 > 
 > int main(int argc, char *argv[])
 > {
 >         int a;
 >         a=5;
 > }
 > 
 > If you compile this file with dwarf2 info, the location of a will be
 > described (most likely) as the location expression "DW_OP_fbreg 8".
 > 
 > When we read the dwarf2 info, we currently try to convert "DW_OP_fbreg
 > 8" into a gdb address class/location, which in this case, is
 > LOC_BASEREG, SYMBOL_BASEREG(x) = FP_REGNUM, SYMBOL_VALUE(x) = 8 (i'm
 > doin this from memory, it's probably completely wrong, but i'm just
 > trying to show something, not be absolutely correct).
 > 
 > If all of these location expression opcodes were simple, we wouldn't
 > have a problem (we also wouldn't be able to describe things anywhere
 > near as well as we could).  However, you also have opcodes like
 > DW_OP_deref, which is supposed to take the top of the current
 > dwarf2 location machine stack, dereference as if it was a pointer, and
 > push  the resulting value back onto the top of the stack.  Obviously,
 > since we don't have access to target memory at symbol reading time, we
 > can't do this (their is currently a hack that lets us do it if
 > DW_OP_deref is the last expression). This is because we are expected
 > to be able to evaluate them at runtime, but because gdb had no way of
 > evaluating them until how, we had to settle for trying to convert it
 > into something gdb could handle, and as they get more complex, or when
 > you try to do something that needs the expressive power, you fail.
 > We can't even come close to the expressive power with the other
 > address classes.  There are things like DW_OP_piece that let you say
 > "the first byte of this variable is in in this place, the next 2 bytes
 > are here, the last byte is here", etc.  
 > As I said, it's a turing complete language. 
 > 
 > Keep in mind, the above is just a single location expression. 
 > Location lists allow you to fully describe the lifetime of even a
 > variable in optimized code, by associating expressions with ranges of
 > PC's. 
 > 
 > This code evaluates dwarf2 location expressions to get the location of
 > a variable, if a symbol has one.  This is used in preference to using the
 > less expressive way gdb currently supports.
 > 
 > All i've done is add a new way to describe the location of a variable,
 > and the associated way to evaluate that description.
 > 
 > So the code doesn't belong in dwarf2read.c.  dwarf2read.c is a symbol
 > reader. dwarf2 location expressions are a way of describing variable
 > locations.  The other code to process the current gdb address
 > classes/locations is in findvar.c, so that's where i put it.
 > 
 > 
 > > 
 > > 	    (read_var_value): Use the dwarf2 location expression if it's available.
 > > 
 > > How do you know that the location expression is NULL when you haven't
 > > set it -- in particular, what's to prevent it from having garbage in
 > > it?
 > Because the symbol gets memset to 0.
 > 
 > > 
 > > And is it likely that some other symbol reader would someday want a
 > > similar hook?
 > It's not a hook really.
 > The dwarf2reader never calls it, is has no relation to symbol reading.
 > 
 > It is related to describing the location of a variable at runtime.
 > 
 > >  For example, what about dwarf1?
 > Nope
 > >   Or is dwarf2 likely to be the only one?
 > 
 > No. DWARF2 is the only symbol format i know of which has support for
 > something as expressive as location expressions/lists.
 > 
 > GCC uses DWARF2 call frame annotation (which is location lists paired
 > with registers to tell you were a given register is at a given pc) to
 > do it's exceptions, for instance.  
 > 
 > If we ever want to be able to use the unwind info or call frame info
 > to be able to do optimized code debugging without much trouble, we
 > need to be able to evaluate the expressions.
 > 
 > If it helps, pretend the word DWARF2 appears nowhere in the
 > patch. There is no relation to the symbol reader.
 > 
 > I could convert the stabs reader to use this way of describing
 > locations, there's just no point (since stabs's locations aren't more
 > expressive than what gdb supports).
 > 
 > -- 
 > I play the harmonica.  The only way I can play is if I get my
 > car going really fast, and stick it out the window.  I've been
 > arrested three times for practicing.
 > 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
  2001-03-30 14:42 ` Daniel Berlin
  2001-03-30 15:14   ` Andrew Cagney
  2001-03-30 15:23   ` Elena Zannoni
@ 2001-03-30 15:24   ` Andrew Cagney
  2001-03-30 18:46     ` Daniel Berlin
  2 siblings, 1 reply; 28+ messages in thread
From: Andrew Cagney @ 2001-03-30 15:24 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: David Taylor, gdb-patches

> So the code doesn't belong in dwarf2read.c.  dwarf2read.c is a symbol
> reader. dwarf2 location expressions are a way of describing variable
> locations.  The other code to process the current gdb address
> classes/locations is in findvar.c, so that's where i put it.

No.

    if (frame == NULL)
      frame = selected_frame;
!   if (SYMBOL_DWARF2_LOC (var) != NULL)
!     return evaluate_dwarf2_locdesc (var, frame, (struct dwarf_block *)
SYMBOL_DWARF2_LOC (var),
SYMBOL_TYPE (var));
    switch (SYMBOL_CLASS (var))
      {
      case LOC_CONST:

I would have expected something like:

	if (this symbol's location is computed dynamically) then
	   return var->compute_location_dynamically (frame);

where that ``compute_location_dynamically'' was a function supplied by
dwarf2read.c.

For this starts to make sense, I suspect it needs to be given some
additional context - namely where in dwarf2read.c things are also going
to be changed.

	Andrew


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location  expressions
  2001-03-30 15:24   ` Andrew Cagney
@ 2001-03-30 18:46     ` Daniel Berlin
  2001-04-06 12:02       ` Andrew Cagney
  0 siblings, 1 reply; 28+ messages in thread
From: Daniel Berlin @ 2001-03-30 18:46 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: David Taylor, gdb-patches

Andrew Cagney <ac131313@cygnus.com> writes:

> > So the code doesn't belong in dwarf2read.c.  dwarf2read.c is a symbol
> > reader. dwarf2 location expressions are a way of describing variable
> > locations.  The other code to process the current gdb address
> > classes/locations is in findvar.c, so that's where i put it.
> 
> No.
> 
>     if (frame == NULL)
>       frame = selected_frame;
> !   if (SYMBOL_DWARF2_LOC (var) != NULL)
> !     return evaluate_dwarf2_locdesc (var, frame, (struct dwarf_block *)
> SYMBOL_DWARF2_LOC (var),
> SYMBOL_TYPE (var));
>     switch (SYMBOL_CLASS (var))
>       {
>       case LOC_CONST:
> 
> I would have expected something like:
> 
> 	if (this symbol's location is computed dynamically) then
> 	   return var->compute_location_dynamically (frame);
> 
> where that ``compute_location_dynamically'' was a function supplied by
> dwarf2read.c.

Why would dwarf2read.c supply it? It's not part of the symbol
reader. It is the location of the symbol. Anything can fill in that
particular field of the symbol structure.

It's no different a location than LOC_CONST, LOC_REGPARM, etc. It just
happens to be a bunch of operations that get evaluated, rather than 1
or 2 fixed ones.

> 
> For this starts to make sense, I suspect it needs to be given some
> additional context - namely where in dwarf2read.c things are also going
> to be changed.

Make sense?
In what way?

> 
> 	Andrew

-- 
I looked out my apartment window, and I saw a bird wearing
sneakers and a button saying, "I ain't flying no where."  I
said, "What's your problem buddy?"  He said, "I'm sick of this
stuff -- winter here, summer there, winter here, summer there.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location  expressions
  2001-03-30 18:46     ` Daniel Berlin
@ 2001-04-06 12:02       ` Andrew Cagney
  2001-04-06 12:40         ` Daniel Berlin
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Cagney @ 2001-04-06 12:02 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: David Taylor, gdb-patches

Daniel Berlin wrote:
> 
> Andrew Cagney <ac131313@cygnus.com> writes:
> 
> > > So the code doesn't belong in dwarf2read.c.  dwarf2read.c is a symbol
> > > reader. dwarf2 location expressions are a way of describing variable
> > > locations.  The other code to process the current gdb address
> > > classes/locations is in findvar.c, so that's where i put it.
> >
> > No.
> >
> >     if (frame == NULL)
> >       frame = selected_frame;
> > !   if (SYMBOL_DWARF2_LOC (var) != NULL)
> > !     return evaluate_dwarf2_locdesc (var, frame, (struct dwarf_block *)

Some code, somewhere must be setting SYMBOL_DWARF2_LOC (var).  At a
guess that code is either in dwarf2read.c or in a file using information
obtained from dwarf2read.c.  

If SYMBOL_DWARF2_LOC() was replaced by a more generic method then, I'd
expect that method to live in dwarf2read.c since it would be
dwarf2read.c creating a type that contained that method.

> > For this starts to make sense, I suspect it needs to be given some
> > additional context - namely where in dwarf2read.c things are also going
> > to be changed.
> 
> Make sense?
> In what way?

It might help explain how this patch ties in to the rest of GDB.

	enjoy,
		Andrew


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Add support for tracking/evaluating dwarf2 location  expressions
  2001-04-06 12:02       ` Andrew Cagney
@ 2001-04-06 12:40         ` Daniel Berlin
  0 siblings, 0 replies; 28+ messages in thread
From: Daniel Berlin @ 2001-04-06 12:40 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: Daniel Berlin, David Taylor, gdb-patches

On Fri, 6 Apr 2001, Andrew Cagney wrote:

> Daniel Berlin wrote:
> >
> > Andrew Cagney <ac131313@cygnus.com> writes:
> >
> > > > So the code doesn't belong in dwarf2read.c.  dwarf2read.c is a symbol
> > > > reader. dwarf2 location expressions are a way of describing variable
> > > > locations.  The other code to process the current gdb address
> > > > classes/locations is in findvar.c, so that's where i put it.
> > >
> > > No.
> > >
> > >     if (frame == NULL)
> > >       frame = selected_frame;
> > > !   if (SYMBOL_DWARF2_LOC (var) != NULL)
> > > !     return evaluate_dwarf2_locdesc (var, frame, (struct dwarf_block *)
>
> Some code, somewhere must be setting SYMBOL_DWARF2_LOC (var).  At a
> guess that code is either in dwarf2read.c or in a file using information
> obtained from dwarf2read.c.

Correct. It's currently dwarf2read.c
But it doesn't actually have to be, as I said, I could easily convert
stabsread.c to make location expressions and set SYMBOL_DWARF2_LOC, there
just wouldn't be much point, because we can already handle everything
STABS throws at us.

 >
> If SYMBOL_DWARF2_LOC() was replaced by a more generic method then, I'd
> expect that method to live in dwarf2read.c since it would be
> dwarf2read.c creating a type that contained that method.

I'll probably rename it to SYMBOL_LOCATION_LIST, and replace DWARF2_LOC
everywhere with LOCATION_LIST. This way, people realize the only reason
it's called a dwarf2 location list is because it, and the opcodes, happens
to be descriobed in the dwarf2 spec.  Anything that wants to, could create them, it relies
on nothing specific to dwarf2.
Other things use dwarf2 location lists to describe locations.  It's just a
robust way of describing the location of variables.
In fact, if it's not too difficult, i might change one of the other debug
info readers to use them for symbol locations, just to prove the point (I
wouldn't expect the patch to be accepted, of cuorse).

l >
> > > For this starts to make sense, I suspect it needs to be given some
> > > additional context - namely where in dwarf2read.c things are also going
> > > to be changed.
> >
> > Make sense?
> > In what way?
>
> It might help explain how this patch ties in to the rest of GDB.
>
As I said, this patch is completely independent of anything that goes on
in the dwarf2 reader. It just defines another way of describing the
location of a variable, a way that happens to be explained in the dwarf2
specification.  There is absolutely nothing dwarf2 specific about it, or
anything that even ties it to dwarf2 as a debugging format.  DWARF2 just
happens to use it as it's way of describing locations, much the same way
g++ exception handling happens to use it as it's way of describing
locations.

That said, the new dwarf2 reader will use location lists, rather
than try to convert a location list into one of the other ways to
describe the location of symbols.  It will also use location lists in
another place to describe the location of a register at a given PC (IE in
the virtaul frame unwinding stuff i'm working on as well)

 Here's the history and where everything  stands:
As you may or may not know, in order to do dwarf2 duplicate
elimination, it's necessary to comdat dwarf2 DIEs.  So, unlike before, you
have DWARF2 DIEs that may refer to DIEs in another DWARF2 compilation
unit.

The current dwar4f2 reade, of course, uses static and global structures
everywhere, and expects to be able to process a single compilation unit at
a time, without having to know anything about *any othre* compilation
units.

Which of course, won't work when you have references to dies in other
units (IE we may have a DIE representing a variable, and the type
attribute of that DIE may be pointing to a DIE in a  different compilation
unit).

These assumptions are so pervasive through dwarf2read that it essentially
required changing, well, everything (the current diff is 85k).  Since it
required so much work to do this, it made sense to put the time in to take
advantage of the other things dwarf2 provides us, that we don't currently
take advantage of, such as the ability to use the virtual frame info to
get at things in the frame/saved registers, rather than rely on target
specific scanning code and such.

the patch i submitted was a first pass at the non-dwarf2 specific parts of
this kind of support.

It's one of those "I wish i had given stuff a different name now" type of
things, since it just seems to confuse people.

Mentioning the virtaul frame info reminds me.  Currently, i have patched
generic_get_saved_register to use the virtual frame info to find a
register, if it can, in preference to the target specific code.  Is
this the right place to do this?  This is where it has the target
specific code fill in the saved registers it knows about, and i'm
just doing the same with the virtual frame info, rather than target
specific code.  It simply calls a function that is impemented in
dwarf2eval.c that does all the same dirty work FRAME_INIT_SAVED_REGS does.

Modifying FRAME_INIT_SAVED_REGS for every platform tidn't seem to be the
right thing to do, since, as I said, the same procedure works on all
platforms, since the target dependent stuff is on the gcc side.
  > 	 enjoy, > 		 Andrew >

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2001-06-07  7:29 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-03-30 11:11 [PATCH] Add support for tracking/evaluating dwarf2 location expressions Daniel Berlin
2001-03-30 15:36 ` Andrew Cagney
2001-03-30 18:53   ` Daniel Berlin
2001-04-06 11:53     ` Andrew Cagney
2001-04-06 12:10       ` Daniel Berlin
2001-04-06 12:36         ` Kevin Buettner
2001-04-06 23:18           ` [PATCH] Add support for tracking/evaluating dwarf2 locationexpressions Daniel Berlin
2001-05-21 14:46 ` [PATCH] Add support for tracking/evaluating dwarf2 location expressions Jim Blandy
2001-05-21 18:49   ` Daniel Berlin
2001-05-22 12:46     ` Jim Blandy
2001-05-22 13:51       ` Daniel Berlin
2001-05-22 23:14         ` Jim Blandy
2001-05-23  8:51           ` Daniel Berlin
2001-05-23 11:53           ` Daniel Berlin
2001-05-23 21:53             ` Jim Blandy
2001-05-23 22:56               ` Daniel Berlin
2001-06-06  9:07   ` Andrew Cagney
2001-06-06  9:46     ` Daniel Berlin
2001-06-07  7:29       ` Andrew Cagney
2001-03-30 13:49 David Taylor
2001-03-30 14:42 ` Daniel Berlin
2001-03-30 15:14   ` Andrew Cagney
2001-03-30 18:44     ` Daniel Berlin
2001-03-30 15:23   ` Elena Zannoni
2001-03-30 15:24   ` Andrew Cagney
2001-03-30 18:46     ` Daniel Berlin
2001-04-06 12:02       ` Andrew Cagney
2001-04-06 12:40         ` Daniel Berlin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox