From mboxrd@z Thu Jan  1 00:00:00 1970
From: Daniel Berlin <dan@cgsoftware.com>
To: Jim Blandy <jimb@zwingli.cygnus.com>
Cc: Daniel Berlin <dan@cgsoftware.com>, gdb-patches@sources.redhat.com, gdb@sources.redhat.com
Subject: Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions
Date: Wed, 23 May 2001 22:56:00 -0000
Message-id: <8766erl29s.fsf@dynamic-addr-83-177.resnet.rochester.edu>
References: <Pine.LNX.4.33.0103301400270.24037-100000@www.cgsoftware.com> <nppud28jdl.fsf@zwingli.cygnus.com> <87heye17ck.fsf@dynamic-addr-83-177.resnet.rochester.edu> <npae452mkb.fsf@zwingli.cygnus.com> <87n1852jnf.fsf@dynamic-addr-83-177.resnet.rochester.edu> <npheyc1tia.fsf@zwingli.cygnus.com> <87wv77zym7.fsf@dynamic-addr-83-177.resnet.rochester.edu> <np4rub1h4t.fsf@zwingli.cygnus.com>
X-SW-Source: 2001-05/msg00325.html

Jim Blandy <jimb@zwingli.cygnus.com> writes:

> Hey, everyone.  If folks think the general direction we're going here
> (explained in my first post to this thread) is not a good idea, speak
> now.  Stuff is happening.
> 
> Daniel Berlin <dan@cgsoftware.com> writes:
> > void init_locexpr (locexpr *);
> > void destroy_locexpr (locexpr *);
> > void locexpr_push_op  (locexpr, enum locexpr_opcode, const char *, ...);
> > void locexpr_get_op (locexpr, unsigned int,  enum locexpr_opcode *);
> > void locexpr_get_longest (locexpr, unsigned int, LONGEST *);
> > void locexpr_get_ulongest (locexpr, unsigned int, ULONGEST *);
> > void locexpr_get_uchar (locexpr, unsigned int, unsigned char *);
> > value_ptr evaluate_locexpr (struct symbol *, struct frame_info *, locexpr, struct type *);
> 
> GDB tends not to use typedefs, which I think is a good thing.  I think
> `struct locexpr' would be better.
The only thing is that i made locexpr a typedef to locexpr_priv *, so
you'd get a compile error if anything outside symeval.c tried to
access the internal structure.

I can change this if we all agree this is a bad idea.
I just think we have a tendency to use internal structure to get
things done quicker, even when we know we shouldn't,  and mark it
"FIXME", and it never gets fixed.

> 
> There should be a way to efficiently build locexprs in obstacks.

I know, i'm on it already.
It was a first pass, so it did the "realloc by one extra byte" for
pushing opcodes.

> 
> How big are your stack elements? 
opcodes are unsigned char.

operands are encoded as one of:
unsigned char
uleb128
sleb128

They are passed into the opcode pusher as LONGEST or ULONGEST or unsigned char.
>  I think it's very important that the
> semantics of a particular string of bytecodes be independent of word
> size issues.  That is, if I construct a particular string, I want it
> to compute the same values no matter what sort of host I'm running on.
> Have you thought about how to ensure that architecturally? 
Yes.

sizeof(char) is guaranteed to be 1 byte always, which is why it's used
for representing the opcode.
uleb128 is host independent and word size independent (in memory it's
a string of bytes. It always encodes to teh same string of bytes on
any platform, and decodes to the same value on any platform).
As is sleb128.
This, and the fact that it's efficient, is why it was used for DWARF2
in the first place.

>  Think
> about right shifts, divides, comparisons --- anything that can bring
> information down from the upper bits.
You can only decode into LONGEST, ULONGEST, and unsigned char. All
operations that can bring info down from upper bits already have explicit
casts to LONGEST (inherited from the fact that this is baesd on the
dwarf2 locexpr evaluator, which was based on unwind-dw2-fde.c from
GCC, which does the same type of casts).
Like so:

   case GLO_div:
		result = (LONGEST)second / (LONGEST)first;
		break; case GLO_shra:
....
        case GLO_shra:
		result = (LONGEST)second >> first;
		break;
	  ...
	      case GLO_le:
		result = (LONGEST)first <= (LONGEST)second;
		break;
	      case GLO_ge:
		result = (LONGEST)first >= (LONGEST)second;
		break;
	      case GLO_eq:
		result = (LONGEST)first == (LONGEST)second;
		break;
	      case GLO_lt:
		result = (LONGEST)first < (LONGEST)second;
		break;
	      case GLO_gt:
		result = (LONGEST)first > (LONGEST)second;
		break;
	      case GLO_ne:
		result = (LONGEST)first != (LONGEST)second;
		break;


> 
> > I kept bregx only because we already have a location type for based
> > registers, so it seemed to make sense (Admittedly, all it does is take
> > us from two opcodes to one for this).
> 
> I think it should be deleted.
Okeydokey.

> 
> > fbreg we can't get rid of, we don't necessarily know the frame
> > register at debug reading time (it may itself be using a location list
> > to say where it is).
> 
> Yep, it's its own animal.
> 
> > We could get rid of the "default" opcodes deref and xderef, and just
> > always use xderef_size and deref_size with an explicit size.
> 
> I think we should do that.
> 
> > If everyone agrees, i'll write up the docs on these opcodes, and the
> > functions to manipulate location expressions.
> 
> Yeah, the interface isn't clear to me. 

Well, it's built to make it so the only thing that knows the internal,
in-memory structure of the location expression is push and get.  

If I added a cookie, so that you didn't have to pass the
get_opcode/get_operand routines a position (which i'll probably do),
you could make things like "constants being in a seperate bcache'd
array" completely independent of the evaluator.  Only the get routine
would know this.
> I'm especially curious about
> that constructor.  Varargs give me the heebie-jeebies.  I think Andrew
> has said he feels the same way.

It was either this, or have one to push opcodes, and three to push
operands.
Or a bunch to cover each combination of opcode+operands you have.

or if you'd rather see the 9 functions like:

locexpr_push_uu
locexpr_push_us
locexpr_push_uc
locexpr_push_ss
locexpr_push_su
locexpr_push_sc
locexpr_push_cc
locexpr_push_cu
locexpr_push_cs

Having just one to push all of them also has the advantage that we can type
check the arguments to make sure you are giving the  right number and
type of operands to the right opcode. 

This is currently implemented.

We could also do it with the 9 functions, but not the one to push
opcodes, and three to push operands (without global variables or
something messy, anyway).

Without this, it's harder to track down bugs in converters, since you
won't see a problem until evaluation, and even then, it may be
sporadic (I actually hit a bug in evaluating fbreg when it was a
dwarf2 evaluator because on powerpc, the offsets from it are always
positive, and x86, they can be negative. I read the opcode as unsigned
accidently, and it took me 3 weeks to track it down to this.)


> 
> > Eventually, I'll get rid of all the other location types, but this can
> > be done incrementally, so it's no rush at all.
> 
> Right, that's where I want to go with this too.  Note that if we do
> that, the LOC_* enum encodes information about whether something is an
> argument or not (i.e., LOC_ARG vs. LOC_LOCAL, LOC_REGISTER
> vs. LOC_REGPARM).  We'll need to encode that information otherhow.
Yes.  

-- 
"My neighbor has a circular driveway...  He can't get out.
"-Steven Wright