From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-17065-listarch-gdb-patches=sourceware.cygnus.com@sources.redhat.com>
Received: (qmail 12979 invoked by alias); 23 Jul 2002 23:43:30 -0000
Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-patches-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sources.redhat.com>
List-Help: <mailto:gdb-patches-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: gdb-patches-owner@sources.redhat.com
Received: (qmail 12972 invoked from network); 23 Jul 2002 23:43:29 -0000
Received: from unknown (HELO zwingli.cygnus.com) (208.245.165.35)
  by sources.redhat.com with SMTP; 23 Jul 2002 23:43:29 -0000
Received: by zwingli.cygnus.com (Postfix, from userid 442)
	id 139775EA11; Tue, 23 Jul 2002 18:43:27 -0500 (EST)
To: david carlton <carlton@math.stanford.edu>
Cc: gdb-patches@sources.redhat.com
Subject: Re: patch for PR gdb/574
References: <15676.21042.125481.349851@jackfruit.Stanford.EDU>
From: Jim Blandy <jimb@redhat.com>
Date: Tue, 23 Jul 2002 16:47:00 -0000
In-Reply-To: <15676.21042.125481.349851@jackfruit.Stanford.EDU>
Message-ID: <npvg766me8.fsf@zwingli.cygnus.com>
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
X-SW-Source: 2002-07/txt/msg00468.txt.bz2

--=-=-=
Content-length: 280


A while back I spent some time looking at some confusing behavior
related to GDB values with different static and dynamic times; that's
in the same neighborhood as the VALUE_ENCLOSING_TYPE stuff, so this
might be helpful.  I wanted to fix this myself, but I never had the
time.


--=-=-=
Content-Type: multipart/digest; boundary="==-=-="

--==-=-=
Content-Type: text/plain
Content-length: 52

Subject: Topics

Topics:
   the address of a value


--==-=-=
Content-length: 5104

Date: Sun, 20 May 2001 02:19:41 -0500 (EST)
From: Jim Blandy <jimb@zwingli.cygnus.com>
To: gdb@sources.redhat.com, Daniel Berlin <dan@cgsoftware.com>
Subject: the address of a value
Message-Id: <20010520071941.811A35E9D7@zwingli.cygnus.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-length: 4801


I'd like to make what seems to me like a rather modest proposal:
that VALUE_ADDRESS (VALUE) should be VALUE's address.

This isn't true today.  In fact, there is little consensus within GDB
on how one *should* go about finding a value's address.  At bottom
I've included a catalog of the various assumptions currently made in
GDB's code.  Things only generally work because values with non-zero
offsets, and values whose types are different from their enclosing
types, are rare.

Probably the most correct assumption in today's code is that:

- a value v's address is VALUE_ADDRESS (v) + VALUE_OFFSET (v) +
  VALUE_EMBEDDED_OFFSET (v), and 

- for values representing C++ embedded subobjects, VALUE_ADDRESS (v) +
  VALUE_OFFSET (v) is the address of VALUE_CONTENTS_ALL (v).

(These aren't really independent; each is a consequence of the other.)

Code following these assumptions will work nicely with values that
hold both a C++ subobject and its surrounding full object, as well as
more ordinary values.  (If you're unfamiliar with the embedded offset
stuff, see my recent doc fix for value.h, posted to gdb-patches but
not yet committed.)

I'm suggesting that, instead:

- a value v's address should be VALUE_ADDRESS (v), and

- for values representing C++ embedded subobjects, VALUE_ADDRESS (v) -
  VALUE_EMBEDDED_OFFSET (v) should be the address of
  VALUE_CONTENTS_ALL (v).

This seems straightforward and simple.  Code which simply wants to
access the value need not worry about embedded objects, enclosing
types, etc.  The complexity of VALUE_EMBEDDED_OFFSET and
VALUE_ENCLOSING_TYPE is confined to code which actually wants to deal
with such things.

Implementing that suggestion would be a large change, but given the
state of the code described below, making the code consistent with any
single interpretation will be a large change.

There are some functions (notably, those for accessing structure
members and (sometimes) array elements) which adjust VALUE_OFFSET,
rather than VALUE_ADDRESS.  This allows them to work consistently on
values whose VALUE_LVAL are `lval_memory', `lval_register',
`lval_internalvar_component', or `lval_reg_frame_relative'.  Under my
suggested change, these functions would need conditionals to decide
whether to adjust VALUE_OFFSET or VALUE_ADDRESS, since consumers of
`lval_memory' values would no longer consult VALUE_OFFSET.  However,
another change, independent of the one I'm proposing here, could
eliminate VALUE_OFFSET altogether, and remove this additional
complexity in the process.  So this additional complexity needn't be
permanent.

And yes, I'm volunteering to do the work.  :)


----


Here are the various ways different parts of GDB go about computing a
value's address.  I made this list by grepping for VALUE_ADDRESS.

- The following functions assume that VALUE_ADDRESS gives a value's
  address:

    mips_push_arguments
    print_formatted (when setting next_address)
    x_command (same)
    print_frame_args
    v850-tdep.c (*yawn*)
    c_value_of_variable

- Some code assumes that VALUE_ADDRESS + VALUE_EMBEDDED_OFFSET is the
  value's address:

    c_value_print and cp_print_static_field, at the final calls to val_print
    cp_print_static_field, at the final call to cp_print_value_fields

- Other code suggests that VALUE_ADDRESS + VALUE_OFFSET is a value's
  address:

    insert_breakpoints
    remove_breakpoints
    bpstat_stop_status
    can_use_hardware_watchpoint
    evaluate_subexp_standard (as it creates the `this' parameter for
        a method call, near line 829)
    value_as_pointer
    value_assign (various cases, probably not all)
    value_subscript (when subscripting bitstrings)
    value_subscripted_rvalue
    value_cast (when casting memory lvalues of differing lengths(?))

- Still other code suggests that a value's address is VALUE_ADDRESS +
  VALUE_OFFSET + VALUE_EMBEDDED_OFFSET:

    value_ind
    value_addr
    value_repeat
    value_fetch_lazy
    search_struct_field
    search_struct_method
    find_method_list
    value_full_object

I'm omitting code that seems like an unreliable witness.  For example,
at valops.c:356 we see code manipulating aligner.contents directly,
thereby assuming in one fell swoop that host == target, and that host
sizeof(long) == target sizeof (data *).  This code is crap.

I'm also omitting code that seems to be taking advantage of special
knowledge about the value it's dealing with.  For example, one could
say that c_val_print assumes that VALUE_ADDRESS is the value's
address, or one could say it's assuming that value_at always returns a
value whose VALUE_OFFSET and VALUE_EMBEDDED_OFFSET are zero, so the
adjustments can be omitted.  This happens to be true, although it
would be easy enough not to assume this, since it makes the code
obscure.


--==-=-=--

--=-=-=--