From mboxrd@z Thu Jan  1 00:00:00 1970
From: Kevin Buettner <kevinb@cygnus.com>
To: gdb@sources.redhat.com
Subject: How to add core addresses (or LM_ADDR problems revisited)
Date: Tue, 02 Oct 2001 18:31:00 -0000
Message-id: <1011003013114.ZM22147@ocotillo.lan>
X-SW-Source: 2001-10/msg00036.html

I've been reexamining the following change to solib-svr4.c:

2001-02-20  Martin M. Hunt  <hunt@redhat.com>

	* solib-svr4.c (LM_ADDR): LM_ADDR is a signed offset, so
	extract_signed_integer() should be called instead of
	extract_address().

For reference, Martin's patch may be found at:

    http://sources.redhat.com/ml/gdb-patches/2001-02/msg00391.html

My approval message which also contains some analysis of why Martin's
change seemed okay to me (at the time) may be found here:

    http://sources.redhat.com/ml/gdb-patches/2001-02/msg00399.html

Finally, the following (earlier) messages from Maciej W. Rozycki
ought to be read to understand the context of the LM_ADDR patch:

    http://sources.redhat.com/ml/gdb-patches/2000-07/msg00260.html
    http://sources.redhat.com/ml/gdb-patches/2000-07/msg00294.html

Now, with the references out of the way, I'll dive in...

I recently tested GDB on a 64-bit sparc machine.  These machines
support both 32-bit and 64-bit executables and I happened to be using
a compiler which defaults to 32-bit mode for my testing.  In the
course of my testing, I found that GDB would not properly stop at a
breakpoint on a function in a dlopen()'d shared library...

    18        if (!(handle = dlopen("./dso.so", RTLD_NOW|RTLD_GLOBAL))) {
    (gdb) n
    23        init = dlsym(handle, "dso_init");
    (gdb) b dso_init
    Breakpoint 2 at 0xff360770: file dso.c, line 10.
    (gdb) c
    Continuing.
    Do `add-symbol-file dso.so 0xff36075c`
    Program received signal SIGTRAP, Trace/breakpoint trap.
    0xff360770 in ?? ()

I should note that I rebuilt and tested on a 32-bit sparc and all was
well.

Here's what is happening on the 64 bit machine:

    1) The shared library in question is being loaded at 0xf3606380.

    2) LM_ADDR() is using extract_signed_integer() instead of
       extract_address() to fetch the load address and is sign
       extending the address rather than zero extending it.

    3) This (sign extended) offset is being added to the values of
       the symbols from the shared object.  As a consequence, GDB's
       representation of all of the symbols from the shared object
       have the pattern 0xffffffff in the upper 32-bits of the 64-bit
       CORE_ADDR.

    4) When a breakpoint is set on the function, it ends up storing
       a sign extended address in the breakpoint data structures.

    5) When the TRAP instruction is encountered, GDB compares the
       trap address (which is properly zero-extended) with the
       sign-extended version in the breakpoint data structure.
       Of course they don't match, so GDB concludes that a breakpoint
       has not been hit.  Of course, these addresses match in the
       low 32-bits...

It was tempting to change LM_ADDR() back to fetching zero extended
addresses, but after rereading Maciej's comments, I concluded that
neither sign-extending nor zero-extending would always produce the
correct results.  To make this perfectly clear, here are two examples
which illustrate why:

    - Suppose a selected symbol's symbol table address is 0x00020000,
      but that it gets loaded at 0x00010000.  The 32-bit offset
      (which is acquired by LM_ADDR()) needed to relocate the
      symbol to the desired load address is 0xffff0000.  On a 64-bit
      machine, using 64-bit arithmetic, this address would need to be
      sign extended in order to get the correct result.  If it is
      zero extended instead, the symbol's load address will be computed
      as 0x100010000 instead of 0x00010000 as desired.

    - Suppose a selected symbol's symbol table address is 0x00020000,
      but that it gets loaded at 0xf0020000.  The 32-bit offset
      needed to relocate the symbol to the load address is
      0xf0000000.  On a 64-bit machine, using 64-bit arithmetic, this
      offset would need to be zero extended in order to obtain the
      correct result.  If it is sign extended instead, the symbol's
      load address will be computed as 0xfffffffff0020000  instead of
      0x00020000 which was desired.

Thus no matter how we implement LM_ADDR(), we haven't really solved
the problem.  Either way, there will be times when an overflow out
of the 32-bit address space will occur.  

I believe that the correct solution to this problem lies in changing
how arithmetic is being performed on CORE_ADDRs.  Instead of always
performing N-bit arithmetic on an N-bit CORE_ADDR, we ought to be
performing M-bit arithmetic where M is the size of the target's
pointer.

However, to further complicate matters, GDB distinguishes between
pointers and addresses.  On most sane hardware, this distinction is
unimportant and one can think of these as the same thing.  However,
architectures with multiple disjoint address spaces make life more
complicated.  On such a target, the pointer size is the number of bits
needed by the CPU to address a given address space.  (There could in
fact be several different pointer sizes.) GDB needs to have a way to
refer to a word (or byte) in any of the target's address spaces and
this is what GDB calls an address.  Therefore, in GDB, the address
size is the number of bits needed to represent a pointer (in any given
space) plus some number of extra bits for the address space
identifier.

When adding an offset to an address that contains an address space
identifier, care must be taken to not change the address space
identifier.  Also, it is not terribly meaningful to add two addresses
together which have different address space identifiers, in fact, it's
probably an internal error.  It's unclear to me if it makes sense or
not to add two addresses together with the same address space
identifier, but I'm inclined to permit it.

Anyway, the code for adding two addresses together is below.  I will
post a patch which incorporates this code to gdb-patches, but I
thought it important to get feedback from the gdb discussion list
first.

/* Compute the two masks needed for doing arithmetic operations on
   a CORE_ADDR.  SPACE_MASKP is a pointer to the CORE_ADDR which will
   be filled in with the mask needed to find the address space
   identifier (if any) typically found on Harvard architectures. 
   PTR_MASKP will be filled in with the mask to be used after the
   arithmetic operation to clear any overflow into the high order bits
   of the CORE_ADDR result.
  
   Note:  These operations will be done a lot for a given target.  It
   may be worthwhile to use the gdbarch swap machinery to compute
   these mask values once so that TARGET_ADDR_BIT and TARGET_PTR_BIT
   won't need to be called over and over again.  */

static void
core_addr_get_masks (CORE_ADDR *space_maskp, CORE_ADDR *ptr_maskp)
{
  CORE_ADDR space_mask, ptr_mask;

  if (sizeof (CORE_ADDR) * 8 > TARGET_PTR_BIT)
    {
      ptr_mask = ((((CORE_ADDR) 1) << TARGET_PTR_BIT) - 1);

      if (TARGET_ADDR_BIT > TARGET_PTR_BIT)
	{
	  if (sizeof (CORE_ADDR) * 8 > TARGET_ADDR_BIT)
	    space_mask = ~ptr_mask & ((((CORE_ADDR) 1) << TARGET_ADDR_BIT) - 1);
	  else
	    space_mask = ~ptr_mask;
	}
      else
	space_mask = 0;

    }
  else
    {
      space_mask = 0;
      ptr_mask = -1;
    }

    *space_maskp = space_mask;
    *ptr_maskp = ptr_mask;
}

/* Add two CORE_ADDR values together and return the sum.  If N is
   the target's word size, the sum is obtained by adding the two N-bit
   addends together and discarding any overflow beyond N-bits.  If
   the current target is an architecture where the number of address
   bits, M, is greater than N, the "address space" information carried
   in bits N thru M-1 are preserved in the sum.  An internal error
   will be generated when attempting to compute a sum of two addresses
   with conflicting address space identifiers. 

   Two address space identifiers conflict if they're both non-zero and
   unequal.  What this means in practice is that it's okay to add two
   offsets together, an address and an offset, or two addresses in the
   same address space.  */

CORE_ADDR
core_addr_add (CORE_ADDR addr1, CORE_ADDR addr2)
{
  CORE_ADDR space1, space2;	/* address space identifiers */
  CORE_ADDR space_mask;		/* mask to get at address space identifier
			           bits */
  CORE_ADDR ptr_mask;		/* mask used to keep pointer portion */
  CORE_ADDR sum;

  core_addr_get_masks (&space_mask, &ptr_mask);

  space1 = addr1 & space_mask;
  space2 = addr2 & space_mask;

  if (space1 != 0 && space2 != 0 && space1 != space2)
    internal_error (__FILE__, __LINE__,
		    "core_addr_add: conflicting space identifiers");

  /* Add the two addresses together; mask off the overflow and then
     put the address space identifiers back in.  */
  sum = ((addr1 + addr2) & ptr_mask) | (space1 | space2);
  return sum;
}