MIPS sign extension of addresses

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

* MIPS sign extension of addresses
@ 2002-09-10  8:49 Fred Fish
  2002-09-10  9:25 ` Paul Koning
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Fred Fish @ 2002-09-10  8:49 UTC (permalink / raw)
  To: binutils; +Cc: gdb, fnf

I'm currently working on a mipsisa32-elf based toolchain and was
concerned about the number of failures in the gdb testsuite.

For example, using the latest development versions of gcc, gdb,
binutils, and newlib (latest as of a week ago anyway), the standard
mipsisa32-elf configured toolchain gets the following results just for
"mips-sim-idt32/-msoft-float", excluding the gdb.mi, gdb.gdbtk, and
gdb.threads tests:

  # of expected passes            5689
  # of unexpected failures        281
  # of expected failures          29
  # of unresolved testcases       41
  # of untested testcases         3
  # of unsupported tests          15

After doing some work on gdb and the testsuite over the weekend, my
results for the mips-elf toolchain are:

  # of expected passes            6126
  # of unexpected failures        91
  # of expected failures          39
  # of unresolved testcases       28
  # of untested testcases         3
  # of unsupported tests          6
  # of unexpected successes       1

Most of the problems I fixed had to do with the fact that BFD takes
the 32 bit unsigned addresses from object and executable files, sign
extends them, and then stores the result as a bfd_vma, which is an
unsigned 64 bit type (unsigned long long).  For example, the unsigned
32 bit address 0x80020004 becomes an unsigned 64 bit bfd_vma/CORE_ADDR
of 0xffffffff80020004.  The bfd_vma type is used to define gdb's
CORE_ADDR types.

I suppose this might make more sense to me if I either had more
historical knowledge about why it does this, or if bfd_vma/CORE_ADDR
were signed types.

The trigger for this behavior in BFD is the field called
sign_extend_vma in the elf backend data, which apparently is only set
by elf32-mips.c, elf32-sh64.c, and elfn32-mips.c.  This special
treatment for mips seems to also be why we need the regcache to
support a read_signed_register call that is only used in mips-tdep.c.

There are also some ADDR_BITS_REMOVE macros used in mips-tdep.c that
one might think would be there to strip the high bits but really don't
since that macro invokes mips_mask_address_p() which tests
mask_address_var which defaults to AUTO_BOOLEAN_AUTO which uses
MIPS_DEFAULT_MASK_ADDRESS_P which checks the multiarch value of
default_mask_address_p which seems to be set to zero everywhere.

Is all of this just internal gdb handwaving that can be changed/fixed
or are there external reasons for all this sign extending followed by
selective discarding/ignoring of the extended bits?  Are there files
that will break or hardware that will misbehave if BFD stops doing
this sign extension?

After getting a little feedback from some private email exchanges
containing substantially the same info as above, I've modified my
mental picture of this process to think of it as a simple address
translation scheme.  I.E. when running a 32 bit binary in a 64 bit
address space, effectively the 32 bit address space is split in half,
with the lower half (0x00000000-0x7FFFFFFF) mapped to the bottom of
the 64 bit space and the upper half (0x80000000-0xFFFFFFFF) mapped to
the top of the 64 bit space.

I suppose this makes sense in a toolchain where you want the same gdb
to be able to handle both 64 and 32 bit mips.  I'm fairly unfamiliar
with all the different possible mips configurations, but perhaps it
would be less confusing if there was one that was strictly 32 bit
internally, much like when configuring and building a native
i686-pc-linux-gnu toolchain uses an "unsigned long int" type for
bfd_vma instead of an "unsigned long long int".

-Fred

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MIPS sign extension of addresses
  2002-09-10  8:49 MIPS sign extension of addresses Fred Fish
@ 2002-09-10  9:25 ` Paul Koning
  2002-09-10 10:00   ` Maciej W. Rozycki
  2002-09-10  9:57 ` Maciej W. Rozycki
  2002-09-10 12:07 ` Andrew Cagney
  2 siblings, 1 reply; 10+ messages in thread
From: Paul Koning @ 2002-09-10  9:25 UTC (permalink / raw)
  To: fnf; +Cc: binutils, gdb

>>>>> "Fred" == Fred Fish <fnf@intrinsity.com> writes:

 Fred> I'm currently working on a mipsisa32-elf based toolchain and
 Fred> was concerned about the number of failures in the gdb
 Fred> testsuite.

 Fred> ...Most of the problems I fixed had to do with the fact that BFD
 Fred> takes the 32 bit unsigned addresses from object and executable
 Fred> files, sign extends them, and then stores the result as a
 Fred> bfd_vma, which is an unsigned 64 bit type (unsigned long long).
 Fred> For example, the unsigned 32 bit address 0x80020004 becomes an
 Fred> unsigned 64 bit bfd_vma/CORE_ADDR of 0xffffffff80020004.  The
 Fred> bfd_vma type is used to define gdb's CORE_ADDR types.

 Fred> ...After getting a little feedback from some private email
 Fred> exchanges containing substantially the same info as above, I've
 Fred> modified my mental picture of this process to think of it as a
 Fred> simple address translation scheme.  I.E. when running a 32 bit
 Fred> binary in a 64 bit address space, effectively the 32 bit
 Fred> address space is split in half, with the lower half
 Fred> (0x00000000-0x7FFFFFFF) mapped to the bottom of the 64 bit
 Fred> space and the upper half (0x80000000-0xFFFFFFFF) mapped to the
 Fred> top of the 64 bit space.

That sounds right.  I looked in the MIPS Inc. reference (MIPS64
architecture, part 3, privileged architecture,
MD00091-2B-MIPS64PRA-AFP-00.95.pdf) and it shows exactly the picture
you describe.  For example, kseg0 starts at 0xffffffff80000000 in 64
bit addressing.

   paul

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MIPS sign extension of addresses
  2002-09-10  9:25 ` Paul Koning
@ 2002-09-10 10:00   ` Maciej W. Rozycki
  0 siblings, 0 replies; 10+ messages in thread
From: Maciej W. Rozycki @ 2002-09-10 10:00 UTC (permalink / raw)
  To: Paul Koning; +Cc: fnf, binutils, gdb

On Tue, 10 Sep 2002, Paul Koning wrote:

> That sounds right.  I looked in the MIPS Inc. reference (MIPS64
> architecture, part 3, privileged architecture,
> MD00091-2B-MIPS64PRA-AFP-00.95.pdf) and it shows exactly the picture
> you describe.  For example, kseg0 starts at 0xffffffff80000000 in 64
> bit addressing.

 The definition of the MIPS address space much predates the MIPS64 ISA --
see the R4000 (the first 64-bit MIPS processor released) manual at the
MIPS site for a reference.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MIPS sign extension of addresses
  2002-09-10  8:49 MIPS sign extension of addresses Fred Fish
  2002-09-10  9:25 ` Paul Koning
@ 2002-09-10  9:57 ` Maciej W. Rozycki
  2002-09-12  7:49   ` Fred Fish
  2002-09-10 12:07 ` Andrew Cagney
  2 siblings, 1 reply; 10+ messages in thread
From: Maciej W. Rozycki @ 2002-09-10  9:57 UTC (permalink / raw)
  To: Fred Fish; +Cc: binutils, gdb

On Tue, 10 Sep 2002, Fred Fish wrote:

> Most of the problems I fixed had to do with the fact that BFD takes
> the 32 bit unsigned addresses from object and executable files, sign
> extends them, and then stores the result as a bfd_vma, which is an
> unsigned 64 bit type (unsigned long long).  For example, the unsigned
> 32 bit address 0x80020004 becomes an unsigned 64 bit bfd_vma/CORE_ADDR
> of 0xffffffff80020004.  The bfd_vma type is used to define gdb's
> CORE_ADDR types.

 Well, that seems the reason of the trouble -- for MIPS addresses in
object and executable files should be treated as signed and bfd_vma should
be a signed type since that's how MIPS works. 

 The two addresses you quote are indeed equivalent. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MIPS sign extension of addresses
  2002-09-10  9:57 ` Maciej W. Rozycki
@ 2002-09-12  7:49   ` Fred Fish
  2002-09-12  8:19     ` Andrew Cagney
  0 siblings, 1 reply; 10+ messages in thread
From: Fred Fish @ 2002-09-12  7:49 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Fred Fish, binutils, gdb

>  Well, that seems the reason of the trouble -- for MIPS addresses in
> object and executable files should be treated as signed and bfd_vma should
> be a signed type since that's how MIPS works. 

So does that mean that it would be more desirable if the MIPS ports used
signed long long for bfd_vma/CORE_ADDR instead of unsigned long long?

I'm willing to work on making that happen if that is the consensus for
making MIPS support more consistent with how the hardware works.

I've not yet checked, but are there fundamental reasons why bfd_vma
or CORE_ADDR have to be unsigned?

-Fred

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MIPS sign extension of addresses
  2002-09-12  7:49   ` Fred Fish
@ 2002-09-12  8:19     ` Andrew Cagney
  2002-09-12  8:30       ` Daniel Jacobowitz
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Cagney @ 2002-09-12  8:19 UTC (permalink / raw)
  To: Fred Fish; +Cc: Maciej W. Rozycki, binutils, gdb

>>  Well, that seems the reason of the trouble -- for MIPS addresses in
>> object and executable files should be treated as signed and bfd_vma should
>> be a signed type since that's how MIPS works. 
> 
> 
> So does that mean that it would be more desirable if the MIPS ports used
> signed long long for bfd_vma/CORE_ADDR instead of unsigned long long?
> 
> I'm willing to work on making that happen if that is the consensus for
> making MIPS support more consistent with how the hardware works.
> 
> I've not yet checked, but are there fundamental reasons why bfd_vma
> or CORE_ADDR have to be unsigned?

I don't think it will help.  I think it will also hinder the situtation 
where BFD/GDB are supporting multiple architectures - one signed and one 
unsigned.

``signed long long'' and ``unsigned long long'' would only show their 
true colours if there was a ``signed/unsigned long long long'' type.

Anyway, several problems occure:

- code extracts a small value and gets its sign extension wrong
Here, I think all we can do is continue tracking down cases.

- code does boundary arrithmetic (0x7fffffff+1 == 0xffffffff80000000)
Here, it happens so rarely that it isn't worth worrying about. 
Eventually someone will implement a CORE_ADDR object and that will 
really fix the problem.

- code doesn't use CORE_ADDR and {U]LONGEST correctly (occures when 
sizeof (CORE_ADDR) < sizeof ([U]LONGEST) --- i386.
A CORE_ADDR object would fix this to.  However, being more careful 
wouldn't hurt.

Andrew



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MIPS sign extension of addresses
  2002-09-12  8:19     ` Andrew Cagney
@ 2002-09-12  8:30       ` Daniel Jacobowitz
  2002-09-13  0:36         ` Maciej W. Rozycki
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Jacobowitz @ 2002-09-12  8:30 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: Fred Fish, Maciej W. Rozycki, binutils, gdb

On Thu, Sep 12, 2002 at 11:19:42AM -0400, Andrew Cagney wrote:
> >> Well, that seems the reason of the trouble -- for MIPS addresses in
> >>object and executable files should be treated as signed and bfd_vma should
> >>be a signed type since that's how MIPS works. 
> >
> >
> >So does that mean that it would be more desirable if the MIPS ports used
> >signed long long for bfd_vma/CORE_ADDR instead of unsigned long long?
> >
> >I'm willing to work on making that happen if that is the consensus for
> >making MIPS support more consistent with how the hardware works.
> >
> >I've not yet checked, but are there fundamental reasons why bfd_vma
> >or CORE_ADDR have to be unsigned?
> 
> I don't think it will help.  I think it will also hinder the situtation 
> where BFD/GDB are supporting multiple architectures - one signed and one 
> unsigned.

Oh, Andrew's right.  Signed CORE_ADDR isn't viable because other
architectures have and assume an unsigned address space.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MIPS sign extension of addresses
  2002-09-12  8:30       ` Daniel Jacobowitz
@ 2002-09-13  0:36         ` Maciej W. Rozycki
  2002-09-13 11:26           ` Andrew Cagney
  0 siblings, 1 reply; 10+ messages in thread
From: Maciej W. Rozycki @ 2002-09-13  0:36 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Andrew Cagney, Fred Fish, binutils, gdb

On Thu, 12 Sep 2002, Daniel Jacobowitz wrote:

> > >I've not yet checked, but are there fundamental reasons why bfd_vma
> > >or CORE_ADDR have to be unsigned?
> > 
> > I don't think it will help.  I think it will also hinder the situtation 
> > where BFD/GDB are supporting multiple architectures - one signed and one 
> > unsigned.
> 
> Oh, Andrew's right.  Signed CORE_ADDR isn't viable because other
> architectures have and assume an unsigned address space.

 Because MIPS is a minority?  Well, I can see some reason here, but care
has to be taken not to lose a sign with casts when operating on MIPS
addresses.  Fortunately the positive and the negative ranges of addresses
are distinct on MIPS so you won't find normal code or data crossing a
boundary (that might lead to wrong results of compare operations).

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MIPS sign extension of addresses
  2002-09-13  0:36         ` Maciej W. Rozycki
@ 2002-09-13 11:26           ` Andrew Cagney
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Cagney @ 2002-09-13 11:26 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Daniel Jacobowitz, Fred Fish, binutils, gdb

> On Thu, 12 Sep 2002, Daniel Jacobowitz wrote:
> 
> 
>> > >I've not yet checked, but are there fundamental reasons why bfd_vma
>> > >or CORE_ADDR have to be unsigned?
> 
>> > 
>> > I don't think it will help.  I think it will also hinder the situtation 
>> > where BFD/GDB are supporting multiple architectures - one signed and one 
>> > unsigned.
> 
>> 
>> Oh, Andrew's right.  Signed CORE_ADDR isn't viable because other
>> architectures have and assume an unsigned address space.
> 
> 
>  Because MIPS is a minority?

Actually no.  GDB supports pure harvard architectures (non-unified 
instruction and data spaces).  Such targets have boundary conditions 
(where sub address spaces should modulo wrap) that make the MIPS case 
look trivial :-)

As with the MIPS, and as you note, it turns out that these boundary 
cases are sufficiently rare to not need an urgent fix.

enjoy,
Andrew



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MIPS sign extension of addresses
  2002-09-10  8:49 MIPS sign extension of addresses Fred Fish
  2002-09-10  9:25 ` Paul Koning
  2002-09-10  9:57 ` Maciej W. Rozycki
@ 2002-09-10 12:07 ` Andrew Cagney
  2 siblings, 0 replies; 10+ messages in thread
From: Andrew Cagney @ 2002-09-10 12:07 UTC (permalink / raw)
  To: fnf; +Cc: binutils, gdb

> Most of the problems I fixed had to do with the fact that BFD takes
> the 32 bit unsigned addresses from object and executable files, sign
> extends them, and then stores the result as a bfd_vma, which is an
> unsigned 64 bit type (unsigned long long).  For example, the unsigned
> 32 bit address 0x80020004 becomes an unsigned 64 bit bfd_vma/CORE_ADDR
> of 0xffffffff80020004.  The bfd_vma type is used to define gdb's
> CORE_ADDR types.

For MIPS, any 32 bit address is signed and should be sign extended.  BFD 
and GDB should both mimic this behavour.

If, for some reason, you encounter a CORE_ADDR that looks like it wasn't 
sign extended, then failing to sign extend the value will be the most 
likely source of the bug.

> I suppose this might make more sense to me if I either had more
> historical knowledge about why it does this, or if bfd_vma/CORE_ADDR
> were signed types.
> 
> The trigger for this behavior in BFD is the field called
> sign_extend_vma in the elf backend data, which apparently is only set
> by elf32-mips.c, elf32-sh64.c, and elfn32-mips.c.  This special
> treatment for mips seems to also be why we need the regcache to
> support a read_signed_register call that is only used in mips-tdep.c.
> 
> There are also some ADDR_BITS_REMOVE macros used in mips-tdep.c that
> one might think would be there to strip the high bits but really don't
> since that macro invokes mips_mask_address_p() which tests
> mask_address_var which defaults to AUTO_BOOLEAN_AUTO which uses
> MIPS_DEFAULT_MASK_ADDRESS_P which checks the multiarch value of
> default_mask_address_p which seems to be set to zero everywhere.
> 
> Is all of this just internal gdb handwaving that can be changed/fixed
> or are there external reasons for all this sign extending followed by
> selective discarding/ignoring of the extended bits?  Are there files
> that will break or hardware that will misbehave if BFD stops doing
> this sign extension?

In the ``bad old days'', GDB/BFD tried to pretend that MIPS addresses 
were not signed and did not bother to sign extend them.  As a 
consequence, many things simply didn't work.  For instance, given an o32 
executable being run on a 64 bit target, the register would contain 
0xffffffff80020004 yet the symbol table would contain 0x000000008002004. 
  Most of the ``gdb handwaving'' you see came about, instead of fixing a 
GDB/BFD design flaw (and sign-extending the address), people kept adding 
yet another one more wafer thin fix....  To make matters worse, GDB 
et.al. were also trying to debug 64 bit abi's (o64) while using 32 bit 
debug info.

Anyway, internally the addresses are now always sign extended.  All 
external addresses are converted to/from sign-extended values before 
they get to GDB's core: when displaying it is trimmed back to 
TARGET_ADDR_BIT; on the target side (remote.c) where due to protocol 
restrictions the address may need to be trimmed; when converting to/from 
pointers (void*) and addresses (CORE_ADDR).

As a consequence, single GDB executable can, on IRIX 6.5, debug -64 -n32 
and -o32 binaries!

> After getting a little feedback from some private email exchanges
> containing substantially the same info as above, I've modified my
> mental picture of this process to think of it as a simple address
> translation scheme.  I.E. when running a 32 bit binary in a 64 bit
> address space, effectively the 32 bit address space is split in half,
> with the lower half (0x00000000-0x7FFFFFFF) mapped to the bottom of
> the 64 bit space and the upper half (0x80000000-0xFFFFFFFF) mapped to
> the top of the 64 bit space.

Yes.  That is a good way of describing it.

> I suppose this makes sense in a toolchain where you want the same gdb
> to be able to handle both 64 and 32 bit mips.  I'm fairly unfamiliar
> with all the different possible mips configurations, but perhaps it
> would be less confusing if there was one that was strictly 32 bit
> internally, much like when configuring and building a native
> i686-pc-linux-gnu toolchain uses an "unsigned long int" type for
> bfd_vma instead of an "unsigned long long int".

Both bfd_vma and CORE_ADDR are at least as large as TARGET_ADDR_BIT. 
That is the assertion:
	sizeof (CORE_ADDR) * HOST_CHAR_BIT >= TARGET_ADDR_BIT
holds.  Even for i386, an assumption such as sizeof (CORE_ADDR) == 32 is 
invalid.

enjoy,
Andrew

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2002-09-13 18:26 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-09-10  8:49 MIPS sign extension of addresses Fred Fish
2002-09-10  9:25 ` Paul Koning
2002-09-10 10:00   ` Maciej W. Rozycki
2002-09-10  9:57 ` Maciej W. Rozycki
2002-09-12  7:49   ` Fred Fish
2002-09-12  8:19     ` Andrew Cagney
2002-09-12  8:30       ` Daniel Jacobowitz
2002-09-13  0:36         ` Maciej W. Rozycki
2002-09-13 11:26           ` Andrew Cagney
2002-09-10 12:07 ` Andrew Cagney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox