packing/unpacking 4-octet longs

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

* packing/unpacking 4-octet longs
@ 2001-12-04  5:44 Timothy Wall
  2001-12-05  8:26 ` Andrew Cagney
  0 siblings, 1 reply; 10+ messages in thread
From: Timothy Wall @ 2001-12-04  5:44 UTC (permalink / raw)
  To: gdb

I'm working on a DSP port (yes, I'm the one tossing the OCTETS_PER_BYTE
strangeness about) and TI has introduced yet another Really Strange Thing.

Longs on this chip are packed little-endian.  No wait, they're packed
big-endian.  No-wait, it's both!

Basically, words are packed little-endian w/r/t octets, but the two words are
big-endian, e.g.

0xaabbccdd is stored as 0xbb 0xaa 0xdd 0xcc.  This I can account for in the
long/float/double packing/unpacking routines (though I'm not sure how I'm
going to make a clean patch yet...).  The problem is that the word order
switches if the value is accessed at an odd address (apparently this made
sense to the hardware designers).  The MS word of a long is always the first
one accessed (they must have forgotten that the chip was little-endian).

Any ideas of how to make this happen?  I remember doing a fix for something
similar way back, but looking at the code now, by the time the data is
packed/unpacked, the target address is long gone.  I doubt I can reliably do
swapping every time I access a 4-octet block of target data.

T.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: packing/unpacking 4-octet longs
  2001-12-04  5:44 packing/unpacking 4-octet longs Timothy Wall
@ 2001-12-05  8:26 ` Andrew Cagney
  2001-12-05  8:46   ` Richard Earnshaw
  2001-12-05  9:31   ` Timothy Wall
  0 siblings, 2 replies; 10+ messages in thread
From: Andrew Cagney @ 2001-12-05  8:26 UTC (permalink / raw)
  To: twall; +Cc: gdb

The short unhelpful answer is: yes, you have a problem.

As you note, GDB very much assumes that bytes laid out in the ``struct 
value *'' are identical to those that appear in memory.

> I'm working on a DSP port (yes, I'm the one tossing the OCTETS_PER_BYTE
> strangeness about) and TI has introduced yet another Really Strange Thing.
> 
> Longs on this chip are packed little-endian.  No wait, they're packed
> big-endian.  No-wait, it's both!
> 
> Basically, words are packed little-endian w/r/t octets, but the two words are
> big-endian, e.g.
> 
> 0xaabbccdd is stored as 0xbb 0xaa 0xdd 0xcc.  This I can account for in the
> long/float/double packing/unpacking routines (though I'm not sure how I'm
> going to make a clean patch yet...).  The problem is that the word order
> switches if the value is accessed at an odd address (apparently this made
> sense to the hardware designers).  The MS word of a long is always the first
> one accessed (they must have forgotten that the chip was little-endian).

Mutter something about hardware engineers taking short cuts :-)  I'm 
pretty sure that there is another very old wacko architecture that did 
something similar to this, I'm trying to remember which.  pdp11? 
original m68k?  This problem also shows up on hardware that implemented 
big/little endian using xor bits - Arm?

I think, gdb has avoided this issue by documenting it as a feature :-/

Regarding the easy case, the thing on the things to do today list is to 
add byte order information to the integer version of ``struct type''. 
That way, just like for the floating-point types, the type and not the 
target determins how the bytes are unpacked.

Regarding the hard case, when you say that the byte order switches at an 
odd address.  Is this an odd word address or an odd byte address?

> Any ideas of how to make this happen?  I remember doing a fix for something
> similar way back, but looking at the code now, by the time the data is
> packed/unpacked, the target address is long gone.  I doubt I can reliably do
> swapping every time I access a 4-octet block of target data.
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: packing/unpacking 4-octet longs
  2001-12-05  8:26 ` Andrew Cagney
@ 2001-12-05  8:46   ` Richard Earnshaw
  2001-12-05 10:11     ` Lars Brinkhoff
  2001-12-05 10:42     ` Andrew Cagney
  2001-12-05  9:31   ` Timothy Wall
  1 sibling, 2 replies; 10+ messages in thread
From: Richard Earnshaw @ 2001-12-05  8:46 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: twall, gdb, Richard.Earnshaw

> > 0xaabbccdd is stored as 0xbb 0xaa 0xdd 0xcc.  This I can account for in the
> > long/float/double packing/unpacking routines (though I'm not sure how I'm
> > going to make a clean patch yet...).  The problem is that the word order
> > switches if the value is accessed at an odd address (apparently this made
> > sense to the hardware designers).  The MS word of a long is always the first
> > one accessed (they must have forgotten that the chip was little-endian).
> 
> Mutter something about hardware engineers taking short cuts :-) 

As opposed to software engineers making invalid assumptions about the 
world they operate in :-)  In particular, by ignoring bits of the 
standards that are inconveniently difficult to deal with...

> I'm 
> pretty sure that there is another very old wacko architecture that did 
> something similar to this, I'm trying to remember which.  pdp11? 
> original m68k?  

PDP11, I think.  It's a bit before my time, but IIRC it's because earlier 
PDPS (8? 9?) were 16-bit machines, so there wasn't really any concept of 
word-ordering beyond that: 16-bit words were little-endian, but the most 
significant word was always at the lowest address (or the other way 
around).  When the pdp-11 came out, with 32-bit operations, all this 
caused carnage.  I'm told that it was this that convinced Digital that all 
the world would in future be little-endian (and hence, I believe, why MIPS 
processors grew a little-endian mode).

> This problem also shows up on hardware that implemented 
> big/little endian using xor bits - Arm?

I'm not aware of this affecting the ARM (except in that FPA format doubles 
and long doubles always have the word with the exponent at the lowest 
address, but there's nothing in the IEEE FP specs that says this is 
invalid).  In particular, storing a word, or multi-word, at an unaligned 
address does not change the order of bytes in memory, so
	memcpy(unaligned_address, aligned_address, sizeof(some_word))
does not require diddling with the internal order (or have I misunderstood 
the problem?)

R.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: packing/unpacking 4-octet longs
  2001-12-05  8:26 ` Andrew Cagney
  2001-12-05  8:46   ` Richard Earnshaw
@ 2001-12-05  9:31   ` Timothy Wall
  2001-12-05 10:03     ` Andrew Cagney
  1 sibling, 1 reply; 10+ messages in thread
From: Timothy Wall @ 2001-12-05  9:31 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: gdb

Andrew Cagney wrote:

>
> Regarding the hard case, when you say that the byte order switches at an
> odd address.  Is this an odd word address or an odd byte address?
>

In this case the word address is the same as the byte address.  One byte = 16
bits, i.e.
octets are not addressable.

So if you read a long (two words) from address 0x1000, the DSP grabs the MSW from
0x1000,
and the LSW from 0x1001.  If you read a long from address 0x2001, the DSP grabs
the MSW from
0x2001, and the LSW from 0x2000.

So in addition to ensuring that the data read by GDB for a long is aligned, I need
to swap words
if the original address wasn't aligned.

T.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: packing/unpacking 4-octet longs
  2001-12-05  9:31   ` Timothy Wall
@ 2001-12-05 10:03     ` Andrew Cagney
  2001-12-05 10:20       ` Timothy Wall
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Cagney @ 2001-12-05 10:03 UTC (permalink / raw)
  To: twall; +Cc: gdb

> Andrew Cagney wrote:
> 
> 
>>
>> Regarding the hard case, when you say that the byte order switches at an
>> odd address.  Is this an odd word address or an odd byte address?
>>
> 
> 
> In this case the word address is the same as the byte address.  One byte = 16
> bits, i.e.
> octets are not addressable.
> 
> So if you read a long (two words) from address 0x1000, the DSP grabs the MSW from
> 0x1000,
> and the LSW from 0x1001.  If you read a long from address 0x2001, the DSP grabs
> the MSW from
> 0x2001, and the LSW from 0x2000.
> 
> So in addition to ensuring that the data read by GDB for a long is aligned, I need
> to swap words
> if the original address wasn't aligned.

(I'm trying to avoid using the word byte it is dangerous :-)

So the architecture is really this hybrid 8bit/16bit thing where low 
bits of the address are used to imply an interpretation of the data.  Ulgh.

Can this be represented as another hybrid integer type - two 16 bit 
words unpacked in a very strange way?

The other (I'm not so sure about) possability is, since GDB must read in 
16bit chunks, modify it to mimic the hardware and 8bit swap things on 
the way in/out.

On this, what will happen with an 8bit character.  Won't GDB need to 
read/modify/write such a value?

	Andrew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: packing/unpacking 4-octet longs
  2001-12-05  8:46   ` Richard Earnshaw
@ 2001-12-05 10:11     ` Lars Brinkhoff
  2001-12-05 10:42     ` Andrew Cagney
  1 sibling, 0 replies; 10+ messages in thread
From: Lars Brinkhoff @ 2001-12-05 10:11 UTC (permalink / raw)
  To: Richard.Earnshaw; +Cc: Andrew Cagney, twall, gdb

Richard Earnshaw <rearnsha@arm.com> writes:
> > > 0xaabbccdd is stored as 0xbb 0xaa 0xdd 0xcc.
> > I'm pretty sure that there is another very old wacko architecture
> > that did something similar to this, I'm trying to remember which.
> > pdp11?
> PDP11, I think.

Yes.  That ordering is sometimes called "PDP-endian" (incorrectly
in my opinion, since there are many different PDP architectures).

> It's a bit before my time, but IIRC it's because earlier PDPS (8?
> 9?) were 16-bit machines, so there wasn't really any concept of
> word-ordering beyond that: 16-bit words were little-endian, but the
> most significant word was always at the lowest address (or the other
> way around).

Not quite.  All earlier PDP architectures were word-adressable.  PDP-1
and PDP-4/7/9 had 18-bit words, PDP-5/8 had 12-bit words, and PDP-6/10
had 36-bit words.  I don't know if the other machines had multiple-word
operands, but the PDP-6/10 stored the most significant word first.

PDP-11 was the first 16-bit machine, and the first byte-addressable.

-- 
Lars Brinkhoff          http://lars.nocrew.org/     Linux, GCC, PDP-10
Brinkhoff Consulting    http://www.brinkhoff.se/    programming

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: packing/unpacking 4-octet longs
  2001-12-05 10:03     ` Andrew Cagney
@ 2001-12-05 10:20       ` Timothy Wall
  2001-12-05 10:49         ` Andrew Cagney
  0 siblings, 1 reply; 10+ messages in thread
From: Timothy Wall @ 2001-12-05 10:20 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: gdb

See comments below.

Andrew Cagney wrote:

> > Andrew Cagney wrote:
> >
> >
> >>
> >> Regarding the hard case, when you say that the byte order switches at an
> >> odd address.  Is this an odd word address or an odd byte address?
> >>
> >
> >
> > In this case the word address is the same as the byte address.  One byte = 16
> > bits, i.e.
> > octets are not addressable.
> >
> > So if you read a long (two words) from address 0x1000, the DSP grabs the MSW from
> > 0x1000,
> > and the LSW from 0x1001.  If you read a long from address 0x2001, the DSP grabs
> > the MSW from
> > 0x2001, and the LSW from 0x2000.
> >
> > So in addition to ensuring that the data read by GDB for a long is aligned, I need
> > to swap words
> > if the original address wasn't aligned.
>
> (I'm trying to avoid using the word byte it is dangerous :-)

Yeah, octets and smallest addressable unit (technical definition of byte) are not the
same thing.

>
>
> So the architecture is really this hybrid 8bit/16bit thing where low
> bits of the address are used to imply an interpretation of the data.  Ulgh.

I doubt they thought that far.  They probably just made it easy to make the data
transfer
units always interpret the first word the same, and the addressing logic was easier to
make the second word
go back one address instead of forward one (just toggle the lo bit of the address to
get the second word).

>
>
> Can this be represented as another hybrid integer type - two 16 bit
> words unpacked in a very strange way?

That's certainly a possibility, maybe an xlong/xfloat (following TI's assembly notation
for indicating a long/float need not be aligned).  I'll have to look into that...

>
>
> The other (I'm not so sure about) possability is, since GDB must read in
> 16bit chunks, modify it to mimic the hardware and 8bit swap things on
> the way in/out.

Except that the swapping is only done on long value accesses (double-word access, as it
were).  If you otherwise sequentially read data, there is no swapping.

>
>
> On this, what will happen with an 8bit character.  Won't GDB need to
> read/modify/write such a value?
>
>         Andrew

Ain't got none of those on this target.  Characters are 16 bits.  I'm going to finally
ditch those sections that refuse TARGET_CHAR_BIT != 8.

For the most part, I'm leaving lengths as octet lengths and adjusting them to target
lengths if necessary at the last possible instant.  My remote server/stub will answer
1-octet requests with the LS octet of the character addressed, though this is not a
recommended request, nor are 1-octet write requests.

Thanks for the tips.

T.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: packing/unpacking 4-octet longs
  2001-12-05  8:46   ` Richard Earnshaw
  2001-12-05 10:11     ` Lars Brinkhoff
@ 2001-12-05 10:42     ` Andrew Cagney
  2001-12-06  2:03       ` Richard Earnshaw
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Cagney @ 2001-12-05 10:42 UTC (permalink / raw)
  To: Richard.Earnshaw; +Cc: twall, gdb


> I'm not aware of this affecting the ARM (except in that FPA format doubles 
> and long doubles always have the word with the exponent at the lowest 
> address, but there's nothing in the IEEE FP specs that says this is 
> invalid).  In particular, storing a word, or multi-word, at an unaligned 
> address does not change the order of bytes in memory, so
> 	memcpy(unaligned_address, aligned_address, sizeof(some_word))
> does not require diddling with the internal order (or have I misunderstood 
> the problem?)

That was a useful manual :-)  See 5-21 where it explains that a 
misaligned 32bit access gets rotated before it is stored :-/

enjoy,
Andrew



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: packing/unpacking 4-octet longs
  2001-12-05 10:20       ` Timothy Wall
@ 2001-12-05 10:49         ` Andrew Cagney
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Cagney @ 2001-12-05 10:49 UTC (permalink / raw)
  To: twall; +Cc: gdb

> For the most part, I'm leaving lengths as octet lengths and adjusting them to target
> lengths if necessary at the last possible instant.  My remote server/stub will answer
> 1-octet requests with the LS octet of the character addressed, though this is not a
> recommended request, nor are 1-octet write requests.

I think doing this would be a bad move.  GDB shouldn't lie about how the 
target represents its data types.

Best example of this I can think of is with the MIPS.  GDB tried to lie 
and pretend that the MIPS didn't sign extended addresses.

enjoy,
Andrew



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: packing/unpacking 4-octet longs
  2001-12-05 10:42     ` Andrew Cagney
@ 2001-12-06  2:03       ` Richard Earnshaw
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Earnshaw @ 2001-12-06  2:03 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: Richard.Earnshaw, twall, gdb

> 
> > I'm not aware of this affecting the ARM (except in that FPA format doubles 
> > and long doubles always have the word with the exponent at the lowest 
> > address, but there's nothing in the IEEE FP specs that says this is 
> > invalid).  In particular, storing a word, or multi-word, at an unaligned 
> > address does not change the order of bytes in memory, so
> > 	memcpy(unaligned_address, aligned_address, sizeof(some_word))
> > does not require diddling with the internal order (or have I misunderstood 
> > the problem?)
> 
> That was a useful manual :-)  See 5-21 where it explains that a 
> misaligned 32bit access gets rotated before it is stored :-/

You're talking about the ARM manual I posted?  If so, please read it 
again, more carefully.  Rotation is *never* done before a store, only on a 
load: it's a side effect of the byte-lane steering used for reading bytes; 
and it's also useful for fetching half-words from memory on those 
machines, since the effect of the rotation means that a ldr (32-bit load) 
from a mis-aligned address will always result in the desired bits being 
placed in bits 0:15 (little-endian mode) or bits 16:31 (big-endian mode) 
of the target register.

Anyway, the issue isn't relevant, since the ARM ABI never makes use of 
this behaviour for un-aligned (packed) objects, the memcpy rule I 
described above applies.

R.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2001-12-06 10:03 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-12-04  5:44 packing/unpacking 4-octet longs Timothy Wall
2001-12-05  8:26 ` Andrew Cagney
2001-12-05  8:46   ` Richard Earnshaw
2001-12-05 10:11     ` Lars Brinkhoff
2001-12-05 10:42     ` Andrew Cagney
2001-12-06  2:03       ` Richard Earnshaw
2001-12-05  9:31   ` Timothy Wall
2001-12-05 10:03     ` Andrew Cagney
2001-12-05 10:20       ` Timothy Wall
2001-12-05 10:49         ` Andrew Cagney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox