Below is a worked through example of an little endian data manipulation on the PowerPC. Background (Programmer Model): The PowerPC defines memory as being `byte' addressable. Mostly through out this example, I'll depict memory as only being one byte size. On a few occasions, I'll show the mapping between these bytes and the 64bit data bus of the PowerPC. Hence memory can be viewed as: 0: 1: 2: 3: 4: . . . Some simple examples: Big-endian store of ABCD at address 4. Big endian is where the most significant (A) byte is stored first (addr 4) in memory. 3: ? 4: A 5: B 6: C 7: D 8: ? Little-endian store of ABCD at address 4. Little endian is where the least significant byte is stored first (addr 4) in memory. 3: ? 4: D 5: C 6: B 7: A 8: ? Notice here that if a value is stored using one endian and read back using the other, the order of bytes will be reversed vis: o Write ABCD in Big-endian at address 4 o Read it as little endian at address 4 gives DCBA What the PowerPC does (see the PowerPC manual for an introduction). Below is a second way of looking at what it is doing. Big endian case: When in big endian mode, the PowerPC will behave as described above. Nothing strange about this. Little endian case (see table 2-29): Effective Address Modifications Data Width (bytes) EA Mod 8 No Change 4 XOR with b'100 2 XOR with b'110 1 XOR with b'111 [PowerPC 601 RISC Microprocessor User's Manual. MPC601UM/AD] I'm going to look at the storing of the value ABCD at address 4, in doing this I'm going to break this store into two separate cases a) as individual bytes and b) as a 4 byte word. In both cases, for the machine to be be truly byte addressable and little endian, no difference should be detectable. a) Individual bytes Using table 2-29, (byte address so XOR with binary 111), I translate the `internal' address for each byte into its corresponding physical address. (Remember little endian, so D the list significant byte is stored at address 4) Addr Value Baddr Caddr 4 D 4=100 011=3 5 C 5=101 010=2 6 B 6=110 001=1 7 A 7=111 000=0 Hence each value is stored into memory as: 0: A 1: B 2: C 3: D 4: ? b) As a 4 byte quantity. Again using table 2-29 and the rule `stored in big endian order' we get that: 4 (= 100) xor b 100 (= 000) and hence again: 0: A 1: B 2: C 3: D 4: ? Notice that in each of these two examples, the data is: o Stored in memory in big endian order o Stored in locations in a different order Mapping this model onto the 64 bit wide data path of the PowerPC: The next step in trying to understand this is to look at the mapping of this byte addressable programmable model onto the real 64 bit PowerPC data path. Again, to emphasize the mapping between the above model and real hardware, I'll depict the data path vertically. The PowerPC has a 64 bit data path. This data path can be broken into 8 data lanes, each 8 bits wide. Lane Bits 0: 0 - 7 1: 8 - 15 2: 16 -23 3: 24 - 31 4: 32 - 39 5: 40 - 47 6: 48 - 56 7: 56 - 63 When the PowerPC performs a big endian store to address 4 with data ABCD. It put on to the address/data busses: Address: 4 (in binary 100) Data Lane: 0: 1: 2: 3: 4: A 5: B 6: C 7: D For a little endian store (same address and data) it would output (onto the address/data bus): Address: 0 (binary 000) Data Lane: 0: A 1: B 2: C 3: D 4: 5: 6: 7: Or for a single byte consider a store of `*' at address 6. 6 (binary 110) XOR with 111 (as byte access) is 1 (001). So for these two cases, the data would appear on the byte lanes's as shown below: Address: 6(110) 1(001) Data Lane: 0: 1: * 2: 3: 4: 5: 6: * 7: Impact on PowerPC <-> IO operations: If the PowerPC is only making transfers between itself and memory, the above translation doesn't matter. If, however, the transfer is between the PowerPC and a real IO device, then there is a serious problem. Consider, for instance, an IO controller with one register at `address 6' A programmer would assume a byte store at address 6 would go to the IO controller. Further the controller would expect that the data is put onto data lane `6' of the data bus. In big endian mode, this assumption is correct. However, from the above example, it can be seen that in little endian mode this is no longer the case: o The address output by the processor would be 1 not 6 o The data would appear on chanel 1 not 6 Following on from this, consider now a 2 byte register at address 6. The registers could be accessed as either A@6 and B@7 or as a 2 byte pair. Clearly using our definition of little endian and big endian stores, a write of AB should produce: Big Little Data lane: 5: ? ? 6: A B 7: B A 8: ? ? With the PowerPC in little endian mode, however, this doesn't happen. Instead of putting on the bus: Address: 6 Data lane 6: B 7: A The PowerPC outputs onto it's address & data lines: Address: 0 (000 = 6(110) XOR 110) Data lane: 0: A 1: B Oops! Inpact on DMA operations: Consider a far longer example of a dma device that is to store the bytes ABCDEF into memory starting at address 6. The endian ness of the machine shouldn't matter as we're considering a byte stream. So in each case, the dma transfer should result in: 5: ? 6: A 7: B 8: C 9: D a: E b: F c: ? However applying the rules defined by table 2-29, the PowerPC in little endian mode would look for the values at addresses: Value Addr PPC looks in: A 6 1 B 7 0 C 8 f D 9 e E a d F b c Hence, for a little endian PowerPC to be able to correctly access the bytes brought in via DMA they would need to appear in memory as: 0: B 1: A 2: ? 3: ? 4: ? 5: ? 6: ? 7: ? 8: ? 9: ? a: ? b: ? c: F d: E e: D f: C Yuck! Mapping these examples onto more traditional ways of drawing 64 bit busses: If you wish to re-draw these examples using more traditional diagrams (eg 64but data buse) be certain to mark, in each case the mapping between the physical data lanes and the logical PowerPC addresses.