[PATCH] [Arm] Fix endianness handling for arm record self tests

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* [PATCH] [Arm] Fix endianness handling for arm record self tests
@ 2022-08-08 10:12 Luis Machado via Gdb-patches
  2022-08-08 12:30 ` Andrew Burgess via Gdb-patches
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Luis Machado via Gdb-patches @ 2022-08-08 10:12 UTC (permalink / raw)
  To: gdb-patches; +Cc: mark

The arm record tests handle 16-bit and 32-bit thumb instructions, but the
code is laid out in a way that handles the 32-bit thumb instructions as
two 16-bit parts.

This is fine, but it is prone to host-endianness issues given how the two
16-bit parts are stored and how they are accessed later on. Arm is
little-endian by default, so running this test with a GDB built with
--enable-targets=all and on a big endian host will run into the following:

Running selftest arm-record.
Process record and replay target doesn't support syscall number -2036195
Process record does not support instruction 0x7f70ee1d at address 0x0.
Self test failed: self-test failed at ../../binutils-gdb/gdb/arm-tdep.c:14482

Investigating this a bit further, there seems to be a chance to do a simple
fix through a type template, using uint16_t for 16-bit thumb instructions
and uint32_t for 32-bit thumb instructions.

This patch implements this.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29432
---
 gdb/arm-tdep.c | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index cf8b610a381..57b865a0819 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -14387,14 +14387,18 @@ decode_insn (abstract_memory_reader &reader,
 #if GDB_SELF_TEST
 namespace selftests {
 
-/* Provide both 16-bit and 32-bit thumb instructions.  */
+/* Provide both 16-bit and 32-bit thumb instructions.
 
+   For 16-bit Thumb instructions, an array of uint16_t should be used.
+   For 32-bit Thumb instructions, an array of uint32_t should be used.  */
+
+template<typename T>
 class instruction_reader_thumb : public abstract_memory_reader
 {
 public:
   template<size_t SIZE>
   instruction_reader_thumb (enum bfd_endian endian,
-			    const uint16_t (&insns)[SIZE])
+			    const T (&insns)[SIZE])
     : m_endian (endian), m_insns (insns), m_insns_size (SIZE)
   {}
 
@@ -14404,18 +14408,14 @@ class instruction_reader_thumb : public abstract_memory_reader
     SELF_CHECK (memaddr % 2 == 0);
     SELF_CHECK ((memaddr / 2) < m_insns_size);
 
-    store_unsigned_integer (buf, 2, m_endian, m_insns[memaddr / 2]);
-    if (len == 4)
-      {
-	store_unsigned_integer (&buf[2], 2, m_endian,
-				m_insns[memaddr / 2 + 1]);
-      }
+    store_unsigned_integer (buf, sizeof (T), m_endian, m_insns[memaddr / 2]);
+
     return true;
   }
 
 private:
   enum bfd_endian m_endian;
-  const uint16_t *m_insns;
+  const T *m_insns;
   size_t m_insns_size;
 };
 
@@ -14436,6 +14436,8 @@ arm_record_test (void)
     memset (&arm_record, 0, sizeof (arm_insn_decode_record));
     arm_record.gdbarch = gdbarch;
 
+    /* Use the endian-free representation of the instructions here.  The test
+       will handle endianness conversions.  */
     static const uint16_t insns[] = {
       /* db b2	uxtb	r3, r3 */
       0xb2db,
@@ -14444,7 +14446,7 @@ arm_record_test (void)
     };
 
     enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
-    instruction_reader_thumb reader (endian, insns);
+    instruction_reader_thumb<uint16_t> reader (endian, insns);
     int ret = decode_insn (reader, &arm_record, THUMB_RECORD,
 			   THUMB_INSN_SIZE_BYTES);
 
@@ -14470,13 +14472,15 @@ arm_record_test (void)
     memset (&arm_record, 0, sizeof (arm_insn_decode_record));
     arm_record.gdbarch = gdbarch;
 
-    static const uint16_t insns[] = {
-      /* 1d ee 70 7f	 mrc	15, 0, r7, cr13, cr0, {3} */
-      0xee1d, 0x7f70,
+    /* Use the endian-free representation of the instruction here.  The test
+       will handle endianness conversions.  */
+    static const uint32_t insns[] = {
+      /* mrc	15, 0, r7, cr13, cr0, {3} */
+      0x7f70ee1d,
     };
 
     enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
-    instruction_reader_thumb reader (endian, insns);
+    instruction_reader_thumb<uint32_t> reader (endian, insns);
     int ret = decode_insn (reader, &arm_record, THUMB2_RECORD,
 			   THUMB2_INSN_SIZE_BYTES);
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
  2022-08-08 10:12 [PATCH] [Arm] Fix endianness handling for arm record self tests Luis Machado via Gdb-patches
@ 2022-08-08 12:30 ` Andrew Burgess via Gdb-patches
  2022-08-10  8:47   ` Andrew Burgess via Gdb-patches
  2022-08-09  9:43 ` Tom de Vries via Gdb-patches
  2022-08-23 20:32 ` [PATCH,v2] " Luis Machado via Gdb-patches
  2 siblings, 1 reply; 17+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-08-08 12:30 UTC (permalink / raw)
  To: Luis Machado, gdb-patches; +Cc: mark

Luis Machado via Gdb-patches <gdb-patches@sourceware.org> writes:

> The arm record tests handle 16-bit and 32-bit thumb instructions, but the
> code is laid out in a way that handles the 32-bit thumb instructions as
> two 16-bit parts.
>
> This is fine, but it is prone to host-endianness issues given how the two
> 16-bit parts are stored and how they are accessed later on. Arm is
> little-endian by default, so running this test with a GDB built with
> --enable-targets=all and on a big endian host will run into the following:
>
> Running selftest arm-record.
> Process record and replay target doesn't support syscall number -2036195
> Process record does not support instruction 0x7f70ee1d at address 0x0.
> Self test failed: self-test failed at ../../binutils-gdb/gdb/arm-tdep.c:14482
>
> Investigating this a bit further, there seems to be a chance to do a simple
> fix through a type template, using uint16_t for 16-bit thumb instructions
> and uint32_t for 32-bit thumb instructions.
>
> This patch implements this.
>
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29432
> ---
>  gdb/arm-tdep.c | 32 ++++++++++++++++++--------------
>  1 file changed, 18 insertions(+), 14 deletions(-)
>
> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
> index cf8b610a381..57b865a0819 100644
> --- a/gdb/arm-tdep.c
> +++ b/gdb/arm-tdep.c
> @@ -14387,14 +14387,18 @@ decode_insn (abstract_memory_reader &reader,
>  #if GDB_SELF_TEST
>  namespace selftests {
>  
> -/* Provide both 16-bit and 32-bit thumb instructions.  */
> +/* Provide both 16-bit and 32-bit thumb instructions.
>  
> +   For 16-bit Thumb instructions, an array of uint16_t should be used.
> +   For 32-bit Thumb instructions, an array of uint32_t should be used.  */
> +
> +template<typename T>
>  class instruction_reader_thumb : public abstract_memory_reader
>  {
>  public:
>    template<size_t SIZE>
>    instruction_reader_thumb (enum bfd_endian endian,
> -			    const uint16_t (&insns)[SIZE])
> +			    const T (&insns)[SIZE])
>      : m_endian (endian), m_insns (insns), m_insns_size (SIZE)
>    {}
>  
> @@ -14404,18 +14408,14 @@ class instruction_reader_thumb : public abstract_memory_reader
>      SELF_CHECK (memaddr % 2 == 0);
>      SELF_CHECK ((memaddr / 2) < m_insns_size);

I was expecting this '/ 2' to need updating here.  If memaddr is an octet
address, then the '/ 2' converts to a 16-bit chunk address, which is
fine if T is uint16_t, but surely is wrong when T is uint32_t...

>  
> -    store_unsigned_integer (buf, 2, m_endian, m_insns[memaddr / 2]);
> -    if (len == 4)
> -      {
> -	store_unsigned_integer (&buf[2], 2, m_endian,
> -				m_insns[memaddr / 2 + 1]);
> -      }
> +    store_unsigned_integer (buf, sizeof (T), m_endian, m_insns[memaddr / 2]);

And the same here.

> +
>      return true;
>    }
>  
>  private:
>    enum bfd_endian m_endian;
> -  const uint16_t *m_insns;
> +  const T *m_insns;
>    size_t m_insns_size;
>  };
>  
> @@ -14436,6 +14436,8 @@ arm_record_test (void)
>      memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>      arm_record.gdbarch = gdbarch;
>  
> +    /* Use the endian-free representation of the instructions here.  The test
> +       will handle endianness conversions.  */
>      static const uint16_t insns[] = {
>        /* db b2	uxtb	r3, r3 */
>        0xb2db,
> @@ -14444,7 +14446,7 @@ arm_record_test (void)
>      };
>  
>      enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
> -    instruction_reader_thumb reader (endian, insns);
> +    instruction_reader_thumb<uint16_t> reader (endian, insns);

I wonder if there's an alternative fix here?
gdbarch_byte_order_for_code returns a value such that READER will
correctly read instructions from arm instruction memory, right?  Which
happens to be little-endian.

However, we're not reading from arm instruction memory, but we are
instead reading from host memory.

On many targets, host memory also happens to be little-endian, thus
gdbarch_byte_order_for_code is still correct.

But, could we not instead pass in a value here that represents the host
memory order instead, then maybe READER will just do the right thing?

Thanks,
Andrew

>      int ret = decode_insn (reader, &arm_record, THUMB_RECORD,
>  			   THUMB_INSN_SIZE_BYTES);
>  
> @@ -14470,13 +14472,15 @@ arm_record_test (void)
>      memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>      arm_record.gdbarch = gdbarch;
>  
> -    static const uint16_t insns[] = {
> -      /* 1d ee 70 7f	 mrc	15, 0, r7, cr13, cr0, {3} */
> -      0xee1d, 0x7f70,
> +    /* Use the endian-free representation of the instruction here.  The test
> +       will handle endianness conversions.  */
> +    static const uint32_t insns[] = {
> +      /* mrc	15, 0, r7, cr13, cr0, {3} */
> +      0x7f70ee1d,
>      };
>  
>      enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
> -    instruction_reader_thumb reader (endian, insns);
> +    instruction_reader_thumb<uint32_t> reader (endian, insns);
>      int ret = decode_insn (reader, &arm_record, THUMB2_RECORD,
>  			   THUMB2_INSN_SIZE_BYTES);
>  
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
  2022-08-08 12:30 ` Andrew Burgess via Gdb-patches
@ 2022-08-10  8:47   ` Andrew Burgess via Gdb-patches
  0 siblings, 0 replies; 17+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-08-10  8:47 UTC (permalink / raw)
  To: Luis Machado, gdb-patches; +Cc: mark

Andrew Burgess <aburgess@redhat.com> writes:

> Luis Machado via Gdb-patches <gdb-patches@sourceware.org> writes:
>
>> The arm record tests handle 16-bit and 32-bit thumb instructions, but the
>> code is laid out in a way that handles the 32-bit thumb instructions as
>> two 16-bit parts.
>>
>> This is fine, but it is prone to host-endianness issues given how the two
>> 16-bit parts are stored and how they are accessed later on. Arm is
>> little-endian by default, so running this test with a GDB built with
>> --enable-targets=all and on a big endian host will run into the following:
>>
>> Running selftest arm-record.
>> Process record and replay target doesn't support syscall number -2036195
>> Process record does not support instruction 0x7f70ee1d at address 0x0.
>> Self test failed: self-test failed at ../../binutils-gdb/gdb/arm-tdep.c:14482
>>
>> Investigating this a bit further, there seems to be a chance to do a simple
>> fix through a type template, using uint16_t for 16-bit thumb instructions
>> and uint32_t for 32-bit thumb instructions.
>>
>> This patch implements this.
>>
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29432
>> ---
>>  gdb/arm-tdep.c | 32 ++++++++++++++++++--------------
>>  1 file changed, 18 insertions(+), 14 deletions(-)
>>
>> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
>> index cf8b610a381..57b865a0819 100644
>> --- a/gdb/arm-tdep.c
>> +++ b/gdb/arm-tdep.c
>> @@ -14387,14 +14387,18 @@ decode_insn (abstract_memory_reader &reader,
>>  #if GDB_SELF_TEST
>>  namespace selftests {
>>  
>> -/* Provide both 16-bit and 32-bit thumb instructions.  */
>> +/* Provide both 16-bit and 32-bit thumb instructions.
>>  
>> +   For 16-bit Thumb instructions, an array of uint16_t should be used.
>> +   For 32-bit Thumb instructions, an array of uint32_t should be used.  */
>> +
>> +template<typename T>
>>  class instruction_reader_thumb : public abstract_memory_reader
>>  {
>>  public:
>>    template<size_t SIZE>
>>    instruction_reader_thumb (enum bfd_endian endian,
>> -			    const uint16_t (&insns)[SIZE])
>> +			    const T (&insns)[SIZE])
>>      : m_endian (endian), m_insns (insns), m_insns_size (SIZE)
>>    {}
>>  
>> @@ -14404,18 +14408,14 @@ class instruction_reader_thumb : public abstract_memory_reader
>>      SELF_CHECK (memaddr % 2 == 0);
>>      SELF_CHECK ((memaddr / 2) < m_insns_size);
>
> I was expecting this '/ 2' to need updating here.  If memaddr is an octet
> address, then the '/ 2' converts to a 16-bit chunk address, which is
> fine if T is uint16_t, but surely is wrong when T is uint32_t...

Yesterdays activity on this thread reminded me of something else I
thought of.  If you plan to rework this patch then maybe this feedback
is irrelevant, but I just wanted to put it on the record anyway.

At the start of instruction_reader_thumb::read we have this assert:

    SELF_CHECK (len == 4 || len == 2);

However, after your change you always read 'sizeof (T)' bytes, which
might be 4, so, I think that if 'len == 2' you will be overflowing BUF.

I guess thumb instructions can be 2-byte or 4-byte, so the thumb decoder
probably reads an initial 2-bytes, and then follows up with either a
full 4-byte read, or a read of the second 2-bytes only when necessary.
I don't think the code as proposed here will handle this use case
correctly.

Thanks,
Andrew



>
>>  
>> -    store_unsigned_integer (buf, 2, m_endian, m_insns[memaddr / 2]);
>> -    if (len == 4)
>> -      {
>> -	store_unsigned_integer (&buf[2], 2, m_endian,
>> -				m_insns[memaddr / 2 + 1]);
>> -      }
>> +    store_unsigned_integer (buf, sizeof (T), m_endian, m_insns[memaddr / 2]);
>
> And the same here.
>
>> +
>>      return true;
>>    }
>>  
>>  private:
>>    enum bfd_endian m_endian;
>> -  const uint16_t *m_insns;
>> +  const T *m_insns;
>>    size_t m_insns_size;
>>  };
>>  
>> @@ -14436,6 +14436,8 @@ arm_record_test (void)
>>      memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>>      arm_record.gdbarch = gdbarch;
>>  
>> +    /* Use the endian-free representation of the instructions here.  The test
>> +       will handle endianness conversions.  */
>>      static const uint16_t insns[] = {
>>        /* db b2	uxtb	r3, r3 */
>>        0xb2db,
>> @@ -14444,7 +14446,7 @@ arm_record_test (void)
>>      };
>>  
>>      enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
>> -    instruction_reader_thumb reader (endian, insns);
>> +    instruction_reader_thumb<uint16_t> reader (endian, insns);
>
> I wonder if there's an alternative fix here?
> gdbarch_byte_order_for_code returns a value such that READER will
> correctly read instructions from arm instruction memory, right?  Which
> happens to be little-endian.
>
> However, we're not reading from arm instruction memory, but we are
> instead reading from host memory.
>
> On many targets, host memory also happens to be little-endian, thus
> gdbarch_byte_order_for_code is still correct.
>
> But, could we not instead pass in a value here that represents the host
> memory order instead, then maybe READER will just do the right thing?
>
> Thanks,
> Andrew
>
>>      int ret = decode_insn (reader, &arm_record, THUMB_RECORD,
>>  			   THUMB_INSN_SIZE_BYTES);
>>  
>> @@ -14470,13 +14472,15 @@ arm_record_test (void)
>>      memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>>      arm_record.gdbarch = gdbarch;
>>  
>> -    static const uint16_t insns[] = {
>> -      /* 1d ee 70 7f	 mrc	15, 0, r7, cr13, cr0, {3} */
>> -      0xee1d, 0x7f70,
>> +    /* Use the endian-free representation of the instruction here.  The test
>> +       will handle endianness conversions.  */
>> +    static const uint32_t insns[] = {
>> +      /* mrc	15, 0, r7, cr13, cr0, {3} */
>> +      0x7f70ee1d,
>>      };
>>  
>>      enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
>> -    instruction_reader_thumb reader (endian, insns);
>> +    instruction_reader_thumb<uint32_t> reader (endian, insns);
>>      int ret = decode_insn (reader, &arm_record, THUMB2_RECORD,
>>  			   THUMB2_INSN_SIZE_BYTES);
>>  
>> -- 
>> 2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
  2022-08-08 10:12 [PATCH] [Arm] Fix endianness handling for arm record self tests Luis Machado via Gdb-patches
  2022-08-08 12:30 ` Andrew Burgess via Gdb-patches
@ 2022-08-09  9:43 ` Tom de Vries via Gdb-patches
  2022-08-09  9:57   ` Tom de Vries via Gdb-patches
                     ` (2 more replies)
  2022-08-23 20:32 ` [PATCH,v2] " Luis Machado via Gdb-patches
  2 siblings, 3 replies; 17+ messages in thread
From: Tom de Vries via Gdb-patches @ 2022-08-09  9:43 UTC (permalink / raw)
  To: Luis Machado, gdb-patches; +Cc: mark

On 8/8/22 12:12, Luis Machado wrote:
> The arm record tests handle 16-bit and 32-bit thumb instructions, but the
> code is laid out in a way that handles the 32-bit thumb instructions as
> two 16-bit parts.
> 
> This is fine, but it is prone to host-endianness issues given how the two
> 16-bit parts are stored and how they are accessed later on. Arm is
> little-endian by default, so running this test with a GDB built with
> --enable-targets=all and on a big endian host will run into the following:
> 
> Running selftest arm-record.
> Process record and replay target doesn't support syscall number -2036195
> Process record does not support instruction 0x7f70ee1d at address 0x0.
> Self test failed: self-test failed at ../../binutils-gdb/gdb/arm-tdep.c:14482
> 
> Investigating this a bit further, there seems to be a chance to do a simple
> fix through a type template, using uint16_t for 16-bit thumb instructions
> and uint32_t for 32-bit thumb instructions.
> 
> This patch implements this.
> 

Hi Luis,

LGTM.

I noticed btw that this:
...
diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index 57b865a0819..1e6d9ba65be 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -14489,6 +14489,29 @@ arm_record_test (void)
      SELF_CHECK (arm_record.reg_rec_count == 1);
      SELF_CHECK (arm_record.arm_regs[0] == 7);
    }
+
+
+  /* 32-bit instructions.  */
+  {
+    arm_insn_decode_record arm_record;
+
+    memset (&arm_record, 0, sizeof (arm_insn_decode_record));
+    arm_record.gdbarch = gdbarch;
+
+    /* Use the endian-free representation of the instruction here.  The 
test
+       will handle endianness conversions.  */
+    static const uint32_t insns[] = {
+      /* mov     r5, r0 */
+      0xe1a05000,
+    };
+
+    enum bfd_endian endian = gdbarch_byte_order_for_code 
(arm_record.gdbarch);
+    instruction_reader_thumb<uint32_t> reader (endian, insns);
+    int ret = decode_insn (reader, &arm_record, ARM_RECORD,
+                          ARM_INSN_SIZE_BYTES);
+
+    SELF_CHECK (ret == 0);
+  }
  }

  /* Instruction reader from manually cooked instruction sequences.  */
...
works fine, so I wonder if instruction_reader_thumb is a misnomer, 
perhaps we call it instruction_reader_selftest or some such, and add the 
arm32 insn to cover that case?

Also I wondered if these checks
...
     SELF_CHECK (len == 4 || len == 2);
     SELF_CHECK (memaddr % 2 == 0);
     SELF_CHECK ((memaddr / 2) < m_insns_size);
...
can be made more specific based on the template argument T.

Thanks,
- Tom


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
  2022-08-09  9:43 ` Tom de Vries via Gdb-patches
@ 2022-08-09  9:57   ` Tom de Vries via Gdb-patches
  2022-08-15 12:13     ` Luis Machado via Gdb-patches
  2022-08-09 11:31   ` Luis Machado via Gdb-patches
  2022-08-15 12:10   ` Luis Machado via Gdb-patches
  2 siblings, 1 reply; 17+ messages in thread
From: Tom de Vries via Gdb-patches @ 2022-08-09  9:57 UTC (permalink / raw)
  To: Luis Machado, gdb-patches; +Cc: mark

On 8/9/22 11:43, Tom de Vries wrote:
> On 8/8/22 12:12, Luis Machado wrote:
>> The arm record tests handle 16-bit and 32-bit thumb instructions, but the
>> code is laid out in a way that handles the 32-bit thumb instructions as
>> two 16-bit parts.
>>
>> This is fine, but it is prone to host-endianness issues given how the two
>> 16-bit parts are stored and how they are accessed later on. Arm is
>> little-endian by default, so running this test with a GDB built with
>> --enable-targets=all and on a big endian host will run into the 
>> following:
>>
>> Running selftest arm-record.
>> Process record and replay target doesn't support syscall number -2036195
>> Process record does not support instruction 0x7f70ee1d at address 0x0.
>> Self test failed: self-test failed at 
>> ../../binutils-gdb/gdb/arm-tdep.c:14482
>>
>> Investigating this a bit further, there seems to be a chance to do a 
>> simple
>> fix through a type template, using uint16_t for 16-bit thumb instructions
>> and uint32_t for 32-bit thumb instructions.
>>
>> This patch implements this.
>>
> 
> Hi Luis,
> 
> LGTM.
> 
> I noticed btw that this:
> ...
> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
> index 57b865a0819..1e6d9ba65be 100644
> --- a/gdb/arm-tdep.c
> +++ b/gdb/arm-tdep.c
> @@ -14489,6 +14489,29 @@ arm_record_test (void)
>       SELF_CHECK (arm_record.reg_rec_count == 1);
>       SELF_CHECK (arm_record.arm_regs[0] == 7);
>     }
> +
> +
> +  /* 32-bit instructions.  */
> +  {
> +    arm_insn_decode_record arm_record;
> +
> +    memset (&arm_record, 0, sizeof (arm_insn_decode_record));
> +    arm_record.gdbarch = gdbarch;
> +
> +    /* Use the endian-free representation of the instruction here.  The 
> test
> +       will handle endianness conversions.  */
> +    static const uint32_t insns[] = {
> +      /* mov     r5, r0 */
> +      0xe1a05000,
> +    };
> +
> +    enum bfd_endian endian = gdbarch_byte_order_for_code 
> (arm_record.gdbarch);
> +    instruction_reader_thumb<uint32_t> reader (endian, insns);
> +    int ret = decode_insn (reader, &arm_record, ARM_RECORD,
> +                          ARM_INSN_SIZE_BYTES);
> +
> +    SELF_CHECK (ret == 0);
> +  }
>   }
> 
>   /* Instruction reader from manually cooked instruction sequences.  */
> ...
> works fine, so I wonder if instruction_reader_thumb is a misnomer, 
> perhaps we call it instruction_reader_selftest or some such, and add the 
> arm32 insn to cover that case?
> 
> Also I wondered if these checks
> ...
>      SELF_CHECK (len == 4 || len == 2);
>      SELF_CHECK (memaddr % 2 == 0);
>      SELF_CHECK ((memaddr / 2) < m_insns_size);
> ...
> can be made more specific based on the template argument T.

Also, I think this is more clear:
...
-store_unsigned_integer (buf, sizeof (T), m_endian,
                          m_insns[memaddr / 2]);
+store_unsigned_integer (buf, sizeof (T), m_endian,
                          m_insns[memaddr / sizeof (T)]);
...

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
  2022-08-09  9:57   ` Tom de Vries via Gdb-patches
@ 2022-08-15 12:13     ` Luis Machado via Gdb-patches
  0 siblings, 0 replies; 17+ messages in thread
From: Luis Machado via Gdb-patches @ 2022-08-15 12:13 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches; +Cc: mark

On 8/9/22 10:57, Tom de Vries wrote:
> On 8/9/22 11:43, Tom de Vries wrote:
>> On 8/8/22 12:12, Luis Machado wrote:
>>> The arm record tests handle 16-bit and 32-bit thumb instructions, but the
>>> code is laid out in a way that handles the 32-bit thumb instructions as
>>> two 16-bit parts.
>>>
>>> This is fine, but it is prone to host-endianness issues given how the two
>>> 16-bit parts are stored and how they are accessed later on. Arm is
>>> little-endian by default, so running this test with a GDB built with
>>> --enable-targets=all and on a big endian host will run into the following:
>>>
>>> Running selftest arm-record.
>>> Process record and replay target doesn't support syscall number -2036195
>>> Process record does not support instruction 0x7f70ee1d at address 0x0.
>>> Self test failed: self-test failed at ../../binutils-gdb/gdb/arm-tdep.c:14482
>>>
>>> Investigating this a bit further, there seems to be a chance to do a simple
>>> fix through a type template, using uint16_t for 16-bit thumb instructions
>>> and uint32_t for 32-bit thumb instructions.
>>>
>>> This patch implements this.
>>>
>>
>> Hi Luis,
>>
>> LGTM.
>>
>> I noticed btw that this:
>> ...
>> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
>> index 57b865a0819..1e6d9ba65be 100644
>> --- a/gdb/arm-tdep.c
>> +++ b/gdb/arm-tdep.c
>> @@ -14489,6 +14489,29 @@ arm_record_test (void)
>>       SELF_CHECK (arm_record.reg_rec_count == 1);
>>       SELF_CHECK (arm_record.arm_regs[0] == 7);
>>     }
>> +
>> +
>> +  /* 32-bit instructions.  */
>> +  {
>> +    arm_insn_decode_record arm_record;
>> +
>> +    memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>> +    arm_record.gdbarch = gdbarch;
>> +
>> +    /* Use the endian-free representation of the instruction here.  The test
>> +       will handle endianness conversions.  */
>> +    static const uint32_t insns[] = {
>> +      /* mov     r5, r0 */
>> +      0xe1a05000,
>> +    };
>> +
>> +    enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
>> +    instruction_reader_thumb<uint32_t> reader (endian, insns);
>> +    int ret = decode_insn (reader, &arm_record, ARM_RECORD,
>> +                          ARM_INSN_SIZE_BYTES);
>> +
>> +    SELF_CHECK (ret == 0);
>> +  }
>>   }
>>
>>   /* Instruction reader from manually cooked instruction sequences.  */
>> ...
>> works fine, so I wonder if instruction_reader_thumb is a misnomer, perhaps we call it instruction_reader_selftest or some such, and add the arm32 insn to cover that case?

I suppose the code was originally made to test thumb 16-bit instructions. I've incorporated this change in the patch. Thanks!

>>
>> Also I wondered if these checks
>> ...
>>      SELF_CHECK (len == 4 || len == 2);
>>      SELF_CHECK (memaddr % 2 == 0);
>>      SELF_CHECK ((memaddr / 2) < m_insns_size);
>> ...
>> can be made more specific based on the template argument T.

Indeed.

> 
> Also, I think this is more clear:
> ...
> -store_unsigned_integer (buf, sizeof (T), m_endian,
>                           m_insns[memaddr / 2]);
> +store_unsigned_integer (buf, sizeof (T), m_endian,
>                           m_insns[memaddr / sizeof (T)]);
> ...
> 
> Thanks,
> - Tom


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
  2022-08-09  9:43 ` Tom de Vries via Gdb-patches
  2022-08-09  9:57   ` Tom de Vries via Gdb-patches
@ 2022-08-09 11:31   ` Luis Machado via Gdb-patches
  2022-08-09 11:48     ` Tom de Vries via Gdb-patches
  2022-08-15 12:10   ` Luis Machado via Gdb-patches
  2 siblings, 1 reply; 17+ messages in thread
From: Luis Machado via Gdb-patches @ 2022-08-09 11:31 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches; +Cc: mark

Hi Tom,

On 8/9/22 10:43, Tom de Vries wrote:
> On 8/8/22 12:12, Luis Machado wrote:
>> The arm record tests handle 16-bit and 32-bit thumb instructions, but the
>> code is laid out in a way that handles the 32-bit thumb instructions as
>> two 16-bit parts.
>>
>> This is fine, but it is prone to host-endianness issues given how the two
>> 16-bit parts are stored and how they are accessed later on. Arm is
>> little-endian by default, so running this test with a GDB built with
>> --enable-targets=all and on a big endian host will run into the following:
>>
>> Running selftest arm-record.
>> Process record and replay target doesn't support syscall number -2036195
>> Process record does not support instruction 0x7f70ee1d at address 0x0.
>> Self test failed: self-test failed at ../../binutils-gdb/gdb/arm-tdep.c:14482
>>
>> Investigating this a bit further, there seems to be a chance to do a simple
>> fix through a type template, using uint16_t for 16-bit thumb instructions
>> and uint32_t for 32-bit thumb instructions.
>>
>> This patch implements this.
>>
> 
> Hi Luis,
> 
> LGTM.
> 
> I noticed btw that this:
> ...
> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
> index 57b865a0819..1e6d9ba65be 100644
> --- a/gdb/arm-tdep.c
> +++ b/gdb/arm-tdep.c
> @@ -14489,6 +14489,29 @@ arm_record_test (void)
>       SELF_CHECK (arm_record.reg_rec_count == 1);
>       SELF_CHECK (arm_record.arm_regs[0] == 7);
>     }
> +
> +
> +  /* 32-bit instructions.  */
> +  {
> +    arm_insn_decode_record arm_record;
> +
> +    memset (&arm_record, 0, sizeof (arm_insn_decode_record));
> +    arm_record.gdbarch = gdbarch;
> +
> +    /* Use the endian-free representation of the instruction here.  The test
> +       will handle endianness conversions.  */
> +    static const uint32_t insns[] = {
> +      /* mov     r5, r0 */
> +      0xe1a05000,
> +    };
> +
> +    enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
> +    instruction_reader_thumb<uint32_t> reader (endian, insns);
> +    int ret = decode_insn (reader, &arm_record, ARM_RECORD,
> +                          ARM_INSN_SIZE_BYTES);
> +
> +    SELF_CHECK (ret == 0);
> +  }
>   }
> 
>   /* Instruction reader from manually cooked instruction sequences.  */
> ...
> works fine, so I wonder if instruction_reader_thumb is a misnomer, perhaps we call it instruction_reader_selftest or some such, and add the arm32 insn to cover that case?

Good point.

> 
> Also I wondered if these checks
> ...
>      SELF_CHECK (len == 4 || len == 2);
>      SELF_CHECK (memaddr % 2 == 0);
>      SELF_CHECK ((memaddr / 2) < m_insns_size);
> ...
> can be made more specific based on the template argument T.

I guess, though I was going to rework the patch based on Andrew's comments. Didn't get to it yet though.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
  2022-08-09 11:31   ` Luis Machado via Gdb-patches
@ 2022-08-09 11:48     ` Tom de Vries via Gdb-patches
  2022-08-09 12:08       ` Tom de Vries via Gdb-patches
  0 siblings, 1 reply; 17+ messages in thread
From: Tom de Vries via Gdb-patches @ 2022-08-09 11:48 UTC (permalink / raw)
  To: Luis Machado, gdb-patches; +Cc: mark, Andrew Burgess

On 8/9/22 13:31, Luis Machado wrote:
> I guess, though I was going to rework the patch based on Andrew's 
> comments. Didn't get to it yet though.

Ah, I hadn't seen Andrew's review.

For some reason I wasn't cc-ed, perhaps that infuriating mailman CC 
rewriting again.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
  2022-08-09 11:48     ` Tom de Vries via Gdb-patches
@ 2022-08-09 12:08       ` Tom de Vries via Gdb-patches
  2022-08-09 12:09         ` Luis Machado via Gdb-patches
  0 siblings, 1 reply; 17+ messages in thread
From: Tom de Vries via Gdb-patches @ 2022-08-09 12:08 UTC (permalink / raw)
  To: Luis Machado, gdb-patches; +Cc: mark, Andrew Burgess

On 8/9/22 13:48, Tom de Vries wrote:
> On 8/9/22 13:31, Luis Machado wrote:
>> I guess, though I was going to rework the patch based on Andrew's 
>> comments. Didn't get to it yet though.
> 
> Ah, I hadn't seen Andrew's review.
> 
> For some reason I wasn't cc-ed, perhaps that infuriating mailman CC 
> rewriting again.
> 

I've logged into the "Gdb-patches mailing list membership configuration" 
  and switched off "Avoid duplicate copies of messages", hopefully that 
will help.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
  2022-08-09 12:08       ` Tom de Vries via Gdb-patches
@ 2022-08-09 12:09         ` Luis Machado via Gdb-patches
  2022-08-09 12:13           ` Tom de Vries via Gdb-patches
  0 siblings, 1 reply; 17+ messages in thread
From: Luis Machado via Gdb-patches @ 2022-08-09 12:09 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches; +Cc: mark, Andrew Burgess

On 8/9/22 13:08, Tom de Vries wrote:
> On 8/9/22 13:48, Tom de Vries wrote:
>> On 8/9/22 13:31, Luis Machado wrote:
>>> I guess, though I was going to rework the patch based on Andrew's comments. Didn't get to it yet though.
>>
>> Ah, I hadn't seen Andrew's review.
>>
>> For some reason I wasn't cc-ed, perhaps that infuriating mailman CC rewriting again.
>>
> 
> I've logged into the "Gdb-patches mailing list membership configuration"  and switched off "Avoid duplicate copies of messages", hopefully that will help.

Oh, is that the setting that makes things work OK? I'll have to try that.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
  2022-08-09 12:09         ` Luis Machado via Gdb-patches
@ 2022-08-09 12:13           ` Tom de Vries via Gdb-patches
  2022-08-09 15:24             ` Mark Wielaard
  0 siblings, 1 reply; 17+ messages in thread
From: Tom de Vries via Gdb-patches @ 2022-08-09 12:13 UTC (permalink / raw)
  To: Luis Machado, gdb-patches; +Cc: mark, Andrew Burgess

On 8/9/22 14:09, Luis Machado wrote:
> On 8/9/22 13:08, Tom de Vries wrote:
>> On 8/9/22 13:48, Tom de Vries wrote:
>>> On 8/9/22 13:31, Luis Machado wrote:
>>>> I guess, though I was going to rework the patch based on Andrew's 
>>>> comments. Didn't get to it yet though.
>>>
>>> Ah, I hadn't seen Andrew's review.
>>>
>>> For some reason I wasn't cc-ed, perhaps that infuriating mailman CC 
>>> rewriting again.
>>>
>>
>> I've logged into the "Gdb-patches mailing list membership 
>> configuration"  and switched off "Avoid duplicate copies of messages", 
>> hopefully that will help.
> 
> Oh, is that the setting that makes things work OK? I'll have to try that.

I think this ( 
https://bugs.launchpad.net/mailman/+bug/1845751/comments/2 ) is a good 
description of the problem, and what the setting does.

Thanks,
- Tom


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
  2022-08-09 12:13           ` Tom de Vries via Gdb-patches
@ 2022-08-09 15:24             ` Mark Wielaard
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Wielaard @ 2022-08-09 15:24 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Andrew Burgess, gdb-patches

Hi,

On Tue, Aug 09, 2022 at 02:13:53PM +0200, Tom de Vries wrote:
> On 8/9/22 14:09, Luis Machado wrote:
> >On 8/9/22 13:08, Tom de Vries wrote:
> >>On 8/9/22 13:48, Tom de Vries wrote:
> >>>For some reason I wasn't cc-ed, perhaps that infuriating
> >>>mailman CC rewriting again.
> >>
> >>I've logged into the "Gdb-patches mailing list membership
> >>configuration"  and switched off "Avoid duplicate copies of
> >>messages", hopefully that will help.
> >
> >Oh, is that the setting that makes things work OK? I'll have to try that.
> 
> I think this (
> https://bugs.launchpad.net/mailman/+bug/1845751/comments/2 ) is a
> good description of the problem, and what the setting does.

Yeah, that is a good description of the problem with the "Filter out
duplicate messages" option. Unfortunately it seems to be on for almost
all list members because it is the default for new list members. It
might be a good idea to as the gdb/gdb-patches list owner (or
overseers) to turn that off by default (and maybe for all list
members, although I am not sure if some people have actively opted in
to this).

Cheers,

Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
  2022-08-09  9:43 ` Tom de Vries via Gdb-patches
  2022-08-09  9:57   ` Tom de Vries via Gdb-patches
  2022-08-09 11:31   ` Luis Machado via Gdb-patches
@ 2022-08-15 12:10   ` Luis Machado via Gdb-patches
  2 siblings, 0 replies; 17+ messages in thread
From: Luis Machado via Gdb-patches @ 2022-08-15 12:10 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches; +Cc: mark

On 8/9/22 10:43, Tom de Vries wrote:
> On 8/8/22 12:12, Luis Machado wrote:
>> The arm record tests handle 16-bit and 32-bit thumb instructions, but the
>> code is laid out in a way that handles the 32-bit thumb instructions as
>> two 16-bit parts.
>>
>> This is fine, but it is prone to host-endianness issues given how the two
>> 16-bit parts are stored and how they are accessed later on. Arm is
>> little-endian by default, so running this test with a GDB built with
>> --enable-targets=all and on a big endian host will run into the following:
>>
>> Running selftest arm-record.
>> Process record and replay target doesn't support syscall number -2036195
>> Process record does not support instruction 0x7f70ee1d at address 0x0.
>> Self test failed: self-test failed at ../../binutils-gdb/gdb/arm-tdep.c:14482
>>
>> Investigating this a bit further, there seems to be a chance to do a simple
>> fix through a type template, using uint16_t for 16-bit thumb instructions
>> and uint32_t for 32-bit thumb instructions.
>>
>> This patch implements this.
>>
> 
> Hi Luis,
> 
> LGTM.
> 
> I noticed btw that this:
> ...
> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
> index 57b865a0819..1e6d9ba65be 100644
> --- a/gdb/arm-tdep.c
> +++ b/gdb/arm-tdep.c
> @@ -14489,6 +14489,29 @@ arm_record_test (void)
>       SELF_CHECK (arm_record.reg_rec_count == 1);
>       SELF_CHECK (arm_record.arm_regs[0] == 7);
>     }
> +
> +
> +  /* 32-bit instructions.  */
> +  {
> +    arm_insn_decode_record arm_record;
> +
> +    memset (&arm_record, 0, sizeof (arm_insn_decode_record));
> +    arm_record.gdbarch = gdbarch;
> +
> +    /* Use the endian-free representation of the instruction here.  The test
> +       will handle endianness conversions.  */
> +    static const uint32_t insns[] = {
> +      /* mov     r5, r0 */
> +      0xe1a05000,
> +    };
> +
> +    enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
> +    instruction_reader_thumb<uint32_t> reader (endian, insns);
> +    int ret = decode_insn (reader, &arm_record, ARM_RECORD,
> +                          ARM_INSN_SIZE_BYTES);
> +
> +    SELF_CHECK (ret == 0);
> +  }
>   }
> 
>   /* Instruction reader from manually cooked instruction sequences.  */
> ...
> works fine, so I wonder if instruction_reader_thumb is a misnomer, perhaps we call it instruction_reader_selftest or some such, and add the arm32 insn to cover that case?
> 
> Also I wondered if these checks
> ...
>      SELF_CHECK (len == 4 || len == 2);
>      SELF_CHECK (memaddr % 2 == 0);
>      SELF_CHECK ((memaddr / 2) < m_insns_size);
> ...
> can be made more specific based on the template argument T.

Indeed. I've adjusted the patch now.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH,v2] [Arm] Fix endianness handling for arm record self tests
  2022-08-08 10:12 [PATCH] [Arm] Fix endianness handling for arm record self tests Luis Machado via Gdb-patches
  2022-08-08 12:30 ` Andrew Burgess via Gdb-patches
  2022-08-09  9:43 ` Tom de Vries via Gdb-patches
@ 2022-08-23 20:32 ` Luis Machado via Gdb-patches
  2022-09-01  9:29   ` [PING][PATCH,v2] " Luis Machado via Gdb-patches
  2 siblings, 1 reply; 17+ messages in thread
From: Luis Machado via Gdb-patches @ 2022-08-23 20:32 UTC (permalink / raw)
  To: gdb-patches; +Cc: mark, tdevries

v2:

- Add 32-bit Arm instruction selftest
- Refactored abstract memory reader into abstract instruction reader
- Adjusted code to use templated type and to use host endianness as
  opposed to target endianness.

The arm record tests handle 16-bit and 32-bit thumb instructions, but the
code is laid out in a way that handles the 32-bit thumb instructions as
two 16-bit parts.

This is fine, but it is prone to host-endianness issues given how the two
16-bit parts are stored and how they are accessed later on. Arm is
little-endian by default, so running this test with a GDB built with
--enable-targets=all and on a big endian host will run into the following:

Running selftest arm-record.
Process record and replay target doesn't support syscall number -2036195
Process record does not support instruction 0x7f70ee1d at address 0x0.
Self test failed: self-test failed at ../../binutils-gdb/gdb/arm-tdep.c:14482

It turns out the abstract memory reader class is more generic than it needs to
be, and we can simplify the code a bit by assuming we have a simple instruction
reader that only reads up to 4 bytes, which is the length of a 32-bit
instruction.

Instead of returning a bool, we return instead the instruction that has been
read. This way we avoid having to deal with the endianness conversion, and use
the host endianness instead. The Arm selftests can be executed on non-Arm
hosts.

While at it, Tom suggested adding a 32-bit Arm instruction selftest to increase
the coverage of the selftests.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29432

Co-authored-by: Tom de Vries <tdevries@suse.de>
---
 gdb/arm-tdep.c | 150 ++++++++++++++++++++++---------------------------
 1 file changed, 68 insertions(+), 82 deletions(-)

diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index 0091d14708f..ead9bbf46c5 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -14252,59 +14252,39 @@ thumb2_record_decode_insn_handler (arm_insn_decode_record *thumb2_insn_r)
 }
 
 namespace {
-/* Abstract memory reader.  */
+/* Abstract instruction reader.  */
 
-class abstract_memory_reader
+class abstract_instruction_reader
 {
 public:
-  /* Read LEN bytes of target memory at address MEMADDR, placing the
-     results in GDB's memory at BUF.  Return true on success.  */
+  /* Read one instruction of size LEN from address MEMADDR and using
+     BYTE_ORDER endianness.  */
 
-  virtual bool read (CORE_ADDR memaddr, gdb_byte *buf, const size_t len) = 0;
+  virtual ULONGEST read (CORE_ADDR memaddr, const size_t len,
+			 enum bfd_endian byte_order) = 0;
 };
 
 /* Instruction reader from real target.  */
 
-class instruction_reader : public abstract_memory_reader
+class instruction_reader : public abstract_instruction_reader
 {
  public:
-  bool read (CORE_ADDR memaddr, gdb_byte *buf, const size_t len) override
+  ULONGEST read (CORE_ADDR memaddr, const size_t len,
+		 enum bfd_endian byte_order) override
   {
-    if (target_read_memory (memaddr, buf, len))
-      return false;
-    else
-      return true;
+    return read_code_unsigned_integer (memaddr, len, byte_order);
   }
 };
 
 } // namespace
 
-/* Extracts arm/thumb/thumb2 insn depending on the size, and returns 0 on success 
-and positive val on failure.  */
-
-static int
-extract_arm_insn (abstract_memory_reader& reader,
-		  arm_insn_decode_record *insn_record, uint32_t insn_size)
-{
-  gdb_byte buf[insn_size];
-
-  memset (&buf[0], 0, insn_size);
-  
-  if (!reader.read (insn_record->this_addr, buf, insn_size))
-    return 1;
-  insn_record->arm_insn = (uint32_t) extract_unsigned_integer (&buf[0],
-			   insn_size, 
-			   gdbarch_byte_order_for_code (insn_record->gdbarch));
-  return 0;
-}
-
 typedef int (*sti_arm_hdl_fp_t) (arm_insn_decode_record*);
 
 /* Decode arm/thumb insn depending on condition cods and opcodes; and
    dispatch it.  */
 
 static int
-decode_insn (abstract_memory_reader &reader,
+decode_insn (abstract_instruction_reader &reader,
 	     arm_insn_decode_record *arm_record,
 	     record_type_t record_type, uint32_t insn_size)
 {
@@ -14339,20 +14319,12 @@ decode_insn (abstract_memory_reader &reader,
 
   uint32_t ret = 0;    /* return value: negative:failure   0:success.  */
   uint32_t insn_id = 0;
+  enum bfd_endian code_endian
+    = gdbarch_byte_order_for_code (arm_record->gdbarch);
+  arm_record->arm_insn
+    = reader.read (arm_record->this_addr, insn_size, code_endian);
 
-  if (extract_arm_insn (reader, arm_record, insn_size))
-    {
-      if (record_debug)
-	{
-	  gdb_printf (gdb_stdlog,
-		      _("Process record: error reading memory at "
-			"addr %s len = %d.\n"),
-		      paddress (arm_record->gdbarch,
-				arm_record->this_addr), insn_size);
-	}
-      return -1;
-    }
-  else if (ARM_RECORD == record_type)
+  if (ARM_RECORD == record_type)
     {
       arm_record->cond = bits (arm_record->arm_insn, 28, 31);
       insn_id = bits (arm_record->arm_insn, 25, 27);
@@ -14412,36 +14384,35 @@ decode_insn (abstract_memory_reader &reader,
 #if GDB_SELF_TEST
 namespace selftests {
 
-/* Provide both 16-bit and 32-bit thumb instructions.  */
+/* Instruction reader class for selftests.
+
+   For 16-bit Thumb instructions, an array of uint16_t should be used.
 
-class instruction_reader_thumb : public abstract_memory_reader
+   For 32-bit Thumb instructions and regular 32-bit Arm instructions, an array
+   of uint32_t should be used.  */
+
+template<typename T>
+class instruction_reader_selftest : public abstract_instruction_reader
 {
 public:
   template<size_t SIZE>
-  instruction_reader_thumb (enum bfd_endian endian,
-			    const uint16_t (&insns)[SIZE])
-    : m_endian (endian), m_insns (insns), m_insns_size (SIZE)
+  instruction_reader_selftest (const T (&insns)[SIZE])
+    : m_insns (insns), m_insns_size (SIZE)
   {}
 
-  bool read (CORE_ADDR memaddr, gdb_byte *buf, const size_t len) override
+  ULONGEST read (CORE_ADDR memaddr, const size_t length,
+		 enum bfd_endian byte_order) override
   {
-    SELF_CHECK (len == 4 || len == 2);
-    SELF_CHECK (memaddr % 2 == 0);
-    SELF_CHECK ((memaddr / 2) < m_insns_size);
+    SELF_CHECK (length == sizeof (T));
+    SELF_CHECK (memaddr % sizeof (T) == 0);
+    SELF_CHECK ((memaddr / sizeof (T)) < m_insns_size);
 
-    store_unsigned_integer (buf, 2, m_endian, m_insns[memaddr / 2]);
-    if (len == 4)
-      {
-	store_unsigned_integer (&buf[2], 2, m_endian,
-				m_insns[memaddr / 2 + 1]);
-      }
-    return true;
+    return m_insns[memaddr / sizeof (T)];
   }
 
 private:
-  enum bfd_endian m_endian;
-  const uint16_t *m_insns;
-  size_t m_insns_size;
+  const T *m_insns;
+  const size_t m_insns_size;
 };
 
 static void
@@ -14461,6 +14432,8 @@ arm_record_test (void)
     memset (&arm_record, 0, sizeof (arm_insn_decode_record));
     arm_record.gdbarch = gdbarch;
 
+    /* Use the endian-free representation of the instructions here.  The test
+       will handle endianness conversions.  */
     static const uint16_t insns[] = {
       /* db b2	uxtb	r3, r3 */
       0xb2db,
@@ -14468,8 +14441,7 @@ arm_record_test (void)
       0x58cd,
     };
 
-    enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
-    instruction_reader_thumb reader (endian, insns);
+    instruction_reader_selftest<uint16_t> reader (insns);
     int ret = decode_insn (reader, &arm_record, THUMB_RECORD,
 			   THUMB_INSN_SIZE_BYTES);
 
@@ -14495,13 +14467,14 @@ arm_record_test (void)
     memset (&arm_record, 0, sizeof (arm_insn_decode_record));
     arm_record.gdbarch = gdbarch;
 
-    static const uint16_t insns[] = {
-      /* 1d ee 70 7f	 mrc	15, 0, r7, cr13, cr0, {3} */
-      0xee1d, 0x7f70,
+    /* Use the endian-free representation of the instruction here.  The test
+       will handle endianness conversions.  */
+    static const uint32_t insns[] = {
+      /* mrc	15, 0, r7, cr13, cr0, {3} */
+      0x7f70ee1d,
     };
 
-    enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
-    instruction_reader_thumb reader (endian, insns);
+    instruction_reader_selftest<uint32_t> reader (insns);
     int ret = decode_insn (reader, &arm_record, THUMB2_RECORD,
 			   THUMB2_INSN_SIZE_BYTES);
 
@@ -14510,6 +14483,27 @@ arm_record_test (void)
     SELF_CHECK (arm_record.reg_rec_count == 1);
     SELF_CHECK (arm_record.arm_regs[0] == 7);
   }
+
+  /* 32-bit instructions.  */
+  {
+    arm_insn_decode_record arm_record;
+
+    memset (&arm_record, 0, sizeof (arm_insn_decode_record));
+    arm_record.gdbarch = gdbarch;
+
+    /* Use the endian-free representation of the instruction here.  The test
+       will handle endianness conversions.  */
+    static const uint32_t insns[] = {
+      /* mov     r5, r0 */
+      0xe1a05000,
+    };
+
+    instruction_reader_selftest<uint32_t> reader (insns);
+    int ret = decode_insn (reader, &arm_record, ARM_RECORD,
+			   ARM_INSN_SIZE_BYTES);
+
+    SELF_CHECK (ret == 0);
+  }
 }
 
 /* Instruction reader from manually cooked instruction sequences.  */
@@ -14609,18 +14603,10 @@ arm_process_record (struct gdbarch *gdbarch, struct regcache *regcache,
     }
 
   instruction_reader reader;
-  if (extract_arm_insn (reader, &arm_record, 2))
-    {
-      if (record_debug)
-	{
-	  gdb_printf (gdb_stdlog,
-		      _("Process record: error reading memory at "
-			"addr %s len = %d.\n"),
-		      paddress (arm_record.gdbarch,
-				arm_record.this_addr), 2);
-	}
-      return -1;
-    }
+  enum bfd_endian code_endian
+    = gdbarch_byte_order_for_code (arm_record.gdbarch);
+  arm_record.arm_insn
+    = reader.read (arm_record.this_addr, 2, code_endian);
 
   /* Check the insn, whether it is thumb or arm one.  */
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PING][PATCH,v2] [Arm] Fix endianness handling for arm record self tests
  2022-08-23 20:32 ` [PATCH,v2] " Luis Machado via Gdb-patches
@ 2022-09-01  9:29   ` Luis Machado via Gdb-patches
  2022-09-06 10:39     ` Tom de Vries via Gdb-patches
  0 siblings, 1 reply; 17+ messages in thread
From: Luis Machado via Gdb-patches @ 2022-09-01  9:29 UTC (permalink / raw)
  To: gdb-patches; +Cc: mark, tdevries

On 8/23/22 21:32, Luis Machado via Gdb-patches wrote:
> v2:
> 
> - Add 32-bit Arm instruction selftest
> - Refactored abstract memory reader into abstract instruction reader
> - Adjusted code to use templated type and to use host endianness as
>    opposed to target endianness.
> 
> The arm record tests handle 16-bit and 32-bit thumb instructions, but the
> code is laid out in a way that handles the 32-bit thumb instructions as
> two 16-bit parts.
> 
> This is fine, but it is prone to host-endianness issues given how the two
> 16-bit parts are stored and how they are accessed later on. Arm is
> little-endian by default, so running this test with a GDB built with
> --enable-targets=all and on a big endian host will run into the following:
> 
> Running selftest arm-record.
> Process record and replay target doesn't support syscall number -2036195
> Process record does not support instruction 0x7f70ee1d at address 0x0.
> Self test failed: self-test failed at ../../binutils-gdb/gdb/arm-tdep.c:14482
> 
> It turns out the abstract memory reader class is more generic than it needs to
> be, and we can simplify the code a bit by assuming we have a simple instruction
> reader that only reads up to 4 bytes, which is the length of a 32-bit
> instruction.
> 
> Instead of returning a bool, we return instead the instruction that has been
> read. This way we avoid having to deal with the endianness conversion, and use
> the host endianness instead. The Arm selftests can be executed on non-Arm
> hosts.
> 
> While at it, Tom suggested adding a 32-bit Arm instruction selftest to increase
> the coverage of the selftests.
> 
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29432
> 
> Co-authored-by: Tom de Vries <tdevries@suse.de>
> ---
>   gdb/arm-tdep.c | 150 ++++++++++++++++++++++---------------------------
>   1 file changed, 68 insertions(+), 82 deletions(-)
> 
> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
> index 0091d14708f..ead9bbf46c5 100644
> --- a/gdb/arm-tdep.c
> +++ b/gdb/arm-tdep.c
> @@ -14252,59 +14252,39 @@ thumb2_record_decode_insn_handler (arm_insn_decode_record *thumb2_insn_r)
>   }
>   
>   namespace {
> -/* Abstract memory reader.  */
> +/* Abstract instruction reader.  */
>   
> -class abstract_memory_reader
> +class abstract_instruction_reader
>   {
>   public:
> -  /* Read LEN bytes of target memory at address MEMADDR, placing the
> -     results in GDB's memory at BUF.  Return true on success.  */
> +  /* Read one instruction of size LEN from address MEMADDR and using
> +     BYTE_ORDER endianness.  */
>   
> -  virtual bool read (CORE_ADDR memaddr, gdb_byte *buf, const size_t len) = 0;
> +  virtual ULONGEST read (CORE_ADDR memaddr, const size_t len,
> +			 enum bfd_endian byte_order) = 0;
>   };
>   
>   /* Instruction reader from real target.  */
>   
> -class instruction_reader : public abstract_memory_reader
> +class instruction_reader : public abstract_instruction_reader
>   {
>    public:
> -  bool read (CORE_ADDR memaddr, gdb_byte *buf, const size_t len) override
> +  ULONGEST read (CORE_ADDR memaddr, const size_t len,
> +		 enum bfd_endian byte_order) override
>     {
> -    if (target_read_memory (memaddr, buf, len))
> -      return false;
> -    else
> -      return true;
> +    return read_code_unsigned_integer (memaddr, len, byte_order);
>     }
>   };
>   
>   } // namespace
>   
> -/* Extracts arm/thumb/thumb2 insn depending on the size, and returns 0 on success
> -and positive val on failure.  */
> -
> -static int
> -extract_arm_insn (abstract_memory_reader& reader,
> -		  arm_insn_decode_record *insn_record, uint32_t insn_size)
> -{
> -  gdb_byte buf[insn_size];
> -
> -  memset (&buf[0], 0, insn_size);
> -
> -  if (!reader.read (insn_record->this_addr, buf, insn_size))
> -    return 1;
> -  insn_record->arm_insn = (uint32_t) extract_unsigned_integer (&buf[0],
> -			   insn_size,
> -			   gdbarch_byte_order_for_code (insn_record->gdbarch));
> -  return 0;
> -}
> -
>   typedef int (*sti_arm_hdl_fp_t) (arm_insn_decode_record*);
>   
>   /* Decode arm/thumb insn depending on condition cods and opcodes; and
>      dispatch it.  */
>   
>   static int
> -decode_insn (abstract_memory_reader &reader,
> +decode_insn (abstract_instruction_reader &reader,
>   	     arm_insn_decode_record *arm_record,
>   	     record_type_t record_type, uint32_t insn_size)
>   {
> @@ -14339,20 +14319,12 @@ decode_insn (abstract_memory_reader &reader,
>   
>     uint32_t ret = 0;    /* return value: negative:failure   0:success.  */
>     uint32_t insn_id = 0;
> +  enum bfd_endian code_endian
> +    = gdbarch_byte_order_for_code (arm_record->gdbarch);
> +  arm_record->arm_insn
> +    = reader.read (arm_record->this_addr, insn_size, code_endian);
>   
> -  if (extract_arm_insn (reader, arm_record, insn_size))
> -    {
> -      if (record_debug)
> -	{
> -	  gdb_printf (gdb_stdlog,
> -		      _("Process record: error reading memory at "
> -			"addr %s len = %d.\n"),
> -		      paddress (arm_record->gdbarch,
> -				arm_record->this_addr), insn_size);
> -	}
> -      return -1;
> -    }
> -  else if (ARM_RECORD == record_type)
> +  if (ARM_RECORD == record_type)
>       {
>         arm_record->cond = bits (arm_record->arm_insn, 28, 31);
>         insn_id = bits (arm_record->arm_insn, 25, 27);
> @@ -14412,36 +14384,35 @@ decode_insn (abstract_memory_reader &reader,
>   #if GDB_SELF_TEST
>   namespace selftests {
>   
> -/* Provide both 16-bit and 32-bit thumb instructions.  */
> +/* Instruction reader class for selftests.
> +
> +   For 16-bit Thumb instructions, an array of uint16_t should be used.
>   
> -class instruction_reader_thumb : public abstract_memory_reader
> +   For 32-bit Thumb instructions and regular 32-bit Arm instructions, an array
> +   of uint32_t should be used.  */
> +
> +template<typename T>
> +class instruction_reader_selftest : public abstract_instruction_reader
>   {
>   public:
>     template<size_t SIZE>
> -  instruction_reader_thumb (enum bfd_endian endian,
> -			    const uint16_t (&insns)[SIZE])
> -    : m_endian (endian), m_insns (insns), m_insns_size (SIZE)
> +  instruction_reader_selftest (const T (&insns)[SIZE])
> +    : m_insns (insns), m_insns_size (SIZE)
>     {}
>   
> -  bool read (CORE_ADDR memaddr, gdb_byte *buf, const size_t len) override
> +  ULONGEST read (CORE_ADDR memaddr, const size_t length,
> +		 enum bfd_endian byte_order) override
>     {
> -    SELF_CHECK (len == 4 || len == 2);
> -    SELF_CHECK (memaddr % 2 == 0);
> -    SELF_CHECK ((memaddr / 2) < m_insns_size);
> +    SELF_CHECK (length == sizeof (T));
> +    SELF_CHECK (memaddr % sizeof (T) == 0);
> +    SELF_CHECK ((memaddr / sizeof (T)) < m_insns_size);
>   
> -    store_unsigned_integer (buf, 2, m_endian, m_insns[memaddr / 2]);
> -    if (len == 4)
> -      {
> -	store_unsigned_integer (&buf[2], 2, m_endian,
> -				m_insns[memaddr / 2 + 1]);
> -      }
> -    return true;
> +    return m_insns[memaddr / sizeof (T)];
>     }
>   
>   private:
> -  enum bfd_endian m_endian;
> -  const uint16_t *m_insns;
> -  size_t m_insns_size;
> +  const T *m_insns;
> +  const size_t m_insns_size;
>   };
>   
>   static void
> @@ -14461,6 +14432,8 @@ arm_record_test (void)
>       memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>       arm_record.gdbarch = gdbarch;
>   
> +    /* Use the endian-free representation of the instructions here.  The test
> +       will handle endianness conversions.  */
>       static const uint16_t insns[] = {
>         /* db b2	uxtb	r3, r3 */
>         0xb2db,
> @@ -14468,8 +14441,7 @@ arm_record_test (void)
>         0x58cd,
>       };
>   
> -    enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
> -    instruction_reader_thumb reader (endian, insns);
> +    instruction_reader_selftest<uint16_t> reader (insns);
>       int ret = decode_insn (reader, &arm_record, THUMB_RECORD,
>   			   THUMB_INSN_SIZE_BYTES);
>   
> @@ -14495,13 +14467,14 @@ arm_record_test (void)
>       memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>       arm_record.gdbarch = gdbarch;
>   
> -    static const uint16_t insns[] = {
> -      /* 1d ee 70 7f	 mrc	15, 0, r7, cr13, cr0, {3} */
> -      0xee1d, 0x7f70,
> +    /* Use the endian-free representation of the instruction here.  The test
> +       will handle endianness conversions.  */
> +    static const uint32_t insns[] = {
> +      /* mrc	15, 0, r7, cr13, cr0, {3} */
> +      0x7f70ee1d,
>       };
>   
> -    enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
> -    instruction_reader_thumb reader (endian, insns);
> +    instruction_reader_selftest<uint32_t> reader (insns);
>       int ret = decode_insn (reader, &arm_record, THUMB2_RECORD,
>   			   THUMB2_INSN_SIZE_BYTES);
>   
> @@ -14510,6 +14483,27 @@ arm_record_test (void)
>       SELF_CHECK (arm_record.reg_rec_count == 1);
>       SELF_CHECK (arm_record.arm_regs[0] == 7);
>     }
> +
> +  /* 32-bit instructions.  */
> +  {
> +    arm_insn_decode_record arm_record;
> +
> +    memset (&arm_record, 0, sizeof (arm_insn_decode_record));
> +    arm_record.gdbarch = gdbarch;
> +
> +    /* Use the endian-free representation of the instruction here.  The test
> +       will handle endianness conversions.  */
> +    static const uint32_t insns[] = {
> +      /* mov     r5, r0 */
> +      0xe1a05000,
> +    };
> +
> +    instruction_reader_selftest<uint32_t> reader (insns);
> +    int ret = decode_insn (reader, &arm_record, ARM_RECORD,
> +			   ARM_INSN_SIZE_BYTES);
> +
> +    SELF_CHECK (ret == 0);
> +  }
>   }
>   
>   /* Instruction reader from manually cooked instruction sequences.  */
> @@ -14609,18 +14603,10 @@ arm_process_record (struct gdbarch *gdbarch, struct regcache *regcache,
>       }
>   
>     instruction_reader reader;
> -  if (extract_arm_insn (reader, &arm_record, 2))
> -    {
> -      if (record_debug)
> -	{
> -	  gdb_printf (gdb_stdlog,
> -		      _("Process record: error reading memory at "
> -			"addr %s len = %d.\n"),
> -		      paddress (arm_record.gdbarch,
> -				arm_record.this_addr), 2);
> -	}
> -      return -1;
> -    }
> +  enum bfd_endian code_endian
> +    = gdbarch_byte_order_for_code (arm_record.gdbarch);
> +  arm_record.arm_insn
> +    = reader.read (arm_record.this_addr, 2, code_endian);
>   
>     /* Check the insn, whether it is thumb or arm one.  */
>   


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PING][PATCH,v2] [Arm] Fix endianness handling for arm record self tests
  2022-09-01  9:29   ` [PING][PATCH,v2] " Luis Machado via Gdb-patches
@ 2022-09-06 10:39     ` Tom de Vries via Gdb-patches
  2022-09-07  8:19       ` Luis Machado via Gdb-patches
  0 siblings, 1 reply; 17+ messages in thread
From: Tom de Vries via Gdb-patches @ 2022-09-06 10:39 UTC (permalink / raw)
  To: Luis Machado, gdb-patches; +Cc: mark

On 9/1/22 11:29, Luis Machado wrote:
> On 8/23/22 21:32, Luis Machado via Gdb-patches wrote:
>> v2:
>>
>> - Add 32-bit Arm instruction selftest
>> - Refactored abstract memory reader into abstract instruction reader
>> - Adjusted code to use templated type and to use host endianness as
>>    opposed to target endianness.
>>
>> The arm record tests handle 16-bit and 32-bit thumb instructions, but the
>> code is laid out in a way that handles the 32-bit thumb instructions as
>> two 16-bit parts.
>>
>> This is fine, but it is prone to host-endianness issues given how the two
>> 16-bit parts are stored and how they are accessed later on. Arm is
>> little-endian by default, so running this test with a GDB built with
>> --enable-targets=all and on a big endian host will run into the 
>> following:
>>
>> Running selftest arm-record.
>> Process record and replay target doesn't support syscall number -2036195
>> Process record does not support instruction 0x7f70ee1d at address 0x0.
>> Self test failed: self-test failed at 
>> ../../binutils-gdb/gdb/arm-tdep.c:14482
>>
>> It turns out the abstract memory reader class is more generic than it 
>> needs to
>> be, and we can simplify the code a bit by assuming we have a simple 
>> instruction
>> reader that only reads up to 4 bytes, which is the length of a 32-bit
>> instruction.
>>
>> Instead of returning a bool, we return instead the instruction that 
>> has been
>> read. This way we avoid having to deal with the endianness conversion, 
>> and use
>> the host endianness instead. The Arm selftests can be executed on non-Arm
>> hosts.
>>

LGTM.

Thanks,
- Tom

>> While at it, Tom suggested adding a 32-bit Arm instruction selftest to 
>> increase
>> the coverage of the selftests.
>>
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29432
>>
>> Co-authored-by: Tom de Vries <tdevries@suse.de>
>> ---
>>   gdb/arm-tdep.c | 150 ++++++++++++++++++++++---------------------------
>>   1 file changed, 68 insertions(+), 82 deletions(-)
>>
>> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
>> index 0091d14708f..ead9bbf46c5 100644
>> --- a/gdb/arm-tdep.c
>> +++ b/gdb/arm-tdep.c
>> @@ -14252,59 +14252,39 @@ thumb2_record_decode_insn_handler 
>> (arm_insn_decode_record *thumb2_insn_r)
>>   }
>>   namespace {
>> -/* Abstract memory reader.  */
>> +/* Abstract instruction reader.  */
>> -class abstract_memory_reader
>> +class abstract_instruction_reader
>>   {
>>   public:
>> -  /* Read LEN bytes of target memory at address MEMADDR, placing the
>> -     results in GDB's memory at BUF.  Return true on success.  */
>> +  /* Read one instruction of size LEN from address MEMADDR and using
>> +     BYTE_ORDER endianness.  */
>> -  virtual bool read (CORE_ADDR memaddr, gdb_byte *buf, const size_t 
>> len) = 0;
>> +  virtual ULONGEST read (CORE_ADDR memaddr, const size_t len,
>> +             enum bfd_endian byte_order) = 0;
>>   };
>>   /* Instruction reader from real target.  */
>> -class instruction_reader : public abstract_memory_reader
>> +class instruction_reader : public abstract_instruction_reader
>>   {
>>    public:
>> -  bool read (CORE_ADDR memaddr, gdb_byte *buf, const size_t len) 
>> override
>> +  ULONGEST read (CORE_ADDR memaddr, const size_t len,
>> +         enum bfd_endian byte_order) override
>>     {
>> -    if (target_read_memory (memaddr, buf, len))
>> -      return false;
>> -    else
>> -      return true;
>> +    return read_code_unsigned_integer (memaddr, len, byte_order);
>>     }
>>   };
>>   } // namespace
>> -/* Extracts arm/thumb/thumb2 insn depending on the size, and returns 
>> 0 on success
>> -and positive val on failure.  */
>> -
>> -static int
>> -extract_arm_insn (abstract_memory_reader& reader,
>> -          arm_insn_decode_record *insn_record, uint32_t insn_size)
>> -{
>> -  gdb_byte buf[insn_size];
>> -
>> -  memset (&buf[0], 0, insn_size);
>> -
>> -  if (!reader.read (insn_record->this_addr, buf, insn_size))
>> -    return 1;
>> -  insn_record->arm_insn = (uint32_t) extract_unsigned_integer (&buf[0],
>> -               insn_size,
>> -               gdbarch_byte_order_for_code (insn_record->gdbarch));
>> -  return 0;
>> -}
>> -
>>   typedef int (*sti_arm_hdl_fp_t) (arm_insn_decode_record*);
>>   /* Decode arm/thumb insn depending on condition cods and opcodes; and
>>      dispatch it.  */
>>   static int
>> -decode_insn (abstract_memory_reader &reader,
>> +decode_insn (abstract_instruction_reader &reader,
>>            arm_insn_decode_record *arm_record,
>>            record_type_t record_type, uint32_t insn_size)
>>   {
>> @@ -14339,20 +14319,12 @@ decode_insn (abstract_memory_reader &reader,
>>     uint32_t ret = 0;    /* return value: negative:failure   
>> 0:success.  */
>>     uint32_t insn_id = 0;
>> +  enum bfd_endian code_endian
>> +    = gdbarch_byte_order_for_code (arm_record->gdbarch);
>> +  arm_record->arm_insn
>> +    = reader.read (arm_record->this_addr, insn_size, code_endian);
>> -  if (extract_arm_insn (reader, arm_record, insn_size))
>> -    {
>> -      if (record_debug)
>> -    {
>> -      gdb_printf (gdb_stdlog,
>> -              _("Process record: error reading memory at "
>> -            "addr %s len = %d.\n"),
>> -              paddress (arm_record->gdbarch,
>> -                arm_record->this_addr), insn_size);
>> -    }
>> -      return -1;
>> -    }
>> -  else if (ARM_RECORD == record_type)
>> +  if (ARM_RECORD == record_type)
>>       {
>>         arm_record->cond = bits (arm_record->arm_insn, 28, 31);
>>         insn_id = bits (arm_record->arm_insn, 25, 27);
>> @@ -14412,36 +14384,35 @@ decode_insn (abstract_memory_reader &reader,
>>   #if GDB_SELF_TEST
>>   namespace selftests {
>> -/* Provide both 16-bit and 32-bit thumb instructions.  */
>> +/* Instruction reader class for selftests.
>> +
>> +   For 16-bit Thumb instructions, an array of uint16_t should be used.
>> -class instruction_reader_thumb : public abstract_memory_reader
>> +   For 32-bit Thumb instructions and regular 32-bit Arm instructions, 
>> an array
>> +   of uint32_t should be used.  */
>> +
>> +template<typename T>
>> +class instruction_reader_selftest : public abstract_instruction_reader
>>   {
>>   public:
>>     template<size_t SIZE>
>> -  instruction_reader_thumb (enum bfd_endian endian,
>> -                const uint16_t (&insns)[SIZE])
>> -    : m_endian (endian), m_insns (insns), m_insns_size (SIZE)
>> +  instruction_reader_selftest (const T (&insns)[SIZE])
>> +    : m_insns (insns), m_insns_size (SIZE)
>>     {}
>> -  bool read (CORE_ADDR memaddr, gdb_byte *buf, const size_t len) 
>> override
>> +  ULONGEST read (CORE_ADDR memaddr, const size_t length,
>> +         enum bfd_endian byte_order) override
>>     {
>> -    SELF_CHECK (len == 4 || len == 2);
>> -    SELF_CHECK (memaddr % 2 == 0);
>> -    SELF_CHECK ((memaddr / 2) < m_insns_size);
>> +    SELF_CHECK (length == sizeof (T));
>> +    SELF_CHECK (memaddr % sizeof (T) == 0);
>> +    SELF_CHECK ((memaddr / sizeof (T)) < m_insns_size);
>> -    store_unsigned_integer (buf, 2, m_endian, m_insns[memaddr / 2]);
>> -    if (len == 4)
>> -      {
>> -    store_unsigned_integer (&buf[2], 2, m_endian,
>> -                m_insns[memaddr / 2 + 1]);
>> -      }
>> -    return true;
>> +    return m_insns[memaddr / sizeof (T)];
>>     }
>>   private:
>> -  enum bfd_endian m_endian;
>> -  const uint16_t *m_insns;
>> -  size_t m_insns_size;
>> +  const T *m_insns;
>> +  const size_t m_insns_size;
>>   };
>>   static void
>> @@ -14461,6 +14432,8 @@ arm_record_test (void)
>>       memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>>       arm_record.gdbarch = gdbarch;
>> +    /* Use the endian-free representation of the instructions here.  
>> The test
>> +       will handle endianness conversions.  */
>>       static const uint16_t insns[] = {
>>         /* db b2    uxtb    r3, r3 */
>>         0xb2db,
>> @@ -14468,8 +14441,7 @@ arm_record_test (void)
>>         0x58cd,
>>       };
>> -    enum bfd_endian endian = gdbarch_byte_order_for_code 
>> (arm_record.gdbarch);
>> -    instruction_reader_thumb reader (endian, insns);
>> +    instruction_reader_selftest<uint16_t> reader (insns);
>>       int ret = decode_insn (reader, &arm_record, THUMB_RECORD,
>>                  THUMB_INSN_SIZE_BYTES);
>> @@ -14495,13 +14467,14 @@ arm_record_test (void)
>>       memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>>       arm_record.gdbarch = gdbarch;
>> -    static const uint16_t insns[] = {
>> -      /* 1d ee 70 7f     mrc    15, 0, r7, cr13, cr0, {3} */
>> -      0xee1d, 0x7f70,
>> +    /* Use the endian-free representation of the instruction here.  
>> The test
>> +       will handle endianness conversions.  */
>> +    static const uint32_t insns[] = {
>> +      /* mrc    15, 0, r7, cr13, cr0, {3} */
>> +      0x7f70ee1d,
>>       };
>> -    enum bfd_endian endian = gdbarch_byte_order_for_code 
>> (arm_record.gdbarch);
>> -    instruction_reader_thumb reader (endian, insns);
>> +    instruction_reader_selftest<uint32_t> reader (insns);
>>       int ret = decode_insn (reader, &arm_record, THUMB2_RECORD,
>>                  THUMB2_INSN_SIZE_BYTES);
>> @@ -14510,6 +14483,27 @@ arm_record_test (void)
>>       SELF_CHECK (arm_record.reg_rec_count == 1);
>>       SELF_CHECK (arm_record.arm_regs[0] == 7);
>>     }
>> +
>> +  /* 32-bit instructions.  */
>> +  {
>> +    arm_insn_decode_record arm_record;
>> +
>> +    memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>> +    arm_record.gdbarch = gdbarch;
>> +
>> +    /* Use the endian-free representation of the instruction here.  
>> The test
>> +       will handle endianness conversions.  */
>> +    static const uint32_t insns[] = {
>> +      /* mov     r5, r0 */
>> +      0xe1a05000,
>> +    };
>> +
>> +    instruction_reader_selftest<uint32_t> reader (insns);
>> +    int ret = decode_insn (reader, &arm_record, ARM_RECORD,
>> +               ARM_INSN_SIZE_BYTES);
>> +
>> +    SELF_CHECK (ret == 0);
>> +  }
>>   }
>>   /* Instruction reader from manually cooked instruction sequences.  */
>> @@ -14609,18 +14603,10 @@ arm_process_record (struct gdbarch *gdbarch, 
>> struct regcache *regcache,
>>       }
>>     instruction_reader reader;
>> -  if (extract_arm_insn (reader, &arm_record, 2))
>> -    {
>> -      if (record_debug)
>> -    {
>> -      gdb_printf (gdb_stdlog,
>> -              _("Process record: error reading memory at "
>> -            "addr %s len = %d.\n"),
>> -              paddress (arm_record.gdbarch,
>> -                arm_record.this_addr), 2);
>> -    }
>> -      return -1;
>> -    }
>> +  enum bfd_endian code_endian
>> +    = gdbarch_byte_order_for_code (arm_record.gdbarch);
>> +  arm_record.arm_insn
>> +    = reader.read (arm_record.this_addr, 2, code_endian);
>>     /* Check the insn, whether it is thumb or arm one.  */
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PING][PATCH,v2] [Arm] Fix endianness handling for arm record self tests
  2022-09-06 10:39     ` Tom de Vries via Gdb-patches
@ 2022-09-07  8:19       ` Luis Machado via Gdb-patches
  0 siblings, 0 replies; 17+ messages in thread
From: Luis Machado via Gdb-patches @ 2022-09-07  8:19 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches; +Cc: mark

On 9/6/22 11:39, Tom de Vries wrote:
> On 9/1/22 11:29, Luis Machado wrote:
>> On 8/23/22 21:32, Luis Machado via Gdb-patches wrote:
>>> v2:
>>>
>>> - Add 32-bit Arm instruction selftest
>>> - Refactored abstract memory reader into abstract instruction reader
>>> - Adjusted code to use templated type and to use host endianness as
>>>    opposed to target endianness.
>>>
>>> The arm record tests handle 16-bit and 32-bit thumb instructions, but the
>>> code is laid out in a way that handles the 32-bit thumb instructions as
>>> two 16-bit parts.
>>>
>>> This is fine, but it is prone to host-endianness issues given how the two
>>> 16-bit parts are stored and how they are accessed later on. Arm is
>>> little-endian by default, so running this test with a GDB built with
>>> --enable-targets=all and on a big endian host will run into the following:
>>>
>>> Running selftest arm-record.
>>> Process record and replay target doesn't support syscall number -2036195
>>> Process record does not support instruction 0x7f70ee1d at address 0x0.
>>> Self test failed: self-test failed at ../../binutils-gdb/gdb/arm-tdep.c:14482
>>>
>>> It turns out the abstract memory reader class is more generic than it needs to
>>> be, and we can simplify the code a bit by assuming we have a simple instruction
>>> reader that only reads up to 4 bytes, which is the length of a 32-bit
>>> instruction.
>>>
>>> Instead of returning a bool, we return instead the instruction that has been
>>> read. This way we avoid having to deal with the endianness conversion, and use
>>> the host endianness instead. The Arm selftests can be executed on non-Arm
>>> hosts.
>>>
> 
> LGTM.
>> Thanks,
> - Tom

Pushed now. Thanks!

> 
>>> While at it, Tom suggested adding a 32-bit Arm instruction selftest to increase
>>> the coverage of the selftests.
>>>
>>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29432
>>>
>>> Co-authored-by: Tom de Vries <tdevries@suse.de>
>>> ---
>>>   gdb/arm-tdep.c | 150 ++++++++++++++++++++++---------------------------
>>>   1 file changed, 68 insertions(+), 82 deletions(-)
>>>
>>> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
>>> index 0091d14708f..ead9bbf46c5 100644
>>> --- a/gdb/arm-tdep.c
>>> +++ b/gdb/arm-tdep.c
>>> @@ -14252,59 +14252,39 @@ thumb2_record_decode_insn_handler (arm_insn_decode_record *thumb2_insn_r)
>>>   }
>>>   namespace {
>>> -/* Abstract memory reader.  */
>>> +/* Abstract instruction reader.  */
>>> -class abstract_memory_reader
>>> +class abstract_instruction_reader
>>>   {
>>>   public:
>>> -  /* Read LEN bytes of target memory at address MEMADDR, placing the
>>> -     results in GDB's memory at BUF.  Return true on success.  */
>>> +  /* Read one instruction of size LEN from address MEMADDR and using
>>> +     BYTE_ORDER endianness.  */
>>> -  virtual bool read (CORE_ADDR memaddr, gdb_byte *buf, const size_t len) = 0;
>>> +  virtual ULONGEST read (CORE_ADDR memaddr, const size_t len,
>>> +             enum bfd_endian byte_order) = 0;
>>>   };
>>>   /* Instruction reader from real target.  */
>>> -class instruction_reader : public abstract_memory_reader
>>> +class instruction_reader : public abstract_instruction_reader
>>>   {
>>>    public:
>>> -  bool read (CORE_ADDR memaddr, gdb_byte *buf, const size_t len) override
>>> +  ULONGEST read (CORE_ADDR memaddr, const size_t len,
>>> +         enum bfd_endian byte_order) override
>>>     {
>>> -    if (target_read_memory (memaddr, buf, len))
>>> -      return false;
>>> -    else
>>> -      return true;
>>> +    return read_code_unsigned_integer (memaddr, len, byte_order);
>>>     }
>>>   };
>>>   } // namespace
>>> -/* Extracts arm/thumb/thumb2 insn depending on the size, and returns 0 on success
>>> -and positive val on failure.  */
>>> -
>>> -static int
>>> -extract_arm_insn (abstract_memory_reader& reader,
>>> -          arm_insn_decode_record *insn_record, uint32_t insn_size)
>>> -{
>>> -  gdb_byte buf[insn_size];
>>> -
>>> -  memset (&buf[0], 0, insn_size);
>>> -
>>> -  if (!reader.read (insn_record->this_addr, buf, insn_size))
>>> -    return 1;
>>> -  insn_record->arm_insn = (uint32_t) extract_unsigned_integer (&buf[0],
>>> -               insn_size,
>>> -               gdbarch_byte_order_for_code (insn_record->gdbarch));
>>> -  return 0;
>>> -}
>>> -
>>>   typedef int (*sti_arm_hdl_fp_t) (arm_insn_decode_record*);
>>>   /* Decode arm/thumb insn depending on condition cods and opcodes; and
>>>      dispatch it.  */
>>>   static int
>>> -decode_insn (abstract_memory_reader &reader,
>>> +decode_insn (abstract_instruction_reader &reader,
>>>            arm_insn_decode_record *arm_record,
>>>            record_type_t record_type, uint32_t insn_size)
>>>   {
>>> @@ -14339,20 +14319,12 @@ decode_insn (abstract_memory_reader &reader,
>>>     uint32_t ret = 0;    /* return value: negative:failure 0:success.  */
>>>     uint32_t insn_id = 0;
>>> +  enum bfd_endian code_endian
>>> +    = gdbarch_byte_order_for_code (arm_record->gdbarch);
>>> +  arm_record->arm_insn
>>> +    = reader.read (arm_record->this_addr, insn_size, code_endian);
>>> -  if (extract_arm_insn (reader, arm_record, insn_size))
>>> -    {
>>> -      if (record_debug)
>>> -    {
>>> -      gdb_printf (gdb_stdlog,
>>> -              _("Process record: error reading memory at "
>>> -            "addr %s len = %d.\n"),
>>> -              paddress (arm_record->gdbarch,
>>> -                arm_record->this_addr), insn_size);
>>> -    }
>>> -      return -1;
>>> -    }
>>> -  else if (ARM_RECORD == record_type)
>>> +  if (ARM_RECORD == record_type)
>>>       {
>>>         arm_record->cond = bits (arm_record->arm_insn, 28, 31);
>>>         insn_id = bits (arm_record->arm_insn, 25, 27);
>>> @@ -14412,36 +14384,35 @@ decode_insn (abstract_memory_reader &reader,
>>>   #if GDB_SELF_TEST
>>>   namespace selftests {
>>> -/* Provide both 16-bit and 32-bit thumb instructions.  */
>>> +/* Instruction reader class for selftests.
>>> +
>>> +   For 16-bit Thumb instructions, an array of uint16_t should be used.
>>> -class instruction_reader_thumb : public abstract_memory_reader
>>> +   For 32-bit Thumb instructions and regular 32-bit Arm instructions, an array
>>> +   of uint32_t should be used.  */
>>> +
>>> +template<typename T>
>>> +class instruction_reader_selftest : public abstract_instruction_reader
>>>   {
>>>   public:
>>>     template<size_t SIZE>
>>> -  instruction_reader_thumb (enum bfd_endian endian,
>>> -                const uint16_t (&insns)[SIZE])
>>> -    : m_endian (endian), m_insns (insns), m_insns_size (SIZE)
>>> +  instruction_reader_selftest (const T (&insns)[SIZE])
>>> +    : m_insns (insns), m_insns_size (SIZE)
>>>     {}
>>> -  bool read (CORE_ADDR memaddr, gdb_byte *buf, const size_t len) override
>>> +  ULONGEST read (CORE_ADDR memaddr, const size_t length,
>>> +         enum bfd_endian byte_order) override
>>>     {
>>> -    SELF_CHECK (len == 4 || len == 2);
>>> -    SELF_CHECK (memaddr % 2 == 0);
>>> -    SELF_CHECK ((memaddr / 2) < m_insns_size);
>>> +    SELF_CHECK (length == sizeof (T));
>>> +    SELF_CHECK (memaddr % sizeof (T) == 0);
>>> +    SELF_CHECK ((memaddr / sizeof (T)) < m_insns_size);
>>> -    store_unsigned_integer (buf, 2, m_endian, m_insns[memaddr / 2]);
>>> -    if (len == 4)
>>> -      {
>>> -    store_unsigned_integer (&buf[2], 2, m_endian,
>>> -                m_insns[memaddr / 2 + 1]);
>>> -      }
>>> -    return true;
>>> +    return m_insns[memaddr / sizeof (T)];
>>>     }
>>>   private:
>>> -  enum bfd_endian m_endian;
>>> -  const uint16_t *m_insns;
>>> -  size_t m_insns_size;
>>> +  const T *m_insns;
>>> +  const size_t m_insns_size;
>>>   };
>>>   static void
>>> @@ -14461,6 +14432,8 @@ arm_record_test (void)
>>>       memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>>>       arm_record.gdbarch = gdbarch;
>>> +    /* Use the endian-free representation of the instructions here. The test
>>> +       will handle endianness conversions.  */
>>>       static const uint16_t insns[] = {
>>>         /* db b2    uxtb    r3, r3 */
>>>         0xb2db,
>>> @@ -14468,8 +14441,7 @@ arm_record_test (void)
>>>         0x58cd,
>>>       };
>>> -    enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
>>> -    instruction_reader_thumb reader (endian, insns);
>>> +    instruction_reader_selftest<uint16_t> reader (insns);
>>>       int ret = decode_insn (reader, &arm_record, THUMB_RECORD,
>>>                  THUMB_INSN_SIZE_BYTES);
>>> @@ -14495,13 +14467,14 @@ arm_record_test (void)
>>>       memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>>>       arm_record.gdbarch = gdbarch;
>>> -    static const uint16_t insns[] = {
>>> -      /* 1d ee 70 7f     mrc    15, 0, r7, cr13, cr0, {3} */
>>> -      0xee1d, 0x7f70,
>>> +    /* Use the endian-free representation of the instruction here. The test
>>> +       will handle endianness conversions.  */
>>> +    static const uint32_t insns[] = {
>>> +      /* mrc    15, 0, r7, cr13, cr0, {3} */
>>> +      0x7f70ee1d,
>>>       };
>>> -    enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
>>> -    instruction_reader_thumb reader (endian, insns);
>>> +    instruction_reader_selftest<uint32_t> reader (insns);
>>>       int ret = decode_insn (reader, &arm_record, THUMB2_RECORD,
>>>                  THUMB2_INSN_SIZE_BYTES);
>>> @@ -14510,6 +14483,27 @@ arm_record_test (void)
>>>       SELF_CHECK (arm_record.reg_rec_count == 1);
>>>       SELF_CHECK (arm_record.arm_regs[0] == 7);
>>>     }
>>> +
>>> +  /* 32-bit instructions.  */
>>> +  {
>>> +    arm_insn_decode_record arm_record;
>>> +
>>> +    memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>>> +    arm_record.gdbarch = gdbarch;
>>> +
>>> +    /* Use the endian-free representation of the instruction here. The test
>>> +       will handle endianness conversions.  */
>>> +    static const uint32_t insns[] = {
>>> +      /* mov     r5, r0 */
>>> +      0xe1a05000,
>>> +    };
>>> +
>>> +    instruction_reader_selftest<uint32_t> reader (insns);
>>> +    int ret = decode_insn (reader, &arm_record, ARM_RECORD,
>>> +               ARM_INSN_SIZE_BYTES);
>>> +
>>> +    SELF_CHECK (ret == 0);
>>> +  }
>>>   }
>>>   /* Instruction reader from manually cooked instruction sequences.  */
>>> @@ -14609,18 +14603,10 @@ arm_process_record (struct gdbarch *gdbarch, struct regcache *regcache,
>>>       }
>>>     instruction_reader reader;
>>> -  if (extract_arm_insn (reader, &arm_record, 2))
>>> -    {
>>> -      if (record_debug)
>>> -    {
>>> -      gdb_printf (gdb_stdlog,
>>> -              _("Process record: error reading memory at "
>>> -            "addr %s len = %d.\n"),
>>> -              paddress (arm_record.gdbarch,
>>> -                arm_record.this_addr), 2);
>>> -    }
>>> -      return -1;
>>> -    }
>>> +  enum bfd_endian code_endian
>>> +    = gdbarch_byte_order_for_code (arm_record.gdbarch);
>>> +  arm_record.arm_insn
>>> +    = reader.read (arm_record.this_addr, 2, code_endian);
>>>     /* Check the insn, whether it is thumb or arm one.  */
>>


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-09-07  8:20 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-08 10:12 [PATCH] [Arm] Fix endianness handling for arm record self tests Luis Machado via Gdb-patches
2022-08-08 12:30 ` Andrew Burgess via Gdb-patches
2022-08-10  8:47   ` Andrew Burgess via Gdb-patches
2022-08-09  9:43 ` Tom de Vries via Gdb-patches
2022-08-09  9:57   ` Tom de Vries via Gdb-patches
2022-08-15 12:13     ` Luis Machado via Gdb-patches
2022-08-09 11:31   ` Luis Machado via Gdb-patches
2022-08-09 11:48     ` Tom de Vries via Gdb-patches
2022-08-09 12:08       ` Tom de Vries via Gdb-patches
2022-08-09 12:09         ` Luis Machado via Gdb-patches
2022-08-09 12:13           ` Tom de Vries via Gdb-patches
2022-08-09 15:24             ` Mark Wielaard
2022-08-15 12:10   ` Luis Machado via Gdb-patches
2022-08-23 20:32 ` [PATCH,v2] " Luis Machado via Gdb-patches
2022-09-01  9:29   ` [PING][PATCH,v2] " Luis Machado via Gdb-patches
2022-09-06 10:39     ` Tom de Vries via Gdb-patches
2022-09-07  8:19       ` Luis Machado via Gdb-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox