From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
To: Luis Machado <luis.machado@arm.com>, gdb-patches@sourceware.org
Cc: mark@klomp.org
Subject: Re: [PATCH] [Arm] Fix endianness handling for arm record self tests
Date: Wed, 10 Aug 2022 09:47:15 +0100 [thread overview]
Message-ID: <874jyk1o4s.fsf@redhat.com> (raw)
In-Reply-To: <87czda2a0m.fsf@redhat.com>
Andrew Burgess <aburgess@redhat.com> writes:
> Luis Machado via Gdb-patches <gdb-patches@sourceware.org> writes:
>
>> The arm record tests handle 16-bit and 32-bit thumb instructions, but the
>> code is laid out in a way that handles the 32-bit thumb instructions as
>> two 16-bit parts.
>>
>> This is fine, but it is prone to host-endianness issues given how the two
>> 16-bit parts are stored and how they are accessed later on. Arm is
>> little-endian by default, so running this test with a GDB built with
>> --enable-targets=all and on a big endian host will run into the following:
>>
>> Running selftest arm-record.
>> Process record and replay target doesn't support syscall number -2036195
>> Process record does not support instruction 0x7f70ee1d at address 0x0.
>> Self test failed: self-test failed at ../../binutils-gdb/gdb/arm-tdep.c:14482
>>
>> Investigating this a bit further, there seems to be a chance to do a simple
>> fix through a type template, using uint16_t for 16-bit thumb instructions
>> and uint32_t for 32-bit thumb instructions.
>>
>> This patch implements this.
>>
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29432
>> ---
>> gdb/arm-tdep.c | 32 ++++++++++++++++++--------------
>> 1 file changed, 18 insertions(+), 14 deletions(-)
>>
>> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
>> index cf8b610a381..57b865a0819 100644
>> --- a/gdb/arm-tdep.c
>> +++ b/gdb/arm-tdep.c
>> @@ -14387,14 +14387,18 @@ decode_insn (abstract_memory_reader &reader,
>> #if GDB_SELF_TEST
>> namespace selftests {
>>
>> -/* Provide both 16-bit and 32-bit thumb instructions. */
>> +/* Provide both 16-bit and 32-bit thumb instructions.
>>
>> + For 16-bit Thumb instructions, an array of uint16_t should be used.
>> + For 32-bit Thumb instructions, an array of uint32_t should be used. */
>> +
>> +template<typename T>
>> class instruction_reader_thumb : public abstract_memory_reader
>> {
>> public:
>> template<size_t SIZE>
>> instruction_reader_thumb (enum bfd_endian endian,
>> - const uint16_t (&insns)[SIZE])
>> + const T (&insns)[SIZE])
>> : m_endian (endian), m_insns (insns), m_insns_size (SIZE)
>> {}
>>
>> @@ -14404,18 +14408,14 @@ class instruction_reader_thumb : public abstract_memory_reader
>> SELF_CHECK (memaddr % 2 == 0);
>> SELF_CHECK ((memaddr / 2) < m_insns_size);
>
> I was expecting this '/ 2' to need updating here. If memaddr is an octet
> address, then the '/ 2' converts to a 16-bit chunk address, which is
> fine if T is uint16_t, but surely is wrong when T is uint32_t...
Yesterdays activity on this thread reminded me of something else I
thought of. If you plan to rework this patch then maybe this feedback
is irrelevant, but I just wanted to put it on the record anyway.
At the start of instruction_reader_thumb::read we have this assert:
SELF_CHECK (len == 4 || len == 2);
However, after your change you always read 'sizeof (T)' bytes, which
might be 4, so, I think that if 'len == 2' you will be overflowing BUF.
I guess thumb instructions can be 2-byte or 4-byte, so the thumb decoder
probably reads an initial 2-bytes, and then follows up with either a
full 4-byte read, or a read of the second 2-bytes only when necessary.
I don't think the code as proposed here will handle this use case
correctly.
Thanks,
Andrew
>
>>
>> - store_unsigned_integer (buf, 2, m_endian, m_insns[memaddr / 2]);
>> - if (len == 4)
>> - {
>> - store_unsigned_integer (&buf[2], 2, m_endian,
>> - m_insns[memaddr / 2 + 1]);
>> - }
>> + store_unsigned_integer (buf, sizeof (T), m_endian, m_insns[memaddr / 2]);
>
> And the same here.
>
>> +
>> return true;
>> }
>>
>> private:
>> enum bfd_endian m_endian;
>> - const uint16_t *m_insns;
>> + const T *m_insns;
>> size_t m_insns_size;
>> };
>>
>> @@ -14436,6 +14436,8 @@ arm_record_test (void)
>> memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>> arm_record.gdbarch = gdbarch;
>>
>> + /* Use the endian-free representation of the instructions here. The test
>> + will handle endianness conversions. */
>> static const uint16_t insns[] = {
>> /* db b2 uxtb r3, r3 */
>> 0xb2db,
>> @@ -14444,7 +14446,7 @@ arm_record_test (void)
>> };
>>
>> enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
>> - instruction_reader_thumb reader (endian, insns);
>> + instruction_reader_thumb<uint16_t> reader (endian, insns);
>
> I wonder if there's an alternative fix here?
> gdbarch_byte_order_for_code returns a value such that READER will
> correctly read instructions from arm instruction memory, right? Which
> happens to be little-endian.
>
> However, we're not reading from arm instruction memory, but we are
> instead reading from host memory.
>
> On many targets, host memory also happens to be little-endian, thus
> gdbarch_byte_order_for_code is still correct.
>
> But, could we not instead pass in a value here that represents the host
> memory order instead, then maybe READER will just do the right thing?
>
> Thanks,
> Andrew
>
>> int ret = decode_insn (reader, &arm_record, THUMB_RECORD,
>> THUMB_INSN_SIZE_BYTES);
>>
>> @@ -14470,13 +14472,15 @@ arm_record_test (void)
>> memset (&arm_record, 0, sizeof (arm_insn_decode_record));
>> arm_record.gdbarch = gdbarch;
>>
>> - static const uint16_t insns[] = {
>> - /* 1d ee 70 7f mrc 15, 0, r7, cr13, cr0, {3} */
>> - 0xee1d, 0x7f70,
>> + /* Use the endian-free representation of the instruction here. The test
>> + will handle endianness conversions. */
>> + static const uint32_t insns[] = {
>> + /* mrc 15, 0, r7, cr13, cr0, {3} */
>> + 0x7f70ee1d,
>> };
>>
>> enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
>> - instruction_reader_thumb reader (endian, insns);
>> + instruction_reader_thumb<uint32_t> reader (endian, insns);
>> int ret = decode_insn (reader, &arm_record, THUMB2_RECORD,
>> THUMB2_INSN_SIZE_BYTES);
>>
>> --
>> 2.25.1
next prev parent reply other threads:[~2022-08-10 8:47 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-08 10:12 Luis Machado via Gdb-patches
2022-08-08 12:30 ` Andrew Burgess via Gdb-patches
2022-08-10 8:47 ` Andrew Burgess via Gdb-patches [this message]
2022-08-09 9:43 ` Tom de Vries via Gdb-patches
2022-08-09 9:57 ` Tom de Vries via Gdb-patches
2022-08-15 12:13 ` Luis Machado via Gdb-patches
2022-08-09 11:31 ` Luis Machado via Gdb-patches
2022-08-09 11:48 ` Tom de Vries via Gdb-patches
2022-08-09 12:08 ` Tom de Vries via Gdb-patches
2022-08-09 12:09 ` Luis Machado via Gdb-patches
2022-08-09 12:13 ` Tom de Vries via Gdb-patches
2022-08-09 15:24 ` Mark Wielaard
2022-08-15 12:10 ` Luis Machado via Gdb-patches
2022-08-23 20:32 ` [PATCH,v2] " Luis Machado via Gdb-patches
2022-09-01 9:29 ` [PING][PATCH,v2] " Luis Machado via Gdb-patches
2022-09-06 10:39 ` Tom de Vries via Gdb-patches
2022-09-07 8:19 ` Luis Machado via Gdb-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874jyk1o4s.fsf@redhat.com \
--to=gdb-patches@sourceware.org \
--cc=aburgess@redhat.com \
--cc=luis.machado@arm.com \
--cc=mark@klomp.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox