From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-bounces@sourceware.org>
Received: from simark.ca
	by simark.ca with LMTP
	id 6G8jDMm8rl9idgAAWB0awg
	(envelope-from <gdb-patches-bounces@sourceware.org>)
	for <public-inbox@simark.ca>; Fri, 13 Nov 2020 12:05:13 -0500
Received: by simark.ca (Postfix, from userid 112)
	id 2DE871F08D; Fri, 13 Nov 2020 12:05:13 -0500 (EST)
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca
X-Spam-Level: 
X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI,
	URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2
Received: from sourceware.org (server2.sourceware.org [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by simark.ca (Postfix) with ESMTPS id 7AD6B1EFBB
	for <public-inbox@simark.ca>; Fri, 13 Nov 2020 12:05:11 -0500 (EST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 3E5AF3944413;
	Fri, 13 Nov 2020 17:05:11 +0000 (GMT)
Received: from simark.ca (simark.ca [158.69.221.121])
 by sourceware.org (Postfix) with ESMTPS id C0E7F385481D
 for <gdb-patches@sourceware.org>; Fri, 13 Nov 2020 17:05:08 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C0E7F385481D
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=simark.ca
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark@simark.ca
Received: from [172.16.0.95] (192-222-181-218.qc.cable.ebox.net
 [192.222.181.218])
 (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by simark.ca (Postfix) with ESMTPSA id 85ACE1E58F;
 Fri, 13 Nov 2020 12:05:06 -0500 (EST)
Subject: Re: [PATCH] gdb/arm: avoid undefined behavior shift when decoding
 immediate value
To: Simon Marchi <simon.marchi@efficios.com>,
 Alan Hayward <Alan.Hayward@arm.com>
References: <20201109204945.1313866-1-simon.marchi@efficios.com>
 <735CB03B-2500-4CAE-8FB3-FA253DA4E414@arm.com>
 <66ee1922-d144-29bc-a8c7-7eceb0b345ce@efficios.com>
From: Simon Marchi <simark@simark.ca>
Message-ID: <6d6a7930-9863-0649-5b98-29ea8c09a61d@simark.ca>
Date: Fri, 13 Nov 2020 12:05:05 -0500
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.10.0
MIME-Version: 1.0
In-Reply-To: <66ee1922-d144-29bc-a8c7-7eceb0b345ce@efficios.com>
Content-Type: text/plain; charset=utf-8
Content-Language: tl
Content-Transfer-Encoding: 8bit
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
Cc: nd <nd@arm.com>,
 "gdb-patches\\@sourceware.org" <gdb-patches@sourceware.org>
Errors-To: gdb-patches-bounces@sourceware.org
Sender: "Gdb-patches" <gdb-patches-bounces@sourceware.org>

On 2020-11-11 3:44 p.m., Simon Marchi via Gdb-patches wrote:
> On 2020-11-11 5:00 a.m., Alan Hayward wrote:
>>
>>
>>> On 9 Nov 2020, at 20:49, Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> wrote:
>>>
>>> From: Simon Marchi <simon.marchi@polymtl.ca>
>>>
>>> When loading the code file provided in PR 26828 and GDB is build with
>>> UBSan, we get:
>>>
>>>    Core was generated by `./Foo'.
>>>    Program terminated with signal SIGABRT, Aborted.
>>>    #0  0xb6c3809c in pthread_cond_wait () from /home/simark/build/binutils-gdb/gdb/repo/lib/libpthread.so.0
>>>    [Current thread is 1 (LWP 29367)]
>>>    (gdb) bt
>>>    /home/simark/src/binutils-gdb/gdb/arm-tdep.c:1551:30: runtime error: shift exponent 32 is too large for 32-bit type 'unsigned int'
>>>
>>
>> This error is because the code is technically wrong - but in reality
>> compilers will plant what we really wanted?
> 
> Well, we don't really know, different things can happen on different
> systems.  I did this program for fun:
> 
>     #include <stdlib.h>
>     #include <stdio.h>
> 
>     int main(int argc, char **argv) {
>             unsigned int foo = 0xabababab;
>             unsigned int shift = atoi(argv[1]);
>             printf("Shifting by %d\n", shift);
>             foo = foo >> shift;
>             printf("%x\n", foo);
>             return 0;
>     }
> 
> Running `./a.out 32`, I get this on AMD64:
> 
>     Shifting by 32
>     abababab
> 
> And I get this on PowerPC and ARM:
> 
>     Shifting by 32
>     0
> 
>>> The sequence of instructions at pthread_cond_wait, in the
>>> libpthread.so.0 library, contains this instruction with an immediate
>>> constant with a "rotate amount" of 0:
>>>
>>>    e24dd044        sub     sp, sp, #68     ; 0x44
>>>
>>> Since it shifts by "32 - rotate amount", arm_analyze_prologue does a 32
>>> bit shift of a 32 bit type, which is caught by UBSan.
>>>
>>
>> Minor nit - it took me a while to figure out that the “it” in “Since it”
>> is the following arm_analyze_prologue and not something in the previous
>> sentence.
> 
> Good point, I swapped "it" and "arm_analyze_prologue" to make sure
> "arm_analyze_prologue" comes before.
> 
>>
>>> Fix it by factoring out the decoding of immediates in a new function,
>>> arm_expand_immediate.
>>>
>>> I added a selftest for arm_analyze_prologue that replicates the
>>> instruction sequence.  Without the fix, it crashes GDB if it is build
>>> with --enable-ubsan.
>>>
>>> I initially wanted to re-use the abstract_memory_reader class already in
>>> arm-tdep.c, used to make arm_process_record testable.  However,
>>> arm_process_record and arm_analyze_prologue don't use the same kind of
>>> memory reading functions.  arm_process_record uses a function that
>>> returns an error status on failure while arm_analyze_prologue uses one
>>> that throws an exception.  Since i didn't want to introduce any other
>>> behavior change, I decided to just introduce a separate interface
>>> (arm_instruction_reader).  It is derived from
>>> abstract_instruction_reader in aarch64-tdep.c.
>>
>> I both don’t like this and can’t see a better way of doing it :)
>> So, I’m ok with it
> 
> Same.  I really wanted not to add another class, but I also didn't want
> this simple fix to become a huge refactor.
> 
>>> @@ -1485,6 +1511,26 @@ arm_instruction_restores_sp (unsigned int insn)
>>>   return 0;
>>> }
>>>
>>> +/* Implement immediate value decoding, as described in section A5.2.4
>>> +   (Modified immediate constants in ARM instructions) of the ARM Architecture
>>> +   Reference Manual.  */
>>
>>
>> "of the ARM Architecture Reference Manual (ARMv7-A and ARMv7-R edition)"
>>
>> Just to make sure no one is looking for that section in the v8 doc.
> 
> Done.
> 
>>> @@ -9618,6 +9658,7 @@ vfp - VFP co-processor."),
>>>
>>> #if GDB_SELF_TEST
>>>   selftests::register_test ("arm-record", selftests::arm_record_test);
>>> +  selftests::register_test ("arm_analyze_prologue", selftests::arm_analyze_prologue_test);
>>> #endif
>>
>> Given the size of the tdep files, I wonder if it’s worth putting the tests into separate
>> files, eg arm-tdep-selftests.c
>> (Probably not for this patch though)
> 
> Possibly.  That requires exposing (making non-static) functions that are not usually exposed
> though.
> 
>>> }
>>> @@ -13254,6 +13295,63 @@ arm_record_test (void)
>>>     SELF_CHECK (arm_record.arm_regs[0] == 7);
>>>   }
>>> }
>>> +
>>> +/* Instruction reader from manually cooked instruction sequences.  */
>>> +
>>> +class test_arm_instruction_reader : public arm_instruction_reader
>>> +{
>>> +public:
>>> +  template<size_t SIZE>
>>> +  explicit test_arm_instruction_reader (const uint32_t (&insns)[SIZE])
>>> +    : m_insns (insns), m_insns_size (SIZE)
>>
>> Why the need for a template?
>> Is it just so that m_insns_size can be determined automatically?
> 
> I don't really know, I just copied the AArch64 one :).  But yeah I think
> that's it.  It could also use a gdb::array_view, that would be simpler I
> think.  I would change it for:
> 
>   /* Instruction reader from manually cooked instruction sequences.  */
> 
>   class test_arm_instruction_reader : public arm_instruction_reader
>   {
>   public:
>     explicit test_arm_instruction_reader (gdb::array_view<const uint32_t> insns)
>       : m_insns (insns)
>     {}
> 
>     uint32_t read (CORE_ADDR memaddr, enum bfd_endian byte_order) const override
>     {
>       SELF_CHECK (memaddr % 4 == 0);
>       SELF_CHECK (memaddr / 4 < m_insns.size ());
> 
>       return m_insns[memaddr / 4];
>     }
> 
>   private:
>     const gdb::array_view<const uint32_t> m_insns;
>   };
> 
> Simon
> 

I pushed the patch with the changes mentioned above.

Simon