From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id 6G8jDMm8rl9idgAAWB0awg (envelope-from ) for ; Fri, 13 Nov 2020 12:05:13 -0500 Received: by simark.ca (Postfix, from userid 112) id 2DE871F08D; Fri, 13 Nov 2020 12:05:13 -0500 (EST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 7AD6B1EFBB for ; Fri, 13 Nov 2020 12:05:11 -0500 (EST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3E5AF3944413; Fri, 13 Nov 2020 17:05:11 +0000 (GMT) Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id C0E7F385481D for ; Fri, 13 Nov 2020 17:05:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C0E7F385481D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark@simark.ca Received: from [172.16.0.95] (192-222-181-218.qc.cable.ebox.net [192.222.181.218]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 85ACE1E58F; Fri, 13 Nov 2020 12:05:06 -0500 (EST) Subject: Re: [PATCH] gdb/arm: avoid undefined behavior shift when decoding immediate value To: Simon Marchi , Alan Hayward References: <20201109204945.1313866-1-simon.marchi@efficios.com> <735CB03B-2500-4CAE-8FB3-FA253DA4E414@arm.com> <66ee1922-d144-29bc-a8c7-7eceb0b345ce@efficios.com> From: Simon Marchi Message-ID: <6d6a7930-9863-0649-5b98-29ea8c09a61d@simark.ca> Date: Fri, 13 Nov 2020 12:05:05 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <66ee1922-d144-29bc-a8c7-7eceb0b345ce@efficios.com> Content-Type: text/plain; charset=utf-8 Content-Language: tl Content-Transfer-Encoding: 8bit X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: nd , "gdb-patches\\@sourceware.org" Errors-To: gdb-patches-bounces@sourceware.org Sender: "Gdb-patches" On 2020-11-11 3:44 p.m., Simon Marchi via Gdb-patches wrote: > On 2020-11-11 5:00 a.m., Alan Hayward wrote: >> >> >>> On 9 Nov 2020, at 20:49, Simon Marchi via Gdb-patches wrote: >>> >>> From: Simon Marchi >>> >>> When loading the code file provided in PR 26828 and GDB is build with >>> UBSan, we get: >>> >>> Core was generated by `./Foo'. >>> Program terminated with signal SIGABRT, Aborted. >>> #0 0xb6c3809c in pthread_cond_wait () from /home/simark/build/binutils-gdb/gdb/repo/lib/libpthread.so.0 >>> [Current thread is 1 (LWP 29367)] >>> (gdb) bt >>> /home/simark/src/binutils-gdb/gdb/arm-tdep.c:1551:30: runtime error: shift exponent 32 is too large for 32-bit type 'unsigned int' >>> >> >> This error is because the code is technically wrong - but in reality >> compilers will plant what we really wanted? > > Well, we don't really know, different things can happen on different > systems. I did this program for fun: > > #include > #include > > int main(int argc, char **argv) { > unsigned int foo = 0xabababab; > unsigned int shift = atoi(argv[1]); > printf("Shifting by %d\n", shift); > foo = foo >> shift; > printf("%x\n", foo); > return 0; > } > > Running `./a.out 32`, I get this on AMD64: > > Shifting by 32 > abababab > > And I get this on PowerPC and ARM: > > Shifting by 32 > 0 > >>> The sequence of instructions at pthread_cond_wait, in the >>> libpthread.so.0 library, contains this instruction with an immediate >>> constant with a "rotate amount" of 0: >>> >>> e24dd044 sub sp, sp, #68 ; 0x44 >>> >>> Since it shifts by "32 - rotate amount", arm_analyze_prologue does a 32 >>> bit shift of a 32 bit type, which is caught by UBSan. >>> >> >> Minor nit - it took me a while to figure out that the “it” in “Since it” >> is the following arm_analyze_prologue and not something in the previous >> sentence. > > Good point, I swapped "it" and "arm_analyze_prologue" to make sure > "arm_analyze_prologue" comes before. > >> >>> Fix it by factoring out the decoding of immediates in a new function, >>> arm_expand_immediate. >>> >>> I added a selftest for arm_analyze_prologue that replicates the >>> instruction sequence. Without the fix, it crashes GDB if it is build >>> with --enable-ubsan. >>> >>> I initially wanted to re-use the abstract_memory_reader class already in >>> arm-tdep.c, used to make arm_process_record testable. However, >>> arm_process_record and arm_analyze_prologue don't use the same kind of >>> memory reading functions. arm_process_record uses a function that >>> returns an error status on failure while arm_analyze_prologue uses one >>> that throws an exception. Since i didn't want to introduce any other >>> behavior change, I decided to just introduce a separate interface >>> (arm_instruction_reader). It is derived from >>> abstract_instruction_reader in aarch64-tdep.c. >> >> I both don’t like this and can’t see a better way of doing it :) >> So, I’m ok with it > > Same. I really wanted not to add another class, but I also didn't want > this simple fix to become a huge refactor. > >>> @@ -1485,6 +1511,26 @@ arm_instruction_restores_sp (unsigned int insn) >>> return 0; >>> } >>> >>> +/* Implement immediate value decoding, as described in section A5.2.4 >>> + (Modified immediate constants in ARM instructions) of the ARM Architecture >>> + Reference Manual. */ >> >> >> "of the ARM Architecture Reference Manual (ARMv7-A and ARMv7-R edition)" >> >> Just to make sure no one is looking for that section in the v8 doc. > > Done. > >>> @@ -9618,6 +9658,7 @@ vfp - VFP co-processor."), >>> >>> #if GDB_SELF_TEST >>> selftests::register_test ("arm-record", selftests::arm_record_test); >>> + selftests::register_test ("arm_analyze_prologue", selftests::arm_analyze_prologue_test); >>> #endif >> >> Given the size of the tdep files, I wonder if it’s worth putting the tests into separate >> files, eg arm-tdep-selftests.c >> (Probably not for this patch though) > > Possibly. That requires exposing (making non-static) functions that are not usually exposed > though. > >>> } >>> @@ -13254,6 +13295,63 @@ arm_record_test (void) >>> SELF_CHECK (arm_record.arm_regs[0] == 7); >>> } >>> } >>> + >>> +/* Instruction reader from manually cooked instruction sequences. */ >>> + >>> +class test_arm_instruction_reader : public arm_instruction_reader >>> +{ >>> +public: >>> + template >>> + explicit test_arm_instruction_reader (const uint32_t (&insns)[SIZE]) >>> + : m_insns (insns), m_insns_size (SIZE) >> >> Why the need for a template? >> Is it just so that m_insns_size can be determined automatically? > > I don't really know, I just copied the AArch64 one :). But yeah I think > that's it. It could also use a gdb::array_view, that would be simpler I > think. I would change it for: > > /* Instruction reader from manually cooked instruction sequences. */ > > class test_arm_instruction_reader : public arm_instruction_reader > { > public: > explicit test_arm_instruction_reader (gdb::array_view insns) > : m_insns (insns) > {} > > uint32_t read (CORE_ADDR memaddr, enum bfd_endian byte_order) const override > { > SELF_CHECK (memaddr % 4 == 0); > SELF_CHECK (memaddr / 4 < m_insns.size ()); > > return m_insns[memaddr / 4]; > } > > private: > const gdb::array_view m_insns; > }; > > Simon > I pushed the patch with the changes mentioned above. Simon