From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id OOiUJ0xNrF8PNgAAWB0awg (envelope-from ) for ; Wed, 11 Nov 2020 15:45:00 -0500 Received: by simark.ca (Postfix, from userid 112) id 94CE71F08B; Wed, 11 Nov 2020 15:45:00 -0500 (EST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RDNS_NONE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.2 Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id F38411E58F for ; Wed, 11 Nov 2020 15:44:59 -0500 (EST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3A875385043A; Wed, 11 Nov 2020 20:44:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3A875385043A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1605127499; bh=NroJzuzLHTbwJsvc6nfzjf24n0QCgDw3hkORsUh73ls=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=iiR4u2uOlBuC0igPvdJXt71ymAdOmuZGUHxHL1rC0kBCrVyOHx5th+wFawEp8qWaI E1GScKoT+zBl1Z2j/rd/+RUgVWiNV5LDP5H8/L66zbDg5yEIHN5T3eTCiiLI1pCO/B EPRBbwSun3eKFO0mfZzYVl9QkOb6DH6jDlr9mRlw= Received: from mail.efficios.com (mail.efficios.com [167.114.26.124]) by sourceware.org (Postfix) with ESMTPS id 14D333833026 for ; Wed, 11 Nov 2020 20:44:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 14D333833026 Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id CD8B89FA96; Wed, 11 Nov 2020 15:44:56 -0500 (EST) Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id zeNNpjrmI2kW; Wed, 11 Nov 2020 15:44:56 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 5CF789FA94; Wed, 11 Nov 2020 15:44:56 -0500 (EST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 5CF789FA94 X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ARbazN1IstyR; Wed, 11 Nov 2020 15:44:56 -0500 (EST) Received: from [172.16.0.95] (192-222-181-218.qc.cable.ebox.net [192.222.181.218]) by mail.efficios.com (Postfix) with ESMTPSA id DAD439FA91; Wed, 11 Nov 2020 15:44:55 -0500 (EST) Subject: Re: [PATCH] gdb/arm: avoid undefined behavior shift when decoding immediate value To: Alan Hayward References: <20201109204945.1313866-1-simon.marchi@efficios.com> <735CB03B-2500-4CAE-8FB3-FA253DA4E414@arm.com> Message-ID: <66ee1922-d144-29bc-a8c7-7eceb0b345ce@efficios.com> Date: Wed, 11 Nov 2020 15:44:55 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <735CB03B-2500-4CAE-8FB3-FA253DA4E414@arm.com> Content-Type: text/plain; charset=utf-8 Content-Language: tl Content-Transfer-Encoding: quoted-printable X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Simon Marchi via Gdb-patches Reply-To: Simon Marchi Cc: nd , "gdb-patches\\@sourceware.org" Errors-To: gdb-patches-bounces@sourceware.org Sender: "Gdb-patches" On 2020-11-11 5:00 a.m., Alan Hayward wrote: > > >> On 9 Nov 2020, at 20:49, Simon Marchi via Gdb-patches wrote: >> >> From: Simon Marchi >> >> When loading the code file provided in PR 26828 and GDB is build with >> UBSan, we get: >> >> Core was generated by `./Foo'. >> Program terminated with signal SIGABRT, Aborted. >> #0 0xb6c3809c in pthread_cond_wait () from /home/simark/build/binu= tils-gdb/gdb/repo/lib/libpthread.so.0 >> [Current thread is 1 (LWP 29367)] >> (gdb) bt >> /home/simark/src/binutils-gdb/gdb/arm-tdep.c:1551:30: runtime error= : shift exponent 32 is too large for 32-bit type 'unsigned int' >> > > This error is because the code is technically wrong - but in reality > compilers will plant what we really wanted? Well, we don't really know, different things can happen on different systems. I did this program for fun: #include #include int main(int argc, char **argv) { unsigned int foo =3D 0xabababab; unsigned int shift =3D atoi(argv[1]); printf("Shifting by %d\n", shift); foo =3D foo >> shift; printf("%x\n", foo); return 0; } Running `./a.out 32`, I get this on AMD64: Shifting by 32 abababab And I get this on PowerPC and ARM: Shifting by 32 0 >> The sequence of instructions at pthread_cond_wait, in the >> libpthread.so.0 library, contains this instruction with an immediate >> constant with a "rotate amount" of 0: >> >> e24dd044 sub sp, sp, #68 ; 0x44 >> >> Since it shifts by "32 - rotate amount", arm_analyze_prologue does a 3= 2 >> bit shift of a 32 bit type, which is caught by UBSan. >> > > Minor nit - it took me a while to figure out that the =E2=80=9Cit=E2=80= =9D in =E2=80=9CSince it=E2=80=9D > is the following arm_analyze_prologue and not something in the previous > sentence. Good point, I swapped "it" and "arm_analyze_prologue" to make sure "arm_analyze_prologue" comes before. > >> Fix it by factoring out the decoding of immediates in a new function, >> arm_expand_immediate. >> >> I added a selftest for arm_analyze_prologue that replicates the >> instruction sequence. Without the fix, it crashes GDB if it is build >> with --enable-ubsan. >> >> I initially wanted to re-use the abstract_memory_reader class already = in >> arm-tdep.c, used to make arm_process_record testable. However, >> arm_process_record and arm_analyze_prologue don't use the same kind of >> memory reading functions. arm_process_record uses a function that >> returns an error status on failure while arm_analyze_prologue uses one >> that throws an exception. Since i didn't want to introduce any other >> behavior change, I decided to just introduce a separate interface >> (arm_instruction_reader). It is derived from >> abstract_instruction_reader in aarch64-tdep.c. > > I both don=E2=80=99t like this and can=E2=80=99t see a better way of do= ing it :) > So, I=E2=80=99m ok with it Same. I really wanted not to add another class, but I also didn't want this simple fix to become a huge refactor. >> @@ -1485,6 +1511,26 @@ arm_instruction_restores_sp (unsigned int insn) >> return 0; >> } >> >> +/* Implement immediate value decoding, as described in section A5.2.4 >> + (Modified immediate constants in ARM instructions) of the ARM Arch= itecture >> + Reference Manual. */ > > > "of the ARM Architecture Reference Manual (ARMv7-A and ARMv7-R edition)= " > > Just to make sure no one is looking for that section in the v8 doc. Done. >> @@ -9618,6 +9658,7 @@ vfp - VFP co-processor."), >> >> #if GDB_SELF_TEST >> selftests::register_test ("arm-record", selftests::arm_record_test); >> + selftests::register_test ("arm_analyze_prologue", selftests::arm_an= alyze_prologue_test); >> #endif > > Given the size of the tdep files, I wonder if it=E2=80=99s worth puttin= g the tests into separate > files, eg arm-tdep-selftests.c > (Probably not for this patch though) Possibly. That requires exposing (making non-static) functions that are = not usually exposed though. >> } >> @@ -13254,6 +13295,63 @@ arm_record_test (void) >> SELF_CHECK (arm_record.arm_regs[0] =3D=3D 7); >> } >> } >> + >> +/* Instruction reader from manually cooked instruction sequences. */ >> + >> +class test_arm_instruction_reader : public arm_instruction_reader >> +{ >> +public: >> + template >> + explicit test_arm_instruction_reader (const uint32_t (&insns)[SIZE]= ) >> + : m_insns (insns), m_insns_size (SIZE) > > Why the need for a template? > Is it just so that m_insns_size can be determined automatically? I don't really know, I just copied the AArch64 one :). But yeah I think that's it. It could also use a gdb::array_view, that would be simpler I think. I would change it for: /* Instruction reader from manually cooked instruction sequences. */ class test_arm_instruction_reader : public arm_instruction_reader { public: explicit test_arm_instruction_reader (gdb::array_view= insns) : m_insns (insns) {} uint32_t read (CORE_ADDR memaddr, enum bfd_endian byte_order) const o= verride { SELF_CHECK (memaddr % 4 =3D=3D 0); SELF_CHECK (memaddr / 4 < m_insns.size ()); return m_insns[memaddr / 4]; } private: const gdb::array_view m_insns; }; Simon