From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id 48laI+yLa2CoOQAAWB0awg (envelope-from ) for ; Mon, 05 Apr 2021 18:15:08 -0400 Received: by simark.ca (Postfix, from userid 112) id 82AEF1E789; Mon, 5 Apr 2021 18:15:08 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-0.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RDNS_DYNAMIC,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 9F5B61E01F for ; Mon, 5 Apr 2021 18:15:07 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 582C3384640D; Mon, 5 Apr 2021 22:15:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 582C3384640D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1617660907; bh=Rsbqe29RbtoVxexfJYUTg/q7cElurrR4emEkXc2Vn1M=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=bkByajF4xJ0u/KIOmOGPQdh5ATH/Z9pPFICxhOEvedrFzqSj1zOHbkKa3Ao3udvKL tNCt1Cj1rU3kInZP7N7By88nzXFtklXdq+TYReD46DYGr/NCwXi0X+AoVQTMYRpH0H ysRaulRByfZhHdE7ViUK/9ClsJmZT5EMppo4nJxU= Received: from mail-qv1-xf2b.google.com (mail-qv1-xf2b.google.com [IPv6:2607:f8b0:4864:20::f2b]) by sourceware.org (Postfix) with ESMTPS id 0726D384B824 for ; Mon, 5 Apr 2021 22:15:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0726D384B824 Received: by mail-qv1-xf2b.google.com with SMTP id j1so2818954qvp.6 for ; Mon, 05 Apr 2021 15:15:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=Rsbqe29RbtoVxexfJYUTg/q7cElurrR4emEkXc2Vn1M=; b=ukXw0a7ZlggDve8b5EZcwlCwBwjr0dM1RmEAIfw17gPk46jarYQtEIOSJEII3VhyBx FCySn7DQqUmToXGPIAObMtoqX4vBzewOeFKFR7ns+UDaDsmWiYYUk+I3hkIm9hMi/otw 45fdbWZZbCB+00yXdoUFRvZd7TqJqGRaoxSRRyk+xTSP/qEWOusCNCNQhjMpyD9P/J/0 Q/PHNgYkPwE1qGjppJjJUGUqLQQuuozB1UF/KkxGyNMSh+6pW8opZh+wFjwH5ox24WBC faWUMYzdJKyDO4+3Qj05v1uV1HIHORNgWwrjWnl9dJcBbCgD6LF8yWWtE3glEeUuSExT Sc0A== X-Gm-Message-State: AOAM531eE+e7vjeCv+tIcuUKtZGrAVyffYqMADOn+HC81YkColfFI7Fz paJaukLaKUngAyuA8EwBHWrowYPZ5Epq+Q== X-Google-Smtp-Source: ABdhPJxZ7eMeoGgik/rzgMGlT1JMlvE1TXFmnxiwewfwBjaAEN/fq/IafPvjvxiVNvW7UtmJJRDGPg== X-Received: by 2002:ad4:4ab0:: with SMTP id i16mr25535364qvx.1.1617660904508; Mon, 05 Apr 2021 15:15:04 -0700 (PDT) Received: from ?IPv6:2804:7f0:4841:2841:2863:21b6:ccec:53b6? ([2804:7f0:4841:2841:2863:21b6:ccec:53b6]) by smtp.gmail.com with ESMTPSA id b22sm14975056qkk.45.2021.04.05.15.15.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 05 Apr 2021 15:15:04 -0700 (PDT) Subject: Re: A lean way for getting the size of the instruction at a given address To: Zied Guermazi , "gdb@sourceware.org" References: <295a186e-0dd9-fb96-671a-3df0a5611dd9@trande.de> <442482d9-31bd-8101-38f0-fb7c7763e61c@trande.de> <476fcf13-8782-a69f-f43b-069497ba7e3b@linaro.org> <51cfbb5b-ed10-c9d8-8dc3-81b3da496022@linaro.org> <72e584f8-2cf6-0bfa-882d-a1ba21a43931@trande.de> <0f2147c8-770d-5cb2-f415-8549b7192b36@linaro.org> Message-ID: <3ccd7866-bfe8-6f3b-6363-25bed37c1507@linaro.org> Date: Mon, 5 Apr 2021 19:15:01 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-BeenThere: gdb@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Luis Machado via Gdb Reply-To: Luis Machado Errors-To: gdb-bounces@sourceware.org Sender: "Gdb" Zied, On 4/5/21 7:12 PM, Zied Guermazi wrote: > Hi Luis, > > yes, it guess it was intended for processing disassemble command. Itwas > not intended to be used in performance critical use cases. Once it was > removed, the next bottle neck is the printf in > get_all_disassembler_options ( a string was used as a mean for passing > options). it consumes 20% of the time. > > Shall we put the changes needed to increase the performance in the "etm > for branch tracing" patch set, or in a dedicated one (performance > improvement one). please advicse This would be best as a separate patch. It will be easier to review that way. You may need to submit the change to both gdb/binutils lists, if the patch touches both projects. > > /Zied > > On 06.04.21 00:04, Luis Machado wrote: >> Hi Zied, >> >> On 4/5/21 6:47 PM, Zied Guermazi wrote: >>> hi Luis, >>> >>> thanks for your support. To experiment the impact of removing the >>> printing of the instruction on the overall performance, I commented >>> out setting and using the print function pointer in print_insn >>> (bfd_vma pc, struct disassemble_info *info, bfd_boolean little) in >>> opcodes/arm-dis.c, and the result was very interesting: The time >>> needed to process the traces dropped down from 12 minutes to 34 >>> seconds for 64 MB of traces. >> >> That is quite a bottleneck! I think this code path isn't exercised often. >> >>> >>> now that we have a proof that the bottleneck was printing, we can >>> think about a way to provide a clean implementation. >> >> I agree. A faster implementation of this particular function would be >> nice to have. It may even improve some other code paths that use this >> information. >> >>> >>> Kind Regards >>> >>> Zied Guermazi >>> >>> >>> On 05.04.21 18:40, Luis Machado wrote: >>>> On 4/5/21 1:17 PM, Zied Guermazi wrote: >>>>> hi Luis >>>>> >>>>> A new member function in "class gdb_disassembler" to calculate the >>>>> instruction length only will be a good solution. In fact a big >>>>> overhead is added by the printing of instruction disassembly, which >>>>> is not needed at all. On aarch64, the decoder is optimized to issue >>>>> many instruction in one trace element, and here calculating the >>>>> size consumes more than 80% of the time. On arm, the decoder issues >>>>> one instruction after another and here getting the size consumes >>>>> 50% of the time. Considering the amount of traces this can sum up >>>>> to a dozen of minutes in some cases (64MB of traces) >>>> >>>> Indeed, that doesn't sound good. >>>> >>>>> >>>>> Calculating the instruction size per se, on arm is a "rapid" >>>>> operation and consists of checking few bits in the opcode. So the >>>>> time can be drastically decreased by having a function to calculate >>>>> the size only. >>>>> >>>>> >>>>> gdb_print_insn can be then changed as following (pseudo code): >>>>> >>>>> int >>>>> gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr, >>>>>          struct ui_file *stream, int *branch_delay_insns) >>>>> { >>>>> >>>>>    gdb_disassembler di (gdbarch, stream); >>>>> >>>>>    if ( di.get_insn_size != 0) >>>>> >>>>>     return di.get_insn_size(memaddr); >>>>> >>>>>    else >>>>> >>>>>     return di.print_insn (memaddr, branch_delay_insns); >>>>> } >>>>> >>>>> Is there a function in aarch64-tdep or arm-tdep doing job of >>>>> disassembly ( the lower layer handling the opcode)? are we relaying >>>>> on the bfd library for it? can someone give me a hint of where to >>>>> find those functions? >>>> >>>> The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and >>>> aarch64-tdep.c (aarch64_gdb_print_insn) are more like helper >>>> functions and do some initial setup, but the code to disassemble >>>> lies in opcodes/arm-dis.c (print_insn) and opcodes/aarch64-dis.c >>>> (print_insn_aarch64). >>>> >>>> If you go with the route of changing "class gdb_disassembler", then >>>> you'll probably need to touch binutils/opcodes. >>>> >>>> If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), >>>> then you only need to change GDB. >>>>> >>>>> >>>>> Kind Regards >>>>> >>>>> Zied Guermazi >>>>> >>>>> >>>>> On 05.04.21 15:01, Luis Machado wrote: >>>>>> Hi Zied, >>>>>> >>>>>> On 4/4/21 4:59 AM, Zied Guermazi wrote: >>>>>>> hi >>>>>>> >>>>>>> I need to get the size of the instruction at a given address. I >>>>>>> am currently using gdb_insn_length (struct gdbarch *gdbarch, >>>>>>> CORE_ADDR addr) which calls gdb_print_insn (struct gdbarch >>>>>>> *gdbarch, CORE_ADDR memaddr, struct ui_file *stream, int >>>>>>> *branch_delay_insns). and this is consuming a huge time, >>>>>>> considering that this is used in branch tracing and this gets >>>>>>> repeated up to few millions times. >>>>>>> >>>>>>> >>>>>>> Is there a lean way for getting the size of the instruction at a >>>>>>> given address, I am using it for aarch64 and arm targets. >>>>>> >>>>>> At the moment I don't think there is an optimal solution for this. >>>>>> The instruction length is calculated as part of the disassemble >>>>>> process, and is tied to the function that prints instructions. >>>>>> >>>>>> One way to speed things up is to have a new member function in >>>>>> "class gdb_disassembler" to calculate the instruction length only. >>>>>> >>>>>> Another way is to have a new gdbarch hook that calculates the size >>>>>> of an instruction based on the current PC, mapping symbols etc. >>>>>> >>>>>>> >>>>>>> Kind Regards >>>>>>> >>>>>>> Zied Guermazi >>>>>>> >>>>>>> >>>>> >>>>> >>> >