From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 116206 invoked by alias); 19 Jul 2019 17:07:49 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 116191 invoked by uid 89); 19 Jul 2019 17:07:49 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 spammy=cold, brain, convince X-HELO: mail-wr1-f66.google.com Received: from mail-wr1-f66.google.com (HELO mail-wr1-f66.google.com) (209.85.221.66) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 19 Jul 2019 17:07:47 +0000 Received: by mail-wr1-f66.google.com with SMTP id f9so32921315wre.12 for ; Fri, 19 Jul 2019 10:07:47 -0700 (PDT) Return-Path: Received: from ?IPv6:2001:8a0:f913:f700:56ee:75ff:fe8d:232b? ([2001:8a0:f913:f700:56ee:75ff:fe8d:232b]) by smtp.gmail.com with ESMTPSA id r5sm34213487wmh.35.2019.07.19.10.07.44 (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Fri, 19 Jul 2019 10:07:44 -0700 (PDT) Subject: Re: [PATCH v2 2/5] Restrict use of minsym names when printing addresses in disassembled code To: Kevin Buettner , gdb-patches@sourceware.org References: <20190704045503.1250-1-kevinb@redhat.com> <20190704045503.1250-3-kevinb@redhat.com> From: Pedro Alves Message-ID: <97a06c38-91e0-8669-08d0-a52a2b91c2bb@redhat.com> Date: Fri, 19 Jul 2019 17:07:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <20190704045503.1250-3-kevinb@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2019-07/txt/msg00475.txt.bz2 Below, more of a brain dump than an objection. FWIW. On 7/4/19 5:55 AM, Kevin Buettner wrote: > > (gdb) x/5i foo_cold > 0x401128 : push %rbp > 0x401129 : mov %rsp,%rbp > 0x40112c : callq 0x401134 > 0x401131 : nop > 0x401132 : pop %rbp I admit that it took me a while to reply because I'm still finding it a bit hard to convince myself that that is the ideal output. E.g., the first two instructions are obviously part of the same prologue, I find it a bit odd that the disassembly shows different function names for that instruction pair. Maybe this won't matter in practice since IIUC your testcase is a bit contrived, and real cold functions are just fragments of functions jmp/ret'ed to/from, without a prologue. BTW, I noticed (with your series applied), this divergence: (gdb) x /5i foo_cold 0x40048e : push %rbp 0x40048f : mov %rsp,%rbp 0x400492 : callq 0x400487 0x400497 : nop 0x400498 : pop %rbp vs: (gdb) info symbol 0x40048f foo_cold + 1 in section .text (gdb) info symbol 0x400492 foo_cold + 4 in section .text (gdb) info symbol Argument required (address). (gdb) info symbol 0x400497 foo_cold + 9 in section .text That's of course because "info symbol" only looks at minimal symbols. On the disassemble side, I think I'd be less confused with: (gdb) disassemble foo Dump of assembler code for function foo: Address range 0x4004a1 to 0x4004bc: 0x00000000004004a1 : push %rbp 0x00000000004004a2 : mov %rsp,%rbp 0x00000000004004a5 : callq 0x40049a 0x00000000004004aa : mov 0x200b70(%rip),%eax # 0x601020 0x00000000004004b0 : test %eax,%eax 0x00000000004004b2 : je 0x4004b9 0x00000000004004b4 : callq 0x40048e 0x00000000004004b9 : nop 0x00000000004004ba : pop %rbp 0x00000000004004bb : retq Address range 0x40048e to 0x40049a: 0x000000000040048e : push %rbp 0x000000000040048f : mov %rsp,%rbp 0x0000000000400492 : callq 0x400487 0x0000000000400497 : nop 0x0000000000400498 : pop %rbp 0x0000000000400499 : retq End of assembler dump. Instead of: (gdb) disassemble foo Dump of assembler code for function foo: Address range 0x4004a1 to 0x4004bc: 0x00000000004004a1 <+0>: push %rbp 0x00000000004004a2 <+1>: mov %rsp,%rbp 0x00000000004004a5 <+4>: callq 0x40049a 0x00000000004004aa <+9>: mov 0x200b70(%rip),%eax # 0x601020 0x00000000004004b0 <+15>: test %eax,%eax 0x00000000004004b2 <+17>: je 0x4004b9 0x00000000004004b4 <+19>: callq 0x40048e 0x00000000004004b9 <+24>: nop 0x00000000004004ba <+25>: pop %rbp 0x00000000004004bb <+26>: retq Address range 0x40048e to 0x40049a: 0x000000000040048e <-19>: push %rbp 0x000000000040048f <-18>: mov %rsp,%rbp 0x0000000000400492 <-15>: callq 0x400487 0x0000000000400497 <-10>: nop 0x0000000000400498 <-9>: pop %rbp 0x0000000000400499 <-8>: retq End of assembler dump. In your examples / testcase, the cold function is adjacent/near the hot / main entry point of foo. But I think that on a real cold function, the offsets between the cold and hot entry points can potentially be much larger (e.g., place in different elf sections), as the point is exactly to move cold code away from the hot path / cache. That that would mean that we're likely to see output like: (gdb) disassemble foo Dump of assembler code for function foo: Address range 0x6004a1 to 0x6004bc: 0x00000000006004a1 <+0>: push %rbp 0x00000000006004a2 <+1>: mov %rsp,%rbp 0x00000000006004a5 <+4>: callq 0x40049a 0x00000000006004aa <+9>: mov 0x200b70(%rip),%eax # 0x601020 0x00000000006004b0 <+15>: test %eax,%eax 0x00000000006004b2 <+17>: je 0x6004b9 0x00000000006004b4 <+19>: callq 0x40048e 0x00000000006004b9 <+24>: nop 0x00000000006004ba <+25>: pop %rbp 0x00000000006004bb <+26>: retq Address range 0x40048e to 0x40049a: 0x000000000040048e <-2097171>: push %rbp 0x000000000040048f <-2097170>: mov %rsp,%rbp 0x0000000000400492 <-2097167>: callq 0x400487 0x0000000000400497 <-2097162>: nop 0x0000000000400498 <-2097161>: pop %rbp 0x0000000000400499 <-2097160>: retq End of assembler dump. ... and those negative offsets kind of look a bit odd. But maybe it's just that I'm not thinking it right. If I think of the offset in terms of offset from the "foo"'s entry point, which is what it really is, then I can convince myself that I can explain why that's the right output, pedantically. So I guess I'll grow into it. And with that out of the way, the series looks good to me as is. Thanks, Pedro Alves