Re: [PATCH v2 2/5] Restrict use of minsym names when printing addresses in disassembled code

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Pedro Alves <palves@redhat.com>
To: Kevin Buettner <kevinb@redhat.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH v2 2/5] Restrict use of minsym names when printing addresses in disassembled code
Date: Fri, 19 Jul 2019 17:07:00 -0000	[thread overview]
Message-ID: <97a06c38-91e0-8669-08d0-a52a2b91c2bb@redhat.com> (raw)
In-Reply-To: <20190704045503.1250-3-kevinb@redhat.com>

Below, more of a brain dump than an objection.  FWIW.

On 7/4/19 5:55 AM, Kevin Buettner wrote:
> 
> (gdb) x/5i foo_cold
>    0x401128 <foo_cold>:	push   %rbp
>    0x401129 <foo+35>:	mov    %rsp,%rbp
>    0x40112c <foo+38>:	callq  0x401134 <baz>
>    0x401131 <foo+43>:	nop
>    0x401132 <foo+44>:	pop    %rbp

I admit that it took me a while to reply because I'm still finding
it a bit hard to convince myself that that is the ideal output.
E.g., the first two instructions are obviously part of the same
prologue, I find it a bit odd that the disassembly shows
different function names for that instruction pair.

Maybe this won't matter in practice since IIUC your testcase is a
bit contrived, and real cold functions are just fragments of
functions jmp/ret'ed to/from, without a prologue.

BTW, I noticed (with your series applied), this divergence:

 (gdb) x /5i foo_cold
    0x40048e <foo_cold>: push   %rbp
    0x40048f <foo-18>:   mov    %rsp,%rbp
    0x400492 <foo-15>:   callq  0x400487 <baz>
    0x400497 <foo-10>:   nop
    0x400498 <foo-9>:    pop    %rbp

vs:

 (gdb) info symbol 0x40048f
 foo_cold + 1 in section .text
 (gdb) info symbol 0x400492
 foo_cold + 4 in section .text
 (gdb) info symbol 
 Argument required (address).
 (gdb) info symbol 0x400497
 foo_cold + 9 in section .text

That's of course because "info symbol" only looks at minimal symbols.

On the disassemble side, I think I'd be less confused with:

(gdb) disassemble foo
Dump of assembler code for function foo:
Address range 0x4004a1 to 0x4004bc:
   0x00000000004004a1 <foo+0>:     push   %rbp
   0x00000000004004a2 <foo+1>:     mov    %rsp,%rbp
   0x00000000004004a5 <foo+4>:     callq  0x40049a <bar>
   0x00000000004004aa <foo+9>:     mov    0x200b70(%rip),%eax        # 0x601020 <e>
   0x00000000004004b0 <foo+15>:    test   %eax,%eax
   0x00000000004004b2 <foo+17>:    je     0x4004b9 <foo+24>
   0x00000000004004b4 <foo+19>:    callq  0x40048e <foo_cold>
   0x00000000004004b9 <foo+24>:    nop
   0x00000000004004ba <foo+25>:    pop    %rbp
   0x00000000004004bb <foo+26>:    retq   
Address range 0x40048e to 0x40049a:
   0x000000000040048e <foo.cold+0>:    push   %rbp
   0x000000000040048f <foo.cold+1>:    mov    %rsp,%rbp
   0x0000000000400492 <foo.cold+4>:    callq  0x400487 <baz>
   0x0000000000400497 <foo.cold+9>:    nop
   0x0000000000400498 <foo.cold+10>:   pop    %rbp
   0x0000000000400499 <foo.cold+11>:   retq   
End of assembler dump.

Instead of:

(gdb) disassemble foo
Dump of assembler code for function foo:
Address range 0x4004a1 to 0x4004bc:
   0x00000000004004a1 <+0>:     push   %rbp
   0x00000000004004a2 <+1>:     mov    %rsp,%rbp
   0x00000000004004a5 <+4>:     callq  0x40049a <bar>
   0x00000000004004aa <+9>:     mov    0x200b70(%rip),%eax        # 0x601020 <e>
   0x00000000004004b0 <+15>:    test   %eax,%eax
   0x00000000004004b2 <+17>:    je     0x4004b9 <foo+24>
   0x00000000004004b4 <+19>:    callq  0x40048e <foo_cold>
   0x00000000004004b9 <+24>:    nop
   0x00000000004004ba <+25>:    pop    %rbp
   0x00000000004004bb <+26>:    retq   
Address range 0x40048e to 0x40049a:
   0x000000000040048e <-19>:    push   %rbp
   0x000000000040048f <-18>:    mov    %rsp,%rbp
   0x0000000000400492 <-15>:    callq  0x400487 <baz>
   0x0000000000400497 <-10>:    nop
   0x0000000000400498 <-9>:     pop    %rbp
   0x0000000000400499 <-8>:     retq   
End of assembler dump.

In your examples / testcase, the cold function is adjacent/near
the hot / main entry point of foo.  But I think that
on a real cold function, the offsets between the cold and
hot entry points can potentially be much larger (e.g., place in
different elf sections), as the point is exactly to
move cold code away from the hot path / cache.

That that would mean that we're likely to see output like:

(gdb) disassemble foo
Dump of assembler code for function foo:
Address range 0x6004a1 to 0x6004bc:
   0x00000000006004a1 <+0>:     push   %rbp
   0x00000000006004a2 <+1>:     mov    %rsp,%rbp
   0x00000000006004a5 <+4>:     callq  0x40049a <bar>
   0x00000000006004aa <+9>:     mov    0x200b70(%rip),%eax        # 0x601020 <e>
   0x00000000006004b0 <+15>:    test   %eax,%eax
   0x00000000006004b2 <+17>:    je     0x6004b9 <foo+24>
   0x00000000006004b4 <+19>:    callq  0x40048e <foo_cold>
   0x00000000006004b9 <+24>:    nop
   0x00000000006004ba <+25>:    pop    %rbp
   0x00000000006004bb <+26>:    retq   
Address range 0x40048e to 0x40049a:
   0x000000000040048e <-2097171>:    push   %rbp
   0x000000000040048f <-2097170>:    mov    %rsp,%rbp
   0x0000000000400492 <-2097167>:    callq  0x400487 <baz>
   0x0000000000400497 <-2097162>:    nop
   0x0000000000400498 <-2097161>:    pop    %rbp
   0x0000000000400499 <-2097160>:    retq   
End of assembler dump.

... and those negative offsets kind of look a bit odd.

But maybe it's just that I'm not thinking it right.  

If I think of the offset in terms of offset from the
"foo"'s entry point, which is what it really is, then I can
convince myself that I can explain why that's the right output,
pedantically.

So I guess I'll grow into it.

And with that out of the way, the series looks good to me as is.

Thanks,
Pedro Alves

next prev parent reply	other threads:[~2019-07-19 17:07 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-04  4:55 [PATCH v2 0/5] Non-contiguous address range bug fixes / improvements Kevin Buettner
2019-07-04  4:55 ` [PATCH v2 3/5] dwarf2-frame.c: Fix FDE processing bug involving non-contiguous ranges Kevin Buettner
2019-07-04  4:55 ` [PATCH v2 2/5] Restrict use of minsym names when printing addresses in disassembled code Kevin Buettner
2019-07-19 17:07   ` Pedro Alves [this message]
2019-07-04  4:55 ` [PATCH v2 1/5] Prefer symtab symbol over minsym for function names in non-contiguous blocks Kevin Buettner
2019-07-04  5:03   ` Kevin Buettner
2019-07-04  5:05 ` [PATCH v2 5/5] Improve test gdb.dwarf2/dw2-ranges-func.exp Kevin Buettner
2019-07-04  5:08   ` Kevin Buettner
2019-07-04  5:05 ` [PATCH v2 4/5] Allow display of negative offsets in print_address_symbolic() Kevin Buettner
2019-07-30 16:47   ` Kevin Buettner
2019-07-27 20:48 ` [PATCH v2 0/5] Non-contiguous address range bug fixes / improvements Kevin Buettner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=97a06c38-91e0-8669-08d0-a52a2b91c2bb@redhat.com \
    --to=palves@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=kevinb@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox