Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

From: Stephen Brennan via Gdb <gdb@sourceware.org>
To: "Omar Sandoval" <osandov@osandov.com>,
	"Thomas Weißschuh" <thomas@t-8ch.de>
Cc: gdb@sourceware.org, linux-debuggers@vger.kernel.org,
	Amal Raj T <tjarlama@gmail.com>
Subject: Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
Date: Mon, 27 Jan 2025 10:42:59 -0800	[thread overview]
Message-ID: <87y0ywgkss.fsf@oracle.com> (raw)
In-Reply-To: <Z5fMtZOoYaV6aXwf@telecaster>

Omar Sandoval <osandov@osandov.com> writes:
> On Sun, Jan 26, 2025 at 07:07:47PM +0100, Thomas Weißschuh wrote:
>> Hi Stephen,
>> On 2025-01-13 16:22:00-0800, Stephen Brennan wrote:
>> > I contribute to the drgn debugger [1], and work on debugging the Linux
>> > kernel a fair bit. Drgn is particularly well-suited to the Linux kernel
>> > and contains a lot of support for it. It currently supports attaching to
>> > live targets via Linux's /proc/kcore, and core dumps. We are looking
>> > into supporting remote targets via GDB's remote protocol.
>> > 
>> > One piece of information that is very useful when debugging the Linux
>> > kernel is the VMCOREINFO note[2]. This is a free-form piece of text
>> > data, typically around 3k bytes, which contains information that
>> > debuggers would find useful in interpreting a Linux kernel memory image.
>> > In particular, it contains the KASLR offset, the build ID of the kernel,
>> > and the OS release. With this information, a debugger could attach to
>> > a live GDB stub (e.g. kgdb, or QEMU) without needing to specify
>> > debuginfo file names or memory KASLR memory offsets.
>> > 
>> > To that end, we hope to extend the GDB remote protocol with a facility
>> > that would allow the debugger to request this information. We've written
>> > up an idea for this proposal at [3]. The summary is:
>> > 
>> >   'q linux.vmcoreinfo'
>> >      Retrieves the Linux vmcoreinfo data.
>> >      Reply:
>> >      'Q [DATA]' data is encoded as described in the Binary Data doc [4]
>> >      'E.<text>' with an informative message if the data is not available
>> > 
>> > However, with the candidate kgdb implementation taking shape [5], we're
>> > becoming concerned regarding this design. It seems that there is an
>> > implicit maximum packet size which is not described in the protocol
>> > documentation. Many stubs have small(ish) shared output buffers. It
>> > seems to me that data which would be 3k bytes before escaping is too
>> > large. We've noticed that there is a 'qXfer' query packet which allows
>> > specifying an offset and a number of bytes. Maybe it would be better for
>> > us to add a new 'special data area' for the 'qXfer' message, and reuse
>> > that command?
>> 
>> Do you need to transfer the full vmcoreinfo data?
>> Wouldn't it be sufficient to only include the address/size of the
>> vmcoreinfo note in memory and the debugger can read the data from there
>> with regular memory access commands?
>> That information is enough for QEMUs vmcoreinfo device.
>> It would simplify the design and implementation(s) significantly.

I do agree that this would simplify the protocol design, and it's a good
idea I hadn't considered. Thank you!

> Oh, that's a good idea. I guess the downside is an extra command round
> trip.
>
> QEMU and KGDB also only implement the `m` command for reading
> hex-encoded memory. We'd probably want to implement the `x` command for
> both since it doesn't (usually) double the transfer size.
>
> One more caveat is that KGDB doesn't check the length passed to the `m`
> command and will happily clobber memory... QEMU's gdbstub does seem to
> check, and also advertises its packet size, so we probably want that in
> KGDB, too.
>
> Stephen, what do you think?

One of our core constraints is that the vmcoreinfo is quite a bit of
text data, around 3k bytes right now. Doubling that to 6k would be a
real problem, so it would be really helpful to keep this to using the
'x' encoding. However, I think there's a couple good reasons that
neither QEMU nor KGDB support the 'x' packet:

1. You cannot know the required buffer size easily, so you cannot know
whether you will be able to satisfy an 'x' request in your static buffer
size. This is most relevant to kgdb. The 'x' packet documentation says:

  The reply may contain fewer addressable memory units than requested if
  the server was reading from a trace frame memory and was able to read
  only part of the region of memory. 

I'm not certain what trace frame memory is, but the vmcoreinfo is not
that. I'm not confident whether this means the protocol allows fewer
bytes to be returned for other cases? Certainly, the 'qXfer' packets
explicitly allow for fewer bytes to be returned, as the documentation
states:

  It is possible for data to have fewer bytes than the length in the
  request. 

2. The 'x' encoding permits NUL bytes to be transmitted in a packet.
Thus, adding support to an existing stub has a hidden pitfall that you
need to ensure you can transmit NUL bytes successfully. You could
circumvent the problem by escaping them, but since NUL bytes are
typically quite common, that eliminates the benefit of the binary
encoding. (You could do run-length encoding, but I'm not actually sure
how common it is to do run-length encoding for escaped characters -- is
it tested anywhere? Certainly, neither kgdb nor QEMU implement
run-length encoding today).

Currently, kgdb implicitly assumes that the buffer is NUL-terminated,
and it would be quite a bit of work for it to instead track the number
of bytes in the response instead of NUL-termination. It seems that QEMU
is in a better state for that: it uses a GString for the response, and
according to the docs[1], it can hold arbitrary binary data.

[1]: https://docs.gtk.org/glib/struct.String.html

So while I think the cleanest design would be the one that simply
transmits the address of the vmcoreinfo note, the most practical one,
with the protocol as it is today, seems to be implementing the 'qXfer'
packet described in the other branch of this thread. For that, #1 is not
an issue (because we can just return fewer bytes than requested), and #2
is not an issue because vmcoreinfo is text data (which can't be
guaranteed for the general 'x' packet implementation).

Thanks,
Stephen

next prev parent reply	other threads:[~2025-01-27 18:44 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <8734hmtfbr.fsf@oracle.com>
2025-01-14 15:03 ` Luis Machado via Gdb
2025-01-14 17:15   ` Tom Tromey
2025-01-14 17:39     ` Stephen Brennan via Gdb
2025-01-16 10:37       ` Andrew Burgess via Gdb
2025-01-16 10:49         ` Luis Machado via Gdb
2025-01-16 16:40           ` Andrew Burgess via Gdb
2025-01-16 17:15             ` Luis Machado via Gdb
2025-01-17 22:01             ` Tom Tromey
2025-01-16 17:58         ` Stephen Brennan via Gdb
2025-01-23  1:11         ` Stephen Brennan via Gdb
2025-01-26 18:07 ` Thomas Weißschuh via Gdb
2025-01-27 18:13   ` Omar Sandoval
2025-01-27 18:42     ` Stephen Brennan via Gdb [this message]
2025-01-27 22:40       ` Thomas Weißschuh via Gdb
2025-01-28  0:19         ` Stephen Brennan via Gdb
2025-01-29 21:16           ` Thomas Weißschuh via Gdb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y0ywgkss.fsf@oracle.com \
    --to=gdb@sourceware.org \
    --cc=linux-debuggers@vger.kernel.org \
    --cc=osandov@osandov.com \
    --cc=stephen.s.brennan@oracle.com \
    --cc=thomas@t-8ch.de \
    --cc=tjarlama@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox