* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
[not found] <8734hmtfbr.fsf@oracle.com>
@ 2025-01-14 15:03 ` Luis Machado via Gdb
2025-01-14 17:15 ` Tom Tromey
2025-01-26 18:07 ` Thomas Weißschuh via Gdb
1 sibling, 1 reply; 16+ messages in thread
From: Luis Machado via Gdb @ 2025-01-14 15:03 UTC (permalink / raw)
To: Stephen Brennan, gdb; +Cc: linux-debuggers, Omar Sandoval, Amal Raj T
Hi Stephen,
I think you want either gdb@sourceware.org or gdb-patches@sourceware.org. This
is the binutils mailing list.
On 1/14/25 00:22, Stephen Brennan wrote:
> Hello all,
>
> I contribute to the drgn debugger [1], and work on debugging the Linux
> kernel a fair bit. Drgn is particularly well-suited to the Linux kernel
> and contains a lot of support for it. It currently supports attaching to
> live targets via Linux's /proc/kcore, and core dumps. We are looking
> into supporting remote targets via GDB's remote protocol.
>
> One piece of information that is very useful when debugging the Linux
> kernel is the VMCOREINFO note[2]. This is a free-form piece of text
> data, typically around 3k bytes, which contains information that
> debuggers would find useful in interpreting a Linux kernel memory image.
> In particular, it contains the KASLR offset, the build ID of the kernel,
> and the OS release. With this information, a debugger could attach to
> a live GDB stub (e.g. kgdb, or QEMU) without needing to specify
> debuginfo file names or memory KASLR memory offsets.
>
> To that end, we hope to extend the GDB remote protocol with a facility
> that would allow the debugger to request this information. We've written
> up an idea for this proposal at [3]. The summary is:
>
> 'q linux.vmcoreinfo'
> Retrieves the Linux vmcoreinfo data.
> Reply:
> 'Q [DATA]' data is encoded as described in the Binary Data doc [4]
> 'E.<text>' with an informative message if the data is not available
>
> However, with the candidate kgdb implementation taking shape [5], we're
> becoming concerned regarding this design. It seems that there is an
> implicit maximum packet size which is not described in the protocol
> documentation. Many stubs have small(ish) shared output buffers. It
> seems to me that data which would be 3k bytes before escaping is too
> large. We've noticed that there is a 'qXfer' query packet which allows
> specifying an offset and a number of bytes. Maybe it would be better for
> us to add a new 'special data area' for the 'qXfer' message, and reuse
> that command?
>
> To sum up, my specific questions are:
>
> 1. What is the maximum protocol packet size, if any?
It is hardcoded by gdb, but the remote can also specify that, but...
> 2. Would this functionality be better implemented in a single "q
> linux.vmcoreinfo" packet, or as a "qXfer" packet?
... we have packets like qXfer that can handle multi-part transfers. So the
packet size is not a critical concern anymore, and it is best to use this
newer mechanism, if the usage fits the packet structure.
> 3. Is it safe to assume that data transmitted in the qXfer packets is
> encoded via the escaped binary data format described at [4] (rather
> than the hexadecimal encoding)?
Yes, gdb uses qXfer to read the remote's auxv file, for instance.
[remote] Sending packet: $qXfer:auxv:read::0,1000#6b
[remote] Packet received: l!\000\000\000\000\000\000\000\000\000\0003\000\000\000\000\000\000\000p\022\000\000\000\000\000\000\020\000\000...
>
> Any other feedback is welcome too.
> Thanks,
> Stephen
>
> [1]: https://github.com/osandov/drgn
> [2]: https://docs.kernel.org/admin-guide/kdump/vmcoreinfo.html
> [3]: https://github.com/osandov/drgn/wiki/GDB-Remote-Protocol-proposal:-linux.vmcoreinfo-query-packet
> [4]: https://sourceware.org/gdb/current/onlinedocs/gdb.html/Overview.html#Binary-Data
> [5]: https://lore.kernel.org/linux-debuggers/20241210133448.3684593-1-tjarlama@gmail.com/T/#mad965a732c1c5e9e2656e4be79ffcdc36d89b7d1
> [6]: https://sourceware.org/gdb/current/onlinedocs/gdb.html/General-Query-Packets.html#qXfer-read
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-14 15:03 ` GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback Luis Machado via Gdb
@ 2025-01-14 17:15 ` Tom Tromey
2025-01-14 17:39 ` Stephen Brennan via Gdb
0 siblings, 1 reply; 16+ messages in thread
From: Tom Tromey @ 2025-01-14 17:15 UTC (permalink / raw)
To: Luis Machado via Gdb
Cc: Stephen Brennan, Luis Machado, linux-debuggers, Omar Sandoval,
Amal Raj T
>>>>> Luis Machado via Gdb <gdb@sourceware.org> writes:
>> To sum up, my specific questions are:
>>
>> 1. What is the maximum protocol packet size, if any?
> It is hardcoded by gdb, but the remote can also specify that, but...
>> 2. Would this functionality be better implemented in a single "q
>> linux.vmcoreinfo" packet, or as a "qXfer" packet?
> ... we have packets like qXfer that can handle multi-part transfers. So the
> packet size is not a critical concern anymore, and it is best to use this
> newer mechanism, if the usage fits the packet structure.
Agreed, qXfer is the way to go.
If you're adding a new object type, a patch to the manual would be good.
Tom
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-14 17:15 ` Tom Tromey
@ 2025-01-14 17:39 ` Stephen Brennan via Gdb
2025-01-16 10:37 ` Andrew Burgess via Gdb
0 siblings, 1 reply; 16+ messages in thread
From: Stephen Brennan via Gdb @ 2025-01-14 17:39 UTC (permalink / raw)
To: Tom Tromey, Luis Machado via Gdb
Cc: Luis Machado, linux-debuggers, Omar Sandoval, Amal Raj T
Tom Tromey <tom@tromey.com> writes:
>>>>>> Luis Machado via Gdb <gdb@sourceware.org> writes:
>
>>> To sum up, my specific questions are:
>>>
>>> 1. What is the maximum protocol packet size, if any?
>
>> It is hardcoded by gdb, but the remote can also specify that, but...
>
>>> 2. Would this functionality be better implemented in a single "q
>>> linux.vmcoreinfo" packet, or as a "qXfer" packet?
>
>> ... we have packets like qXfer that can handle multi-part transfers. So the
>> packet size is not a critical concern anymore, and it is best to use this
>> newer mechanism, if the usage fits the packet structure.
>
> Agreed, qXfer is the way to go.
Thank you Tom & Luis for confirmation, qXfer seems appropriate. With
that approach the buffer size is not really a concern: we can simply use
the minimum of the requested read size, and the stub's buffer size. So
long as clients use multiple requests until the data is fully read.
While the "os" object also sounds like a good place to put this (e.g.
within a new annex), it seems like that contains XML-formatted data with
well-understood schema and semantics. The vmcoreinfo is free-form text
(generally of a "key=value" format), so it probably should be a separate
object.
So I think we would prefer to add an object type, e.g. named "vmcoreinfo".
(But please do speak up if this sounds like a mistake)
> If you're adding a new object type, a patch to the manual would be good.
I'll definitely include a patch for the manual in the plan for this.
Another patch I'd like to write is to allow GDB's server to expose this
object type when the target is an ELF core dump with a VMCOREINFO note.
We're hoping for this to useful for all debuggers, not just drgn.
Thanks again!
Stephen
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-14 17:39 ` Stephen Brennan via Gdb
@ 2025-01-16 10:37 ` Andrew Burgess via Gdb
2025-01-16 10:49 ` Luis Machado via Gdb
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Andrew Burgess via Gdb @ 2025-01-16 10:37 UTC (permalink / raw)
To: Stephen Brennan, Tom Tromey, Luis Machado via Gdb
Cc: Luis Machado, linux-debuggers, Omar Sandoval, Amal Raj T
Stephen Brennan via Gdb <gdb@sourceware.org> writes:
> Tom Tromey <tom@tromey.com> writes:
>>>>>>> Luis Machado via Gdb <gdb@sourceware.org> writes:
>>
>>>> To sum up, my specific questions are:
>>>>
>>>> 1. What is the maximum protocol packet size, if any?
>>
>>> It is hardcoded by gdb, but the remote can also specify that, but...
>>
>>>> 2. Would this functionality be better implemented in a single "q
>>>> linux.vmcoreinfo" packet, or as a "qXfer" packet?
>>
>>> ... we have packets like qXfer that can handle multi-part transfers. So the
>>> packet size is not a critical concern anymore, and it is best to use this
>>> newer mechanism, if the usage fits the packet structure.
>>
>> Agreed, qXfer is the way to go.
>
> Thank you Tom & Luis for confirmation, qXfer seems appropriate. With
> that approach the buffer size is not really a concern: we can simply use
> the minimum of the requested read size, and the stub's buffer size. So
> long as clients use multiple requests until the data is fully read.
>
> While the "os" object also sounds like a good place to put this (e.g.
> within a new annex), it seems like that contains XML-formatted data with
> well-understood schema and semantics. The vmcoreinfo is free-form text
> (generally of a "key=value" format), so it probably should be a separate
> object.
>
> So I think we would prefer to add an object type, e.g. named "vmcoreinfo".
> (But please do speak up if this sounds like a mistake)
>
>> If you're adding a new object type, a patch to the manual would be good.
>
> I'll definitely include a patch for the manual in the plan for this.
> Another patch I'd like to write is to allow GDB's server to expose this
> object type when the target is an ELF core dump with a VMCOREINFO note.
> We're hoping for this to useful for all debuggers, not just drgn.
Hi Stephen,
I took a look at the wiki page and it seems like initially at least,
your plan is to make the information from vmcoreinfo available via a new
'info' command.
It is possible to send remote packets through GDB's Python API[1]. And
of course, the Python API allows for new commands to be created[2].
There is a test in GDB's test suite that makes use of the packet sending
API, and it happens to send a qXfer packet[3].
I say all this not to put you off contributing a patch to core GDB, but
if what you want is a new user command which will send a packet to a
remote target and process the results, then it should be possible to
implement this as a Python extension.
The benefits of using a Python API are that you can ship the extension
with the kernel, so if/when the vmcoreinfo changes you can easily update
the extension. At the very least, it allows you to prototype your
commands before having to submit patches to GDB core.
I have included below a very simple example that implements 'info stuff'
which just displays the 'threads' information from a remote target.
Place the code into 'cmd.py', then from GDB 'source cmd.py', after which
you can 'info stuff' and 'help info stuff'.
If this is an approach that might be of interest then I'm happy to help
in any way I can, just drop questions on the mailing list.
Thanks,
Andrew
[1] https://sourceware.org/gdb/current/onlinedocs/gdb.html/Connections-In-Python.html
[2] https://sourceware.org/gdb/current/onlinedocs/gdb.html/CLI-Commands-In-Python.html
[3] https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/testsuite/gdb.python/py-send-packet.py;hb=HEAD#l100
--
def bytes_to_string(byte_array):
res = ""
for b in byte_array:
b = int(b)
if (b >= 32 and b <= 126) or (b == 10) :
res = res + ("%c" % b)
else:
res = res + ("\\x%02x" % b)
return res
def get_stuff(conn, obj):
assert isinstance(conn, gdb.RemoteTargetConnection)
start_pos = 0
len = 10
output = ""
while True:
str = conn.send_packet("qXfer:%s:read::%d,%d" % (obj, start_pos, len))
str = bytes_to_string(str)
start_pos += len
c = str[0]
str = str[1:]
output += str
if c == "l":
break
return output
class InfoStuffCommand (gdb.Command):
"""
Usage:
info stuff
Shows the remote targets thread information.
"""
def __init__ (self):
gdb.Command.__init__ (self, "info stuff", gdb.COMMAND_DATA)
def invoke(self, args, from_tty):
inf = gdb.selected_inferior()
conn = inf.connection
if not isinstance(conn, gdb.RemoteTargetConnection):
raise gdb.GdbError("not on a remote connection")
string = get_stuff(conn, "threads")
print("Target thread information:")
for line in string.splitlines():
print(" %s" % line)
InfoStuffCommand ()
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-16 10:37 ` Andrew Burgess via Gdb
@ 2025-01-16 10:49 ` Luis Machado via Gdb
2025-01-16 16:40 ` Andrew Burgess via Gdb
2025-01-16 17:58 ` Stephen Brennan via Gdb
2025-01-23 1:11 ` Stephen Brennan via Gdb
2 siblings, 1 reply; 16+ messages in thread
From: Luis Machado via Gdb @ 2025-01-16 10:49 UTC (permalink / raw)
To: Andrew Burgess, Stephen Brennan, Tom Tromey, Luis Machado via Gdb
Cc: linux-debuggers, Omar Sandoval, Amal Raj T
On 1/16/25 10:37, Andrew Burgess wrote:
> Stephen Brennan via Gdb <gdb@sourceware.org> writes:
>
>> Tom Tromey <tom@tromey.com> writes:
>>>>>>>> Luis Machado via Gdb <gdb@sourceware.org> writes:
>>>
>>>>> To sum up, my specific questions are:
>>>>>
>>>>> 1. What is the maximum protocol packet size, if any?
>>>
>>>> It is hardcoded by gdb, but the remote can also specify that, but...
>>>
>>>>> 2. Would this functionality be better implemented in a single "q
>>>>> linux.vmcoreinfo" packet, or as a "qXfer" packet?
>>>
>>>> ... we have packets like qXfer that can handle multi-part transfers. So the
>>>> packet size is not a critical concern anymore, and it is best to use this
>>>> newer mechanism, if the usage fits the packet structure.
>>>
>>> Agreed, qXfer is the way to go.
>>
>> Thank you Tom & Luis for confirmation, qXfer seems appropriate. With
>> that approach the buffer size is not really a concern: we can simply use
>> the minimum of the requested read size, and the stub's buffer size. So
>> long as clients use multiple requests until the data is fully read.
>>
>> While the "os" object also sounds like a good place to put this (e.g.
>> within a new annex), it seems like that contains XML-formatted data with
>> well-understood schema and semantics. The vmcoreinfo is free-form text
>> (generally of a "key=value" format), so it probably should be a separate
>> object.
>>
>> So I think we would prefer to add an object type, e.g. named "vmcoreinfo".
>> (But please do speak up if this sounds like a mistake)
>>
>>> If you're adding a new object type, a patch to the manual would be good.
>>
>> I'll definitely include a patch for the manual in the plan for this.
>> Another patch I'd like to write is to allow GDB's server to expose this
>> object type when the target is an ELF core dump with a VMCOREINFO note.
>> We're hoping for this to useful for all debuggers, not just drgn.
>
> Hi Stephen,
>
> I took a look at the wiki page and it seems like initially at least,
> your plan is to make the information from vmcoreinfo available via a new
> 'info' command.
>
> It is possible to send remote packets through GDB's Python API[1]. And
> of course, the Python API allows for new commands to be created[2].
> There is a test in GDB's test suite that makes use of the packet sending
> API, and it happens to send a qXfer packet[3].
>
> I say all this not to put you off contributing a patch to core GDB, but
> if what you want is a new user command which will send a packet to a
> remote target and process the results, then it should be possible to
> implement this as a Python extension.
A bit off-topic, but wouldn't that have the potential to proliferate
remote packets gdb/debugging stubs have no control over or no documentation
to point at/refer to? Possibly contributing to greater confusion as to what
should be minimally supported in terms of remote packets?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-16 10:49 ` Luis Machado via Gdb
@ 2025-01-16 16:40 ` Andrew Burgess via Gdb
2025-01-16 17:15 ` Luis Machado via Gdb
2025-01-17 22:01 ` Tom Tromey
0 siblings, 2 replies; 16+ messages in thread
From: Andrew Burgess via Gdb @ 2025-01-16 16:40 UTC (permalink / raw)
To: Luis Machado, Stephen Brennan, Tom Tromey, Luis Machado via Gdb
Cc: linux-debuggers, Omar Sandoval, Amal Raj T
Luis Machado <luis.machado@arm.com> writes:
> On 1/16/25 10:37, Andrew Burgess wrote:
>> Stephen Brennan via Gdb <gdb@sourceware.org> writes:
>>
>>> Tom Tromey <tom@tromey.com> writes:
>>>>>>>>> Luis Machado via Gdb <gdb@sourceware.org> writes:
>>>>
>>>>>> To sum up, my specific questions are:
>>>>>>
>>>>>> 1. What is the maximum protocol packet size, if any?
>>>>
>>>>> It is hardcoded by gdb, but the remote can also specify that, but...
>>>>
>>>>>> 2. Would this functionality be better implemented in a single "q
>>>>>> linux.vmcoreinfo" packet, or as a "qXfer" packet?
>>>>
>>>>> ... we have packets like qXfer that can handle multi-part transfers. So the
>>>>> packet size is not a critical concern anymore, and it is best to use this
>>>>> newer mechanism, if the usage fits the packet structure.
>>>>
>>>> Agreed, qXfer is the way to go.
>>>
>>> Thank you Tom & Luis for confirmation, qXfer seems appropriate. With
>>> that approach the buffer size is not really a concern: we can simply use
>>> the minimum of the requested read size, and the stub's buffer size. So
>>> long as clients use multiple requests until the data is fully read.
>>>
>>> While the "os" object also sounds like a good place to put this (e.g.
>>> within a new annex), it seems like that contains XML-formatted data with
>>> well-understood schema and semantics. The vmcoreinfo is free-form text
>>> (generally of a "key=value" format), so it probably should be a separate
>>> object.
>>>
>>> So I think we would prefer to add an object type, e.g. named "vmcoreinfo".
>>> (But please do speak up if this sounds like a mistake)
>>>
>>>> If you're adding a new object type, a patch to the manual would be good.
>>>
>>> I'll definitely include a patch for the manual in the plan for this.
>>> Another patch I'd like to write is to allow GDB's server to expose this
>>> object type when the target is an ELF core dump with a VMCOREINFO note.
>>> We're hoping for this to useful for all debuggers, not just drgn.
>>
>> Hi Stephen,
>>
>> I took a look at the wiki page and it seems like initially at least,
>> your plan is to make the information from vmcoreinfo available via a new
>> 'info' command.
>>
>> It is possible to send remote packets through GDB's Python API[1]. And
>> of course, the Python API allows for new commands to be created[2].
>> There is a test in GDB's test suite that makes use of the packet sending
>> API, and it happens to send a qXfer packet[3].
>>
>> I say all this not to put you off contributing a patch to core GDB, but
>> if what you want is a new user command which will send a packet to a
>> remote target and process the results, then it should be possible to
>> implement this as a Python extension.
>
> A bit off-topic, but wouldn't that have the potential to proliferate
> remote packets gdb/debugging stubs have no control over or no documentation
> to point at/refer to? Possibly contributing to greater confusion as to what
> should be minimally supported in terms of remote packets?
I don't see a problem, but that doesn't mean using an extension is the
right choice in all cases.
Using an extension, in my mind, ties the functionality to the project.
So if we expect many remote targets to support the new packet then it
begins to make more sense to move the functionality into the GDB tree
(maybe as C++ code, or maybe as Python code). But if, really, the new
feature only applies for one project, then it makes more sense (for me)
to add GDB support via an project specific extension. Documentation
would naturally live within the project itself.
On the topic of control, I'm not entirely sure what problems you see.
Using a project specific extension gives (IMHO) the project more control
over the packet as they are free to change the packet over time.
Currently GDB is pretty conservative with changing packets, so if a
single project pushed their own custom packet to GDB core, and then
wanted to change the packet, there's a risk we (GDB team) might refuse
the change "just in case" someone else has since started to use the
packet. That might be seen as the project having less control.
Looking at GDB's control over packets, I'm not sure what control GDB
needs. For sure, there's a risk that if extensions start creating their
own packets then there is a risk that GDB might later add a conflicting
packet, which could be a problem, and one reason we might want to move
from an extension to a core packet. I doubt, right now, there are many
"custom packets" in the wild, but maybe we should consider defining
namespaces for custom packets, to avoid future conflicts?
The minimally supported problem is usually about the minimal set of
packets that a remote target needs to support, rather than the set GDB
supports. In most cases, communication is initiated by GDB, so if GDB
isn't aware of this packet (extension not loaded), GDB will just never
ask for it.
I'm certainly not making the argument that using an extension is the
right choice in every case. Not every packet type is going to be right
for implementing as an extension. Additionally, it might be that a
project starts with an extension while prototyping a new packet, and
then ports it to C++ and contributes it to GDB later once the design is
solid.
I've had success using packets via Python before, and not everyone is
aware this feature is available, so I thought I'd mention it.
Thanks,
Andrew
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-16 16:40 ` Andrew Burgess via Gdb
@ 2025-01-16 17:15 ` Luis Machado via Gdb
2025-01-17 22:01 ` Tom Tromey
1 sibling, 0 replies; 16+ messages in thread
From: Luis Machado via Gdb @ 2025-01-16 17:15 UTC (permalink / raw)
To: Andrew Burgess, Stephen Brennan, Tom Tromey, Luis Machado via Gdb
Cc: linux-debuggers, Omar Sandoval, Amal Raj T
On 1/16/25 16:40, Andrew Burgess wrote:
> Luis Machado <luis.machado@arm.com> writes:
>
>> On 1/16/25 10:37, Andrew Burgess wrote:
>>> Stephen Brennan via Gdb <gdb@sourceware.org> writes:
>>>
>>>> Tom Tromey <tom@tromey.com> writes:
>>>>>>>>>> Luis Machado via Gdb <gdb@sourceware.org> writes:
>>>>>
>>>>>>> To sum up, my specific questions are:
>>>>>>>
>>>>>>> 1. What is the maximum protocol packet size, if any?
>>>>>
>>>>>> It is hardcoded by gdb, but the remote can also specify that, but...
>>>>>
>>>>>>> 2. Would this functionality be better implemented in a single "q
>>>>>>> linux.vmcoreinfo" packet, or as a "qXfer" packet?
>>>>>
>>>>>> ... we have packets like qXfer that can handle multi-part transfers. So the
>>>>>> packet size is not a critical concern anymore, and it is best to use this
>>>>>> newer mechanism, if the usage fits the packet structure.
>>>>>
>>>>> Agreed, qXfer is the way to go.
>>>>
>>>> Thank you Tom & Luis for confirmation, qXfer seems appropriate. With
>>>> that approach the buffer size is not really a concern: we can simply use
>>>> the minimum of the requested read size, and the stub's buffer size. So
>>>> long as clients use multiple requests until the data is fully read.
>>>>
>>>> While the "os" object also sounds like a good place to put this (e.g.
>>>> within a new annex), it seems like that contains XML-formatted data with
>>>> well-understood schema and semantics. The vmcoreinfo is free-form text
>>>> (generally of a "key=value" format), so it probably should be a separate
>>>> object.
>>>>
>>>> So I think we would prefer to add an object type, e.g. named "vmcoreinfo".
>>>> (But please do speak up if this sounds like a mistake)
>>>>
>>>>> If you're adding a new object type, a patch to the manual would be good.
>>>>
>>>> I'll definitely include a patch for the manual in the plan for this.
>>>> Another patch I'd like to write is to allow GDB's server to expose this
>>>> object type when the target is an ELF core dump with a VMCOREINFO note.
>>>> We're hoping for this to useful for all debuggers, not just drgn.
>>>
>>> Hi Stephen,
>>>
>>> I took a look at the wiki page and it seems like initially at least,
>>> your plan is to make the information from vmcoreinfo available via a new
>>> 'info' command.
>>>
>>> It is possible to send remote packets through GDB's Python API[1]. And
>>> of course, the Python API allows for new commands to be created[2].
>>> There is a test in GDB's test suite that makes use of the packet sending
>>> API, and it happens to send a qXfer packet[3].
>>>
>>> I say all this not to put you off contributing a patch to core GDB, but
>>> if what you want is a new user command which will send a packet to a
>>> remote target and process the results, then it should be possible to
>>> implement this as a Python extension.
>>
>> A bit off-topic, but wouldn't that have the potential to proliferate
>> remote packets gdb/debugging stubs have no control over or no documentation
>> to point at/refer to? Possibly contributing to greater confusion as to what
>> should be minimally supported in terms of remote packets?
>
> I don't see a problem, but that doesn't mean using an extension is the
> right choice in all cases.
>
> Using an extension, in my mind, ties the functionality to the project.
> So if we expect many remote targets to support the new packet then it
> begins to make more sense to move the functionality into the GDB tree
> (maybe as C++ code, or maybe as Python code). But if, really, the new
> feature only applies for one project, then it makes more sense (for me)
> to add GDB support via an project specific extension. Documentation
> would naturally live within the project itself.
>
> On the topic of control, I'm not entirely sure what problems you see.
> Using a project specific extension gives (IMHO) the project more control
> over the packet as they are free to change the packet over time.
> Currently GDB is pretty conservative with changing packets, so if a
> single project pushed their own custom packet to GDB core, and then
> wanted to change the packet, there's a risk we (GDB team) might refuse
> the change "just in case" someone else has since started to use the
> packet. That might be seen as the project having less control.
>
> Looking at GDB's control over packets, I'm not sure what control GDB
> needs. For sure, there's a risk that if extensions start creating their
> own packets then there is a risk that GDB might later add a conflicting
> packet, which could be a problem, and one reason we might want to move
> from an extension to a core packet. I doubt, right now, there are many
> "custom packets" in the wild, but maybe we should consider defining
> namespaces for custom packets, to avoid future conflicts?
>
> The minimally supported problem is usually about the minimal set of
> packets that a remote target needs to support, rather than the set GDB
> supports. In most cases, communication is initiated by GDB, so if GDB
> isn't aware of this packet (extension not loaded), GDB will just never
> ask for it.
>
> I'm certainly not making the argument that using an extension is the
> right choice in every case. Not every packet type is going to be right
> for implementing as an extension. Additionally, it might be that a
> project starts with an extension while prototyping a new packet, and
> then ports it to C++ and contributes it to GDB later once the design is
> solid.
>
> I've had success using packets via Python before, and not everyone is
> aware this feature is available, so I thought I'd mention it.
All good points. I wasn't aware this particular feature existed to be honest.
I suppose my concerns are more towards having a packet that eventually starts
getting wider usage, and then people rely on it, but gdb doesn't know about it.
But I agree by then we should be able to make the call to internalize such packet
in GDB.
To be fair, given the nature of the RSP packets, it would make more sense to
have that parsed/handled via Python for simplicity/efficiency, as opposed to C/C++.
>
> Thanks,
> Andrew
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-16 10:37 ` Andrew Burgess via Gdb
2025-01-16 10:49 ` Luis Machado via Gdb
@ 2025-01-16 17:58 ` Stephen Brennan via Gdb
2025-01-23 1:11 ` Stephen Brennan via Gdb
2 siblings, 0 replies; 16+ messages in thread
From: Stephen Brennan via Gdb @ 2025-01-16 17:58 UTC (permalink / raw)
To: Andrew Burgess, Tom Tromey, Luis Machado via Gdb
Cc: Luis Machado, linux-debuggers, Omar Sandoval, Amal Raj T
Andrew Burgess <aburgess@redhat.com> writes:
> Stephen Brennan via Gdb <gdb@sourceware.org> writes:
>
>> Tom Tromey <tom@tromey.com> writes:
>>>>>>>> Luis Machado via Gdb <gdb@sourceware.org> writes:
>>>
>>>>> To sum up, my specific questions are:
>>>>>
>>>>> 1. What is the maximum protocol packet size, if any?
>>>
>>>> It is hardcoded by gdb, but the remote can also specify that, but...
>>>
>>>>> 2. Would this functionality be better implemented in a single "q
>>>>> linux.vmcoreinfo" packet, or as a "qXfer" packet?
>>>
>>>> ... we have packets like qXfer that can handle multi-part transfers. So the
>>>> packet size is not a critical concern anymore, and it is best to use this
>>>> newer mechanism, if the usage fits the packet structure.
>>>
>>> Agreed, qXfer is the way to go.
>>
>> Thank you Tom & Luis for confirmation, qXfer seems appropriate. With
>> that approach the buffer size is not really a concern: we can simply use
>> the minimum of the requested read size, and the stub's buffer size. So
>> long as clients use multiple requests until the data is fully read.
>>
>> While the "os" object also sounds like a good place to put this (e.g.
>> within a new annex), it seems like that contains XML-formatted data with
>> well-understood schema and semantics. The vmcoreinfo is free-form text
>> (generally of a "key=value" format), so it probably should be a separate
>> object.
>>
>> So I think we would prefer to add an object type, e.g. named "vmcoreinfo".
>> (But please do speak up if this sounds like a mistake)
>>
>>> If you're adding a new object type, a patch to the manual would be good.
>>
>> I'll definitely include a patch for the manual in the plan for this.
>> Another patch I'd like to write is to allow GDB's server to expose this
>> object type when the target is an ELF core dump with a VMCOREINFO note.
>> We're hoping for this to useful for all debuggers, not just drgn.
>
> Hi Stephen,
>
> I took a look at the wiki page and it seems like initially at least,
> your plan is to make the information from vmcoreinfo available via a new
> 'info' command.
>
> It is possible to send remote packets through GDB's Python API[1]. And
> of course, the Python API allows for new commands to be created[2].
> There is a test in GDB's test suite that makes use of the packet sending
> API, and it happens to send a qXfer packet[3].
>
> I say all this not to put you off contributing a patch to core GDB, but
> if what you want is a new user command which will send a packet to a
> remote target and process the results, then it should be possible to
> implement this as a Python extension.
>
> The benefits of using a Python API are that you can ship the extension
> with the kernel, so if/when the vmcoreinfo changes you can easily update
> the extension. At the very least, it allows you to prototype your
> commands before having to submit patches to GDB core.
Thanks for this reference information & code, it will be quite useful
for prototyping, and likely it will be of use for the GDB scripts that
already ship with the kernel.
The main plan for this is to implement the stub-side wherever possible:
kgdb, GDB, and QEMU's gdbserver, being the main places I think it would
be used.
From the client/debugger side, the Python approach may make a lot of
sense for the GDB scripts which are bundled with the kernel! I don't
necessarily know whether any users outside of the kernel would care
about an "info vmcoreinfo" command, so it does seem simpler to keep the
client code with the kernel.
The other anticipated client side would of course be drgn, which will be
adding support for the GDB Remote Protocol, and will use the vmcoreinfo
as an essential part of its startup to learn about the target's kernel.
> I have included below a very simple example that implements 'info stuff'
> which just displays the 'threads' information from a remote target.
> Place the code into 'cmd.py', then from GDB 'source cmd.py', after which
> you can 'info stuff' and 'help info stuff'.
>
> If this is an approach that might be of interest then I'm happy to help
> in any way I can, just drop questions on the mailing list.
That is very kind, thank you!
Stephen
> Thanks,
> Andrew
>
>
> [1] https://sourceware.org/gdb/current/onlinedocs/gdb.html/Connections-In-Python.html
> [2] https://sourceware.org/gdb/current/onlinedocs/gdb.html/CLI-Commands-In-Python.html
> [3] https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/testsuite/gdb.python/py-send-packet.py;hb=HEAD#l100
>
> --
>
> def bytes_to_string(byte_array):
> res = ""
> for b in byte_array:
> b = int(b)
> if (b >= 32 and b <= 126) or (b == 10) :
> res = res + ("%c" % b)
> else:
> res = res + ("\\x%02x" % b)
> return res
>
>
> def get_stuff(conn, obj):
> assert isinstance(conn, gdb.RemoteTargetConnection)
> start_pos = 0
> len = 10
> output = ""
> while True:
> str = conn.send_packet("qXfer:%s:read::%d,%d" % (obj, start_pos, len))
> str = bytes_to_string(str)
> start_pos += len
> c = str[0]
> str = str[1:]
> output += str
> if c == "l":
> break
> return output
>
>
> class InfoStuffCommand (gdb.Command):
> """
> Usage:
> info stuff
>
> Shows the remote targets thread information.
> """
>
> def __init__ (self):
> gdb.Command.__init__ (self, "info stuff", gdb.COMMAND_DATA)
>
> def invoke(self, args, from_tty):
> inf = gdb.selected_inferior()
> conn = inf.connection
> if not isinstance(conn, gdb.RemoteTargetConnection):
> raise gdb.GdbError("not on a remote connection")
> string = get_stuff(conn, "threads")
> print("Target thread information:")
> for line in string.splitlines():
> print(" %s" % line)
>
> InfoStuffCommand ()
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-16 16:40 ` Andrew Burgess via Gdb
2025-01-16 17:15 ` Luis Machado via Gdb
@ 2025-01-17 22:01 ` Tom Tromey
1 sibling, 0 replies; 16+ messages in thread
From: Tom Tromey @ 2025-01-17 22:01 UTC (permalink / raw)
To: Andrew Burgess
Cc: Luis Machado, Stephen Brennan, Tom Tromey, Luis Machado via Gdb,
linux-debuggers, Omar Sandoval, Amal Raj T
>>>>> "Andrew" == Andrew Burgess <aburgess@redhat.com> writes:
Andrew> I doubt, right now, there are many
Andrew> "custom packets" in the wild, but maybe we should consider defining
Andrew> namespaces for custom packets, to avoid future conflicts?
Right now most of the incompatible packets in the wild are from
lldbserver. They extended the protocol but didn't discuss with gdb or
send patches to the docs.
The other major incompatibility I see in the wild is custom servers that
send weird results. I've sent a workaround or two for these kinds of
things in the past.
Anyway, for a new qXfer packet, I guess I don't see much to worry about.
Hopefully people adding such things have the wisdom to pick an unusual
name.
Andrew> I've had success using packets via Python before, and not everyone is
Andrew> aware this feature is available, so I thought I'd mention it.
It didn't cross my mind but yeah this seems like a reasonable use case.
Tom
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-16 10:37 ` Andrew Burgess via Gdb
2025-01-16 10:49 ` Luis Machado via Gdb
2025-01-16 17:58 ` Stephen Brennan via Gdb
@ 2025-01-23 1:11 ` Stephen Brennan via Gdb
2 siblings, 0 replies; 16+ messages in thread
From: Stephen Brennan via Gdb @ 2025-01-23 1:11 UTC (permalink / raw)
To: Andrew Burgess, Tom Tromey, Luis Machado via Gdb
Cc: Luis Machado, linux-debuggers, Omar Sandoval, Amal Raj T
Andrew Burgess <aburgess@redhat.com> writes:
> I have included below a very simple example that implements 'info stuff'
> which just displays the 'threads' information from a remote target.
> Place the code into 'cmd.py', then from GDB 'source cmd.py', after which
> you can 'info stuff' and 'help info stuff'.
Hi Andrew (and others)
Thank you for your initial script for the qXfer command. I've started
implementation with QEMU and I used it to great effect to help me debug
and test that. I'll soon be submitting it to the QEMU mailing list (once
I address those 32-bit and other-endian concerns).
I went ahead and updated your script a bit, mostly to my own tastes. The
only major change I made was that the arguments to the qXfer command
need to be in HEX, not decimal.
https://gist.github.com/brenns10/c936c8e7ef8193e668884ed70070ccfb
In any case, at this point, I would like to submit a patch to GDB to
update the manual to include this new qXfer command, since I've at least
got a client & a stub both speaking the same language on it. So I know
it's a feasible design.
I surmise that I need to update the (very large) gdb/doc/gdb.texinfo
file. But when I run commands like "make doc/gdb.dvi" to build the
resulting output, I get errors related to a missing file "gdb-cfg.texi"
(see below). I confirmed this not just on Oracle Linux but also on Arch
Linux. I'm happy to try out another distro via a container too. I have
texinfo, makeinfo, etc. istalled, so it's a bit confusing.
Does anybody have any advice for building the docs?
Thanks,
Stephen
Error below:
$ make doc/gdb.dvi
texi2dvi doc/gdb.texinfo
This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) (preloaded format=etex)
restricted \write18 enabled.
entering extended mode
(./doc/gdb.texinfo (/usr/share/texlive/texmf-dist/tex/texinfo/texinfo.tex
Loading texinfo [version 2019-09-20.22]: pdf, fonts, markup, glyphs,
page headings, tables, conditionals, indexing, sectioning, toc, environments,
defuns, macros, cross references, insertions,
(/usr/share/texlive/texmf-dist/tex/generic/epsf/epsf.tex
This is `epsf.tex' v2.7.4 <14 February 2011>
) localization, formatting, and turning on texinfo input format.)
./doc/gdb.texinfo:10: I can't find file `gdb-cfg.texi'.
@temp ->@input gdb-cfg.texi
@includezzz ...and @input #1 }@expandafter }@temp
@popthisfilestack
l.10 @include gdb-cfg.texi
(Press Enter to retry, or Control-D to exit)
Please type another input file name
./doc/gdb.texinfo:10: Emergency stop.
@temp ->@input gdb-cfg.texi
@includezzz ...and @input #1 }@expandafter }@temp
@popthisfilestack
l.10 @include gdb-cfg.texi
No pages of output.
Transcript written on gdb.log.
/usr/bin/texi2dvi: etex exited with bad status, quitting.
make: *** [<builtin>: doc/gdb.dvi] Error 1
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
[not found] <8734hmtfbr.fsf@oracle.com>
2025-01-14 15:03 ` GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback Luis Machado via Gdb
@ 2025-01-26 18:07 ` Thomas Weißschuh via Gdb
2025-01-27 18:13 ` Omar Sandoval
1 sibling, 1 reply; 16+ messages in thread
From: Thomas Weißschuh via Gdb @ 2025-01-26 18:07 UTC (permalink / raw)
To: Stephen Brennan; +Cc: gdb, linux-debuggers, Omar Sandoval, Amal Raj T
Hi Stephen,
On 2025-01-13 16:22:00-0800, Stephen Brennan wrote:
> I contribute to the drgn debugger [1], and work on debugging the Linux
> kernel a fair bit. Drgn is particularly well-suited to the Linux kernel
> and contains a lot of support for it. It currently supports attaching to
> live targets via Linux's /proc/kcore, and core dumps. We are looking
> into supporting remote targets via GDB's remote protocol.
>
> One piece of information that is very useful when debugging the Linux
> kernel is the VMCOREINFO note[2]. This is a free-form piece of text
> data, typically around 3k bytes, which contains information that
> debuggers would find useful in interpreting a Linux kernel memory image.
> In particular, it contains the KASLR offset, the build ID of the kernel,
> and the OS release. With this information, a debugger could attach to
> a live GDB stub (e.g. kgdb, or QEMU) without needing to specify
> debuginfo file names or memory KASLR memory offsets.
>
> To that end, we hope to extend the GDB remote protocol with a facility
> that would allow the debugger to request this information. We've written
> up an idea for this proposal at [3]. The summary is:
>
> 'q linux.vmcoreinfo'
> Retrieves the Linux vmcoreinfo data.
> Reply:
> 'Q [DATA]' data is encoded as described in the Binary Data doc [4]
> 'E.<text>' with an informative message if the data is not available
>
> However, with the candidate kgdb implementation taking shape [5], we're
> becoming concerned regarding this design. It seems that there is an
> implicit maximum packet size which is not described in the protocol
> documentation. Many stubs have small(ish) shared output buffers. It
> seems to me that data which would be 3k bytes before escaping is too
> large. We've noticed that there is a 'qXfer' query packet which allows
> specifying an offset and a number of bytes. Maybe it would be better for
> us to add a new 'special data area' for the 'qXfer' message, and reuse
> that command?
Do you need to transfer the full vmcoreinfo data?
Wouldn't it be sufficient to only include the address/size of the
vmcoreinfo note in memory and the debugger can read the data from there
with regular memory access commands?
That information is enough for QEMUs vmcoreinfo device.
It would simplify the design and implementation(s) significantly.
Thomas
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-26 18:07 ` Thomas Weißschuh via Gdb
@ 2025-01-27 18:13 ` Omar Sandoval
2025-01-27 18:42 ` Stephen Brennan via Gdb
0 siblings, 1 reply; 16+ messages in thread
From: Omar Sandoval @ 2025-01-27 18:13 UTC (permalink / raw)
To: Thomas Weißschuh; +Cc: Stephen Brennan, gdb, linux-debuggers, Amal Raj T
On Sun, Jan 26, 2025 at 07:07:47PM +0100, Thomas Weißschuh wrote:
> Hi Stephen,
>
> On 2025-01-13 16:22:00-0800, Stephen Brennan wrote:
> > I contribute to the drgn debugger [1], and work on debugging the Linux
> > kernel a fair bit. Drgn is particularly well-suited to the Linux kernel
> > and contains a lot of support for it. It currently supports attaching to
> > live targets via Linux's /proc/kcore, and core dumps. We are looking
> > into supporting remote targets via GDB's remote protocol.
> >
> > One piece of information that is very useful when debugging the Linux
> > kernel is the VMCOREINFO note[2]. This is a free-form piece of text
> > data, typically around 3k bytes, which contains information that
> > debuggers would find useful in interpreting a Linux kernel memory image.
> > In particular, it contains the KASLR offset, the build ID of the kernel,
> > and the OS release. With this information, a debugger could attach to
> > a live GDB stub (e.g. kgdb, or QEMU) without needing to specify
> > debuginfo file names or memory KASLR memory offsets.
> >
> > To that end, we hope to extend the GDB remote protocol with a facility
> > that would allow the debugger to request this information. We've written
> > up an idea for this proposal at [3]. The summary is:
> >
> > 'q linux.vmcoreinfo'
> > Retrieves the Linux vmcoreinfo data.
> > Reply:
> > 'Q [DATA]' data is encoded as described in the Binary Data doc [4]
> > 'E.<text>' with an informative message if the data is not available
> >
> > However, with the candidate kgdb implementation taking shape [5], we're
> > becoming concerned regarding this design. It seems that there is an
> > implicit maximum packet size which is not described in the protocol
> > documentation. Many stubs have small(ish) shared output buffers. It
> > seems to me that data which would be 3k bytes before escaping is too
> > large. We've noticed that there is a 'qXfer' query packet which allows
> > specifying an offset and a number of bytes. Maybe it would be better for
> > us to add a new 'special data area' for the 'qXfer' message, and reuse
> > that command?
>
> Do you need to transfer the full vmcoreinfo data?
> Wouldn't it be sufficient to only include the address/size of the
> vmcoreinfo note in memory and the debugger can read the data from there
> with regular memory access commands?
> That information is enough for QEMUs vmcoreinfo device.
> It would simplify the design and implementation(s) significantly.
Oh, that's a good idea. I guess the downside is an extra command round
trip.
QEMU and KGDB also only implement the `m` command for reading
hex-encoded memory. We'd probably want to implement the `x` command for
both since it doesn't (usually) double the transfer size.
One more caveat is that KGDB doesn't check the length passed to the `m`
command and will happily clobber memory... QEMU's gdbstub does seem to
check, and also advertises its packet size, so we probably want that in
KGDB, too.
Stephen, what do you think?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-27 18:13 ` Omar Sandoval
@ 2025-01-27 18:42 ` Stephen Brennan via Gdb
2025-01-27 22:40 ` Thomas Weißschuh via Gdb
0 siblings, 1 reply; 16+ messages in thread
From: Stephen Brennan via Gdb @ 2025-01-27 18:42 UTC (permalink / raw)
To: Omar Sandoval, Thomas Weißschuh; +Cc: gdb, linux-debuggers, Amal Raj T
Omar Sandoval <osandov@osandov.com> writes:
> On Sun, Jan 26, 2025 at 07:07:47PM +0100, Thomas Weißschuh wrote:
>> Hi Stephen,
>> On 2025-01-13 16:22:00-0800, Stephen Brennan wrote:
>> > I contribute to the drgn debugger [1], and work on debugging the Linux
>> > kernel a fair bit. Drgn is particularly well-suited to the Linux kernel
>> > and contains a lot of support for it. It currently supports attaching to
>> > live targets via Linux's /proc/kcore, and core dumps. We are looking
>> > into supporting remote targets via GDB's remote protocol.
>> >
>> > One piece of information that is very useful when debugging the Linux
>> > kernel is the VMCOREINFO note[2]. This is a free-form piece of text
>> > data, typically around 3k bytes, which contains information that
>> > debuggers would find useful in interpreting a Linux kernel memory image.
>> > In particular, it contains the KASLR offset, the build ID of the kernel,
>> > and the OS release. With this information, a debugger could attach to
>> > a live GDB stub (e.g. kgdb, or QEMU) without needing to specify
>> > debuginfo file names or memory KASLR memory offsets.
>> >
>> > To that end, we hope to extend the GDB remote protocol with a facility
>> > that would allow the debugger to request this information. We've written
>> > up an idea for this proposal at [3]. The summary is:
>> >
>> > 'q linux.vmcoreinfo'
>> > Retrieves the Linux vmcoreinfo data.
>> > Reply:
>> > 'Q [DATA]' data is encoded as described in the Binary Data doc [4]
>> > 'E.<text>' with an informative message if the data is not available
>> >
>> > However, with the candidate kgdb implementation taking shape [5], we're
>> > becoming concerned regarding this design. It seems that there is an
>> > implicit maximum packet size which is not described in the protocol
>> > documentation. Many stubs have small(ish) shared output buffers. It
>> > seems to me that data which would be 3k bytes before escaping is too
>> > large. We've noticed that there is a 'qXfer' query packet which allows
>> > specifying an offset and a number of bytes. Maybe it would be better for
>> > us to add a new 'special data area' for the 'qXfer' message, and reuse
>> > that command?
>>
>> Do you need to transfer the full vmcoreinfo data?
>> Wouldn't it be sufficient to only include the address/size of the
>> vmcoreinfo note in memory and the debugger can read the data from there
>> with regular memory access commands?
>> That information is enough for QEMUs vmcoreinfo device.
>> It would simplify the design and implementation(s) significantly.
I do agree that this would simplify the protocol design, and it's a good
idea I hadn't considered. Thank you!
> Oh, that's a good idea. I guess the downside is an extra command round
> trip.
>
> QEMU and KGDB also only implement the `m` command for reading
> hex-encoded memory. We'd probably want to implement the `x` command for
> both since it doesn't (usually) double the transfer size.
>
> One more caveat is that KGDB doesn't check the length passed to the `m`
> command and will happily clobber memory... QEMU's gdbstub does seem to
> check, and also advertises its packet size, so we probably want that in
> KGDB, too.
>
> Stephen, what do you think?
One of our core constraints is that the vmcoreinfo is quite a bit of
text data, around 3k bytes right now. Doubling that to 6k would be a
real problem, so it would be really helpful to keep this to using the
'x' encoding. However, I think there's a couple good reasons that
neither QEMU nor KGDB support the 'x' packet:
1. You cannot know the required buffer size easily, so you cannot know
whether you will be able to satisfy an 'x' request in your static buffer
size. This is most relevant to kgdb. The 'x' packet documentation says:
The reply may contain fewer addressable memory units than requested if
the server was reading from a trace frame memory and was able to read
only part of the region of memory.
I'm not certain what trace frame memory is, but the vmcoreinfo is not
that. I'm not confident whether this means the protocol allows fewer
bytes to be returned for other cases? Certainly, the 'qXfer' packets
explicitly allow for fewer bytes to be returned, as the documentation
states:
It is possible for data to have fewer bytes than the length in the
request.
2. The 'x' encoding permits NUL bytes to be transmitted in a packet.
Thus, adding support to an existing stub has a hidden pitfall that you
need to ensure you can transmit NUL bytes successfully. You could
circumvent the problem by escaping them, but since NUL bytes are
typically quite common, that eliminates the benefit of the binary
encoding. (You could do run-length encoding, but I'm not actually sure
how common it is to do run-length encoding for escaped characters -- is
it tested anywhere? Certainly, neither kgdb nor QEMU implement
run-length encoding today).
Currently, kgdb implicitly assumes that the buffer is NUL-terminated,
and it would be quite a bit of work for it to instead track the number
of bytes in the response instead of NUL-termination. It seems that QEMU
is in a better state for that: it uses a GString for the response, and
according to the docs[1], it can hold arbitrary binary data.
[1]: https://docs.gtk.org/glib/struct.String.html
So while I think the cleanest design would be the one that simply
transmits the address of the vmcoreinfo note, the most practical one,
with the protocol as it is today, seems to be implementing the 'qXfer'
packet described in the other branch of this thread. For that, #1 is not
an issue (because we can just return fewer bytes than requested), and #2
is not an issue because vmcoreinfo is text data (which can't be
guaranteed for the general 'x' packet implementation).
Thanks,
Stephen
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-27 18:42 ` Stephen Brennan via Gdb
@ 2025-01-27 22:40 ` Thomas Weißschuh via Gdb
2025-01-28 0:19 ` Stephen Brennan via Gdb
0 siblings, 1 reply; 16+ messages in thread
From: Thomas Weißschuh via Gdb @ 2025-01-27 22:40 UTC (permalink / raw)
To: Stephen Brennan; +Cc: Omar Sandoval, gdb, linux-debuggers, Amal Raj T
On 2025-01-27 10:42:59-0800, Stephen Brennan wrote:
> Omar Sandoval <osandov@osandov.com> writes:
> > On Sun, Jan 26, 2025 at 07:07:47PM +0100, Thomas Weißschuh wrote:
> >> On 2025-01-13 16:22:00-0800, Stephen Brennan wrote:
<snip>
> >> Do you need to transfer the full vmcoreinfo data?
> >> Wouldn't it be sufficient to only include the address/size of the
> >> vmcoreinfo note in memory and the debugger can read the data from there
> >> with regular memory access commands?
> >> That information is enough for QEMUs vmcoreinfo device.
> >> It would simplify the design and implementation(s) significantly.
>
> I do agree that this would simplify the protocol design, and it's a good
> idea I hadn't considered. Thank you!
>
> > Oh, that's a good idea. I guess the downside is an extra command round
> > trip.
> >
> > QEMU and KGDB also only implement the `m` command for reading
> > hex-encoded memory. We'd probably want to implement the `x` command for
> > both since it doesn't (usually) double the transfer size.
> >
> > One more caveat is that KGDB doesn't check the length passed to the `m`
> > command and will happily clobber memory... QEMU's gdbstub does seem to
> > check, and also advertises its packet size, so we probably want that in
> > KGDB, too.
> >
> > Stephen, what do you think?
>
> One of our core constraints is that the vmcoreinfo is quite a bit of
> text data, around 3k bytes right now. Doubling that to 6k would be a
> real problem, so it would be really helpful to keep this to using the
> 'x' encoding. However, I think there's a couple good reasons that
> neither QEMU nor KGDB support the 'x' packet:
Could you elaborate why it "would be a real problem"?
If it is a problem with static buffer sizes, wouldn't this be avoided by
configuring a max packet size? The debugger would have to request the
data in chunks.
This is also what the drgn pull request currently does for all memory reads.
Without knowing much about drgn, I would have expected the regular
memory accesses to be much more data then the 3KB vmcore notes.
> 1. You cannot know the required buffer size easily, so you cannot know
> whether you will be able to satisfy an 'x' request in your static buffer
> size. This is most relevant to kgdb. The 'x' packet documentation says:
>
> The reply may contain fewer addressable memory units than requested if
> the server was reading from a trace frame memory and was able to read
> only part of the region of memory.
>
> I'm not certain what trace frame memory is, but the vmcoreinfo is not
> that. I'm not confident whether this means the protocol allows fewer
> bytes to be returned for other cases? Certainly, the 'qXfer' packets
> explicitly allow for fewer bytes to be returned, as the documentation
> states:
>
> It is possible for data to have fewer bytes than the length in the
> request.
If I understand correctly "trace frame" refers to same sort of cache.
It's not a property of the target memory but how it is managed by gdb.
For m reads, gdb itself explictly seems to allow short reads, also
referring (not exclusively) to trace frames.
Server:
/* Read trace frame or inferior memory. Returns the number of bytes
actually read, zero when no further transfer is possible, and -1 on
error. Return of a positive value smaller than LEN does not
indicate there's no more to be read, only the end of the transfer.
E.g., when GDB reads memory from a traceframe, a first request may
be served from a memory block that does not cover the whole request
length. A following request gets the rest served from either
another block (of the same traceframe) or from the read-only
regions. */
static int
gdb_read_memory (CORE_ADDR memaddr, unsigned char *myaddr, int len)
...
Client:
target_xfer_status
remote_target::remote_read_bytes_1 (CORE_ADDR memaddr, gdb_byte *myaddr,
ULONGEST len_units,
int unit_size, ULONGEST *xfered_len_units)
{
...
decoded_bytes = hex2bin (p, myaddr, todo_units * unit_size);
/* Return what we have. Let higher layers handle partial reads. */
*xfered_len_units = (ULONGEST) (decoded_bytes / unit_size);
return (*xfered_len_units != 0) ? TARGET_XFER_OK : TARGET_XFER_EOF;
}
<snip>
Thomas
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-27 22:40 ` Thomas Weißschuh via Gdb
@ 2025-01-28 0:19 ` Stephen Brennan via Gdb
2025-01-29 21:16 ` Thomas Weißschuh via Gdb
0 siblings, 1 reply; 16+ messages in thread
From: Stephen Brennan via Gdb @ 2025-01-28 0:19 UTC (permalink / raw)
To: Thomas Weißschuh; +Cc: Omar Sandoval, gdb, linux-debuggers, Amal Raj T
Thomas Weißschuh <thomas@t-8ch.de> writes:
> On 2025-01-27 10:42:59-0800, Stephen Brennan wrote:
>> Omar Sandoval <osandov@osandov.com> writes:
>> > On Sun, Jan 26, 2025 at 07:07:47PM +0100, Thomas Weißschuh wrote:
>> >> On 2025-01-13 16:22:00-0800, Stephen Brennan wrote:
>
> <snip>
>
>> >> Do you need to transfer the full vmcoreinfo data?
>> >> Wouldn't it be sufficient to only include the address/size of the
>> >> vmcoreinfo note in memory and the debugger can read the data from there
>> >> with regular memory access commands?
>> >> That information is enough for QEMUs vmcoreinfo device.
>> >> It would simplify the design and implementation(s) significantly.
>>
>> I do agree that this would simplify the protocol design, and it's a good
>> idea I hadn't considered. Thank you!
>>
>> > Oh, that's a good idea. I guess the downside is an extra command round
>> > trip.
>> >
>> > QEMU and KGDB also only implement the `m` command for reading
>> > hex-encoded memory. We'd probably want to implement the `x` command for
>> > both since it doesn't (usually) double the transfer size.
>> >
>> > One more caveat is that KGDB doesn't check the length passed to the `m`
>> > command and will happily clobber memory... QEMU's gdbstub does seem to
>> > check, and also advertises its packet size, so we probably want that in
>> > KGDB, too.
>> >
>> > Stephen, what do you think?
>>
>> One of our core constraints is that the vmcoreinfo is quite a bit of
>> text data, around 3k bytes right now. Doubling that to 6k would be a
>> real problem, so it would be really helpful to keep this to using the
>> 'x' encoding. However, I think there's a couple good reasons that
>> neither QEMU nor KGDB support the 'x' packet:
>
> Could you elaborate why it "would be a real problem"?
> If it is a problem with static buffer sizes, wouldn't this be avoided by
> configuring a max packet size?
> The debugger would have to request the
> data in chunks.
>
> This is also what the drgn pull request currently does for all memory reads.
> Without knowing much about drgn, I would have expected the regular
> memory accesses to be much more data then the 3KB vmcore notes.
I'm sorry, I should have said what the problem was!
The problem is more to do with transmission time than buffer size. For
the worst case 9600 baud (960 bytes/second), the vmcoreinfo alone takes
~3.2s to transmit. Then you need to add in any overhead for the several
packets used reading chunk-wise. Doubling that transmission time by
encoding it with hex is something we really wanted to avoid.
But it is fair to say that drgn may request other large blocks of
memory, and without 'x' support in the stubs, we will be suffering with
that same large transmission time as well. So implementing the "x"
command in our targets would probably be helpful, regardless.
>> 1. You cannot know the required buffer size easily, so you cannot know
>> whether you will be able to satisfy an 'x' request in your static buffer
>> size. This is most relevant to kgdb. The 'x' packet documentation says:
>>
>> The reply may contain fewer addressable memory units than requested if
>> the server was reading from a trace frame memory and was able to read
>> only part of the region of memory.
>>
>> I'm not certain what trace frame memory is, but the vmcoreinfo is not
>> that. I'm not confident whether this means the protocol allows fewer
>> bytes to be returned for other cases? Certainly, the 'qXfer' packets
>> explicitly allow for fewer bytes to be returned, as the documentation
>> states:
>>
>> It is possible for data to have fewer bytes than the length in the
>> request.
>
> If I understand correctly "trace frame" refers to same sort of cache.
> It's not a property of the target memory but how it is managed by gdb.
>
> For m reads, gdb itself explictly seems to allow short reads, also
> referring (not exclusively) to trace frames.
>
> Server:
>
> /* Read trace frame or inferior memory. Returns the number of bytes
> actually read, zero when no further transfer is possible, and -1 on
> error. Return of a positive value smaller than LEN does not
> indicate there's no more to be read, only the end of the transfer.
> E.g., when GDB reads memory from a traceframe, a first request may
> be served from a memory block that does not cover the whole request
> length. A following request gets the rest served from either
> another block (of the same traceframe) or from the read-only
> regions. */
>
> static int
> gdb_read_memory (CORE_ADDR memaddr, unsigned char *myaddr, int len)
> ...
>
> Client:
>
> target_xfer_status
> remote_target::remote_read_bytes_1 (CORE_ADDR memaddr, gdb_byte *myaddr,
> ULONGEST len_units,
> int unit_size, ULONGEST *xfered_len_units)
> {
> ...
>
> decoded_bytes = hex2bin (p, myaddr, todo_units * unit_size);
> /* Return what we have. Let higher layers handle partial reads. */
> *xfered_len_units = (ULONGEST) (decoded_bytes / unit_size);
> return (*xfered_len_units != 0) ? TARGET_XFER_OK : TARGET_XFER_EOF;
Ok, I suppose I'm putting too much stock in the wording of the docs.
Maybe I can update it to indicate more clearly that the number of bytes
returned may be smaller than the requested amount. The way I read it, it
sounded like it was *only* legal to return less than the requested
number of bytes in certain circumstances.
Something that confuses/bothers me about remote_read_bytes_1, is that it
uses the configured PacketSize, divided by 2, as the maximum amount it
requests from the stub - even if it will use the 'x' type packet. That
means that for 'x' packets, the buffer size would likely be
under-utilized, since most bytes are probably not escaped.
If using the 'x' type packet, and partial reads are allowed, then why
not request as much as possible, and let the stub return however much it
can fit into its buffer?
I'm also slightly confused at this use of PacketSize. From the
documentation:
‘PacketSize=bytes’
The remote stub can accept packets up to at least bytes in length.
GDB will send packets up to this size for bulk transfers, and will
never send larger packets. This is a limit on the data characters in
the packet, not including the frame and checksum. There is no
trailing NUL byte in a remote protocol packet; if the stub stores
packets in a NUL-terminated format, it should allow an extra byte in
its buffer for the NUL. If this stub feature is not supported, GDB
guesses based on the size of the ‘g’ packet response.
It reads like something that advertises the buffer size for the stub's
*input buffer*, so that GDB will never send a packet too big for the
stub.
But here GDB is using it to limit the number of bytes it requests from
the stub. I understand that many stubs may share an input & output
buffer, so that makes a certain sense. But if partial reads are allowed,
again, I would expect GDB to request whatever it wants, and let the stub
limit its output by what fits into its output buffer, and that could
control the chunking rather than the negotiation process here. That way
we could reduce round-trips. Would that approach be legal in the
protocol?
----
I've now realized what may be the most important reason to use the qXfer
command rather than simply returning the memory address of the note.
QEMU only knows the phyical address of the note. It offers a
non-standard extension to the protocol which allows switching from
virtual addresses into physical address mode.
KGDB knows the physical and virtual address of the note, obviously.
However, it has no support for reading physical addresses.
We can't have QEMU transmit the virtual address, since it does not know
it. And it's pointless for KGDB to transmit the physical address, since
it offers no way to read it. I suppose we could have them transmit
whichever they support, with some sort of flag to differentiate, and
then rely on the debugger to use the QEMU extensions to enter physical
memory mode to get it. But that seems like the most complex option of
all.
Thanks,
Stephen
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback
2025-01-28 0:19 ` Stephen Brennan via Gdb
@ 2025-01-29 21:16 ` Thomas Weißschuh via Gdb
0 siblings, 0 replies; 16+ messages in thread
From: Thomas Weißschuh via Gdb @ 2025-01-29 21:16 UTC (permalink / raw)
To: Stephen Brennan; +Cc: Omar Sandoval, gdb, linux-debuggers, Amal Raj T
On 2025-01-27 16:19:41-0800, Stephen Brennan wrote:
> Thomas Weißschuh <thomas@t-8ch.de> writes:
> > On 2025-01-27 10:42:59-0800, Stephen Brennan wrote:
> >> Omar Sandoval <osandov@osandov.com> writes:
> >> > On Sun, Jan 26, 2025 at 07:07:47PM +0100, Thomas Weißschuh wrote:
> >> >> On 2025-01-13 16:22:00-0800, Stephen Brennan wrote:
> >
> > <snip>
> >
> >> >> Do you need to transfer the full vmcoreinfo data?
> >> >> Wouldn't it be sufficient to only include the address/size of the
> >> >> vmcoreinfo note in memory and the debugger can read the data from there
> >> >> with regular memory access commands?
> >> >> That information is enough for QEMUs vmcoreinfo device.
> >> >> It would simplify the design and implementation(s) significantly.
> >>
> >> I do agree that this would simplify the protocol design, and it's a good
> >> idea I hadn't considered. Thank you!
> >>
> >> > Oh, that's a good idea. I guess the downside is an extra command round
> >> > trip.
> >> >
> >> > QEMU and KGDB also only implement the `m` command for reading
> >> > hex-encoded memory. We'd probably want to implement the `x` command for
> >> > both since it doesn't (usually) double the transfer size.
> >> >
> >> > One more caveat is that KGDB doesn't check the length passed to the `m`
> >> > command and will happily clobber memory... QEMU's gdbstub does seem to
> >> > check, and also advertises its packet size, so we probably want that in
> >> > KGDB, too.
> >> >
> >> > Stephen, what do you think?
> >>
> >> One of our core constraints is that the vmcoreinfo is quite a bit of
> >> text data, around 3k bytes right now. Doubling that to 6k would be a
> >> real problem, so it would be really helpful to keep this to using the
> >> 'x' encoding. However, I think there's a couple good reasons that
> >> neither QEMU nor KGDB support the 'x' packet:
> >
> > Could you elaborate why it "would be a real problem"?
> > If it is a problem with static buffer sizes, wouldn't this be avoided by
> > configuring a max packet size?
>
> > The debugger would have to request the
> > data in chunks.
> >
> > This is also what the drgn pull request currently does for all memory reads.
> > Without knowing much about drgn, I would have expected the regular
> > memory accesses to be much more data then the 3KB vmcore notes.
>
> I'm sorry, I should have said what the problem was!
> The problem is more to do with transmission time than buffer size. For
> the worst case 9600 baud (960 bytes/second), the vmcoreinfo alone takes
> ~3.2s to transmit. Then you need to add in any overhead for the several
> packets used reading chunk-wise. Doubling that transmission time by
> encoding it with hex is something we really wanted to avoid.
Agreed.
How many of the vmcoreinfo fields are needed on the normal fastpath?
There is also 'qSearch' which could be used to directly find the string
KERNELOFFSET= and only read that location.
Note: qSearch is currently unimplemented in qemu/kgdb/gdbstub, but
implementing it may be easier than the custom encoding scheme.
Also it could be used by a wider range of users.
Or for now it could be left out, letting users fall back to full reads.
Then at some point it can be backfilled as a performance optimization;
without breaking the protocol.
> But it is fair to say that drgn may request other large blocks of
> memory, and without 'x' support in the stubs, we will be suffering with
> that same large transmission time as well. So implementing the "x"
> command in our targets would probably be helpful, regardless.
I am curious if the current drgn PR would be usable with 9600 baud.
> >> 1. You cannot know the required buffer size easily, so you cannot know
> >> whether you will be able to satisfy an 'x' request in your static buffer
> >> size. This is most relevant to kgdb. The 'x' packet documentation says:
> >>
> >> The reply may contain fewer addressable memory units than requested if
> >> the server was reading from a trace frame memory and was able to read
> >> only part of the region of memory.
> >>
> >> I'm not certain what trace frame memory is, but the vmcoreinfo is not
> >> that. I'm not confident whether this means the protocol allows fewer
> >> bytes to be returned for other cases? Certainly, the 'qXfer' packets
> >> explicitly allow for fewer bytes to be returned, as the documentation
> >> states:
> >>
> >> It is possible for data to have fewer bytes than the length in the
> >> request.
> >
> > If I understand correctly "trace frame" refers to same sort of cache.
> > It's not a property of the target memory but how it is managed by gdb.
> >
> > For m reads, gdb itself explictly seems to allow short reads, also
> > referring (not exclusively) to trace frames.
> >
> > Server:
> >
> > /* Read trace frame or inferior memory. Returns the number of bytes
> > actually read, zero when no further transfer is possible, and -1 on
> > error. Return of a positive value smaller than LEN does not
> > indicate there's no more to be read, only the end of the transfer.
> > E.g., when GDB reads memory from a traceframe, a first request may
> > be served from a memory block that does not cover the whole request
> > length. A following request gets the rest served from either
> > another block (of the same traceframe) or from the read-only
> > regions. */
> >
> > static int
> > gdb_read_memory (CORE_ADDR memaddr, unsigned char *myaddr, int len)
> > ...
> >
> > Client:
> >
> > target_xfer_status
> > remote_target::remote_read_bytes_1 (CORE_ADDR memaddr, gdb_byte *myaddr,
> > ULONGEST len_units,
> > int unit_size, ULONGEST *xfered_len_units)
> > {
> > ...
> >
> > decoded_bytes = hex2bin (p, myaddr, todo_units * unit_size);
> > /* Return what we have. Let higher layers handle partial reads. */
> > *xfered_len_units = (ULONGEST) (decoded_bytes / unit_size);
> > return (*xfered_len_units != 0) ? TARGET_XFER_OK : TARGET_XFER_EOF;
>
> Ok, I suppose I'm putting too much stock in the wording of the docs.
>
> Maybe I can update it to indicate more clearly that the number of bytes
> returned may be smaller than the requested amount. The way I read it, it
> sounded like it was *only* legal to return less than the requested
> number of bytes in certain circumstances.
>
> Something that confuses/bothers me about remote_read_bytes_1, is that it
> uses the configured PacketSize, divided by 2, as the maximum amount it
> requests from the stub - even if it will use the 'x' type packet. That
> means that for 'x' packets, the buffer size would likely be
> under-utilized, since most bytes are probably not escaped.
>
> If using the 'x' type packet, and partial reads are allowed, then why
> not request as much as possible, and let the stub return however much it
> can fit into its buffer?
>
> I'm also slightly confused at this use of PacketSize. From the
> documentation:
>
> ‘PacketSize=bytes’
>
> The remote stub can accept packets up to at least bytes in length.
> GDB will send packets up to this size for bulk transfers, and will
> never send larger packets. This is a limit on the data characters in
> the packet, not including the frame and checksum. There is no
> trailing NUL byte in a remote protocol packet; if the stub stores
> packets in a NUL-terminated format, it should allow an extra byte in
> its buffer for the NUL. If this stub feature is not supported, GDB
> guesses based on the size of the ‘g’ packet response.
>
> It reads like something that advertises the buffer size for the stub's
> *input buffer*, so that GDB will never send a packet too big for the
> stub.
I agree with your confusion but can't give any insights.
Some clearer docs do sound nice.
> But here GDB is using it to limit the number of bytes it requests from
> the stub. I understand that many stubs may share an input & output
> buffer, so that makes a certain sense. But if partial reads are allowed,
> again, I would expect GDB to request whatever it wants, and let the stub
> limit its output by what fits into its output buffer, and that could
> control the chunking rather than the negotiation process here. That way
> we could reduce round-trips. Would that approach be legal in the
> protocol?
Sounds like valid reasoning. I can't give advise on the legal parts,
though :-)
> ----
>
> I've now realized what may be the most important reason to use the qXfer
> command rather than simply returning the memory address of the note.
>
> QEMU only knows the phyical address of the note. It offers a
> non-standard extension to the protocol which allows switching from
> virtual addresses into physical address mode.
>
> KGDB knows the physical and virtual address of the note, obviously.
> However, it has no support for reading physical addresses.
>
> We can't have QEMU transmit the virtual address, since it does not know
> it.
flatview_translate()/address_space_map() could do this I think.
This is also what cpu_physical_memory_read() uses under the hood.
> And it's pointless for KGDB to transmit the physical address, since
> it offers no way to read it. I suppose we could have them transmit
> whichever they support, with some sort of flag to differentiate, and
> then rely on the debugger to use the QEMU extensions to enter physical
> memory mode to get it. But that seems like the most complex option of
> all.
GDB core protocol does not know about different kinds of addresses.
So the query should return whatever kind of address that can be passed to
'm'/'x'/'qSearch' again.
But again, I'm just a bystander, thinking out loud.
Thomas
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-01-29 21:17 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <8734hmtfbr.fsf@oracle.com>
2025-01-14 15:03 ` GDB Remote Protocol Extension - Linux VMCOREINFO - Request for Feedback Luis Machado via Gdb
2025-01-14 17:15 ` Tom Tromey
2025-01-14 17:39 ` Stephen Brennan via Gdb
2025-01-16 10:37 ` Andrew Burgess via Gdb
2025-01-16 10:49 ` Luis Machado via Gdb
2025-01-16 16:40 ` Andrew Burgess via Gdb
2025-01-16 17:15 ` Luis Machado via Gdb
2025-01-17 22:01 ` Tom Tromey
2025-01-16 17:58 ` Stephen Brennan via Gdb
2025-01-23 1:11 ` Stephen Brennan via Gdb
2025-01-26 18:07 ` Thomas Weißschuh via Gdb
2025-01-27 18:13 ` Omar Sandoval
2025-01-27 18:42 ` Stephen Brennan via Gdb
2025-01-27 22:40 ` Thomas Weißschuh via Gdb
2025-01-28 0:19 ` Stephen Brennan via Gdb
2025-01-29 21:16 ` Thomas Weißschuh via Gdb
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox