From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-31952-listarch-gdb=sources.redhat.com@sourceware.org>
Received: (qmail 12962 invoked by alias); 27 May 2008 21:33:49 -0000
Received: (qmail 12948 invoked by uid 22791); 27 May 2008 21:33:47 -0000
X-Spam-Check-By: sourceware.org
Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (65.74.133.4)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 27 May 2008 21:33:30 +0000
Received: (qmail 21810 invoked from network); 27 May 2008 21:33:27 -0000
Received: from unknown (HELO orlando.local) (pedro@127.0.0.2)   by mail.codesourcery.com with ESMTPA; 27 May 2008 21:33:27 -0000
From: Pedro Alves <pedro@codesourcery.com>
To: gdb@sourceware.org
Subject: multi-process remote protocol extensions
Date: Thu, 29 May 2008 17:18:00 -0000
User-Agent: KMail/1.9.9
MIME-Version: 1.0
Content-Type: text/plain;   charset="utf-8"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200805272233.24078.pedro@codesourcery.com>
X-IsSubscribed: yes
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb.sourceware.org>
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
X-SW-Source: 2008-05/txt/msg00170.txt.bz2

Hi,

We'd like to add to GDB the possibility to debug multiple processes
simultaneously.  The first target we'll be working with, is the remote
target.  It happens that this OS we're working with, has the
convenient property that although each process has its own address
space, all processes see all the loaded code and data at the same
addresses, which allows us to incrementally implement multi-process
support, before taking care of multiple symbol tables.  After this,
we'll be adding support for non-stop debugging to the remote protocol
as well.

Unfortunately, the remote protocol, as is defined currently, can't
cope with multi-process support properly, all due to the fact that it
leaves the process id of the debuggee implicit.

What follows are proposals and questions on a direction the protocol
could take to support multi-process debugging.  The intention is not
to design a new protocol, but to extend the current protocol.  This is
not a finished design, and we'd very much like to have early community
input on this.

If you think this is entirely the wrong way to extend the protocol say
so; if you think this is the right direction, say so; if you have any
comment whatsoever, please speak up.  If you think you know the answer
to any of the questions below, please speak up.

Thanks you!

I've more or less split the proposal in these parts:

 1 - General notes
 2 - Reporting multi-process support
 3 - Thread IDs
 4 - Stop Reply Packets 
 5 - Interrupting/Stopping threads and processes.
 6 - Detaching and killing processes
 7 - Breakpoints
 8 - Resuming and stepping
 9 - What's left out

----------------------------------------------------------------------

Part 1 - General notes

 Multi process support makes sense when the stub can "attach" or "run"
 processes on demand.  Attaching and running is only supported on the
 extended-remote target, so multi-process extensions apply only to an
 extended-remote capable stub.

----------------------------------------------------------------------

Part 2 - Reporting multi-process support

 Multi-process debugging support should be reported both by GDB and
 the stub using the qSupported mechanism.  The feature name reported
 should be "multiprocess+".

----------------------------------------------------------------------

Part 3 - Thread IDs

 GDB currently identifies threads using 'ptid_t' values.  A ptid_t is
 a three-member structure containing a process ID, lightweight process
 ID (for the kernel-level thread within the process), and a thread ID
 (for the thread library's identifier).  This means that GDB is
 already carrying the information needed to find the process to which
 a given thread belongs to.

 Unfortunately, the way the remote protocol is defined currently, only
 allows passing a thread id number.  This means that when the stub
 reports an event in a thread, it doesn't tell GDB which process the
 thread belongs to; when GDB wants to resume a thread in a given
 process, there's no way to specify which process.  Same for reading
 registers, etc, etc.

 As an extension for multi-process support in the remote protocol, it
 is proposed that wherever the protocol specifies a thread or
 thread-id field, in multi-process mode, either the stub or GDB can
 specify a pid as well.  The format will be pid.tid, where it was just
 tid before.  The general format for specifying a thread is then
 extended

 from:

  thread-id := tid
  tid :- -1 | hex-digits

 to:
 
  thread-id := (pid.)?tid
  pid := -1 | hex-digits
  tid :- -1 | hex-digits

 Process ids are passed as hex strings, in the same format as thread
 ids are passed currently.

 E.g.:"123a.2"

 To specify the whole process, the form below is used:

 "pid.-1"

 In the case the protocol already specified that a given packet takes
 a pid parameter to specify a process, not a thread id, the dot `.'
 form is not used.  E.g. this packet is *not* changed:

  `vAttach;pid' Attach to a new process with the specified process ID

 To specify all processes, and all threads, the `-1.-1' form is used.
 Specifying all processes, and a specific thread is an error (e.g.,
 -1.14).

 Looking at the description of the H packet:

  `H c t'
  
  Set thread for subsequent operations (`m', `M', `g', `G', et.al.). c
  depends on the operation to be performed: it should be `c' for step
  and continue operations, `g' for other operations. The thread
  designator t may be `-1', meaning all the threads, a thread number,
  or `0' which means pick any thread.

 With this extension, the thread specified with Hg becomes again the
 stub side equivalent of inferior_ptid in the GDB side.

 The extension allows r/w memory, and r/w register packets to continue
 working unaltered, e.g.:

  `g' Read general registers. 

  `G XX...' Write general registers.

  `m addr,length'
    Read length bytes of memory starting at address addr

  `M addr,length:XX...'
    Write length bytes of memory starting at address addr.

  `p n' Read the value of register n;

  `P n...=r...'
    Write register n... with value r....

In addition, and as examples, these packets below also accept or
return the new pid.tid format, because they already handled a thread
id:

 `qGetTLSAddr:thread-id,offset,lm'

 `qfThreadInfo', `qsThreadInfo' and its reply:
       `m id,id...' - a comma-separated list of thread ids 

 `qC'
    Return the current thread id.

 `qThreadExtraInfo,id'

    Obtain a printable string description of a thread's attributes
    from the target OS. id is a thread-id in big-endian hex. This
    string may contain anything that the target OS thinks is
    interesting for GDB to tell the user about the thread. The string
    is displayed in GDB's info threads display. Some examples of
    possible thread extra info strings are `Runnable', or `Blocked on
    Mutex'.

 `T XX'
    Find out if the thread XX is alive. 

 `vCont[;action[:tid]]...'
   Resume the inferior, specifying different actions for each thread.

----------------------------------------------------------------------

Part 4 - Stop Reply Packets 

 The reply packets must also be extended to report the process ID of
 the thread that got the event.

 The T stop reply packet is defined currently like so:

  T AA n1:r1;n2:r2;...'
      The program received signal number AA (a two-digit hexadecimal
   number).  This is equivalent to an `S' response, except that the
   `n:r' pairs can carry values of important registers and other
   information directly in the stop reply packet, reducing round-trip
   latency. Single-step and breakpoint traps are reported this
   way. Each `n:r' pair is interpreted as follows:

    * If n is a hexadecimal number, it is a register number, and the
      corresponding r gives that register's value. r is a series of
      bytes in target byte order, with each byte given by a two-digit
      hex number.

    * If n is `thread', then r is the thread process ID, in hex.

  The multi-process extension allows specifying a (pid.)?tid as n.

  E.g.:
   `T 05 thread:12.2'


 The W and X reply packets have a different problem, they don't
 specify any thread or pid currently, so they can't be as easily
 extended.  These are the current descriptions:

  `W AA'
    The process exited, and AA is the exit status. This is only
    applicable to certain targets.

  `X AA'
    The process terminated with signal AA.

 In this case, I believe it is better to add a new stop reply packet:

  `vExited;why:id(:value)?

     The why's currently defined would be the same as the W and X stop
     replies, with the extension that the value field isn't limited to
     two nibbles.

     `vExited;W:pid:exit_code'

       W - process pid exited normally with exit status stored in
       `exit_code'

     `vExited;W:pid:signal'

       X - process pid exited signalled, exit signal recorded in
       `signal'

----------------------------------------------------------------------

Part 5 - Interrupting/Stopping threads and processes.

In both non-stop and multi-process, interrupting the remote with
^C or BREAK is not sufficiently accurate.  We need to be able to
specify a process or thread to interrupt.

What would be the preferred way to do this?  Assuming the (pid.)?tid
form is accepted, should we extend vCont with a new action, or
implement a new packet entirely?

----------------------------------------------------------------------

Part 6 - Detaching and killing processes

The detach, and kill packets, described as such in the manual,

 `D' Detach GDB from the remote system.

   Sent to the remote target before GDB disconnects via the
   detach command.

 `k' Kill request.

    FIXME: There is no description of how to operate when a specific
    thread context has been selected (i.e. does 'k' kill only that
    thread?).

 Both suffer from the problem that they don't specify which
 process should be killed or detached.

 What is the prefered form to fix this?

 I see three options:

   a) add new arguments to these packets, that are allowed by
      a multi-process aware stub.  If GDB is talking to a
      non multi-process aware stub, it doesn't send the extra
      arguments.

   b) Do not add arguments, but instead specify that the process
      to detach from is specified with the extended Hg packet.
   
   c) Specify new `vDetach pid`, and `vKill pid` packets.

 Given the FIXME on the description of the `k' packet, I'm somewhat
 inclined to specify new packets.

----------------------------------------------------------------------

Part 7 - Breakpoints

 Currently, the breakpoint packet are documented as so:

  `z type,addr,length'
  `Z type,addr,length'
  
    Insert (`Z') or remove (`z') a type breakpoint or watchpoint
    starting at address address and covering the next length bytes.

  The types currently defined are:

  0 - memory breakpoint
  1 - hardware breakpoint
  2 - write watchpoint
  3 - read watchpoint
  4 - access watchpoint

  Even today, support for thread specific breakpoints is missing on
  these packets.  There are stubs/OSs/debug APIs that can handle
  thread specific breakpoints internally, so GDB doesn't have to
  thread hop over those breakpoints.

  The other issue, is, with multi-process support, there's no way to
  specify to which process a breakpoint applies, or, if the breakpoint
  applies to all processes -- we're working with an OS where all
  processes see the full address space, where it makes sense to have
  global breakpoints.

  Since current stubs totally disregard the current thread (set by
  Hg), when setting a breakpoint with the `z' packets, it may not be
  safe to reuse Hg for per-process and per-thread breakpoints.

  OTOH, extending the packet format as,

  `z type,addr,length,tid'

  .. should be somewhat safe, in the following manner:

  - If the stub misparses it, by ignoring the new parameter, and sets
    a global breakpoint, GDB/infrun.c will still handle thread hops,
    so its as if nothing changed.

  - If the stub replies an empty response, to signal that it doesn't
    support the packet, GDB retries without the `,tid' field, and
    makes sure to not send to the stub the new packet format again.
    Everything goes back to what is was.

  - If the stub replies an error, to signal that it understood the
    packet, but saw a format error, GDB retries without the `,tid'
    field, and makes sure to not send to the stub the new packet
    format again.  Everything goes back to what is was.

  In multi-process, tid would take the (pid.)?tid form, so we can
  specify to which process and thread the breakpoint applies to.
  pid.-1 to the whole process, pid.tid to a single thread, -1.-1 to
  mean all attached processes and threads, and empty to mean all
  processes, even the ones not attached.

  If you think that extended the breakpoint packets is still not safe,
  then I propose a new similar packet:

  vBreak type,addr,lenght,tid

  ... the the same semantics I just described.


----------------------------------------------------------------------

Part 8 - Resuming and stepping

  vCont was devised to overcome limitations in the Hs/s/S/c/C packets,
  but it is an optional packet.

  Applying the same pid.tid extension to Hs, allows s,S,c and C to
  work in multi-process mode, but, *requiring* support for "vCont" in
  multi-process mode seems a better solution, than perpetuating some
  broken packets.  For this reason, we propose requiring vCont support
  in a multi-process aware stub.

----------------------------------------------------------------------

Part 9 - What's left out

 Left out of this proposal are Tracepoint and File-IO packets.

-- 
Pedro Alves