Towards multiprocess GDB

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

* Towards multiprocess GDB
@ 2008-07-18 20:50 Stan Shebs
  2008-07-18 21:21 ` Paul Koning
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Stan Shebs @ 2008-07-18 20:50 UTC (permalink / raw)
  To: gdb

CodeSourcery has a project to add "multiprocess" capability to GDB,
and with this message I'd like to kick off some discussion of what
that means and how to make it happen.

To put it simply, the goal of the project is to make this command work
in some useful way:

  gdb prog1 prog2 pid2 prog3 prog4

As the command suggests, we're talking about multiple programs or
executables being controlled by a single GDB, in contrast to a single
program with multiple processes or forks, a la Michael's machinery for
Linux forks. So although we often use the term "multiprocess", it's
perhaps more precise to call it "multiprogram" or "multiexec" GDB.

The first thing is to figure out is how this should all work for the GDB
user. The command above seems like a pretty obvious extension;
programmers debugging client/server pairs have long wanted to be able
to do just that, and we've always had to tell them they have to start
up two GDBs and juggle. Since both core files and process ids are
distinguishable from executable names, we can allow intermixing of all
these on the command line, so that multiple core file and pid
arguments apply to the preceding executables.

Once the debugger is started, but before any of the programs have run,
it seems obvious to have a notion of "current program", so that commands
like "list main" and "break main" will work as usual. The user should
have a way to list the programs ("info programs") and a way to set the
current one. It might also make sense to have a menu option a la C++,
so that "list client_only_fn" works irrespective of the current program,
while "list main" might ask the user which main() is wanted. Another
possibility is to introduce a "program apply <names>|all <cmd>" on the
analogy of the existing thread apply command.

Notice that we don't really want to use the term "process" for any of
this so far, because nothing is running yet and there are no
processes/inferiors; this part is all about the symbol side.

Commands like "file" should get an alternate form or behavior that
adds rather than replaces. Conversely, the user will also want a way
to take programs out of the debugging session. This is not quite the
same as detaching, because the user may want, say, the server to
continue doing its serving thing, but also to have the list etc
commands only work on the client code, no longer be ambiguous.

When it's time to run, the user will want the ability to run anywhere
from one to all of the programs, each with its own argument list. It
should be possible to do this with a single command, so that the user
isn't scrambling to put in all the run commands quickly enough.

Once programs are running, execution control should work in a fashion
generally analogous to what we have for threads now. When something is
stopped, it needs to report program, process, and thread; step and
continue will need a way to specify the program or process being
stepped or continued. User-friendliness suggests that program name
should be accepted as a synonym for process id if there is only one
process for the executable.

Data display will need to have some way to identify the process from
which the data is being taken. It may be useful to have a "process
apply" for print commands, so that the output includes the value of
the expression in each process, especially useful for values that are
expected to be the same in each.

Implementationwise, we will need to replace the single exec target
with a list of execs, and modify symbol machinery to support a
many-many relationship between programs and symbol tables. Although
my inclination is to create a new symbol table for each process'
image of each shared library, that may be excessively expensive.

In addition to thoughts on desired user interface, I would welcome
suggestions on how to add this feature incrementally; the abovementioned
bits are a lot to add all at once!

Stan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 20:50 Towards multiprocess GDB Stan Shebs
@ 2008-07-18 21:21 ` Paul Koning
  2008-07-18 21:41   ` Stan Shebs
  2008-07-21  0:34   ` Michael Snyder
  2008-07-18 22:13 ` Mark Kettenis
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Paul Koning @ 2008-07-18 21:21 UTC (permalink / raw)
  To: stanshebs; +Cc: gdb

>>>>> "Stan" == Stan Shebs <stanshebs@earthlink.net> writes:

 Stan> When it's time to run, the user will want the ability to run
 Stan> anywhere from one to all of the programs, each with its own
 Stan> argument list. It should be possible to do this with a single
 Stan> command, so that the user isn't scrambling to put in all the
 Stan> run commands quickly enough.

Nice.

Do we need a way to specify a different environment for each program?

 Stan> Implementationwise, we will need to replace the single exec
 Stan> target with a list of execs, and modify symbol machinery to
 Stan> support a many-many relationship between programs and symbol
 Stan> tables. Although my inclination is to create a new symbol table
 Stan> for each process' image of each shared library, that may be
 Stan> excessively expensive.

Indeed it might be excessive.  Avoiding it means adding the notion of
per-program offsets, so program A can map library L at offset X, while
program B maps that same library at offset Y.

 Stan> In addition to thoughts on desired user interface, I would
 Stan> welcome suggestions on how to add this feature incrementally;
 Stan> the abovementioned bits are a lot to add all at once!

You mentioned threads in passing.  Those are still needed, so you need
to allow multiple programs each of which may have multiple threads.

   paul

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 21:21 ` Paul Koning
@ 2008-07-18 21:41   ` Stan Shebs
  2008-07-21  0:34   ` Michael Snyder
  1 sibling, 0 replies; 21+ messages in thread
From: Stan Shebs @ 2008-07-18 21:41 UTC (permalink / raw)
  To: Paul Koning; +Cc: gdb

Paul Koning wrote:
>>>>>> "Stan" == Stan Shebs <stanshebs@earthlink.net> writes:
>>>>>>             
>
>  Stan> When it's time to run, the user will want the ability to run
>  Stan> anywhere from one to all of the programs, each with its own
>  Stan> argument list. It should be possible to do this with a single
>  Stan> command, so that the user isn't scrambling to put in all the
>  Stan> run commands quickly enough.
>
> Nice.
>
> Do we need a way to specify a different environment for each program?
>   
Yep, "path" and "set environment" and friends will all need per-program 
controls.
>  Stan> Implementationwise, we will need to replace the single exec
>  Stan> target with a list of execs, and modify symbol machinery to
>  Stan> support a many-many relationship between programs and symbol
>  Stan> tables. Although my inclination is to create a new symbol table
>  Stan> for each process' image of each shared library, that may be
>  Stan> excessively expensive.
>
> Indeed it might be excessive.  Avoiding it means adding the notion of
> per-program offsets, so program A can map library L at offset X, while
> program B maps that same library at offset Y.
>   
Yeah, I'm not yet sure whether we can do it without having to modify all 
the symbol lookup sites. It might make sense to have new symbol table 
objects that consist of process, offsets, and link to old-style symbol 
tables.
>  Stan> In addition to thoughts on desired user interface, I would
>  Stan> welcome suggestions on how to add this feature incrementally;
>  Stan> the abovementioned bits are a lot to add all at once!
>
> You mentioned threads in passing.  Those are still needed, so you need
> to allow multiple programs each of which may have multiple threads.
>   
That's right. In a way execution control is in better shape for this 
than other areas, because it's already had some hacking to facilitate 
listening to multiple issuers of events and such.

Stan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 20:50 Towards multiprocess GDB Stan Shebs
  2008-07-18 21:21 ` Paul Koning
@ 2008-07-18 22:13 ` Mark Kettenis
  2008-07-18 22:20   ` David Daney
                     ` (4 more replies)
  2008-07-18 23:13 ` Tom Tromey
                   ` (3 subsequent siblings)
  5 siblings, 5 replies; 21+ messages in thread
From: Mark Kettenis @ 2008-07-18 22:13 UTC (permalink / raw)
  To: stanshebs; +Cc: gdb

> Date: Fri, 18 Jul 2008 13:40:08 -0700
> From: Stan Shebs <stanshebs@earthlink.net>
> 
> CodeSourcery has a project to add "multiprocess" capability to GDB,
> and with this message I'd like to kick off some discussion of what
> that means and how to make it happen.

Please remind me, why was this a desirable capability again?

Personally, I can't imagine someone would really want to do this using
the traditional gdb CLI, at least not within a single gdb instance.
I'd simply fire up two (or more) xterms and debug the processes
seperately.  One thing that could be useful though for that scenario
is the ability to hand of processes between gdb instances upon
fork/exec.

I suppose multiprocess debugging makes somewhat more sense in a
GUI-based IDE environment.  But running multiple instances of gdb
should work just as well in that case.

Adding a "multiprocess" capability to GDB is almost certainly going to
add significant complexity.  So there should be a good motivation for
it.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 22:13 ` Mark Kettenis
@ 2008-07-18 22:20   ` David Daney
  2008-07-18 22:28   ` Thiago Jung Bauermann
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: David Daney @ 2008-07-18 22:20 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: stanshebs, gdb

Mark Kettenis wrote:
>> Date: Fri, 18 Jul 2008 13:40:08 -0700
>> From: Stan Shebs <stanshebs@earthlink.net>
>>
>> CodeSourcery has a project to add "multiprocess" capability to GDB,
>> and with this message I'd like to kick off some discussion of what
>> that means and how to make it happen.
> 
> Please remind me, why was this a desirable capability again?
> 
> Personally, I can't imagine someone would really want to do this using
> the traditional gdb CLI, at least not within a single gdb instance.
> I'd simply fire up two (or more) xterms and debug the processes
> seperately.  One thing that could be useful though for that scenario
> is the ability to hand of processes between gdb instances upon
> fork/exec.
> 
> I suppose multiprocess debugging makes somewhat more sense in a
> GUI-based IDE environment.  But running multiple instances of gdb
> should work just as well in that case.
> 
> Adding a "multiprocess" capability to GDB is almost certainly going to
> add significant complexity.  So there should be a good motivation for
> it.

Here's my take on it (not being at all involved in the effort):

Suppose that you are debugging both a client and a server process.  One scenario is that you want to debug the results of some command sent between them.

Set a break point in the command handler and disable it.

In the other process set a breakpoint just before sending the command.  This breakpoint's commands could enable the break point in the other process and continue.

Doing this with a multiprocess gdb would be trivial.  If you have to use multiple gdb instances, it is either a manual process or some complicated scripting setup.

Just my $0.02,
David Daney


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 22:13 ` Mark Kettenis
  2008-07-18 22:20   ` David Daney
@ 2008-07-18 22:28   ` Thiago Jung Bauermann
  2008-07-18 23:09   ` Stan Shebs
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Thiago Jung Bauermann @ 2008-07-18 22:28 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: stanshebs, gdb

On Fri, 2008-07-18 at 23:41 +0200, Mark Kettenis wrote:
> > Date: Fri, 18 Jul 2008 13:40:08 -0700
> > From: Stan Shebs <stanshebs@earthlink.net>
> > 
> > CodeSourcery has a project to add "multiprocess" capability to GDB,
> > and with this message I'd like to kick off some discussion of what
> > that means and how to make it happen.
> 
> Please remind me, why was this a desirable capability again?

One interesting use case I've heard (and which other debuggers support)
is debugging both processes involved in a shell pipe command, like:

$ producer | consumer

So you can say something like "break when variable foo in producer is 2
and variable bar in consumer is 10".

I even saw people on the #gdb IRC channel asking for that functionality.

> Personally, I can't imagine someone would really want to do this using
> the traditional gdb CLI, at least not within a single gdb instance.

I think it wouldn't be much different a user experience than debugging a
multithreaded application in the CLI.
-- 
[]'s
Thiago Jung Bauermann
Software Engineer
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 22:13 ` Mark Kettenis
  2008-07-18 22:20   ` David Daney
  2008-07-18 22:28   ` Thiago Jung Bauermann
@ 2008-07-18 23:09   ` Stan Shebs
  2008-07-19  3:53     ` Pedro Alves
  2008-07-21  1:53   ` Michael Snyder
  2008-07-21 17:57   ` Joel Brobecker
  4 siblings, 1 reply; 21+ messages in thread
From: Stan Shebs @ 2008-07-18 23:09 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb

Mark Kettenis wrote:
>> Date: Fri, 18 Jul 2008 13:40:08 -0700
>> From: Stan Shebs <stanshebs@earthlink.net>
>>
>> CodeSourcery has a project to add "multiprocess" capability to GDB,
>> and with this message I'd like to kick off some discussion of what
>> that means and how to make it happen.
>>     
>
> Please remind me, why was this a desirable capability again?
>   
I thought everybody knew that from when I wrote it into the original GDB 
5 wishlist back in 1995 or so. :-)
> Personally, I can't imagine someone would really want to do this using
> the traditional gdb CLI, at least not within a single gdb instance.
> I'd simply fire up two (or more) xterms and debug the processes
> seperately.  One thing that could be useful though for that scenario
> is the ability to hand of processes between gdb instances upon
> fork/exec.
>   
You hit on one of the reasons right there. Even the two-xterm case 
starts to involve some juggling, where you're trying to get to the other 
xterm and continue or whatever, before the first program times out. (At 
least I always seem to fumble the mouse clicking at that key point.) And 
if you've set the first program not to time out, then maybe you've 
changed its behavior in such a way as to suppress the bug. So the single 
GDB instance gives you tighter control over the several programs.

I think there also advantages in terms of sharing history, symbols, and 
so forth; "break foo" can be a very interesting way to discover which 
program's foo() gets hits first, for foo() in a shared library.
> Adding a "multiprocess" capability to GDB is almost certainly going to
> add significant complexity.  So there should be a good motivation for
> it.
>   
I'm not sure it's going to be that big of a change actually. Once 
behavior has been directed to the right objects internally, they will do 
what they're doing now. Most symbol handling is objfile-relative for 
instance, adding a new set of objfiles for a different exec isn't going 
to affect that code.

Stan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 20:50 Towards multiprocess GDB Stan Shebs
  2008-07-18 21:21 ` Paul Koning
  2008-07-18 22:13 ` Mark Kettenis
@ 2008-07-18 23:13 ` Tom Tromey
  2008-07-21 17:11   ` Stan Shebs
  2008-07-21  0:30 ` Michael Snyder
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Tom Tromey @ 2008-07-18 23:13 UTC (permalink / raw)
  To: Stan Shebs; +Cc: gdb

>>>>> "Stan" == Stan Shebs <stanshebs@earthlink.net> writes:

Stan> CodeSourcery has a project to add "multiprocess" capability to GDB,
Stan> and with this message I'd like to kick off some discussion of what
Stan> that means and how to make it happen.

Awesome.

Stan> Once the debugger is started, but before any of the programs
Stan> have run, it seems obvious to have a notion of "current
Stan> program", so that commands like "list main" and "break main"
Stan> will work as usual.

Stan> Notice that we don't really want to use the term "process" for any of
Stan> this so far, because nothing is running yet and there are no
Stan> processes/inferiors; this part is all about the symbol side.

It is worth thinking about how things like "break main" will work
after processes have been made, as well.  For instance, consider
something like "set follow-fork-mode both"; after a fork, where does
'break main' apply?

This is particularly interesting if you are watching a very large tree
of processes.  It would be nice if gdb did not have eagerly to read
symbols for every program.

The HPD spec (which I know you worked on :) has some "focus" commands
for this kind of thing.  It may be worth looking at that, or looking
to see what the competition (totalview cough cough) does.  I suppose
the user could focus on a thread, a process, or a file (or sets
thereof); and have a generalized "focus-group apply" thing.

Another idea, related to something in HPD, would be to come up with a
general syntax for naming things -- like "#PID#file.c:function" (or
what have you) -- so the user can easily embed a context into a name.
(I dunno if this is useful, I'm just brainstorming a bit.)

Finally, HPD also has "barrier"s, which are like multi-process
breakpoints.  I don't know if these are interesting enough to write
into the core of gdb; it seems to me that new porcelain like this can
most easily be put over the plumbing using new commands written in
python.

Stan> Conversely, the user will also want a way to take programs out
Stan> of the debugging session. This is not quite the same as
Stan> detaching, because the user may want, say, the server to
Stan> continue doing its serving thing

I don't understand the distinction here.  With 'detach' the server
would continue serving... what am I missing?

Stan> Although my inclination is to create a new symbol table for each
Stan> process' image of each shared library, that may be excessively
Stan> expensive.

To me this sounds important to solve, though of course I have no
numbers.  I am just thinking that every program is going to be mapping
in libc, and it would be great to scale to a large number of inferiors.
FWIW the particular scenario I am interested in is running "make check"
in the gcc tree, under gdb, and having it stop only when some
sub-process gets a SEGV.  So, the question is, how does gdb react to
thousands of process creations and deletions, how eagerly does it read
and discard symbol table info, etc.

Tom


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 23:09   ` Stan Shebs
@ 2008-07-19  3:53     ` Pedro Alves
  2008-07-21 17:39       ` Stan Shebs
  0 siblings, 1 reply; 21+ messages in thread
From: Pedro Alves @ 2008-07-19  3:53 UTC (permalink / raw)
  To: gdb; +Cc: Stan Shebs, Mark Kettenis

A Friday 18 July 2008 23:27:58, Stan Shebs wrote:

> I'm not sure it's going to be that big of a change actually. Once
> behavior has been directed to the right objects internally, they will do
> what they're doing now. Most symbol handling is objfile-relative for
> instance, adding a new set of objfiles for a different exec isn't going
> to affect that code.

You seem to be thinking of the symbol handling code only.

The thing I see requiring architecting, is the target stack.

Currently, it's only layed out for the single executable + single
process case.

For starters, I'll try to give a simple example.

Imagine you're debugging simultaneously these cases below:

1) Linux native non-threaded app.  The current target is a squashed
view of:

 linux-nat    process_stratum
 exec         file_stratum

2) Linux native multi-threaded app, loads libthread_db for the thread
support.  The current target is a squashed view of:

 linux-thread-db    thread_stratum
 linux-nat          process_stratum
 exec               file_stratum

3) Linux native threaded app, but loads another thread support lib.

 linux-other-thread-db    thread_stratum
 linux-nat                process_stratum
 exec                     file_stratum

4) Linux native non-threaded app, doing reverse debugging.

 record                   record_stratum
 linux-nat                process_stratum
 exec                     file_stratum

 (There are OSs where this is more common.)

It seems to me that each process should have its own target stack
instance.

We could split target_ops in two and share the ops part
between instances, but still the target_ops rw data needs to be
per-process.

Note that I'm not even considering multi-process,multi-exec,single
remote target connection, where things get even dirtier, or even
native process + remote process debugging simultaneously.

There are other issues around the target stack that need
resolution, but this issue above seems crucial to have in
mind first.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 20:50 Towards multiprocess GDB Stan Shebs
                   ` (2 preceding siblings ...)
  2008-07-18 23:13 ` Tom Tromey
@ 2008-07-21  0:30 ` Michael Snyder
  2008-07-21 18:21   ` Paul Pluzhnikov
  2008-07-21 18:32   ` Stan Shebs
  2008-07-21 16:38 ` Michael Eager
  2008-07-31 22:04 ` Ian Lance Taylor
  5 siblings, 2 replies; 21+ messages in thread
From: Michael Snyder @ 2008-07-21  0:30 UTC (permalink / raw)
  To: Stan Shebs; +Cc: gdb

On Fri, 2008-07-18 at 13:40 -0700, Stan Shebs wrote:
> CodeSourcery has a project to add "multiprocess" capability to GDB,
> and with this message I'd like to kick off some discussion of what
> that means and how to make it happen.
> 
> To put it simply, the goal of the project is to make this command work
> in some useful way:
> 
>   gdb prog1 prog2 pid2 prog3 prog4
> 
> As the command suggests, we're talking about multiple programs or
> executables being controlled by a single GDB, in contrast to a single
> program with multiple processes or forks, a la Michael's machinery for
> Linux forks.

Daniel's, really, I just extended it.

This is great, it's been a long time coming.  I've had lots of
inquiries about it, and even people asking me to implement it.

Mostly I've referred them to Eclipse, which as you may know
can martial together several gdb sessions under one eclipse
session.

>  So although we often use the term "multiprocess", it's
> perhaps more precise to call it "multiprogram" or "multiexec" GDB.
> 
> The first thing is to figure out is how this should all work for the GDB
> user. The command above seems like a pretty obvious extension;
> programmers debugging client/server pairs have long wanted to be able
> to do just that, and we've always had to tell them they have to start
> up two GDBs and juggle. Since both core files and process ids are
> distinguishable from executable names, we can allow intermixing of all
> these on the command line, so that multiple core file and pid
> arguments apply to the preceding executables.

Yarg.  I can imagine wanting to debug several core files at once
(failure of a multi-process application, maybe), but mixed core
files and live processes?  I think maybe we should disallow that...
What would happen when you said "continue"?

> Once the debugger is started, but before any of the programs have run,
> it seems obvious to have a notion of "current program", so that commands
> like "list main" and "break main" will work as usual. The user should
> have a way to list the programs ("info programs") and a way to set the
> current one. It might also make sense to have a menu option a la C++,
> so that "list client_only_fn" works irrespective of the current program,
> while "list main" might ask the user which main() is wanted. Another
> possibility is to introduce a "program apply <names>|all <cmd>" on the
> analogy of the existing thread apply command.

Yep, I think we'd do well to consider the thread debugging paradigm
as a model, with this maybe as a second layer on top of that
(although users might want to be able to give commands that cross
both layers, eg. "give me the registers for process A, thread B").

In the case of threads, we have already taught GDB to maintain a
separate set of program state (and debugger state) info for each
thread.  We could do the same for processes, with the state info
set extended to include (eventually) a symbol file, environment, 
argv list etc.

> Notice that we don't really want to use the term "process" for any of
> this so far, because nothing is running yet and there are no
> processes/inferiors; this part is all about the symbol side.
> 
> Commands like "file" should get an alternate form or behavior that
> adds rather than replaces.

While preserving the present behavior, of course...   ;-)

>  Conversely, the user will also want a way
> to take programs out of the debugging session. This is not quite the
> same as detaching, because the user may want, say, the server to
> continue doing its serving thing, but also to have the list etc
> commands only work on the client code, no longer be ambiguous.

You would want to be able to kill, detach etc. on a per-process
basis of course (here I assume the process is running so I say
process explicitly).

But I echo somebody else's question -- how do you see what you
describe as being different from detach (other than by applying 
to only one of several child processes)?

> When it's time to run, the user will want the ability to run anywhere
> from one to all of the programs, each with its own argument list. It
> should be possible to do this with a single command, so that the user
> isn't scrambling to put in all the run commands quickly enough.

Yep -- while it might be a challenge to come up with a command
syntax that would allow the user to specify the argv args for all
the child processes at once, we could for instance have "set args"
commands for each of several processes we wanted to start at once,
and then a single "run all" sort of thing.

Of course, this sort of implies giving gdb some knowledge about
each of several child processes BEFORE they are actually created.

Then we could also apply some attribute to each of the (running
or potential) processes that specified something like a "run set",
if you will (and if you remember that concept from the HPDF) --
ie. I want these to run and those to not-run.

> Once programs are running, execution control should work in a fashion
> generally analogous to what we have for threads now. When something is
> stopped, it needs to report program, process, and thread; step and
> continue will need a way to specify the program or process being
> stepped or continued. User-friendliness suggests that program name
> should be accepted as a synonym for process id if there is only one
> process for the executable.

And again -- appologies that I have not closely followed the current
discussions about threads and running/not-running, but you would
probably
want to be able to specify some subset of processes-and-threads that
were considered runnable, and some that would be held in the stopped
state.

> Data display will need to have some way to identify the process from
> which the data is being taken. It may be useful to have a "process
> apply" for print commands, so that the output includes the value of
> the expression in each process, especially useful for values that are
> expected to be the same in each.

Absolutely -- although in the absence of a gui, this could rapidly
become cumbersome.  I think most of what we currently can do with
threads can extend more-or-less naturally to processes.  

I see the first big challenge as being the handling of multiple
symbol tables.

> Implementationwise, we will need to replace the single exec target
> with a list of execs, and modify symbol machinery to support a
> many-many relationship between programs and symbol tables. Although
> my inclination is to create a new symbol table for each process'
> image of each shared library, that may be excessively expensive.
> 
> In addition to thoughts on desired user interface, I would welcome
> suggestions on how to add this feature incrementally; the abovementioned
> bits are a lot to add all at once!

I am a big lover of the incremental approach.

How about this as a first increment?  Could we extend
what we now have with forks, so that GDB could individually
control the separate process ids (including threads, if they
have them), while forestalling dealing with separate symbol
files because forks can all share the same one?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 21:21 ` Paul Koning
  2008-07-18 21:41   ` Stan Shebs
@ 2008-07-21  0:34   ` Michael Snyder
  1 sibling, 0 replies; 21+ messages in thread
From: Michael Snyder @ 2008-07-21  0:34 UTC (permalink / raw)
  To: Paul Koning; +Cc: stanshebs, gdb

On Fri, 2008-07-18 at 16:50 -0400, Paul Koning wrote:

> Do we need a way to specify a different environment for each program?

Yep - environment, argv, symbols, and address space (mem caching).
Maybe auxv and TLS.  Plus debugger state (running, stepping, 
all the egs and tss stuff).



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 22:13 ` Mark Kettenis
                     ` (2 preceding siblings ...)
  2008-07-18 23:09   ` Stan Shebs
@ 2008-07-21  1:53   ` Michael Snyder
  2008-07-21 16:08     ` Russell Shaw
  2008-07-21 17:57   ` Joel Brobecker
  4 siblings, 1 reply; 21+ messages in thread
From: Michael Snyder @ 2008-07-21  1:53 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: stanshebs, gdb

On Fri, 2008-07-18 at 23:41 +0200, Mark Kettenis wrote:
> > Date: Fri, 18 Jul 2008 13:40:08 -0700
> > From: Stan Shebs <stanshebs@earthlink.net>
> > 
> > CodeSourcery has a project to add "multiprocess" capability to GDB,
> > and with this message I'd like to kick off some discussion of what
> > that means and how to make it happen.
> 
> Please remind me, why was this a desirable capability again?

Not answering for Stan, but from my experience, because
people are asking for it.  Parallel programs eventually 
go beyond threads.

For instance I've had inquiries from several router mfgrs, 
who want to be able to debug their more advanced router systems
that are running full-scale OS's (such as linux, bsd, and qnx)
on the remote side.  Also from makers of set top boxes of one
sort or another, and multi-core cpu mfgrs (though there we get
into multiple cores as well as multiple processes).

But yes, I'd be interested in hearing what kinds of use cases
CodeSourcery's customers are asking about as well...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-21  1:53   ` Michael Snyder
@ 2008-07-21 16:08     ` Russell Shaw
  0 siblings, 0 replies; 21+ messages in thread
From: Russell Shaw @ 2008-07-21 16:08 UTC (permalink / raw)
  Cc: gdb

Michael Snyder wrote:
> On Fri, 2008-07-18 at 23:41 +0200, Mark Kettenis wrote:
>>> Date: Fri, 18 Jul 2008 13:40:08 -0700
>>> From: Stan Shebs <stanshebs@earthlink.net>
>>>
>>> CodeSourcery has a project to add "multiprocess" capability to GDB,
>>> and with this message I'd like to kick off some discussion of what
>>> that means and how to make it happen.
>> Please remind me, why was this a desirable capability again?
> 
> Not answering for Stan, but from my experience, because
> people are asking for it.  Parallel programs eventually 
> go beyond threads.
> 
> For instance I've had inquiries from several router mfgrs, 
> who want to be able to debug their more advanced router systems
> that are running full-scale OS's (such as linux, bsd, and qnx)
> on the remote side.  Also from makers of set top boxes of one
> sort or another, and multi-core cpu mfgrs (though there we get
> into multiple cores as well as multiple processes).
> 
> But yes, I'd be interested in hearing what kinds of use cases
> CodeSourcery's customers are asking about as well...

Trying to contain all the dynamics of multiple processes in one gdb
doesn't suit the user interface of gdb well. Things would get pokey
and cluttered for the user.

One way would be to make multiple instances of gdb coordinate
via an IPC protocol.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 20:50 Towards multiprocess GDB Stan Shebs
                   ` (3 preceding siblings ...)
  2008-07-21  0:30 ` Michael Snyder
@ 2008-07-21 16:38 ` Michael Eager
  2008-07-21 21:54   ` Stan Shebs
  2008-07-31 22:04 ` Ian Lance Taylor
  5 siblings, 1 reply; 21+ messages in thread
From: Michael Eager @ 2008-07-21 16:38 UTC (permalink / raw)
  To: Stan Shebs; +Cc: gdb

Stan Shebs wrote:
> CodeSourcery has a project to add "multiprocess" capability to GDB,
> and with this message I'd like to kick off some discussion of what
> that means and how to make it happen.
> 
> To put it simply, the goal of the project is to make this command work
> in some useful way:
> 
>  gdb prog1 prog2 pid2 prog3 prog4
> 
> As the command suggests, we're talking about multiple programs or
> executables being controlled by a single GDB, in contrast to a single
> program with multiple processes or forks, a la Michael's machinery for
> Linux forks. So although we often use the term "multiprocess", it's
> perhaps more precise to call it "multiprogram" or "multiexec" GDB.

I think that this is a subset of what is actually needed.  A good
starting point, but it doesn't address present and future architectures.

I would like to see GDB become a multi-process, multi-target, multi-
architecture debugger.  There are many multi-processor systems, where
several processing units make up the target.  One example is the Cell:
a PowerPC processor connected to multiple special-purpose processors.
There are many RISC-DSP processors; there are efforts to re-purpose
Nvidia's massively parallel GPUs to more conventional programming;
and the future of processor design is multicore, not all of will be
the same architecture or connected symmetrically.  An SMP architecture
running a single Linux OS is a subset.

The target abstraction to support this needs to permit multiple
threads on multiple targets, with different architectures, different
executables, and different communication protocols.

Clearly, if this were the target abstraction, supporting multiple
processes running on the same processor under a single OS would
be easy.  A design which only focuses on this subset will not be
easy to extend to multiple dissimilar execution environments.

-- 
Michael Eager	 eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 23:13 ` Tom Tromey
@ 2008-07-21 17:11   ` Stan Shebs
  0 siblings, 0 replies; 21+ messages in thread
From: Stan Shebs @ 2008-07-21 17:11 UTC (permalink / raw)
  To: tromey; +Cc: gdb

Tom Tromey wrote:
> This is particularly interesting if you are watching a very large tree
> of processes.  It would be nice if gdb did not have eagerly to read
> symbols for every program.
>   
Yes, that would be good. I think that if things like breakpoint 
locations include an exec name, then inferior control need do nothing 
more than catch the fork/exec, and let the programs with nonmatching 
names continue on without further ado.
> Another idea, related to something in HPD, would be to come up with a
> general syntax for naming things -- like "#PID#file.c:function" (or
> what have you) -- so the user can easily embed a context into a name.
> (I dunno if this is useful, I'm just brainstorming a bit.)
>   
We're going to want some sort of syntax, don't want to commit to any 
particular character(s) yet. :-)
> Stan> Conversely, the user will also want a way to take programs out
> Stan> of the debugging session. This is not quite the same as
> Stan> detaching, because the user may want, say, the server to
> Stan> continue doing its serving thing
>
> I don't understand the distinction here.  With 'detach' the server
> would continue serving... what am I missing?
>   
I'm thinking one may want to flush symbol tables, so as to reduce 
ambiguity. Otherwise you could end up with more and more versions of 
main() over time, and very long selection menus in response to "break 
main", but with many or most of the entries being for programs not 
actually being debugged at the moment.
> Stan> Although my inclination is to create a new symbol table for each
> Stan> process' image of each shared library, that may be excessively
> Stan> expensive.
>
> To me this sounds important to solve, though of course I have no
> numbers.  I am just thinking that every program is going to be mapping
> in libc, and it would be great to scale to a large number of inferiors.
> FWIW the particular scenario I am interested in is running "make check"
> in the gcc tree, under gdb, and having it stop only when some
> sub-process gets a SEGV.  So, the question is, how does gdb react to
> thousands of process creations and deletions, how eagerly does it read
> and discard symbol table info, etc.
>   
That's a really great scenario to keep in mind, and totally doable I 
think, though maybe not on the first iteration. :-)

Stan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-19  3:53     ` Pedro Alves
@ 2008-07-21 17:39       ` Stan Shebs
  0 siblings, 0 replies; 21+ messages in thread
From: Stan Shebs @ 2008-07-21 17:39 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb, Stan Shebs, Mark Kettenis

Pedro Alves wrote:
> A Friday 18 July 2008 23:27:58, Stan Shebs wrote:
>
>   
>> I'm not sure it's going to be that big of a change actually. Once
>> behavior has been directed to the right objects internally, they will do
>> what they're doing now. Most symbol handling is objfile-relative for
>> instance, adding a new set of objfiles for a different exec isn't going
>> to affect that code.
>>     
>
> You seem to be thinking of the symbol handling code only.
>
> The thing I see requiring architecting, is the target stack.
>   
You make very good points about the target stack. As I understand it, 
the immediate project is merely to support multiple programs all using 
the same target, so I've been concentrating on the symbol side. But 
you're right, I think we do want the target objects to be per-program.

This connects to another longstanding wishlist item, which is to change 
target stacking to be more procedural, and less literally stacklike. 
With multiple programs, and inferiors scattered around the net, the 
target "stack" starts looking more forest-like, and it seems like it 
might be simpler to construct target objects as compositions of target 
descriptions, with the stack being more like the composition procedure 
than a literal stack.

Thinking about it from the user point of view, I wonder if we need to 
merge the "target", "run", and "attach" commands into some kind of 
super-command that amounts to "get all the inferiors ready to debug", so 
that the old commands are all special cases.

Stan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 22:13 ` Mark Kettenis
                     ` (3 preceding siblings ...)
  2008-07-21  1:53   ` Michael Snyder
@ 2008-07-21 17:57   ` Joel Brobecker
  4 siblings, 0 replies; 21+ messages in thread
From: Joel Brobecker @ 2008-07-21 17:57 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: stanshebs, gdb

> Please remind me, why was this a desirable capability again?

One case that Pedro and I discussed at the GCC Summit when he mentioned
this topic was VxWorks systems.

AdaCore recently re-implemented support for ppc/vxworks targets in GDB,
and I can post the diff if anyone is interested. I haven't done so yet,
because there are a couple of major hacks that we had to introduced and
I want to re-implement then appropriately before considering a submission.

VxWorks introduces the notion of partitions, but has no notion of
"process" - only "tasks" (aka threads). All tasks within a partition
have access to everything within that partition. However, partitions
can almost be viewed as different computers, so symbols in one partition
are completely irrelevant to other partitions.

One way to debug some of the code is to debug one task, and treat it
as a process. But the reality is that programs are often multi-threaded
and so debugging one task would only be debugging part the program.
This is why we introduced the option of debugging the entire partition
as one process.

But most of our users want to debug the entire system rather than each
individual task separately, because of inter-partition scheduling or
communication issues. This is when the fun starts, because we now have
symbols for one partition, and symbols for another partition, and we
need to remember which partition they are for.  I believe that adding
support for multi-process debugging will solve the issue above, since
each process should have its own list of objfiles and associated
symbols.

VxWorks also has the concept of shared-library partitions that any given
partition can link to, effectively giving access to the symbols there.
I think this is very comparable to shared libraries in Unix systems or
DLLs on Windows. It would be really nice if we eventually taught GDB to
recognize that two processes were using the same shared library and
thus internally loaded the symbol once.

-- 
Joel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-21  0:30 ` Michael Snyder
@ 2008-07-21 18:21   ` Paul Pluzhnikov
  2008-07-21 18:32   ` Stan Shebs
  1 sibling, 0 replies; 21+ messages in thread
From: Paul Pluzhnikov @ 2008-07-21 18:21 UTC (permalink / raw)
  To: Michael Snyder; +Cc: Stan Shebs, gdb

On Sun, Jul 20, 2008 at 5:26 PM, Michael Snyder <msnyder@specifix.com> wrote:

> I am a big lover of the incremental approach.
>
> How about this as a first increment?  Could we extend
> what we now have with forks, so that GDB could individually
> control the separate process ids (including threads, if they
> have them), while forestalling dealing with separate symbol
> files because forks can all share the same one?

I would like to add another twist (which I don't see mentioned
elsewhere in this thread yet): cluster computing.

Google (and I am sure many others) has great interest in being
able to debug potentially 1000s of instances of execution of the
same program on multiple hosts. [Naturally, each host will have at
least one gdbserver running.]

This is almost, but not quite the same as what Michael suggests
above.

-- 
Paul Pluzhnikov

[Not officially speaking for Google.]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-21  0:30 ` Michael Snyder
  2008-07-21 18:21   ` Paul Pluzhnikov
@ 2008-07-21 18:32   ` Stan Shebs
  1 sibling, 0 replies; 21+ messages in thread
From: Stan Shebs @ 2008-07-21 18:32 UTC (permalink / raw)
  To: Michael Snyder; +Cc: gdb

Michael Snyder wrote:
> Yarg.  I can imagine wanting to debug several core files at once
> (failure of a multi-process application, maybe), but mixed core
> files and live processes?  I think maybe we should disallow that...
> What would happen when you said "continue"?
>   
In practice, I think there are going to be nonsensical combinations, but 
we can just exclude them as they come up. For a first version it seems 
reasonable to only work if all the arguments use the same target, and 
then allow more variation once Pedro implements the multi-target 
capability he described. :-) If targets are per-program, then "all 
continue" could be quite a circus - no effect on cores, ptrace calls to 
local inferiors, and a spray of packets to remote targets, which are 
fortunately async, so they don't mind getting an extra continue packet 
or two. :-)
> How about this as a first increment?  Could we extend
> what we now have with forks, so that GDB could individually
> control the separate process ids (including threads, if they
> have them), while forestalling dealing with separate symbol
> files because forks can all share the same one?
>   
I want to jump on symbol files quickly, because I think it may be the 
long pole - I hope not, but we don't know for sure yet, lots of old code 
with old assumptions. Our clientele is mainly interested in embedded 
(more protocol and gdbserver hackery, whee), so native fork support 
seems most likely as either testing support or a volunteer contribution.

Stan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-21 16:38 ` Michael Eager
@ 2008-07-21 21:54   ` Stan Shebs
  0 siblings, 0 replies; 21+ messages in thread
From: Stan Shebs @ 2008-07-21 21:54 UTC (permalink / raw)
  To: Michael Eager; +Cc: gdb

Michael Eager wrote:
> I would like to see GDB become a multi-process, multi-target, multi-
> architecture debugger.
I agree that this should be the ultimate outcome, and I keep it in mind 
when thinking about specific subsystems. After all, that was one of the 
main reasons for the long and painful transition from random 
preprocessor macros to gdbarch objects. Poking around the web pages and 
wiki, I don't see anything actually saying that anywhere, although most 
GDBers I've talked to in person are in agreement. Time for a new 
long-term roadmap?

Stan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Towards multiprocess GDB
  2008-07-18 20:50 Towards multiprocess GDB Stan Shebs
                   ` (4 preceding siblings ...)
  2008-07-21 16:38 ` Michael Eager
@ 2008-07-31 22:04 ` Ian Lance Taylor
  5 siblings, 0 replies; 21+ messages in thread
From: Ian Lance Taylor @ 2008-07-31 22:04 UTC (permalink / raw)
  To: Stan Shebs; +Cc: gdb

Stan Shebs <stanshebs@earthlink.net> writes:

> CodeSourcery has a project to add "multiprocess" capability to GDB,
> and with this message I'd like to kick off some discussion of what
> that means and how to make it happen.
>
> To put it simply, the goal of the project is to make this command work
> in some useful way:
>
>  gdb prog1 prog2 pid2 prog3 prog4

I am certainly interested in having this happen.  I just want to add
that in the cases I most care about the different programs are
actually running on different machines.  That should fall out pretty
naturally, using gdbserver, from this work if done right; I'm just
chiming in now to make sure that that capability is considered as a
goal from the start.  Thanks.

Ian

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2008-07-31 20:03 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-18 20:50 Towards multiprocess GDB Stan Shebs
2008-07-18 21:21 ` Paul Koning
2008-07-18 21:41   ` Stan Shebs
2008-07-21  0:34   ` Michael Snyder
2008-07-18 22:13 ` Mark Kettenis
2008-07-18 22:20   ` David Daney
2008-07-18 22:28   ` Thiago Jung Bauermann
2008-07-18 23:09   ` Stan Shebs
2008-07-19  3:53     ` Pedro Alves
2008-07-21 17:39       ` Stan Shebs
2008-07-21  1:53   ` Michael Snyder
2008-07-21 16:08     ` Russell Shaw
2008-07-21 17:57   ` Joel Brobecker
2008-07-18 23:13 ` Tom Tromey
2008-07-21 17:11   ` Stan Shebs
2008-07-21  0:30 ` Michael Snyder
2008-07-21 18:21   ` Paul Pluzhnikov
2008-07-21 18:32   ` Stan Shebs
2008-07-21 16:38 ` Michael Eager
2008-07-21 21:54   ` Stan Shebs
2008-07-31 22:04 ` Ian Lance Taylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox