* RE: Checkpoint-restart with different code
@ 2005-12-13 19:58 Prateek Saxena
0 siblings, 0 replies; 6+ messages in thread
From: Prateek Saxena @ 2005-12-13 19:58 UTC (permalink / raw)
To: greg; +Cc: gdb, drow
Hi,
I have a stable kernel patch for Linux 2.6.14.3 and Linux 2.4.31 that
allows a process to mmap any region of the tracee's process address
space, through the /proc/[pid]/mem file.
It allows the debugger/tracer to write/read change all parts of the
program memory of traced process. You could you this interface to
modify the code memory being read by the traced process, by simply
mmaping the code/stack/data page in the checkpointer's memory, and
writing to it.
I have tested some large tracer's with this, and the fsx testsuite.
The patches should be available for download in the next week or so. I
can send the link as soon as I upload them.
-- Prateek Saxena
Department of Computer Science,
Stony Brook University.
Greg Bronevetsky wrote:
> I see that there's been some discussion on this list on checkpointing techniques > that may be included in gdb. My research group at Cornell is working on a >number of such checkpointers for both sequential and parallel programs and we >recently decided to try a more challenging variant of checkpointing where the >user can take a checkpoint of their program, modify their source code a bit (add >remove stack variables, move function calls around a bit and a few other things) >and then resume computation using the modified code. This seems to be very >useful for debugging long-running applications since the user would be able to >work around the bug without losing a week's or month's worth of results. (can >happen in high-performance computing) Similarly, its useful for situations where >your execution is in some particularly buggy corner case and you want to keep >making modifications and trying them out without having to guide the program's >execution back into that corner case after every code change.
>My question is, has anybody heard of anything that can do this?
Obviously, this >kind of checkpointing would require compiler support,
so gdb wouldn't have done >this, but have you heard of any
systems/research that has addressed this >question? Thanks.
>--
>Greg Bronevetsky
>490 Rhodes Hall
^ permalink raw reply [flat|nested] 6+ messages in thread
* Checkpoint-restart with different code
@ 2005-12-09 21:58 Greg Bronevetsky
2005-12-09 22:08 ` Daniel Jacobowitz
0 siblings, 1 reply; 6+ messages in thread
From: Greg Bronevetsky @ 2005-12-09 21:58 UTC (permalink / raw)
To: gdb
I see that there's been some discussion on this list on checkpointing
techniques that may be included in gdb. My research group at Cornell is
working on a number of such checkpointers for both sequential and
parallel programs and we recently decided to try a more challenging
variant of checkpointing where the user can take a checkpoint of their
program, modify their source code a bit (add remove stack variables,
move function calls around a bit and a few other things) and then resume
computation using the modified code. This seems to be very useful for
debugging long-running applications since the user would be able to work
around the bug without losing a week's or month's worth of results. (can
happen in high-performance computing) Similarly, its useful for
situations where your execution is in some particularly buggy corner
case and you want to keep making modifications and trying them out
without having to guide the program's execution back into that corner
case after every code change.
My question is, has anybody heard of anything that can do this?
Obviously, this kind of checkpointing would require compiler support, so
gdb wouldn't have done this, but have you heard of any systems/research
that has addressed this question? Thanks.
--
Greg Bronevetsky
490 Rhodes Hall
Cornell University
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Checkpoint-restart with different code
2005-12-09 21:58 Greg Bronevetsky
@ 2005-12-09 22:08 ` Daniel Jacobowitz
2005-12-10 1:36 ` Randolph Chung
2005-12-10 20:34 ` Greg Bronevetsky
0 siblings, 2 replies; 6+ messages in thread
From: Daniel Jacobowitz @ 2005-12-09 22:08 UTC (permalink / raw)
To: Greg Bronevetsky; +Cc: gdb
On Fri, Dec 09, 2005 at 04:57:57PM -0500, Greg Bronevetsky wrote:
> I see that there's been some discussion on this list on checkpointing
> techniques that may be included in gdb. My research group at Cornell is
> working on a number of such checkpointers for both sequential and
> parallel programs and we recently decided to try a more challenging
> variant of checkpointing where the user can take a checkpoint of their
> program, modify their source code a bit (add remove stack variables,
> move function calls around a bit and a few other things) and then resume
> computation using the modified code. This seems to be very useful for
> debugging long-running applications since the user would be able to work
> around the bug without losing a week's or month's worth of results. (can
> happen in high-performance computing) Similarly, its useful for
> situations where your execution is in some particularly buggy corner
> case and you want to keep making modifications and trying them out
> without having to guide the program's execution back into that corner
> case after every code change.
>
> My question is, has anybody heard of anything that can do this?
> Obviously, this kind of checkpointing would require compiler support, so
> gdb wouldn't have done this, but have you heard of any systems/research
> that has addressed this question? Thanks.
Better: I know at least one production debug environment which supports
this - Apple's Xcode. The option is called fix-and-continue. I don't
think they combine it with checkpointing, though, only as an action on
a running process. It's partly compiler-based and partly in their
debug environment.
Merging that with Michael's fork-based code would be fairly
straightforward, I expect.
--
Daniel Jacobowitz
CodeSourcery, LLC
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Checkpoint-restart with different code
2005-12-09 22:08 ` Daniel Jacobowitz
@ 2005-12-10 1:36 ` Randolph Chung
2005-12-10 20:34 ` Greg Bronevetsky
1 sibling, 0 replies; 6+ messages in thread
From: Randolph Chung @ 2005-12-10 1:36 UTC (permalink / raw)
To: gdb
>> My question is, has anybody heard of anything that can do this?
>> Obviously, this kind of checkpointing would require compiler support, so
>> gdb wouldn't have done this, but have you heard of any systems/research
>> that has addressed this question? Thanks.
>
> Better: I know at least one production debug environment which supports
> this - Apple's Xcode. The option is called fix-and-continue. I don't
> think they combine it with checkpointing, though, only as an action on
> a running process. It's partly compiler-based and partly in their
> debug environment.
>
> Merging that with Michael's fork-based code would be fairly
> straightforward, I expect.
HP's wdb (which is derived from gdb) claims to support this with HP
compilers. I have not tried it though.
http://www.hp.com/go/wdb
randolph
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Checkpoint-restart with different code
2005-12-09 22:08 ` Daniel Jacobowitz
2005-12-10 1:36 ` Randolph Chung
@ 2005-12-10 20:34 ` Greg Bronevetsky
2005-12-11 3:53 ` Daniel Jacobowitz
1 sibling, 1 reply; 6+ messages in thread
From: Greg Bronevetsky @ 2005-12-10 20:34 UTC (permalink / raw)
Cc: gdb
Thanks for your help! I've looked around some more and it looks like a
number of debuggers provide some form of this functionality. However,
they all seem to be binary rewriters. Does anybody know of work at the
compiler level? I can imagine that it would be possible to allow for
more flexible editing of the code if it could be recompiled from the source.
--
Greg Bronevetsky
Daniel Jacobowitz wrote:
>On Fri, Dec 09, 2005 at 04:57:57PM -0500, Greg Bronevetsky wrote:
>
>
>>I see that there's been some discussion on this list on checkpointing
>>techniques that may be included in gdb. My research group at Cornell is
>>working on a number of such checkpointers for both sequential and
>>parallel programs and we recently decided to try a more challenging
>>variant of checkpointing where the user can take a checkpoint of their
>>program, modify their source code a bit (add remove stack variables,
>>move function calls around a bit and a few other things) and then resume
>>computation using the modified code. This seems to be very useful for
>>debugging long-running applications since the user would be able to work
>>around the bug without losing a week's or month's worth of results. (can
>>happen in high-performance computing) Similarly, its useful for
>>situations where your execution is in some particularly buggy corner
>>case and you want to keep making modifications and trying them out
>>without having to guide the program's execution back into that corner
>>case after every code change.
>>
>>My question is, has anybody heard of anything that can do this?
>>Obviously, this kind of checkpointing would require compiler support, so
>>gdb wouldn't have done this, but have you heard of any systems/research
>>that has addressed this question? Thanks.
>>
>>
>
>Better: I know at least one production debug environment which supports
>this - Apple's Xcode. The option is called fix-and-continue. I don't
>think they combine it with checkpointing, though, only as an action on
>a running process. It's partly compiler-based and partly in their
>debug environment.
>
>Merging that with Michael's fork-based code would be fairly
>straightforward, I expect.
>
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Checkpoint-restart with different code
2005-12-10 20:34 ` Greg Bronevetsky
@ 2005-12-11 3:53 ` Daniel Jacobowitz
0 siblings, 0 replies; 6+ messages in thread
From: Daniel Jacobowitz @ 2005-12-11 3:53 UTC (permalink / raw)
To: Greg Bronevetsky; +Cc: gdb
On Sat, Dec 10, 2005 at 03:34:47PM -0500, Greg Bronevetsky wrote:
> Thanks for your help! I've looked around some more and it looks like a
> number of debuggers provide some form of this functionality. However,
> they all seem to be binary rewriters. Does anybody know of work at the
> compiler level? I can imagine that it would be possible to allow for
> more flexible editing of the code if it could be recompiled from the source.
That's what fix-and-continue is. The debugger inserts trampolines in
the old function which redirect to the new function.
--
Daniel Jacobowitz
CodeSourcery, LLC
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-12-13 19:58 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-12-13 19:58 Checkpoint-restart with different code Prateek Saxena
-- strict thread matches above, loose matches on Subject: below --
2005-12-09 21:58 Greg Bronevetsky
2005-12-09 22:08 ` Daniel Jacobowitz
2005-12-10 1:36 ` Randolph Chung
2005-12-10 20:34 ` Greg Bronevetsky
2005-12-11 3:53 ` Daniel Jacobowitz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox