Mirror of the gdb mailing list
 help / color / mirror / Atom feed
* Checkpoint-restart with different code
@ 2005-12-09 21:58 Greg Bronevetsky
  2005-12-09 22:08 ` Daniel Jacobowitz
  0 siblings, 1 reply; 6+ messages in thread
From: Greg Bronevetsky @ 2005-12-09 21:58 UTC (permalink / raw)
  To: gdb

I see that there's been some discussion on this list on checkpointing 
techniques that may be included in gdb. My research group at Cornell is 
working on a number of such checkpointers for both sequential and 
parallel programs and we recently decided to try a more challenging 
variant of checkpointing where the user can take a checkpoint of their 
program, modify their source code a bit (add remove stack variables, 
move function calls around a bit and a few other things) and then resume 
computation using the modified code. This seems to be very useful for 
debugging long-running applications since the user would be able to work 
around the bug without losing a week's or month's worth of results. (can 
happen in high-performance computing) Similarly, its useful for 
situations where your execution is in some particularly buggy corner 
case and you want to keep making modifications and trying them out 
without having to guide the program's execution back into that corner 
case after every code change.

My question is, has anybody heard of anything that can do this? 
Obviously, this kind of checkpointing would require compiler support, so 
gdb wouldn't have done this, but have you heard of any systems/research 
that has addressed this question? Thanks.

-- 
Greg Bronevetsky
490 Rhodes Hall
Cornell University


^ permalink raw reply	[flat|nested] 6+ messages in thread
* RE: Checkpoint-restart with different code
@ 2005-12-13 19:58 Prateek Saxena
  0 siblings, 0 replies; 6+ messages in thread
From: Prateek Saxena @ 2005-12-13 19:58 UTC (permalink / raw)
  To: greg; +Cc: gdb, drow

Hi,

I have a stable kernel patch for Linux 2.6.14.3 and Linux 2.4.31 that
allows a process to mmap any region of the tracee's process address
space, through the /proc/[pid]/mem file.

It allows the debugger/tracer to write/read change all parts of the
program memory of traced process. You could you this interface to
modify the code memory being read by the traced process, by simply
mmaping the code/stack/data page in the checkpointer's memory, and
writing to it.

I have tested some large tracer's with this, and the fsx testsuite.
The patches should be available for download in the next week or so. I
can send the link as soon as I upload them.

-- Prateek Saxena
   Department of Computer Science,
   Stony Brook University.

Greg Bronevetsky wrote:

> I see that there's been some discussion on this list on checkpointing techniques > that may be included in gdb. My research group at Cornell is working on a >number of such checkpointers for both sequential and parallel programs and we >recently decided to try a more challenging variant of checkpointing where the >user can take a checkpoint of their program, modify their source code a bit (add >remove stack variables, move function calls around a bit and a few other things) >and then resume computation using the modified code. This seems to be very >useful for debugging long-running applications since the user would be able to >work around the bug without losing a week's or month's worth of results. (can >happen in high-performance computing) Similarly, its useful for situations where >your execution is in some particularly buggy corner case and you want to keep >making modifications and trying them out without having to guide the program's >execution back into that corner case after every code change.

>My question is, has anybody heard of anything that can do this?
Obviously, this >kind of checkpointing would require compiler support,
so gdb wouldn't have done >this, but have you heard of any
systems/research that has addressed this >question? Thanks.

>--
>Greg Bronevetsky
>490 Rhodes Hall


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-12-13 19:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-12-09 21:58 Checkpoint-restart with different code Greg Bronevetsky
2005-12-09 22:08 ` Daniel Jacobowitz
2005-12-10  1:36   ` Randolph Chung
2005-12-10 20:34   ` Greg Bronevetsky
2005-12-11  3:53     ` Daniel Jacobowitz
2005-12-13 19:58 Prateek Saxena

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox