From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18770 invoked by alias); 13 Dec 2001 13:56:01 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 18742 invoked from network); 13 Dec 2001 13:56:00 -0000 Received: from unknown (HELO pc-62-30-77-187-az.blueyonder.co.uk) (62.30.77.187) by sources.redhat.com with SMTP; 13 Dec 2001 13:56:00 -0000 Received: from jcownie by etnus.com with esmtp (MasqMail 0.1.16) id 16EWL3-0eH-00; Thu, 13 Dec 2001 13:55:53 +0000 To: Michael Snyder cc: gdb@sources.redhat.com Subject: Re: [RFC] New gdb command 'gcore' Reply-To: James Cownie Date: Thu, 13 Dec 2001 05:56:00 -0000 From: James Cownie Message-ID: <16EWL3-0eH-00@etnus.com> X-SW-Source: 2001-12/txt/msg00123.txt.bz2 > The holy grail, of course, would be to then give gdb the ability to > restart the process from the core file state. That would give us a > checkpoint-and-restart capability that very few debuggers have ever > had. But that's down the line... Unfortunately in general a normal core file does not contain enough information to allow a process to be restarted, since it doesn't contain a lot of the information in the kernel which forms part of the process' state. There are many "fun" issues which arise when trying to implement checkpointing, such as 1) Open files. What fds ar open ? What's the seek position of each ? What about pipes ? 2) process id; does it change between the original process and its reincarnation ?) 3) parent process id (same question). 4) relationship with child processes (if any). Do you checkpoint the whole process group ? 5) network connections. Can you reconstruct them ? What about the state of the other end ? 6) time. When the process is reincarnated does it see time passing while it was only a checkpoint ? 7) signal handling state. What signal handlers are set up ? What signals are blocked ? 8) State of any timers. Suppose a thread was in a sleep() when should the sleep complete ? 9) State of other potentially long system calls. A listen(), for instance, or a read from something which isn't ready. 10) All the other things which didn't come to mind in the three minutes it's taken to type this. Of course it's possible to add restrictions to the state a process must be in before it can be checkpointed, unfortunately if you want to do the checkpoint from gdb it's going to be hard to know if the restrictions are valid, since you can arbitrarily invoke gcore between any two machine instructions. It's a nice idea, but I think it's hard :-( (and to do it portably is _very_ hard). -- Jim James Cownie Etnus, LLC. +44 117 9071438 http://www.etnus.com