From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16392 invoked by alias); 16 May 2014 17:08:57 -0000 Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org Received: (qmail 16366 invoked by uid 89); 16 May 2014 17:08:56 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 16 May 2014 17:08:54 +0000 Received: from svr-orw-exc-10.mgc.mentorg.com ([147.34.98.58]) by relay1.mentorg.com with esmtp id 1WlLco-0003Su-BK from donb@codesourcery.com ; Fri, 16 May 2014 10:08:50 -0700 Received: from [127.0.0.1] ([172.30.2.24]) by SVR-ORW-EXC-10.mgc.mentorg.com with Microsoft SMTPSVC(6.0.3790.4675); Fri, 16 May 2014 10:08:49 -0700 Message-ID: <53764622.7020202@codesourcery.com> Date: Fri, 16 May 2014 17:08:00 -0000 From: "Breazeal, Don" User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Pedro Alves , "gdb@sourceware.org" Subject: Re: Follow-fork-mode / detach-on-fork expected behavior? References: <537604E5.7050804@redhat.com> In-Reply-To: <537604E5.7050804@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2014-05/txt/msg00026.txt.bz2 On 5/16/2014 5:30 AM, Pedro Alves wrote: > On 04/24/2014 11:05 PM, Breazeal, Don wrote: >> Hi >> >> I'm working on implementation of follow-fork in gdbserver. My intent is to make it work just like it works in native GDB. However, I am confused by what looks like inconsistent behavior in native GDB. I'm hoping to get some feedback on my observations so that I know how to proceed. I want to make sure things are working in native GDB before going any further with gdbserver. >> >> Apologies for the length of this email. The only way I can think of to explain my questions is by describing what I see in a test case. I'm using a test case that uses 'fork' (not 'vfork') in all-stop mode (gdb.base/foll-fork). Aside from the fork mode settings, the commands are: >> (gdb) set verbose # to see the fork msgs >> (gdb) break main >> (gdb) run >> (gdb) next 2 # this executes past the fork call >> >> The behavior is inconsistent when following the child, depending on the setting for detach-on-fork. Below is the behavior I see in the four possible combinations of fork settings after the 'next 2' command is entered, along with my specific questions: >> >> 1) follow parent / detach child (default settings) >> - prints msg about detaching after fork >> - stops after the next command in the parent >> - one inferior left >> >> 2) follow parent / don't detach child >> - prints [New process] msg >> - prints symbol reading/loading msgs >> - stops in parent after next >> - two inferiors left, info inferiors shows pids of both >> >> So far, so good, this is what I expect. >> >> 3) follow child / detach parent >> - prints msg about attach to child after fork >> - prints [New process] msg >> - prints [Switching to process ] msg >> - stops in child after 'next' command >> - two inferiors left, info inferiors shows parent 'null' >> >> This looks like there might be a problem: >> Q1: shouldn't there only be one inferior? > > Yeah. With detach-on-fork enabled, I agree that that's > the expected behavior. And I think the new process should Oh...ugh. I thought I had figured this out, but had reached the opposite conclusion. This was based on a couple of things: * Text in the GDB manual (Yao also pointed this out to me): https://sourceware.org/gdb/onlinedocs/gdb/Inferiors-and-Programs.html mentions that "Inferiors may be created before a process runs, and may be retained after a process exits." and "After the successful completion of a command such as detach, detach inferiors, kill or kill inferiors, or after a normal process exit, the inferior is still valid and listed with info inferiors, ready to be restarted." It seemed to me like "detach-on-fork" should work like the detach command and keep the inferior. I guess the argument opposing this would be that the parent and child inferiors would just be identical to each other. However, if there was an exec in the followed process, this would no longer be the case. * PR gdb/14808, which is a problem where the inferior of a detached parent is modified when it shouldn't be. I was proceeding under the assumptions that (1) a detached parent should show up in the inferiors list so that it could be re-run later, and (2) a detached child should not, since it had never really been attached from a user perspective. My plan has been to implement the remote case, then see if I could consolidate some of the code into a common follow-fork layer. Do I need to change course? Would it make sense for me to proceed under the assumptions above, and deal with the 'detached inferior' issue later, or should that be resolved before implementing remote follow-fork? Thanks --Don > reuse the old inferior id. The reason I hadn't done things > that way to begin with, is actually the vfork case. > For vfork, we need to hold on to the parent until the child > execs/exits. In the old days, linux-nat.c itself would hold on to > the lwp, behind the core's back. The core had no clue about > vfork-done. Nowadays, we have that modelled with > TARGET_WAITKIND_VFORK_DONE. We need that to be able to correctly > remove all child breakpoints from the parent at child exec/exit > time. > > We could fix that by getting rid of the parent inferior from the > core, going back to having linux-nat.c itself keep track of the > parent behind the cores back, knowing that it needs to remove > breakpoints from the parent when a vfork_done arrives (like it used > to be done before TARGET_WAITKIND_VFORK_DONE). But, how to > do that in the remote/gdbserver case? gdbserver will have no clue > about software breakpoints planted in memory directly by GDB. > > One way I've thought before to fix/handle this, would > be to still leave the parent inferior in the inferior list, > so most of the core wouldn't need to change, but hide the > parent inferior from the user. We could for example make > the parent's inferior's number be negative (like internal > breakpoints are negative). > > Sounds doable, but I've never tried it. I recall of at least > one complication. Like, if the parent is hidden, and the user > decides "oops, I want to keep debugging the parent", and does > "attach PID", with PID being the parent's PID. ptrace would > of course complain, because GDB is already attached to the process. > My immediate reaction is that the core could look at PID and check > if it's already attached before passing the PID down to the target. > But, given that PID is just a string that is interpreted by the > target, we can't have the core interpret it as a number. So we > need something else. Maybe let the target return "I'm already > attached, and this is the PID in number form". Easy for > native, requires extensions for remote. Well, given we need > remote multi-process extensions to see this happen, perhaps > just just ignore the issue and do the pid comparison as > a number anyway (if not fake). > >> Q2: should the child have stopped? > > Yes. That's what "follow" means. That's how GDB always > behaved, even before it learned about "info inferiors". > GDB 6.8 could follow forks, and had follow-fork/detach-on-fork > settings already, but couldn't run two fork processes > at once. You'd have to choose which to debug with > "info forks" / "fork". > >> The manual doesn't make this completely clear. >> >> 4) follow child / don't detach parent >> - prints msg about attach to child after fork >> - prints [New process] msg >> - prints symbol reading/loading msgs >> - child runs to completion >> - two inferiors left, info inferiors shows child 'null' >> >> Something seems wrong here. >> Q3: to be consistent, shouldn't the child process either have stopped after the 'next' command >> in both (3) and (4) > > Yes. > >> or run to completion in both cases? > > No. > >> >> I'd appreciate it if someone could clarify the expected behavior for me, or if what I'm seeing is expected, explain the rationale. If something needs to be fixed in the native implementation, I'll want to look at that before continuing with the remote case. >> Thanks >