From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15985 invoked by alias); 16 May 2014 12:51:10 -0000 Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org Received: (qmail 15969 invoked by uid 89); 16 May 2014 12:51:09 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 16 May 2014 12:51:08 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s4GCp5kf032052 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 16 May 2014 08:51:05 -0400 Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s4GCUTwT024683; Fri, 16 May 2014 08:30:30 -0400 Message-ID: <537604E5.7050804@redhat.com> Date: Fri, 16 May 2014 12:51:00 -0000 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: "Breazeal, Don" , "gdb@sourceware.org" Subject: Re: Follow-fork-mode / detach-on-fork expected behavior? References: In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-SW-Source: 2014-05/txt/msg00025.txt.bz2 On 04/24/2014 11:05 PM, Breazeal, Don wrote: > Hi > > I'm working on implementation of follow-fork in gdbserver. My intent is to make it work just like it works in native GDB. However, I am confused by what looks like inconsistent behavior in native GDB. I'm hoping to get some feedback on my observations so that I know how to proceed. I want to make sure things are working in native GDB before going any further with gdbserver. > > Apologies for the length of this email. The only way I can think of to explain my questions is by describing what I see in a test case. I'm using a test case that uses 'fork' (not 'vfork') in all-stop mode (gdb.base/foll-fork). Aside from the fork mode settings, the commands are: > (gdb) set verbose # to see the fork msgs > (gdb) break main > (gdb) run > (gdb) next 2 # this executes past the fork call > > The behavior is inconsistent when following the child, depending on the setting for detach-on-fork. Below is the behavior I see in the four possible combinations of fork settings after the 'next 2' command is entered, along with my specific questions: > > 1) follow parent / detach child (default settings) > - prints msg about detaching after fork > - stops after the next command in the parent > - one inferior left > > 2) follow parent / don't detach child > - prints [New process] msg > - prints symbol reading/loading msgs > - stops in parent after next > - two inferiors left, info inferiors shows pids of both > > So far, so good, this is what I expect. > > 3) follow child / detach parent > - prints msg about attach to child after fork > - prints [New process] msg > - prints [Switching to process ] msg > - stops in child after 'next' command > - two inferiors left, info inferiors shows parent 'null' > > This looks like there might be a problem: > Q1: shouldn't there only be one inferior? Yeah. With detach-on-fork enabled, I agree that that's the expected behavior. And I think the new process should reuse the old inferior id. The reason I hadn't done things that way to begin with, is actually the vfork case. For vfork, we need to hold on to the parent until the child execs/exits. In the old days, linux-nat.c itself would hold on to the lwp, behind the core's back. The core had no clue about vfork-done. Nowadays, we have that modelled with TARGET_WAITKIND_VFORK_DONE. We need that to be able to correctly remove all child breakpoints from the parent at child exec/exit time. We could fix that by getting rid of the parent inferior from the core, going back to having linux-nat.c itself keep track of the parent behind the cores back, knowing that it needs to remove breakpoints from the parent when a vfork_done arrives (like it used to be done before TARGET_WAITKIND_VFORK_DONE). But, how to do that in the remote/gdbserver case? gdbserver will have no clue about software breakpoints planted in memory directly by GDB. One way I've thought before to fix/handle this, would be to still leave the parent inferior in the inferior list, so most of the core wouldn't need to change, but hide the parent inferior from the user. We could for example make the parent's inferior's number be negative (like internal breakpoints are negative). Sounds doable, but I've never tried it. I recall of at least one complication. Like, if the parent is hidden, and the user decides "oops, I want to keep debugging the parent", and does "attach PID", with PID being the parent's PID. ptrace would of course complain, because GDB is already attached to the process. My immediate reaction is that the core could look at PID and check if it's already attached before passing the PID down to the target. But, given that PID is just a string that is interpreted by the target, we can't have the core interpret it as a number. So we need something else. Maybe let the target return "I'm already attached, and this is the PID in number form". Easy for native, requires extensions for remote. Well, given we need remote multi-process extensions to see this happen, perhaps just just ignore the issue and do the pid comparison as a number anyway (if not fake). > Q2: should the child have stopped? Yes. That's what "follow" means. That's how GDB always behaved, even before it learned about "info inferiors". GDB 6.8 could follow forks, and had follow-fork/detach-on-fork settings already, but couldn't run two fork processes at once. You'd have to choose which to debug with "info forks" / "fork". > The manual doesn't make this completely clear. > > 4) follow child / don't detach parent > - prints msg about attach to child after fork > - prints [New process] msg > - prints symbol reading/loading msgs > - child runs to completion > - two inferiors left, info inferiors shows child 'null' > > Something seems wrong here. > Q3: to be consistent, shouldn't the child process either have stopped after the 'next' command > in both (3) and (4) Yes. > or run to completion in both cases? No. > > I'd appreciate it if someone could clarify the expected behavior for me, or if what I'm seeing is expected, explain the rationale. If something needs to be fixed in the native implementation, I'll want to look at that before continuing with the remote case. > Thanks -- Pedro Alves