From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 95469 invoked by alias); 24 Jul 2015 18:43:58 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 95459 invoked by uid 89); 24 Jul 2015 18:43:58 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 24 Jul 2015 18:43:56 +0000 Received: from svr-orw-fem-04.mgc.mentorg.com ([147.34.97.41]) by relay1.mentorg.com with esmtp id 1ZIhwm-0005u1-FE from Don_Breazeal@mentor.com ; Fri, 24 Jul 2015 11:43:52 -0700 Received: from [172.30.11.14] (147.34.91.1) by SVR-ORW-FEM-04.mgc.mentorg.com (147.34.97.41) with Microsoft SMTP Server (TLS) id 14.3.224.2; Fri, 24 Jul 2015 11:43:51 -0700 Message-ID: <55B2875E.1070504@codesourcery.com> Date: Fri, 24 Jul 2015 18:43:00 -0000 From: Don Breazeal User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Pedro Alves , "Breazeal, Don" , Simon Marchi CC: GDB Patches Subject: Re: [PATCH 3/N] remote follow fork and spurious child stops in non-stop mode References: <1437672294-29351-1-git-send-email-palves@redhat.com> <55B1308E.4020700@redhat.com> In-Reply-To: <55B1308E.4020700@redhat.com> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2015-07/txt/msg00714.txt.bz2 On 7/23/2015 11:21 AM, Pedro Alves wrote: > So I managed to extract out this smaller patch from the > gdbserver fixes I mentioned. I think this one looks safe enough > for 7.10. WDYT? > > ----------- > From 98d41152bff2a21f7fda864d87ee5dd0cffa2d17 Mon Sep 17 00:00:00 2001 > From: Pedro Alves > Date: Thu, 23 Jul 2015 18:49:51 +0100 > Subject: [PATCH] remote follow fork and spurious child stops in non-stop mode > > Running gdb.threads/fork-plus-threads.exp against gdbserver in > extended-remote mode, even though the test passes, we still see broken > behavior: > > Running gdb.threads/fork-plus-threads.exp against gdbserver in > extended-remote mode, even though the test passes, we still see broken > behavior: > > (gdb) PASS: gdb.threads/fork-plus-threads.exp: set detach-on-fork off > continue & > Continuing. > (gdb) PASS: gdb.threads/fork-plus-threads.exp: continue & > [New Thread 28092.28092] > > [Thread 28092.28092] #2 stopped. > [New Thread 28094.28094] > [Inferior 2 (process 28092) exited normally] > [New Thread 28094.28105] > [New Thread 28094.28109] > > ... > > [Thread 28174.28174] #18 stopped. > [New Thread 28185.28185] > [Inferior 10 (process 28174) exited normally] > [New Thread 28185.28196] > > [Thread 28185.28185] #20 stopped. > Cannot remove breakpoints because program is no longer writable. > Further execution is probably impossible. > [Inferior 11 (process 28185) exited normally] > [Inferior 1 (process 28091) exited normally] > PASS: gdb.threads/fork-plus-threads.exp: reached breakpoint > info threads > No threads. > (gdb) PASS: gdb.threads/fork-plus-threads.exp: no threads left > info inferiors > Num Description Executable > * 1 /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/fork-plus-threads > (gdb) PASS: gdb.threads/fork-plus-threads.exp: only inferior 1 left > > All the "[Thread FOO] #NN stopped." above are bogus, as well as the > "Cannot remove breakpoints because program is no longer writable.", > which is a consequence. > > The problem is that when we intercept a fork event, we should report > the event for the parent, only, and leave the child stopped, but not > report its stop event. GDB later decides whether to follow the parent > or the child. But because handle_extended_wait does not set the > child's last_status.kind to TARGET_WAITKIND_STOPPED, a > stop_all_threads/unstop_all_lwps sequence (e.g., from trying to access > memory) by mistake ends up queueing a SIGSTOP on the child, resuming > it, and then when that SIGSTOP is intercepted, because the LWP has > last_resume_kind set to resume_stop, gdbserver reports the stop to > GDB, as GDB_SIGNAL_0: > > ... > >>>> entering unstop_all_lwps > unstopping all lwps > proceed_one_lwp: lwp 1600 > client wants LWP to remain 1600 stopped > proceed_one_lwp: lwp 1828 > Client wants LWP 1828 to stop. Making sure it has a SIGSTOP pending > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > Sending sigstop to lwp 1828 > pc is 0x3615ebc7cc > Resuming lwp 1828 (continue, signal 0, stop expected) > continue from pc 0x3615ebc7cc > unstop_all_lwps done > sigchld_handler > <<<< exiting unstop_all_lwps > handling possible target event > >>>> entering linux_wait_1 > linux_wait_1: [] > my_waitpid (-1, 0x40000001) > my_waitpid (-1, 0x1): status(137f), 1828 > LWFE: waitpid(-1, ...) returned 1828, ERRNO-OK > LLW: waitpid 1828 received Stopped (signal) (stopped) > pc is 0x3615ebc7cc > Expected stop. > LLW: resume_stop SIGSTOP caught for LWP 1828.1828. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > ... > linux_wait_1 ret = LWP 1828.1828, 1, 0 > <<<< exiting linux_wait_1 > Writing resume reply for LWP 1828.1828:1 > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > By inspection, I also noticed that we miss leaving the child with the > suspend count incremented if stopping threads, like we do for clone > threads. > > Tested on x86_64 Fedora 20, extended-remote. > > gdb/gdbserver/ChangeLog: > 2015-07-23 Pedro Alves > > * linux-low.c (handle_extended_wait): Set the child's last > reported status to TARGET_WAITKIND_STOPPED. > --- > gdb/gdbserver/linux-low.c | 7 ++++++ > gdb/testsuite/gdb.threads/fork-plus-threads.exp | 30 +++++++++++++++++++++++++ > 2 files changed, 37 insertions(+) > > diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c > index 17b2a51..56a33ff 100644 > --- a/gdb/gdbserver/linux-low.c > +++ b/gdb/gdbserver/linux-low.c > @@ -488,6 +488,13 @@ handle_extended_wait (struct lwp_info *event_lwp, int wstat) > child_lwp->status_pending_p = 0; > child_thr = get_lwp_thread (child_lwp); > child_thr->last_resume_kind = resume_stop; > + child_thr->last_status.kind = TARGET_WAITKIND_STOPPED; This makes perfect sense to me. > + > + /* If we're suspending all threads, leave this one suspended > + too. */ > + if (stopping_threads == STOPPING_AND_SUSPENDING_THREADS) > + child_lwp->suspended = 1; I have a question about this. In the definition of struct lwp_info in linux-low.h, it has this comment: /* When this is true, we shall not try to resume this thread, even if last_resume_kind isn't resume_stop. */ int suspended; Since we are setting last_resume_kind to resume_stop here, is this unnecessary? Thanks, --Don > + > parent_proc = get_thread_process (event_thr); > child_proc->attached = parent_proc->attached; > clone_all_breakpoints (&child_proc->breakpoints, > diff --git a/gdb/testsuite/gdb.threads/fork-plus-threads.exp b/gdb/testsuite/gdb.threads/fork-plus-threads.exp > index f44dd76..80d2464 100644 > --- a/gdb/testsuite/gdb.threads/fork-plus-threads.exp > +++ b/gdb/testsuite/gdb.threads/fork-plus-threads.exp > @@ -48,13 +48,43 @@ gdb_test_multiple $test $test { > } > } > > +# gdbserver had a bug that resulted in reporting the fork child's > +# initial stop to gdb, which gdb does not expect, in turn resulting in > +# a broken session, like: > +# > +# [Thread 31536.31536] #16 stopped. <== BAD > +# [New Thread 31547.31547] > +# [Inferior 10 (process 31536) exited normally] > +# [New Thread 31547.31560] > +# > +# [Thread 31547.31547] #18 stopped. <== BAD > +# Cannot remove breakpoints because program is no longer writable. <== BAD > +# Further execution is probably impossible. <== BAD > +# [Inferior 11 (process 31547) exited normally] > +# [Inferior 1 (process 31454) exited normally] > +# > +# These variables track whether we see such broken behavior. > +set saw_cannot_remove_breakpoints 0 > +set saw_thread_stopped 0 > + > set test "reached breakpoint" > gdb_test_multiple "" $test { > + -re "Cannot remove breakpoints" { > + set saw_cannot_remove_breakpoints 1 > + exp_continue > + } > + -re "Thread \[^\r\n\]+ stopped\\." { > + set saw_thread_stopped 1 > + exp_continue > + } > -re "Inferior 1 \(\[^\r\n\]+\) exited normally" { > pass $test > } > } > > +gdb_assert !$saw_cannot_remove_breakpoints "no failure to remove breakpoints" > +gdb_assert !$saw_thread_stopped "no spurious thread stop" > + > gdb_test "info threads" "No threads\." \ > "no threads left" > >