From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14263 invoked by alias); 23 Sep 2014 08:47:27 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 14226 invoked by uid 89); 23 Sep 2014 08:47:23 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 23 Sep 2014 08:47:20 +0000 Received: from svr-orw-fem-04.mgc.mentorg.com ([147.34.97.41]) by relay1.mentorg.com with esmtp id 1XWLkj-00055E-5c from Yao_Qi@mentor.com ; Tue, 23 Sep 2014 01:47:17 -0700 Received: from GreenOnly (147.34.91.1) by svr-orw-fem-04.mgc.mentorg.com (147.34.97.41) with Microsoft SMTP Server id 14.3.181.6; Tue, 23 Sep 2014 01:47:16 -0700 From: Yao Qi To: Pedro Alves CC: Subject: Re: [PATCH] Honour SIGILL and SIGSEGV in cancel breakpoint References: <1410696393-29327-1-git-send-email-yao@codesourcery.com> <54182945.7090300@redhat.com> <87mw9xzmlr.fsf@codesourcery.com> <541C6208.3080805@redhat.com> Date: Tue, 23 Sep 2014 08:47:00 -0000 In-Reply-To: <541C6208.3080805@redhat.com> (Pedro Alves's message of "Fri, 19 Sep 2014 18:04:08 +0100") Message-ID: <871tr2ybg0.fsf@codesourcery.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2014-09/txt/msg00673.txt.bz2 Pedro Alves writes: >> count_events_callback and select_event_lwp_callback in GDBServer need to >> honour SIGILL and SIGSEGV too. I write a patch to call >> lp_status_is_sigtrap_like_event in them, but regression test result >> isn't changed, which is a surprise to me. I thought some fails should >> be fixed. I'll look into it deeply. > > Maybe you're getting lucky with scheduling. > pthreads.exp and schedlock.exp I think are the most sensitive to this. I run them ten times, the results aren't changed. > > See: > https://www.sourceware.org/ml/gdb-patches/2001-06/msg00250.html Randomly selecting event lwp was added in the url above you gave, to prevent the starvation of threads. However, in my configuration (arm-linux with SIGILL), event lwp selection does nothing, but no test fails are caused. GDBserver processes events like this: 1. When GDBServer gets a breakpoint event from waitpid (-1, ), 2. GDBserver will stop_all_lwps, in which wait_for_sigstop will drain all pending reports from kernel. 3. GDBserver selects one lwp and cancels the breakpoint on the rest. If event lwp selection does nothing, it is the lwp GDBserver gets in step 1. 4. GDBserver steps over the breakpoint, and resumes all the threads. Go back to step 1, wait until any threads hit breakpoint, As we can see, if waitpid (-1, ) (in step #1) returns event lwp randomly, we don't have to randomly select event lwp again in step #3. IMO, it is naturally random that one thread hits the breakpoint first in a multi-thread program. That is the reason why no test fails are caused without event lwp selection in my experiments. IOW, on the platform that waitpid (-1, ) returns event lwp randomly, we don't need such lwp random selection at all. However, if waitpid kernel implementation always iterate over a list children in the fixed order, it is possible that event of the lwp in the front of the list is reported and the rest lwps may be starved. In this case, we still have to reply on random selection inside GDB/GDBserver to avoid starvation. The patch below is updated to call lp_status_maybe_breakpoint in both breakpoint cancellation and event lwp selection. --=20 Yao (=E9=BD=90=E5=B0=A7) Subject: [PATCH] Honour SIGILL and SIGSEGV in cancel breakpoint and event l= wp selection I see the following fail on arm-none-linux-gnueabi testing, (gdb) continue^M Continuing.^M ^M Program received signal SIGILL, Illegal instruction.^M [Switching to Thread 1003]^M handler (signo=3D10) at /scratch/yqi/arm-none-linux-gnueabi/src/gdb-trunk/gdb/testsuite/gdb.threads= /sigstep-threads.c:33^M 33 tgkill (getpid (), gettid (), SIGUSR1); /* step-2 */^M (gdb) FAIL: gdb.threads/sigstep-threads.exp: continue the cause is that GDBserver doesn't cancel the breakpoint if the stop signal is SIGILL. The kernel used here is a little old, 2.6.x, and doesn't translate SIGILL to SIGTRAP when program hits breakpoint instruction (which is an illegal instruction actually). GDB and GDBserver can translate SIGILL to SIGTRAP under certain circumstance, so it is not a problem here. See gdbserver/linux-low.c:linux_wait_1 /* If this event was not handled before, and is not a SIGTRAP, we report it. SIGILL and SIGSEGV are also treated as traps in case a breakpoint is inserted at the current PC. If this target does not support internal breakpoints at all, we also report the SIGTRAP without further processing; it's of no concern to us. */ maybe_internal_trap =3D (supports_breakpoints () && (WSTOPSIG (w) =3D=3D SIGTRAP || ((WSTOPSIG (w) =3D=3D SIGILL || WSTOPSIG (w) =3D=3D SIGSEGV) && (*the_low_target.breakpoint_at) (event_child->stop_pc)))); However, SIGILL and SIGSEGV is not considered when cancelling breakpoint, which causes the fail above. That is, when GDB is doing software single step on address ADDR, both thread A and thread B hits the software single step breakpoint, and get SIGILL. GDB selects the event from thread A, removes the software single step breakpoint, and resume the program. The event (SIGILL) from thread B is reported to GDB, but GDB doesn't regard this SIGILL as SIGTRAP, because the breakpoint on address ADDR was removed, so GDB reports "Program received signal SIGILL". The patch is to allow calling cancel_breakpoint if the signal is SIGILL and SIGSEGV. This patch fixes the fail above. Likewise, event lwp selection should honour SIGILL and SIGSEGV too. gdb/gdbserver: 2014-09-23 Yao Qi * linux-low.c (lp_status_maybe_breakpoint): New function. (linux_low_filter_event): Call lp_status_maybe_breakpoint. (count_events_callback): Likewise. (select_event_lwp_callback): Likewise. (cancel_breakpoints_callback): Likewise. diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c index 705edde..2f860e9 100644 --- a/gdb/gdbserver/linux-low.c +++ b/gdb/gdbserver/linux-low.c @@ -1739,6 +1739,20 @@ cancel_breakpoint (struct lwp_info *lwp) return 0; } =20 +/* Return true if the event in LP may be caused by breakpoint. */ + +static int +lp_status_maybe_breakpoint (struct lwp_info *lp) +{ + return (lp->status_pending_p + && WIFSTOPPED (lp->status_pending) + && (WSTOPSIG (lp->status_pending) =3D=3D SIGTRAP + /* SIGILL and SIGSEGV are also treated as traps in case a + breakpoint is inserted at the current PC. */ + || WSTOPSIG (lp->status_pending) =3D=3D SIGILL + || WSTOPSIG (lp->status_pending) =3D=3D SIGSEGV)); +} + /* Do low-level handling of the event, and check if we should go on and pass it to caller code. Return the affected lwp if we are, or NULL otherwise. */ @@ -1936,7 +1950,7 @@ linux_low_filter_event (ptid_t filter_ptid, int lwpid= , int wstat) the core before this one is handled. All-stop always cancels breakpoint hits in all threads. */ if (non_stop - && WSTOPSIG (wstat) =3D=3D SIGTRAP + && lp_status_maybe_breakpoint (child) && cancel_breakpoint (child)) { /* Throw away the SIGTRAP. */ @@ -2197,9 +2211,7 @@ count_events_callback (struct inferior_list_entry *en= try, void *data) should be reported to GDB. */ if (thread->last_status.kind =3D=3D TARGET_WAITKIND_IGNORE && thread->last_resume_kind !=3D resume_stop - && lp->status_pending_p - && WIFSTOPPED (lp->status_pending) - && WSTOPSIG (lp->status_pending) =3D=3D SIGTRAP + && lp_status_maybe_breakpoint (lp) && !breakpoint_inserted_here (lp->stop_pc)) (*count)++; =20 @@ -2237,9 +2249,7 @@ select_event_lwp_callback (struct inferior_list_entry= *entry, void *data) /* Select only resumed LWPs that have a SIGTRAP event pending. */ if (thread->last_resume_kind !=3D resume_stop && thread->last_status.kind =3D=3D TARGET_WAITKIND_IGNORE - && lp->status_pending_p - && WIFSTOPPED (lp->status_pending) - && WSTOPSIG (lp->status_pending) =3D=3D SIGTRAP + && lp_status_maybe_breakpoint (lp) && !breakpoint_inserted_here (lp->stop_pc)) if ((*selector)-- =3D=3D 0) return 1; @@ -2271,9 +2281,7 @@ cancel_breakpoints_callback (struct inferior_list_ent= ry *entry, void *data) =20 if (thread->last_resume_kind !=3D resume_stop && thread->last_status.kind =3D=3D TARGET_WAITKIND_IGNORE - && lp->status_pending_p - && WIFSTOPPED (lp->status_pending) - && WSTOPSIG (lp->status_pending) =3D=3D SIGTRAP + && lp_status_maybe_breakpoint (lp) && !lp->stepping && !lp->stopped_by_watchpoint && cancel_breakpoint (lp))