From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4478 invoked by alias); 23 Oct 2002 04:25:56 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 4469 invoked from network); 23 Oct 2002 04:25:55 -0000 Received: from unknown (HELO crack.them.org) (65.125.64.184) by sources.redhat.com with SMTP; 23 Oct 2002 04:25:55 -0000 Received: from nevyn.them.org ([66.93.61.169] ident=mail) by crack.them.org with asmtp (Exim 3.12 #1 (Debian)) id 184E1K-0005xp-00; Wed, 23 Oct 2002 00:25:30 -0500 Received: from drow by nevyn.them.org with local (Exim 3.36 #1 (Debian)) id 184D5z-0003NC-00; Wed, 23 Oct 2002 00:26:15 -0400 Date: Tue, 22 Oct 2002 21:25:00 -0000 From: Daniel Jacobowitz To: gdb-patches@sources.redhat.com Cc: msnyder@redhat.com, kettenis@gnu.org Subject: RFA: lin-lwp bug with software-single-step or schedlock Message-ID: <20021023042615.GA6358@nevyn.them.org> Mail-Followup-To: gdb-patches@sources.redhat.com, msnyder@redhat.com, kettenis@gnu.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.1i X-SW-Source: 2002-10/txt/msg00455.txt.bz2 This bug was noticed on MIPS, because MIPS GNU/Linux is SOFTWARE_SINGLE_STEP_P. There's a comment in lin_lwp_resume: /* Apparently the interpretation of PID is dependent on STEP: If STEP is non-zero, a specific PID means `step only this process id'. But if STEP is zero, then PID means `continue *all* processes, but give the signal only to this one'. */ resume_all = (PIDGET (ptid) == -1) || !step; Now, I did some digging, and I believe this comment is completely incorrect. Saying "signal SIGWINCH" causes PIDGET (ptid) == -1, and it is assumed the signal will be delivered to inferior_ptid. There's some other problem there - I think I've discovered that we will neglect to single-step over a breakpoint if we are told to continue with a signal, which is a bit dubious of a decision - but by and large it works as expected. So if STEP is 0, we always resume all processes. STEP at this point _only_ refers to whether we want a PTRACE_SINGLESTEP or equivalent; SOFTWARE_SINGLE_STEP has already been handled. We can't make policy decisions based on STEP any more. I tried removing the || !step. It's pretty hard to tell, since there are still a few non-deterministic failures on my test systems (which is what I was actually hunting when I found this!) but I believe testsuite results are improved on i386. One run of just the thread tests (after the patch in my last message, which I've committed), shows that these all got fixed: FAIL: gdb.threads/schedlock.exp: other thread 0 didn't run (ran) FAIL: gdb.threads/schedlock.exp: other thread 2 didn't run (ran) FAIL: gdb.threads/schedlock.exp: other thread 3 didn't run (ran) FAIL: gdb.threads/schedlock.exp: other thread 4 didn't run (ran) FAIL: gdb.threads/schedlock.exp: other thread 5 didn't run (ran) with my change: # of expected passes 188 # of unexpected failures 3 without: # of expected passes 183 # of unexpected failures 7 There's still two that look like test issues rather than bugs, one that is fixed by a patch Kevin posted a long time ago, and one in print-threads.exp that I'm not sure of the cause for yet. On MIPS it's more pronounced, as I'd expect. with my change: # of expected passes 170 # of unexpected failures 12 without: # of expected passes 114 # of unexpected failures 41 At least one of the MIPS issues remaining appears to be hitting the software single step breakpoint in the wrong thread. It doesn't seem to be marked as thread-specific, and it should be. I'll follow up on that later. That was the long version. Here's the short version. OK to apply? -- Daniel Jacobowitz MontaVista Software Debian GNU/Linux Developer 2002-10-23 Daniel Jacobowitz * lin-lwp.c (lin_lwp_resume): Remove resume_all test for !step. Index: lin-lwp.c =================================================================== RCS file: /cvs/src/src/gdb/lin-lwp.c,v retrieving revision 1.35 diff -u -p -r1.35 lin-lwp.c --- lin-lwp.c 27 Aug 2002 22:37:06 -0000 1.35 +++ lin-lwp.c 23 Oct 2002 04:23:13 -0000 @@ -579,11 +579,8 @@ lin_lwp_resume (ptid_t ptid, int step, e struct lwp_info *lp; int resume_all; - /* Apparently the interpretation of PID is dependent on STEP: If - STEP is non-zero, a specific PID means `step only this process - id'. But if STEP is zero, then PID means `continue *all* - processes, but give the signal only to this one'. */ - resume_all = (PIDGET (ptid) == -1) || !step; + /* A specific PTID means `step only this process id'. */ + resume_all = (PIDGET (ptid) == -1); if (resume_all) iterate_over_lwps (resume_set_callback, NULL);