From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18994 invoked by alias); 28 Sep 2009 23:27:04 -0000 Received: (qmail 18984 invoked by uid 22791); 28 Sep 2009 23:27:03 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (65.74.133.4) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 28 Sep 2009 23:26:59 +0000 Received: (qmail 31094 invoked from network); 28 Sep 2009 23:26:56 -0000 Received: from unknown (HELO orlando) (pedro@127.0.0.2) by mail.codesourcery.com with ESMTPA; 28 Sep 2009 23:26:56 -0000 From: Pedro Alves To: gdb-patches@sourceware.org Subject: Re: [RFC] mask off is-syscall bit for TRAP_IS_SYSCALL Date: Mon, 28 Sep 2009 23:27:00 -0000 User-Agent: KMail/1.9.10 Cc: Doug Evans , sergiodj@linux.vnet.ibm.com References: <20090928193905.32592843AC@ruffy.mtv.corp.google.com> <200909282058.49161.pedro@codesourcery.com> In-Reply-To: <200909282058.49161.pedro@codesourcery.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200909290027.12792.pedro@codesourcery.com> X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2009-09/txt/msg00890.txt.bz2 A Monday 28 September 2009 20:58:48, Pedro Alves escreveu: > On Monday 28 September 2009 20:39:05, Doug Evans wrote: > > Hi. > > > > On one system I use (bi-arch ubuntu-hardy clone), > > i386-disp-step.exp is failing because the wait status > > value linux_nat_wait_1 gets when hitting a system call is 0857f > > which gets passed to the upper layers which then get confused > > by a signal of 0x85 (== 0x80 | SIGTRAP). > > > > This patch fixes things by masking off the 0x80 bit before > > passing the signal number to up the call chain. > > > > Ok to check in? > > This seems OK-is to me, although I see one extra case that > isn't handled correctly: > > stop_wait_callback: > > status = wait_lwp (lp); > > ... > > if (WSTOPSIG (status) != SIGSTOP) > { > if (WSTOPSIG (status) == SIGTRAP) > { > ... > } > else > { > /* If the lp->status field is still empty, use it to > hold this event. If not, then this event must be > returned to the event queue of the LWP. */ > if (lp->status) > { > if (debug_linux_nat) > { > fprintf_unfiltered (gdb_stdlog, > "SWC: kill %s, %s\n", > target_pid_to_str (lp->ptid), > status_to_str ((int) status)); > } > kill_lwp (GET_LWP (lp->ptid), WSTOPSIG (status)); <<<<<<<< > } > > It seems we can reach that <<< marked code with a TRAP_IS_SYSCALL, but, > I doubt that we want to requeue that signal in the kernel (?). > In case I wasn't clear, I was alluding at the fact that I think the filtering is done too late.. In fact, things are worse than I imagined. See here for a simple failing example: #include #include #include #include void * thread_function (void *arg) { while (1) { usleep (1); } } int main () { int res; pthread_t thread; long i = 0; for (i = 0; i < 10; i++) { res = pthread_create(&thread, NULL, thread_function, NULL); } thread_function (NULL); } (gdb) r Starting program: /home/pedro/gdb/tests/trap_is_syscall [New Thread 0x7fe7dee066e0 (LWP 9531)] [Thread debugging using libthread_db enabled] [New Thread 0x40800950 (LWP 9538)] [New Thread 0x41001950 (LWP 9539)] [New Thread 0x41802950 (LWP 9540)] [New Thread 0x42003950 (LWP 9541)] [New Thread 0x42804950 (LWP 9542)] [New Thread 0x43005950 (LWP 9545)] [New Thread 0x43806950 (LWP 9547)] [New Thread 0x44007950 (LWP 9548)] [New Thread 0x44808950 (LWP 9549)] [New Thread 0x45009950 (LWP 9550)] Program received signal SIGINT, Interrupt. 0x00007ffff78ffb81 in nanosleep () from /lib/libc.so.6 (gdb) catch syscall warning: Could not open "syscalls/amd64-linux.xml" warning: Could not load the syscall XML file `syscalls/amd64-linux.xml'. GDB will not be able to display syscall names. Catchpoint 1 (any syscall) (gdb) c Continuing. [Switching to Thread 0x45009950 (LWP 9550)] Catchpoint 1 (call to syscall 35), 0x00007ffff78ffb81 in nanosleep () from /lib/libc.so.6 (gdb) c Continuing. Program received signal ?, Unknown signal. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [Switching to Thread 0x44808950 (LWP 9549)] 0x00007ffff78ffb81 in nanosleep () from /lib/libc.so.6 (gdb) `?' is due to TRAP_IS_SYSCALL. Same with "set debug lin-lwp 1": linux_nat_wait: [process -1] LLW: Using pending wait status Trace/breakpoint trap (stopped at syscall) for Thread 0x44808950 (LWP 10484). LLW: Candidate event Trace/breakpoint trap (stopped at syscall) in Thread 0x44808950 (LWP 10484). Program received signal ?, Unknown signal. [Switching to Thread 0x44808950 (LWP 10484)] 0x00007ffff78ffb81 in nanosleep () from /lib/libc.so.6 I wouldn't be surprised if syscall events were inverted from here on (entry/exit). Playing with a patch like the below, it's easier to trigger the case I was mentioning before: --- gdb/linux-nat.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) Index: src/gdb/linux-nat.c =================================================================== --- src.orig/gdb/linux-nat.c 2009-09-29 00:09:13.000000000 +0100 +++ src/gdb/linux-nat.c 2009-09-29 00:13:59.000000000 +0100 @@ -2400,6 +2400,13 @@ stop_wait_callback (struct lwp_info *lp, target_pid_to_str (lp->ptid), errno ? safe_strerror (errno) : "OK"); + if (WSTOPSIG (status) == TRAP_IS_SYSCALL) + { + /* Simulate the case of a signal < SIGTRAP and + SIGSTOP being delivered before the SIGSTOP. */ + // kill_lwp (GET_LWP (lp->ptid), SIGINT); + } + /* Hold this event/waitstatus while we check to see if there are any more (we still want to get that SIGSTOP). */ stop_wait_callback (lp, NULL); @@ -2416,7 +2423,9 @@ stop_wait_callback (struct lwp_info *lp, target_pid_to_str (lp->ptid), status_to_str ((int) status)); } - kill_lwp (GET_LWP (lp->ptid), WSTOPSIG (status)); + + // gdb_assert (WSTOPSIG (status) != TRAP_IS_SYSCALL); + gdb_assert (kill_lwp (GET_LWP (lp->ptid), WSTOPSIG (status)) == 0); } else lp->status = status; Uncommenting the first commented out kill_lwp the patchlet adds, makes the last gdb_assert trigger (can't kill with signal 0x85), meaning, a syscall event gets lost. -- Pedro Alves