From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22867 invoked by alias); 29 Sep 2009 14:32:21 -0000 Received: (qmail 22856 invoked by uid 22791); 29 Sep 2009 14:32:20 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (65.74.133.4) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 29 Sep 2009 14:32:14 +0000 Received: (qmail 9724 invoked from network); 29 Sep 2009 14:32:11 -0000 Received: from unknown (HELO orlando) (pedro@127.0.0.2) by mail.codesourcery.com with ESMTPA; 29 Sep 2009 14:32:11 -0000 From: Pedro Alves To: Doug Evans Subject: Re: [RFC] mask off is-syscall bit for TRAP_IS_SYSCALL Date: Tue, 29 Sep 2009 14:32:00 -0000 User-Agent: KMail/1.9.10 Cc: sergiodj@linux.vnet.ibm.com, gdb-patches@sourceware.org References: <20090929124349.47188843A9@ruffy.mtv.corp.google.com> In-Reply-To: <20090929124349.47188843A9@ruffy.mtv.corp.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200909291532.29075.pedro@codesourcery.com> X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2009-09/txt/msg00915.txt.bz2 (for those not reading the whole thread, catch syscall is unfortunatly broken with trivial multi-threading.) On Tuesday 29 September 2009 13:43:48, Doug Evans wrote: > I'm not sure the filtering is done too late though. > [One *could* do it earlier, but it seems like it could be done > later too.] I don't see this 0x8x signal that much different from the extended waitstatuses, and I don't see anywhere else other than linux_handle_extended_wait (or somewhere around it) that would need to distinguish a syscall SIGTRAP from a regular SIGTRAP, and yet differently from other ptrace SIGTRAPs. Take cancel_breakpoint, for example, this seems to me that it should already be ignoring any LWP that is stopped with a SIGTRAP due to an ptrace event, like PTRACE_EVENT_VFORK|FORK|EXEC. cancel_breakpoint: - if (lp->status != 0 + if (lp->waitstatus.kind != TARGET_WAITKIND_IGNORE + && lp->status != 0 && WIFSTOPPED (lp->status) && WSTOPSIG (lp->status) == SIGTRAP stop_wait_callback only needs to care that there's a SIGTRAP, doesn't need to know if that SIGTRAP is a PTRACE_EVENT_FORK event pending, just like it doesn't look like it needs to care for TRAP_IS_SIGCALL specially. I do see other broken calls to the core with an unfiltered 0x85, like, e.g.,: get_pending_status: signo = target_signal_from_host (WSTOPSIG (lp->status)); or cases of passing a 0x85 to ptrace/kernel, like detach_callback: if (ptrace (PTRACE_DETACH, GET_LWP (lp->ptid), 0, WSTOPSIG (status)) < 0) This one's quite concerning: Catchpoint 1 (returned from syscall 202), 0x00007ffff7bd113e in __lll_lock_wait_private () from /lib/libpthread.so.0 (gdb) detach Can't detach Thread 0x43806950 (LWP 20939): Input/output error (gdb) (gdb) c Continuing. ../../src/gdb/linux-nat.c:1782: internal-error: linux_nat_resume: Assertion `lp != NULL' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) Irk! It seems we'd want to handle syscall SIGTRAPs exactly like other extended wait statuses --- they're all ptrace SIGTRAPs: don't ever pass a ptrace SIGTRAP to the inferior. Here's another somewhat related breakage: Take the same simple multi-threaded app example as before, and do this: (gdb) r Starting program: /home/pedro/gdb/tests/trap_is_syscall [Thread debugging using libthread_db enabled] [New Thread 0x40800950 (LWP 18877)] [New Thread 0x41001950 (LWP 18878)] [New Thread 0x41802950 (LWP 18879)] [New Thread 0x42003950 (LWP 18880)] [New Thread 0x42804950 (LWP 18881)] [New Thread 0x43005950 (LWP 18882)] [New Thread 0x43806950 (LWP 18883)] [New Thread 0x44007950 (LWP 18884)] [New Thread 0x44808950 (LWP 18885)] [New Thread 0x45009950 (LWP 18886)] Program received signal SIGINT, Interrupt. 0x00007ffff78ffb81 in nanosleep () from /lib/libc.so.6 (gdb) catch syscall warning: Could not open "syscalls/amd64-linux.xml" warning: Could not load the syscall XML file `syscalls/amd64-linux.xml'. GDB will not be able to display syscall names. Catchpoint 1 (any syscall) (gdb) c Continuing. [Switching to Thread 0x45009950 (LWP 18886)] Catchpoint 1 (call to syscall 35), 0x00007ffff78ffb81 in nanosleep () from /lib/libc.so.6 Now, delete the catchpoint, while some LWPs have a pending syscall event to report: (gdb) del 1 (gdb) c Continuing. Program received signal SIGTRAP, Trace/breakpoint trap. [Switching to Thread 0x44808950 (LWP 18885)] 0x00007ffff78ffb81 in nanosleep () from /lib/libc.so.6 (gdb) This case does need to be filtered later, it seems to me. > btw, I'm seeing lots of "syscall 0" (presumably restarted system calls). > I hacked another patch to save the the syscall number so that the user > wouldn't see syscall 0, but maybe the thing to do is record the > fact that the syscall got restarted and report that to the user? > [I'm assuming the 0's I see are indeed restarted syscalls.] Hmmm, haven't seen this one yet. -- Pedro Alves