From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30232 invoked by alias); 21 Sep 2011 20:46:18 -0000 Received: (qmail 30223 invoked by uid 22791); 21 Sep 2011 20:46:17 -0000 X-SWARE-Spam-Status: No, hits=-1.6 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from imr4.ericy.com (HELO imr4.ericy.com) (198.24.6.9) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 21 Sep 2011 20:46:02 +0000 Received: from eusaamw0706.eamcs.ericsson.se ([147.117.20.31]) by imr4.ericy.com (8.14.3/8.14.3/Debian-9.1ubuntu1) with ESMTP id p8LKjZsA019316; Wed, 21 Sep 2011 15:45:57 -0500 Received: from EUSAACMS0703.eamcs.ericsson.se ([169.254.1.120]) by eusaamw0706.eamcs.ericsson.se ([147.117.20.31]) with mapi; Wed, 21 Sep 2011 16:45:37 -0400 From: Marc Khouzam To: "'Pedro Alves'" , "gdb@sourceware.org" Date: Wed, 21 Sep 2011 20:46:00 -0000 Subject: RE: Displaced stepping not always working as expected Message-ID: References: <201109211120.26531.pedro@codesourcery.com> In-Reply-To: <201109211120.26531.pedro@codesourcery.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2011-09/txt/msg00090.txt.bz2 > -----Original Message----- > From: Pedro Alves [mailto:pedro@codesourcery.com]=20 > Sent: Wednesday, September 21, 2011 6:20 AM > To: gdb@sourceware.org > Cc: Marc Khouzam > Subject: Re: Displaced stepping not always working as expected >=20 > On Tuesday 20 September 2011 20:54:24, Marc Khouzam wrote: > > Hi, > >=20 > > I just need a hint on where next to look... >=20 > > I've been asked to look into problems with non-stop on=20 > > a user-mode-linux virtual machine > > (http://user-mode-linux.sourceforge.net/) >=20 > So does this only happen with UML? UML uses ptrace internally for > its own business, I wouldn't be surprised if there's something > wonky going on at that level. Yes, only on UML. In fact, only on a particular installation of UML: I am not able to reproduce the problem on my own installation of UML. So, I also believe it is because of the UML. But I'm hoping it might be something that can be fixed. Which is why I'm trying to pin-point the cause. > > On that AMD 64bit machine, I cannot step or resume past a breakpoint > > when using non-stop with a multi-threaded program _if_ any of the > > threads is still running. If I interrupt all threads, then=20 > displaced > > stepping works. >=20 > I wouldn't be surprised if the UM kernel is reporting a spurious > SIGTRAP to gdb. Try "set debug lin-lwp 1" as well, but I don't > think it'll tell you much. Maybe peeking at eflags or the siginfo > of that SIGTRAP reveals something. Thanks. No luck with "set debug lin-lwp 1" but I will try to open up the siginfo. I've spend the better part of the day getting familiar with the relevant code. > Try "set debug lin-lwp 1", and see if the resume was preempted and > for some bizarre reason the core is getting a cached wait status > instead of really resuming the thread. >=20 > Otherwise, this smells like a UML problem. One bizarre thing I noticed when trying "set debug target 1" is that more threads get started than there should! Normally, I get (in summary): set non-stop on set target-async on b 8 r& Breakpoint 1, thread_exec1 (ptr=3D0x400888) at multithread.c:8 8 i++; info thr Id Target Id Frame=20 2 Thread 0x40804940 (LWP 946) "multi3" thread_exec1 (ptr=3D0x400888) a= t multithread.c:8 * 1 Thread 0x40003800 (LWP 943) "multi3" (running) If I add "set debug target 1" before 'r&', I get: info thr Id Target Id Frame=20 4 Thread 0x41806940 (LWP 940) "multi3" thread_exec1 (ptr=3D0x400888) a= t multithread.c:8 target_core_of_thread (935) =3D 0 3 Thread 0x41005940 (LWP 939) "multi3" (running) target_core_of_thread (935) =3D 0 2 Thread 0x40804940 (LWP 938) "multi3" (running) target_core_of_thread (935) =3D 0 * 1 Thread 0x40003800 (LWP 935) "multi3" (running) What the??? The program (copied below), only starts two threads. I wonder if libthread is causing some problem when using UML? When using gdbserver, I do often get a gdbserver warning when attaching: "PID mismatch! Expected 789, got 791" where 791 is the LWP of a thread other than the main one.=20 Anyway, I'll keep digging. Thanks for the pointers. Marc My test program =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D #include #include #include void *thread_exec1(void *ptr) { int i; for (i=3D0;i<500;i++) { i++; i--; printf("in the second thread %d\n", i); sleep(1); } } int main() { pthread_t thread2; int iret2 =3D pthread_create( &thread2, NULL, thread_exec1, (void*) "Th= read 2"); printf("in the first thread\n"); int i; for (i=3D0;i<30;i++) { sleep(2); // while here, non-stop can't step over breakpoints } printf("ABOUT TO CALL JOIN\n"); pthread_join(thread2, NULL); // but while here, it can! return 0; }