From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 59979 invoked by alias); 27 Jan 2016 16:47:48 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 59966 invoked by uid 89); 27 Jan 2016 16:47:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=cooked, unsure, Checking, PCs X-HELO: usplmg20.ericsson.net Received: from usplmg20.ericsson.net (HELO usplmg20.ericsson.net) (198.24.6.45) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Wed, 27 Jan 2016 16:47:45 +0000 Received: from EUSAAHC007.ericsson.se (Unknown_Domain [147.117.188.93]) by usplmg20.ericsson.net (Symantec Mail Security) with SMTP id A7.36.06940.CD1F8A65; Wed, 27 Jan 2016 17:35:41 +0100 (CET) Received: from [142.133.110.95] (147.117.188.8) by smtp-am.internal.ericsson.com (147.117.188.95) with Microsoft SMTP Server id 14.3.248.2; Wed, 27 Jan 2016 11:47:42 -0500 Subject: Re: Move threads out of jumppad without single step To: Yao Qi , Pedro Alves References: <86zixzvhj1.fsf@gmail.com> <565C6043.4040106@redhat.com> <864mg2v1s5.fsf@gmail.com> CC: , From: Antoine Tremblay Message-ID: <56A8F4AE.5040305@ericsson.com> Date: Wed, 27 Jan 2016 16:47:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: <864mg2v1s5.fsf@gmail.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2016-01/txt/msg00673.txt.bz2 On 12/01/2015 06:36 AM, Yao Qi wrote: > Pedro Alves writes: > >> You may be able to handle this by retrieving state from the saved registers >> buffer in the jump pad, similar to how gdb_collect cooks up a regcache, though >> unlike gdb_collect, you'll have to handle the case of the thread stopping >> midway through that register saving too (some registers already saved, some not >> yet). > > Compute the next PCs on the basis of cooked up regcache from stack is > what I intended, but I didn't consider the case thread is stopped in the > middle way of register saving. > >> >> So I assume it's much simpler to just run to [1] as well, and then issue >> a normal software single-step when you get there. > > Then, looks we have to use software single step. > Hi, I'm testing using software single stepping to move threads out of the the jump pad and I've ran into a problem which I'm really unsure how to fix and since the run control stuff is quite hard to follow help is appreciated. Some context : I'm using the following program to test on ARM : #include "trace-common.h" #include static void begin (void) {} static void end (void) {} int main () { begin (); FAST_TRACEPOINT_LABEL(set_point); FAST_TRACEPOINT_LABEL(other_point); end (); return 0; } To compile : gcc -marm -Wl,--no-as-needed move-out.c libinproctrace.so -g -Wl,-rpath,$ORIGIN -lm -o move-out My gdb version is a soon to be posted fast tracepoint branch for ARM: https://github.com/hexa00/binutils-gdb/commit/14912518f37abc2eb4594b42ca161c912ef6b6cd Running gdbserver under gdb as such : gdb gdbserver --debug -ex "break linux_stabilize_threads" -ex "run --once :7777 ./move-out and gdb with the following commands: set pagination off set non-stop on set remotetimeout unlimited tar rem :7777 break main break end c #break on last jump pad instruction break *gdb_agent_gdb_jump_pad_buffer + 43*4 ftrace set_point tstart c #delete the breakpoint to fool gdbserver a bit. delete 3 ftrace other_point In the logs I can see : stop_all_lwps done, setting stopping_threads back to !stopping <<<< exiting stop_all_lwps Checking whether LWP 5148 needs to move out of the jump pad. fast_tracepoint_collecting in jump pad of tpoint (4, 85d8); jump_pad(33000, 330b0); adj_insn(330ac, 9393939393939393) fast_tracepoint_collecting, returning need-single-step (330ac-9393939393939393) Checking whether LWP 5148 needs to move out of the jump pad...it does LWP 5148 needs stabilizing (in jump pad) Resuming lwp 5148 (continue, signal 0, stop not expected) lwp 5148 wants to get out of fast tracepoint jump pad single-stepping stop pc is 0x330ac pc is 0x330ac Writing f001f0e7 to 0x000085dc in process 5148 stop pc is 0x330ac continue from pc 0x330ac Checking whether LWP 5650 needs to move out of the jump pad. sigchld_handler fast_tracepoint_collecting fast_tracepoint_collecting: not collecting (and nobody is). Checking whether LWP 5650 needs to move out of the jump pad...no >>>> entering linux_wait_1 linux_wait_1: [] my_waitpid (-1, 0x40000001) my_waitpid (-1, 0x40000001): status(57f), 5148 LWFE: waitpid(-1, ...) returned 5148, ERRNO-OK LLW: waitpid 5148 received Trace/breakpoint trap (stopped) stop pc is 0x85dc pc is 0x85dc CSBB: LWP 5148.5148 stopped by software breakpoint my_waitpid (-1, 0x40000001) my_waitpid (-1, 0x40000001): status(ffffffff), 0 LWFE: waitpid(-1, ...) returned 0, ERRNO-OK leader_pid=5148, leader_lp!=NULL=1, num_lwps=2, zombie=0 LLW: exit (no unwaited-for LWP) linux_wait_1 ret = null_ptid, TARGET_WAITKIND_NO_RESUMED <<<< exiting linux_wait_1 ../../../gdb/gdbserver/linux-low.c:1922: A problem internal to GDBserver has been detected. unsuspend LWP 5148, suspended=-1 The main problem seems to be that as we enter linux_wait_1 in stabilize_threads gdbserver gets the stopped event of the single step breakpoint : LLW: waitpid 5148 received Trace/breakpoint trap (stopped) stop pc is 0x85dc pc is 0x85dc CSBB: LWP 5148.5148 stopped by software breakpoint but this event is filtered out by: linux-low.c:2683 /* ... and find an LWP with a status to report to the core, if any. */ event_thread = (struct thread_info *) find_inferior (&all_threads, status_pending_p_callback, &filter_ptid); Here status_pending_p_callback find that lwp_resumed is true and returns 0 thus filtering out the event. Thus we go back to linux_stabilize_threads do nothing, end the loop since lwp is now stopped. And then try to unsuspend a thread that was not suspended and hit the assert. Ideas on how this should be fixed ? Thanks, Antoine