From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-129052-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 59979 invoked by alias); 27 Jan 2016 16:47:48 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 59966 invoked by uid 89); 27 Jan 2016 16:47:47 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=cooked, unsure, Checking, PCs
X-HELO: usplmg20.ericsson.net
Received: from usplmg20.ericsson.net (HELO usplmg20.ericsson.net) (198.24.6.45) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Wed, 27 Jan 2016 16:47:45 +0000
Received: from EUSAAHC007.ericsson.se (Unknown_Domain [147.117.188.93])	by usplmg20.ericsson.net (Symantec Mail Security) with SMTP id A7.36.06940.CD1F8A65; Wed, 27 Jan 2016 17:35:41 +0100 (CET)
Received: from [142.133.110.95] (147.117.188.8) by smtp-am.internal.ericsson.com (147.117.188.95) with Microsoft SMTP Server id 14.3.248.2; Wed, 27 Jan 2016 11:47:42 -0500
Subject: Re: Move threads out of jumppad without single step
To: Yao Qi <qiyaoltc@gmail.com>, Pedro Alves <palves@redhat.com>
References: <86zixzvhj1.fsf@gmail.com> <565C6043.4040106@redhat.com> <864mg2v1s5.fsf@gmail.com>
CC: <gdb-patches@sourceware.org>, <simon.marchi@ericsson.com>
From: Antoine Tremblay <antoine.tremblay@ericsson.com>
Message-ID: <56A8F4AE.5040305@ericsson.com>
Date: Wed, 27 Jan 2016 16:47:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0
MIME-Version: 1.0
In-Reply-To: <864mg2v1s5.fsf@gmail.com>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
X-SW-Source: 2016-01/txt/msg00673.txt.bz2


On 12/01/2015 06:36 AM, Yao Qi wrote:
> Pedro Alves <palves@redhat.com> writes:
>
>> You may be able to handle this by retrieving state from the saved registers
>> buffer in the jump pad, similar to how gdb_collect cooks up a regcache, though
>> unlike gdb_collect, you'll have to handle the case of the thread stopping
>> midway through that register saving too (some registers already saved, some not
>> yet).
>
> Compute the next PCs on the basis of cooked up regcache from stack is
> what I intended, but I didn't consider the case thread is stopped in the
> middle way of register saving.
>
>>
>> So I assume it's much simpler to just run to [1] as well, and then issue
>> a normal software single-step when you get there.
>
> Then, looks we have to use software single step.
>

Hi,
   I'm testing using software single stepping to move threads out of the 
the jump pad and I've ran into a problem which I'm really unsure how to 
fix and since the run control stuff is quite hard to follow help is 
appreciated.

Some context :

I'm using the following program to test on ARM :

#include "trace-common.h"
#include <stdio.h>

static void
begin (void)
{}
static void
end (void)
{}

int
main ()
{
   begin ();
   FAST_TRACEPOINT_LABEL(set_point);
   FAST_TRACEPOINT_LABEL(other_point);
   end ();
   return 0;
}

To compile :
gcc -marm -Wl,--no-as-needed move-out.c libinproctrace.so -g 
-Wl,-rpath,$ORIGIN -lm -o move-out

My gdb version is a soon to be posted fast tracepoint branch for ARM:
https://github.com/hexa00/binutils-gdb/commit/14912518f37abc2eb4594b42ca161c912ef6b6cd

Running gdbserver under gdb as such :

gdb gdbserver  --debug -ex "break linux_stabilize_threads" -ex "run 
--once :7777 ./move-out

and gdb with the following commands:
set pagination off
set non-stop on
set remotetimeout unlimited
tar rem :7777
break main
break end
c
#break on last jump pad instruction
break *gdb_agent_gdb_jump_pad_buffer + 43*4
ftrace set_point
tstart
c
#delete the breakpoint to fool gdbserver a bit.
delete 3
ftrace other_point

In the logs I can see :

stop_all_lwps done, setting stopping_threads back to !stopping
<<<< exiting stop_all_lwps
Checking whether LWP 5148 needs to move out of the jump pad.
fast_tracepoint_collecting
in jump pad of tpoint (4, 85d8); jump_pad(33000, 330b0); adj_insn(330ac, 
9393939393939393)
fast_tracepoint_collecting, returning need-single-step 
(330ac-9393939393939393)
Checking whether LWP 5148 needs to move out of the jump pad...it does
LWP 5148 needs stabilizing (in jump pad)
Resuming lwp 5148 (continue, signal 0, stop not expected)
lwp 5148 wants to get out of fast tracepoint jump pad single-stepping
stop pc is 0x330ac
pc is 0x330ac
Writing f001f0e7 to 0x000085dc in process 5148
stop pc is 0x330ac
   continue from pc 0x330ac
Checking whether LWP 5650 needs to move out of the jump pad.
sigchld_handler
fast_tracepoint_collecting
fast_tracepoint_collecting: not collecting (and nobody is).
Checking whether LWP 5650 needs to move out of the jump pad...no
 >>>> entering linux_wait_1
linux_wait_1: [<all threads>]
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x40000001): status(57f), 5148
LWFE: waitpid(-1, ...) returned 5148, ERRNO-OK
LLW: waitpid 5148 received Trace/breakpoint trap (stopped)
stop pc is 0x85dc
pc is 0x85dc
CSBB: LWP 5148.5148 stopped by software breakpoint
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x40000001): status(ffffffff), 0
LWFE: waitpid(-1, ...) returned 0, ERRNO-OK
leader_pid=5148, leader_lp!=NULL=1, num_lwps=2, zombie=0
LLW: exit (no unwaited-for LWP)
linux_wait_1 ret = null_ptid, TARGET_WAITKIND_NO_RESUMED
<<<< exiting linux_wait_1
../../../gdb/gdbserver/linux-low.c:1922: A problem internal to GDBserver 
has been detected.
unsuspend LWP 5148, suspended=-1


The main problem seems to be that as we enter linux_wait_1 in 
stabilize_threads gdbserver gets the stopped event of the single step 
breakpoint :

LLW: waitpid 5148 received Trace/breakpoint trap (stopped)
stop pc is 0x85dc
pc is 0x85dc
CSBB: LWP 5148.5148 stopped by software breakpoint

but this event is filtered out by: linux-low.c:2683

  /* ... and find an LWP with a status to report to the core, if
	 any.  */
       event_thread = (struct thread_info *)
	find_inferior (&all_threads, status_pending_p_callback, &filter_ptid);

Here status_pending_p_callback find that lwp_resumed is true and returns 
0 thus filtering out the event.

Thus we go back to linux_stabilize_threads do nothing, end the loop 
since lwp is now stopped.

And then try to unsuspend a thread that was not suspended and hit the 
assert.

Ideas on how this should be fixed ?

Thanks,
Antoine