Move threads out of jumppad without single step

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* Move threads out of jumppad without single step
@ 2015-11-27 10:55 Yao Qi
  2015-11-30 14:42 ` Pedro Alves
  0 siblings, 1 reply; 9+ messages in thread
From: Yao Qi @ 2015-11-27 10:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: palves, simon.marchi

Hi Pedro,
Can we move threads out of jumppad without using single step?  I mean,
supposing GDBserver can compute the next PCs of the instruction on which
fast tracepoint is set, GDBserver can insert breakpoints at the next PCs
of the instruction fast tracepoint is set on, and resume the thread
which is in jumppad.

In the recent discussion about ARM fast tracepoint support, I pointed
out that missing hardware single step may be a showstopper to ARM fast
tracepoint support, but Simon wants to convince me that we can have a
non-perfect fast tracepoint support on ARM with software single step.
After I think about the possible issues of using software single step, I
don't have any outcomes, but I wonder why do we need to single step
thread out of jumppad?  Here is my understanding, in the jumppad, the
instructions are like,

  saved registers
  spin lock
  call gdb_collect
  spin unlock
  restore registers
  <----- [1]
  relocated instructions
  jump back

If PC is within the range above, GDBserver needs to move the thread
out.  GDBserver can single step instructions one by one until PC is out
of this range, but we have an optimization here that GDBserver can
insert breakpoint and resume rather than single step one by one if PC
doesn't point to the relocated instructions yet [1].  Afterwards,
GDBserver will single step, but there aren't many instructions.  Unless
I miss something, the code comments explain why do we move threads out
of jumppad, but they do *not* explain why do we move threads out of
jumppad in this way.  IMO, the reason is that instructions before [1]
are sequentially executed (except call gdb_collect), but instructions
after [1] may not.  We can safely insert breakpoint on [1], and thread
must hit the breakpoint.  However, the original instruction on which the
fast tracepoint is set may be a branch instruction, so the relocated
instructions may jump back to the target address.  At the moment we add
fast tracepoint support (for x86), we rely on hardware single step so
that we don't have to compute the next PCs (that is also the reason why
qRelocInsn packet was added).

Now, the situation in GDBserver changed a little bit, as GDBserver will
be able to accurately compute the next PCs.  Under this assumption,
GDBserver can compute the next PCs of the original instruction, and set
breakpoints there when threads are in jumppad.  In this way, we don't
need to single step thread out jumppad at all for targets which support
software single step in GDBserver.

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Move threads out of jumppad without single step
  2015-11-27 10:55 Move threads out of jumppad without single step Yao Qi
@ 2015-11-30 14:42 ` Pedro Alves
  2015-11-30 19:04   ` Simon Marchi
  2015-12-01 11:36   ` Yao Qi
  0 siblings, 2 replies; 9+ messages in thread
From: Pedro Alves @ 2015-11-30 14:42 UTC (permalink / raw)
  To: Yao Qi, gdb-patches; +Cc: simon.marchi

On 11/27/2015 10:54 AM, Yao Qi wrote:
> 
> Hi Pedro,
> Can we move threads out of jumppad without using single step?  I mean,
> supposing GDBserver can compute the next PCs of the instruction on which
> fast tracepoint is set, GDBserver can insert breakpoints at the next PCs
> of the instruction fast tracepoint is set on, and resume the thread
> which is in jumppad.
> 
> In the recent discussion about ARM fast tracepoint support, I pointed
> out that missing hardware single step may be a showstopper to ARM fast
> tracepoint support, but Simon wants to convince me that we can have a
> non-perfect fast tracepoint support on ARM with software single step.
> After I think about the possible issues of using software single step, I
> don't have any outcomes, but I wonder why do we need to single step
> thread out of jumppad?  Here is my understanding, in the jumppad, the
> instructions are like,
> 
>   saved registers
>   spin lock
>   call gdb_collect
>   spin unlock
>   restore registers
>   <----- [1]
>   relocated instructions
>   jump back
> 
> If PC is within the range above, GDBserver needs to move the thread
> out.  GDBserver can single step instructions one by one until PC is out
> of this range, but we have an optimization here that GDBserver can
> insert breakpoint and resume rather than single step one by one if PC
> doesn't point to the relocated instructions yet [1].  Afterwards,
> GDBserver will single step, but there aren't many instructions.  Unless
> I miss something, the code comments explain why do we move threads out
> of jumppad, but they do *not* explain why do we move threads out of
> jumppad in this way.  IMO, the reason is that instructions before [1]
> are sequentially executed (except call gdb_collect), but instructions
> after [1] may not.  We can safely insert breakpoint on [1], and thread
> must hit the breakpoint.  However, the original instruction on which the
> fast tracepoint is set may be a branch instruction, so the relocated
> instructions may jump back to the target address.

Correct.  But still, even if the relocated instruction does not jump back,
we need to execute the last "jump back" instruction then.

> At the moment we add
> fast tracepoint support (for x86), we rely on hardware single step so
> that we don't have to compute the next PCs (that is also the reason why
> qRelocInsn packet was added).
> 
> Now, the situation in GDBserver changed a little bit, as GDBserver will
> be able to accurately compute the next PCs.  Under this assumption,
> GDBserver can compute the next PCs of the original instruction, and set
> breakpoints there when threads are in jumppad.  In this way, we don't
> need to single step thread out jumppad at all for targets which support
> software single step in GDBserver.

I don't think we can determine the "next PCs" of the original instruction that
easily even with the new software single-step work.

E.g., if the instruction is an indirect branch, like "B r1", you need to
know the contents of register r1 to tell where the branch will take you.
But you can't compute offline what r1 will contain.

That's why the code that computes the next PCs (the gdbarch_software_single_step
implementations) assumes that the thread is currently stopped at the
instruction that we want to compute the next PCs for.

In the current arm_software_single_step implementation, you'll see the register
context also used to e.g., handle the conditions in conditional execution instructions,
based on state of registers and flags.

You may be able to handle this by retrieving state from the saved registers
buffer in the jump pad, similar to how gdb_collect cooks up a regcache, though
unlike gdb_collect, you'll have to handle the case of the thread stopping
midway through that register saving too (some registers already saved, some not
yet).

So I assume it's much simpler to just run to [1] as well, and then issue
a normal software single-step when you get there.

Also, not sure, but it's possible the stabilize threads machinery may
need work to handle the "wrong threads" hitting that "out of jump pad"
single-step breakpoint for another thread, and not have them start
a new start over, but instead have them be locked immediately.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Move threads out of jumppad without single step
  2015-11-30 14:42 ` Pedro Alves
@ 2015-11-30 19:04   ` Simon Marchi
  2015-11-30 19:39     ` Pedro Alves
  2015-12-01 11:36   ` Yao Qi
  1 sibling, 1 reply; 9+ messages in thread
From: Simon Marchi @ 2015-11-30 19:04 UTC (permalink / raw)
  To: Pedro Alves, Yao Qi, gdb-patches

On 15-11-30 09:42 AM, Pedro Alves wrote:
> So I assume it's much simpler to just run to [1] as well, and then issue
> a normal software single-step when you get there.
>
> Also, not sure, but it's possible the stabilize threads machinery may
> need work to handle the "wrong threads" hitting that "out of jump pad"
> single-step breakpoint for another thread, and not have them start
> a new start over, but instead have them be locked immediately.

To be clear, do you mean, when single stepping the last jump pad instruction,
the jump that goes back to the regular code?  When putting a breakpoint on the
next pc of that instruction, it means putting a breakpoint in the regular code.

However, when doing a single step in gdbserver, aren't all other threads stopped?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Move threads out of jumppad without single step
  2015-11-30 19:04   ` Simon Marchi
@ 2015-11-30 19:39     ` Pedro Alves
  0 siblings, 0 replies; 9+ messages in thread
From: Pedro Alves @ 2015-11-30 19:39 UTC (permalink / raw)
  To: Simon Marchi, Yao Qi, gdb-patches

On 11/30/2015 07:04 PM, Simon Marchi wrote:
> On 15-11-30 09:42 AM, Pedro Alves wrote:
>> So I assume it's much simpler to just run to [1] as well, and then issue
>> a normal software single-step when you get there.
>>
>> Also, not sure, but it's possible the stabilize threads machinery may
>> need work to handle the "wrong threads" hitting that "out of jump pad"
>> single-step breakpoint for another thread, and not have them start
>> a new start over, but instead have them be locked immediately.
> 
> To be clear, do you mean, when single stepping the last jump pad instruction,
> the jump that goes back to the regular code?  When putting a breakpoint on the
> next pc of that instruction, it means putting a breakpoint in the regular code.

Right.

> 
> However, when doing a single step in gdbserver, aren't all other threads stopped?
> 

That's really a function of whether we're stepping over a breakpoint; in that case
we must stop all threads, but that's true when you step past a breakpoint with
either software single-step or hardware single-step.   IOW, it's an orthogonal,
though tangent issue.   E.g., when software single stepping for a tracepoint's
while-stepping action, I don't see why you'd stop all threads.

In this case, if the thread reached the breakpoint set at the last instruction of
the jump pad, we don't need that breakpoint anymore.  So for the last single-step
we can compute the next PC, install a breakpoint there, and continue, no
need to start a step-over process (pause all threads, remove bp, step, reinsert bp,
unpause all).

On the other hand, always leaving that breakpoint inserted while the stabilization
is in progress and then making that final single-step be a step-over operation
could make things simpler.  Or not, not sure.  We currently track the breakpoint's
lifetime by putting it in the lwp structure (lwp->exit_jump_pad_bkpt), because
it's per-lwp.  If we keep it inserted, then we need to track its lifetime
differently.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Move threads out of jumppad without single step
  2015-11-30 14:42 ` Pedro Alves
  2015-11-30 19:04   ` Simon Marchi
@ 2015-12-01 11:36   ` Yao Qi
  2016-01-27 16:47     ` Antoine Tremblay
  1 sibling, 1 reply; 9+ messages in thread
From: Yao Qi @ 2015-12-01 11:36 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Yao Qi, gdb-patches, simon.marchi

Pedro Alves <palves@redhat.com> writes:

> You may be able to handle this by retrieving state from the saved registers
> buffer in the jump pad, similar to how gdb_collect cooks up a regcache, though
> unlike gdb_collect, you'll have to handle the case of the thread stopping
> midway through that register saving too (some registers already saved, some not
> yet).

Compute the next PCs on the basis of cooked up regcache from stack is
what I intended, but I didn't consider the case thread is stopped in the
middle way of register saving.

>
> So I assume it's much simpler to just run to [1] as well, and then issue
> a normal software single-step when you get there.

Then, looks we have to use software single step.

-- 
Yao (齐尧)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Move threads out of jumppad without single step
  2015-12-01 11:36   ` Yao Qi
@ 2016-01-27 16:47     ` Antoine Tremblay
  2016-01-29 20:43       ` Antoine Tremblay
  0 siblings, 1 reply; 9+ messages in thread
From: Antoine Tremblay @ 2016-01-27 16:47 UTC (permalink / raw)
  To: Yao Qi, Pedro Alves; +Cc: gdb-patches, simon.marchi



On 12/01/2015 06:36 AM, Yao Qi wrote:
> Pedro Alves <palves@redhat.com> writes:
>
>> You may be able to handle this by retrieving state from the saved registers
>> buffer in the jump pad, similar to how gdb_collect cooks up a regcache, though
>> unlike gdb_collect, you'll have to handle the case of the thread stopping
>> midway through that register saving too (some registers already saved, some not
>> yet).
>
> Compute the next PCs on the basis of cooked up regcache from stack is
> what I intended, but I didn't consider the case thread is stopped in the
> middle way of register saving.
>
>>
>> So I assume it's much simpler to just run to [1] as well, and then issue
>> a normal software single-step when you get there.
>
> Then, looks we have to use software single step.
>

Hi,
   I'm testing using software single stepping to move threads out of the 
the jump pad and I've ran into a problem which I'm really unsure how to 
fix and since the run control stuff is quite hard to follow help is 
appreciated.

Some context :

I'm using the following program to test on ARM :

#include "trace-common.h"
#include <stdio.h>

static void
begin (void)
{}
static void
end (void)
{}

int
main ()
{
   begin ();
   FAST_TRACEPOINT_LABEL(set_point);
   FAST_TRACEPOINT_LABEL(other_point);
   end ();
   return 0;
}

To compile :
gcc -marm -Wl,--no-as-needed move-out.c libinproctrace.so -g 
-Wl,-rpath,$ORIGIN -lm -o move-out

My gdb version is a soon to be posted fast tracepoint branch for ARM:
https://github.com/hexa00/binutils-gdb/commit/14912518f37abc2eb4594b42ca161c912ef6b6cd

Running gdbserver under gdb as such :

gdb gdbserver  --debug -ex "break linux_stabilize_threads" -ex "run 
--once :7777 ./move-out

and gdb with the following commands:
set pagination off
set non-stop on
set remotetimeout unlimited
tar rem :7777
break main
break end
c
#break on last jump pad instruction
break *gdb_agent_gdb_jump_pad_buffer + 43*4
ftrace set_point
tstart
c
#delete the breakpoint to fool gdbserver a bit.
delete 3
ftrace other_point

In the logs I can see :

stop_all_lwps done, setting stopping_threads back to !stopping
<<<< exiting stop_all_lwps
Checking whether LWP 5148 needs to move out of the jump pad.
fast_tracepoint_collecting
in jump pad of tpoint (4, 85d8); jump_pad(33000, 330b0); adj_insn(330ac, 
9393939393939393)
fast_tracepoint_collecting, returning need-single-step 
(330ac-9393939393939393)
Checking whether LWP 5148 needs to move out of the jump pad...it does
LWP 5148 needs stabilizing (in jump pad)
Resuming lwp 5148 (continue, signal 0, stop not expected)
lwp 5148 wants to get out of fast tracepoint jump pad single-stepping
stop pc is 0x330ac
pc is 0x330ac
Writing f001f0e7 to 0x000085dc in process 5148
stop pc is 0x330ac
   continue from pc 0x330ac
Checking whether LWP 5650 needs to move out of the jump pad.
sigchld_handler
fast_tracepoint_collecting
fast_tracepoint_collecting: not collecting (and nobody is).
Checking whether LWP 5650 needs to move out of the jump pad...no
 >>>> entering linux_wait_1
linux_wait_1: [<all threads>]
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x40000001): status(57f), 5148
LWFE: waitpid(-1, ...) returned 5148, ERRNO-OK
LLW: waitpid 5148 received Trace/breakpoint trap (stopped)
stop pc is 0x85dc
pc is 0x85dc
CSBB: LWP 5148.5148 stopped by software breakpoint
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x40000001): status(ffffffff), 0
LWFE: waitpid(-1, ...) returned 0, ERRNO-OK
leader_pid=5148, leader_lp!=NULL=1, num_lwps=2, zombie=0
LLW: exit (no unwaited-for LWP)
linux_wait_1 ret = null_ptid, TARGET_WAITKIND_NO_RESUMED
<<<< exiting linux_wait_1
../../../gdb/gdbserver/linux-low.c:1922: A problem internal to GDBserver 
has been detected.
unsuspend LWP 5148, suspended=-1


The main problem seems to be that as we enter linux_wait_1 in 
stabilize_threads gdbserver gets the stopped event of the single step 
breakpoint :

LLW: waitpid 5148 received Trace/breakpoint trap (stopped)
stop pc is 0x85dc
pc is 0x85dc
CSBB: LWP 5148.5148 stopped by software breakpoint

but this event is filtered out by: linux-low.c:2683

  /* ... and find an LWP with a status to report to the core, if
	 any.  */
       event_thread = (struct thread_info *)
	find_inferior (&all_threads, status_pending_p_callback, &filter_ptid);

Here status_pending_p_callback find that lwp_resumed is true and returns 
0 thus filtering out the event.

Thus we go back to linux_stabilize_threads do nothing, end the loop 
since lwp is now stopped.

And then try to unsuspend a thread that was not suspended and hit the 
assert.

Ideas on how this should be fixed ?

Thanks,
Antoine



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Move threads out of jumppad without single step
  2016-01-27 16:47     ` Antoine Tremblay
@ 2016-01-29 20:43       ` Antoine Tremblay
  2016-02-04 16:58         ` Yao Qi
  0 siblings, 1 reply; 9+ messages in thread
From: Antoine Tremblay @ 2016-01-29 20:43 UTC (permalink / raw)
  To: Yao Qi, Pedro Alves; +Cc: gdb-patches, simon.marchi



On 01/27/2016 11:47 AM, Antoine Tremblay wrote:
>
>
> On 12/01/2015 06:36 AM, Yao Qi wrote:
>> Pedro Alves <palves@redhat.com> writes:
>>
>>> You may be able to handle this by retrieving state from the saved
>>> registers
>>> buffer in the jump pad, similar to how gdb_collect cooks up a
>>> regcache, though
>>> unlike gdb_collect, you'll have to handle the case of the thread
>>> stopping
>>> midway through that register saving too (some registers already
>>> saved, some not
>>> yet).
>>
>> Compute the next PCs on the basis of cooked up regcache from stack is
>> what I intended, but I didn't consider the case thread is stopped in the
>> middle way of register saving.
>>
>>>
>>> So I assume it's much simpler to just run to [1] as well, and then issue
>>> a normal software single-step when you get there.
>>
>> Then, looks we have to use software single step.
>>
>
> Hi,
>    I'm testing using software single stepping to move threads out of the
> the jump pad and I've ran into a problem which I'm really unsure how to
> fix and since the run control stuff is quite hard to follow help is
> appreciated.
>
> Some context :
>
> I'm using the following program to test on ARM :
>
> #include "trace-common.h"
> #include <stdio.h>
>
> static void
> begin (void)
> {}
> static void
> end (void)
> {}
>
> int
> main ()
> {
>    begin ();
>    FAST_TRACEPOINT_LABEL(set_point);
>    FAST_TRACEPOINT_LABEL(other_point);
>    end ();
>    return 0;
> }
>
> To compile :
> gcc -marm -Wl,--no-as-needed move-out.c libinproctrace.so -g
> -Wl,-rpath,$ORIGIN -lm -o move-out
>
> My gdb version is a soon to be posted fast tracepoint branch for ARM:
> https://github.com/hexa00/binutils-gdb/commit/14912518f37abc2eb4594b42ca161c912ef6b6cd
>
>
> Running gdbserver under gdb as such :
>
> gdb gdbserver  --debug -ex "break linux_stabilize_threads" -ex "run
> --once :7777 ./move-out
>
> and gdb with the following commands:
> set pagination off
> set non-stop on
> set remotetimeout unlimited
> tar rem :7777
> break main
> break end
> c
> #break on last jump pad instruction
> break *gdb_agent_gdb_jump_pad_buffer + 43*4
> ftrace set_point
> tstart
> c
> #delete the breakpoint to fool gdbserver a bit.
> delete 3
> ftrace other_point
>
> In the logs I can see :
>
> stop_all_lwps done, setting stopping_threads back to !stopping
> <<<< exiting stop_all_lwps
> Checking whether LWP 5148 needs to move out of the jump pad.
> fast_tracepoint_collecting
> in jump pad of tpoint (4, 85d8); jump_pad(33000, 330b0); adj_insn(330ac,
> 9393939393939393)
> fast_tracepoint_collecting, returning need-single-step
> (330ac-9393939393939393)
> Checking whether LWP 5148 needs to move out of the jump pad...it does
> LWP 5148 needs stabilizing (in jump pad)
> Resuming lwp 5148 (continue, signal 0, stop not expected)
> lwp 5148 wants to get out of fast tracepoint jump pad single-stepping
> stop pc is 0x330ac
> pc is 0x330ac
> Writing f001f0e7 to 0x000085dc in process 5148
> stop pc is 0x330ac
>    continue from pc 0x330ac
> Checking whether LWP 5650 needs to move out of the jump pad.
> sigchld_handler
> fast_tracepoint_collecting
> fast_tracepoint_collecting: not collecting (and nobody is).
> Checking whether LWP 5650 needs to move out of the jump pad...no
>  >>>> entering linux_wait_1
> linux_wait_1: [<all threads>]
> my_waitpid (-1, 0x40000001)
> my_waitpid (-1, 0x40000001): status(57f), 5148
> LWFE: waitpid(-1, ...) returned 5148, ERRNO-OK
> LLW: waitpid 5148 received Trace/breakpoint trap (stopped)
> stop pc is 0x85dc
> pc is 0x85dc
> CSBB: LWP 5148.5148 stopped by software breakpoint
> my_waitpid (-1, 0x40000001)
> my_waitpid (-1, 0x40000001): status(ffffffff), 0
> LWFE: waitpid(-1, ...) returned 0, ERRNO-OK
> leader_pid=5148, leader_lp!=NULL=1, num_lwps=2, zombie=0
> LLW: exit (no unwaited-for LWP)
> linux_wait_1 ret = null_ptid, TARGET_WAITKIND_NO_RESUMED
> <<<< exiting linux_wait_1
> ../../../gdb/gdbserver/linux-low.c:1922: A problem internal to GDBserver
> has been detected.
> unsuspend LWP 5148, suspended=-1
>
>
> The main problem seems to be that as we enter linux_wait_1 in
> stabilize_threads gdbserver gets the stopped event of the single step
> breakpoint :
>
> LLW: waitpid 5148 received Trace/breakpoint trap (stopped)
> stop pc is 0x85dc
> pc is 0x85dc
> CSBB: LWP 5148.5148 stopped by software breakpoint
>
> but this event is filtered out by: linux-low.c:2683
>
>   /* ... and find an LWP with a status to report to the core, if
>       any.  */
>        event_thread = (struct thread_info *)
>      find_inferior (&all_threads, status_pending_p_callback, &filter_ptid);
>
> Here status_pending_p_callback find that lwp_resumed is true and returns
> 0 thus filtering out the event.
>
> Thus we go back to linux_stabilize_threads do nothing, end the loop
> since lwp is now stopped.
>
> And then try to unsuspend a thread that was not suspended and hit the
> assert.
>
> Ideas on how this should be fixed ?
>
> Thanks,
> Antoine
>
>

I managed to fix it like so :
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -1693,7 +1693,10 @@ status_pending_p_callback (struct 
inferior_list_entry *entry, void *arg)
    if (!ptid_match (ptid_of (thread), ptid))
      return 0;

-  if (!lwp_resumed (lp))
+  /* If we are stabilizing threads, threads have been stopped except the
+     ones that are moving out of the jump pad. The events of those threads
+     need to be reported whatever the last_resume_kind is.  */
+  if (!lwp_resumed (lp) && !stabilizing_threads)
      return 0;

    if (lp->status_pending_p

I've tested this in all stop/non stop and it works properly.

Basically what happens is that if stabilize_threads is not called in the 
context of linux_resume and that gdbserver needs to report an event, it 
won't since last_resume_kind can be resume_stop.

In the current case gdbserver is in cmd_qtdp, the last command was 
continue (vCont;c) in all stop mode so last_resume_kind is resume_stop.

So when going in linux_wait, the event is filtered out by :
  event_thread = (struct thread_info *)
	find_inferior (&all_threads, status_pending_p_callback, &filter_ptid);

Since status_pending_p_callback returns false.

Note that this fix may not the best one... but it may be some progress...

Any ideas are welcome, otherwise I will add it to my patch set and there 
can be more discussion at review.

Thanks,
Antoine








^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Move threads out of jumppad without single step
  2016-01-29 20:43       ` Antoine Tremblay
@ 2016-02-04 16:58         ` Yao Qi
  2016-02-04 18:24           ` Antoine Tremblay
  0 siblings, 1 reply; 9+ messages in thread
From: Yao Qi @ 2016-02-04 16:58 UTC (permalink / raw)
  To: Antoine Tremblay; +Cc: Yao Qi, Pedro Alves, gdb-patches, simon.marchi

Antoine Tremblay <antoine.tremblay@ericsson.com> writes:

> I've tested this in all stop/non stop and it works properly.
>
> Basically what happens is that if stabilize_threads is not called in
> the context of linux_resume and that gdbserver needs to report an
> event, it won't since last_resume_kind can be resume_stop.
>
> In the current case gdbserver is in cmd_qtdp, the last command was
> continue (vCont;c) in all stop mode so last_resume_kind is
> resume_stop.
>
> So when going in linux_wait, the event is filtered out by :
>  event_thread = (struct thread_info *)
> 	find_inferior (&all_threads, status_pending_p_callback, &filter_ptid);
>
> Since status_pending_p_callback returns false.
>
> Note that this fix may not the best one... but it may be some progress...
>
> Any ideas are welcome, otherwise I will add it to my patch set and
> there can be more discussion at review.

Hi Antoine,
I don't have an idea to your problem and your fix, but I don't
understand why don't we see this problem before.  I may miss some
details of your problem, so please include these details in your
patches.

What I want to say here is that we still need more tests and works to
get software single step properly/fully engaged with the rest part of
GDBserver before we introduce (fast) tracepoint for ARM into GDBserver.
The software single step in GDBserver isn't fully exercised yet, for
example, GDB still doesn't emit vCont;s and vCont;S to ask GDBserver to
do single step on arm-linux.  If we force GDB to emit vCont;s on
arm-linux, as the patch below does, there are still some problems,

 1. PTRACE_SINGLESTEP is used in ptrace call in GDBserver, which is
 obviously wrong,

 2. PC increment is still incorrect if the single step is requested by
 GDB.  We've had a fix https://sourceware.org/ml/gdb-patches/2015-11/msg00470.html
 to fix the PC increment when if single step is requested by GDBserver
 itself.

I did some fixes but there are still some work to do software single
step in GDBserver requested by GDB.  I don't want to block your fast
tracepoint work, just raise these issues, and let you know my thoughts.

-- 
Yao (齐尧)

diff --git a/gdb/arm-linux-tdep.c b/gdb/arm-linux-tdep.c
index 3421f3b..dead2b9 100644
--- a/gdb/arm-linux-tdep.c
+++ b/gdb/arm-linux-tdep.c
@@ -935,11 +935,6 @@ arm_linux_software_single_step (struct frame_info *frame)
   VEC (CORE_ADDR) *next_pcs = NULL;
   struct cleanup *old_chain = make_cleanup (VEC_cleanup (CORE_ADDR), &next_pcs);
 
-  /* If the target does have hardware single step, GDB doesn't have
-     to bother software single step.  */
-  if (target_can_do_single_step () == 1)
-    return 0;
-
   arm_get_next_pcs_ctor (&next_pcs_ctx,
 			 &arm_linux_get_next_pcs_ops,
 			 gdbarch_byte_order (gdbarch),
diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
index ef715e7..0d51dec 100644
--- a/gdb/gdbserver/server.c
+++ b/gdb/gdbserver/server.c
@@ -2958,12 +2958,15 @@ handle_v_requests (char *own_buf, int packet_len, int *new_packet_len)
 	{
 	  strcpy (own_buf, "vCont;c;C;t");
 
-	  if (target_supports_hardware_single_step () || !vCont_supported)
+	  if (target_supports_hardware_single_step ()
+	      || target_supports_software_single_step ()
+	      || !vCont_supported)
 	    {
-	      /* If target supports hardware single step, add actions s
-		 and S to the list of supported actions.  On the other
-		 hand, if GDB doesn't request the supported vCont actions
-		 in qSupported packet, add s and S to the list too.  */
+	      /* If target supports single step either by hardware or by
+		 software, add actions s and S to the list of supported
+		 actions.  On the other hand, if GDB doesn't request the
+		 supported vCont actions in qSupported packet, add s and
+		 S to the list too.  */
 	      own_buf = own_buf + strlen (own_buf);
 	      strcpy (own_buf, ";s;S");
 	    }
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 15210c9..3e2d76e 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2246,6 +2246,11 @@ maybe_software_singlestep (struct gdbarch *gdbarch, CORE_ADDR pc)
 {
   int hw_step = 1;
 
+  /* If the target does have hardware single step, GDB doesn't have
+     to bother software single step.  */
+  if (target_can_do_single_step () == 1)
+    return 1;
+
   if (execution_direction == EXEC_FORWARD
       && gdbarch_software_single_step_p (gdbarch)
       && gdbarch_software_single_step (gdbarch, get_current_frame ()))


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Move threads out of jumppad without single step
  2016-02-04 16:58         ` Yao Qi
@ 2016-02-04 18:24           ` Antoine Tremblay
  0 siblings, 0 replies; 9+ messages in thread
From: Antoine Tremblay @ 2016-02-04 18:24 UTC (permalink / raw)
  To: Yao Qi; +Cc: Pedro Alves, gdb-patches, simon.marchi

On 02/04/2016 11:58 AM, Yao Qi wrote:
> Antoine Tremblay <antoine.tremblay@ericsson.com> writes:
>
>> I've tested this in all stop/non stop and it works properly.
>>
>> Basically what happens is that if stabilize_threads is not called in
>> the context of linux_resume and that gdbserver needs to report an
>> event, it won't since last_resume_kind can be resume_stop.
>>
>> In the current case gdbserver is in cmd_qtdp, the last command was
>> continue (vCont;c) in all stop mode so last_resume_kind is
>> resume_stop.
>>
>> So when going in linux_wait, the event is filtered out by :
>>   event_thread = (struct thread_info *)
>> 	find_inferior (&all_threads, status_pending_p_callback, &filter_ptid);
>>
>> Since status_pending_p_callback returns false.
>>
>> Note that this fix may not the best one... but it may be some progress...
>>
>> Any ideas are welcome, otherwise I will add it to my patch set and
>> there can be more discussion at review.
>
> Hi Antoine,
> I don't have an idea to your problem and your fix, but I don't
> understand why don't we see this problem before.  I may miss some
> details of your problem, so please include these details in your
> patches.
>

Indeed, in fact I just tried on x86 on master and I got the same problem 
ending with:

<<<< exiting linux_wait_1
../../../gdb/gdbserver/linux-low.c:1922: A problem internal to GDBserver 
has been detected.
unsuspend LWP 24815, suspended=-1

So maybe we should have seen this before ?

Or my test setup is messing up the normal flow.

At least it makes it easier to reproduce, you can run the same thing on 
x86 with master and get the same situation (See my previous mail on the 
exact steps).

If I understand what you mean by details as a proper problem 
description. It's hard to provide all the details in a coherent way for 
me at this time since I'm not sure I understand the problem correctly, 
mainly due to my lack of experience with the control flow and all the 
states that compose this flow.

That's why I asked here, see if you or Pedro could shed some light based 
on a reproducible program and logs. If it's non-trivial for you guys 
also I'll take more time to dig.

> What I want to say here is that we still need more tests and works to
> get software single step properly/fully engaged with the rest part of
> GDBserver before we introduce (fast) tracepoint for ARM into GDBserver.

That's fine with me.

> The software single step in GDBserver isn't fully exercised yet, for
> example, GDB still doesn't emit vCont;s and vCont;S to ask GDBserver to
> do single step on arm-linux.  If we force GDB to emit vCont;s on
> arm-linux, as the patch below does, there are still some problems,
>
>   1. PTRACE_SINGLESTEP is used in ptrace call in GDBserver, which is
>   obviously wrong,
>
>   2. PC increment is still incorrect if the single step is requested by
>   GDB.  We've had a fix https://sourceware.org/ml/gdb-patches/2015-11/msg00470.html
>   to fix the PC increment when if single step is requested by GDBserver
>   itself.
>

I had not tested that at all thanks for working on it.

If there's anything I can do to help please let me know.

> I did some fixes but there are still some work to do software single
> step in GDBserver requested by GDB.  I don't want to block your fast
> tracepoint work, just raise these issues, and let you know my thoughts.
>

Thanks for letting me know that's always appreciated.

On my end my priority is to get the normal tracepoints in first, still a 
few patches waiting for review there.

In the meantime even if the basic fast tracepoint patch set is ready we 
still need to work on the JIT condition evaluation and relocating PC 
relative instructions which should take some time.

Regards,
Antoine

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-02-04 18:24 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-27 10:55 Move threads out of jumppad without single step Yao Qi
2015-11-30 14:42 ` Pedro Alves
2015-11-30 19:04   ` Simon Marchi
2015-11-30 19:39     ` Pedro Alves
2015-12-01 11:36   ` Yao Qi
2016-01-27 16:47     ` Antoine Tremblay
2016-01-29 20:43       ` Antoine Tremblay
2016-02-04 16:58         ` Yao Qi
2016-02-04 18:24           ` Antoine Tremblay

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox