Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* [PATCH] Relax ARM prologue unwinder assumption
@ 2015-02-05 19:07 Luis Machado
  2015-02-06 15:25 ` Pedro Alves
  0 siblings, 1 reply; 5+ messages in thread
From: Luis Machado @ 2015-02-05 19:07 UTC (permalink / raw)
  To: 'gdb-patches@sourceware.org'

[-- Attachment #1: Type: text/plain, Size: 2304 bytes --]

Just recently i ran into a problem with a bare metal ARM target where 
GDB would not allow some registers to be changed after the PC was 
manually set to some other value.

In reality the target had just started and one of its cores, out of 
four, was running the program, but the other ones were in some random state.

The PC of one of the other 3 cores was then adjusted to point to a known 
function address.

GDB's reaction to this change is to invalidate the regcache and frame 
and build a brand new chain and cache, while trying to retain the 
previously selected frame - #0 in this case.

What i noticed is that GDB was selecting frame #1 instead of #0 due to 
unfortunate coincidences with both frames' SP's being 0. And we can't 
modify some registers on non-innermost frames for obvious reasons.

Here's a brief log of what happens:


[Switching to thread 2 (Thread 2)]
#0  0x0093ff10 in ?? ()
(gdb) set $pc=functioncore2
(gdb) bt
#0  functioncore2 () at test.c:32
#1  0x0000fc44 in ?? ()
(gdb) frame
#1  0x0000fc44 in ?? ()
(gdb) set $sp=0x2030
Attempt to assign to an unmodifiable value.

I tracked this problem down to this old 
(https://sourceware.org/ml/gdb-patches/2003-08/msg00526.html) piece of 
code in arm-tdep.c:arm_prologue_this_id:

   /* If we've hit a wall, stop.  */
   if (cache->prev_sp == 0)
     return;

Due to the SP being 0 for this specific core, GDB returns early and does 
not set the frame's PC to the new value. That means we have a frame with 
PC==0x0 and SP==0x0, which ends up matching frame #1 in our case. But 
this is obviously wrong.

I looked up the patch that introduced this chunk of code and did not 
find any reasonable explanation for this check. Though it may make sense 
for non-bare metal targets, a bare-metal board attached to a probe can 
be stopped at any random place, so we should be able to set its 
registers freely without worrying about unwinding assumptions. This is 
generic code after all. I understand a valid workaround is to make sure 
the proper frame is selected, but GDB should be smart enough not to 
confuse things. Therefore this "wall" described in the comment seems too 
strict to be in generic code.

The attached patch removes this restriction and does not cause any 
regressions for ARM bare metal, but i'd like feedback.

[-- Attachment #2: arm_prologue.diff --]
[-- Type: text/x-patch, Size: 687 bytes --]

2015-02-05  Luis Machado  <lgustavo@codesourcery.com>

	* arm-tdep.c (arm_prologue_this_id): Remove check for the stack
	pointer being 0.

diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index 8e9552a..771cbeb 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -2042,10 +2042,6 @@ arm_prologue_this_id (struct frame_info *this_frame,
   if (pc <= gdbarch_tdep (get_frame_arch (this_frame))->lowest_pc)
     return;
 
-  /* If we've hit a wall, stop.  */
-  if (cache->prev_sp == 0)
-    return;
-
   /* Use function start address as part of the frame ID.  If we cannot
      identify the start address (due to missing symbol information),
      fall back to just using the current PC.  */

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Relax ARM prologue unwinder assumption
  2015-02-05 19:07 [PATCH] Relax ARM prologue unwinder assumption Luis Machado
@ 2015-02-06 15:25 ` Pedro Alves
  2015-02-09 14:51   ` Luis Machado
  0 siblings, 1 reply; 5+ messages in thread
From: Pedro Alves @ 2015-02-06 15:25 UTC (permalink / raw)
  To: lgustavo, 'gdb-patches@sourceware.org'

On 02/05/2015 07:06 PM, Luis Machado wrote:
> Just recently i ran into a problem with a bare metal ARM target where 
> GDB would not allow some registers to be changed after the PC was 
> manually set to some other value.
> 
> In reality the target had just started and one of its cores, out of 
> four, was running the program, but the other ones were in some random state.
> 
> The PC of one of the other 3 cores was then adjusted to point to a known 
> function address.
> 
> GDB's reaction to this change is to invalidate the regcache and frame 
> and build a brand new chain and cache, while trying to retain the 
> previously selected frame - #0 in this case.
> 
> What i noticed is that GDB was selecting frame #1 instead of #0 due to 
> unfortunate coincidences with both frames' SP's being 0. And we can't 
> modify some registers on non-innermost frames for obvious reasons.
> 
> Here's a brief log of what happens:
> 
> 
> [Switching to thread 2 (Thread 2)]
> #0  0x0093ff10 in ?? ()
> (gdb) set $pc=functioncore2
> (gdb) bt
> #0  functioncore2 () at test.c:32
> #1  0x0000fc44 in ?? ()

So in this case, frame #0's unwinder must now be the dwarf
unwinder, right?

> (gdb) frame
> #1  0x0000fc44 in ?? ()
> (gdb) set $sp=0x2030
> Attempt to assign to an unmodifiable value.
> 
> I tracked this problem down to this old 
> (https://sourceware.org/ml/gdb-patches/2003-08/msg00526.html) piece of 
> code in arm-tdep.c:arm_prologue_this_id:
> 
>    /* If we've hit a wall, stop.  */
>    if (cache->prev_sp == 0)
>      return;
> 
> Due to the SP being 0 for this specific core, GDB returns early and does 
> not set the frame's PC to the new value. That means we have a frame with 
> PC==0x0 and SP==0x0, which ends up matching frame #1 in our case. But 
> this is obviously wrong.
> 
> I looked up the patch that introduced this chunk of code and did not 
> find any reasonable explanation for this check. Though it may make sense 
> for non-bare metal targets, a bare-metal board attached to a probe can 
> be stopped at any random place, so we should be able to set its 
> registers freely without worrying about unwinding assumptions. 

A debugger is supposedly useful for badly behaved programs too.  :-)
So I don't see how GNU/Linux would be any different.  The user may
attach to a program that can be stopped at any random place too.


> This is
> generic code after all. I understand a valid workaround is to make sure 
> the proper frame is selected, but GDB should be smart enough not to 
> confuse things. Therefore this "wall" described in the comment seems too 
> strict to be in generic code.

The "wall" refers to not being possible to unwind past this frame if
there's no stack to unwind.  So it sounds like that after your patch,
"bt" will start trying to unwind past that, while before, we'd stop,
because arm_prologue_this_id returning early means that the frame's
ID defaults to output_frame_id (see compute_frame_id).  Stop on SP==0
may be an ABI thing, I haven't checked.

> The attached patch removes this restriction and does not cause any 
> regressions for ARM bare metal, but i'd like feedback.

Nowadays we have the frame_unwind_stop_reason hook to make it possible
to have different outermost frame ids, but still have the frame claim
that its the outermost, instead of forcing all outermost frame ids
use the outer_frame_id id.  See i386_frame_unwind_stop_reason.

Sounds like ARM could add an arm_frame_unwind_stop_reason that
returns UNWIND_OUTERMOST when prev_sp is 0.

And it looks like this bit here:

  /* This is meant to halt the backtrace at "_start".  */
  pc = get_frame_pc (this_frame);
  if (pc <= gdbarch_tdep (get_frame_arch (this_frame))->lowest_pc)
    return;

can well cause a fail in the same way.

(It may also make sense to think through the cases where we are
trying to restore the selected frame, and change that code.  It may be
that it never makes sense to go from frame #0 to any other frame
number, for instance.)

Thanks,
Pedro Alves


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Relax ARM prologue unwinder assumption
  2015-02-06 15:25 ` Pedro Alves
@ 2015-02-09 14:51   ` Luis Machado
  2015-02-10  9:38     ` Pedro Alves
  0 siblings, 1 reply; 5+ messages in thread
From: Luis Machado @ 2015-02-09 14:51 UTC (permalink / raw)
  To: Pedro Alves, 'gdb-patches@sourceware.org'

[-- Attachment #1: Type: text/plain, Size: 2722 bytes --]

On 02/06/2015 01:25 PM, Pedro Alves wrote:
> On 02/05/2015 07:06 PM, Luis Machado wrote:
>> Just recently i ran into a problem with a bare metal ARM target where
>> GDB would not allow some registers to be changed after the PC was
>> manually set to some other value.
>>
>> In reality the target had just started and one of its cores, out of
>> four, was running the program, but the other ones were in some random state.
>>
>> The PC of one of the other 3 cores was then adjusted to point to a known
>> function address.
>>
>> GDB's reaction to this change is to invalidate the regcache and frame
>> and build a brand new chain and cache, while trying to retain the
>> previously selected frame - #0 in this case.
>>
>> What i noticed is that GDB was selecting frame #1 instead of #0 due to
>> unfortunate coincidences with both frames' SP's being 0. And we can't
>> modify some registers on non-innermost frames for obvious reasons.
>>
>> Here's a brief log of what happens:
>>
>>
>> [Switching to thread 2 (Thread 2)]
>> #0  0x0093ff10 in ?? ()
>> (gdb) set $pc=functioncore2
>> (gdb) bt
>> #0  functioncore2 () at test.c:32
>> #1  0x0000fc44 in ?? ()
>
> So in this case, frame #0's unwinder must now be the dwarf
> unwinder, right?
>

Correct.

>> The attached patch removes this restriction and does not cause any
>> regressions for ARM bare metal, but i'd like feedback.
>
> Nowadays we have the frame_unwind_stop_reason hook to make it possible
> to have different outermost frame ids, but still have the frame claim
> that its the outermost, instead of forcing all outermost frame ids
> use the outer_frame_id id.  See i386_frame_unwind_stop_reason.
>
> Sounds like ARM could add an arm_frame_unwind_stop_reason that
> returns UNWIND_OUTERMOST when prev_sp is 0.
>
> And it looks like this bit here:
>
>    /* This is meant to halt the backtrace at "_start".  */
>    pc = get_frame_pc (this_frame);
>    if (pc <= gdbarch_tdep (get_frame_arch (this_frame))->lowest_pc)
>      return;
>
> can well cause a fail in the same way.
>

Thanks. That mechanism does sound a bit more flexible and it indeed 
solves the problem i'm seeing.

How does the updated patch look? I have arm_prologue_unwind_stop_reason 
implemented now.

> (It may also make sense to think through the cases where we are
> trying to restore the selected frame, and change that code.  It may be
> that it never makes sense to go from frame #0 to any other frame
> number, for instance.)

Trying to come up with the best match for a certain frame that 
potentially does not exist anymore can be troublesome, but it's been 
working well so far. I'd go for always picking up frame 0 as well. It is 
easier, cleaner and less prone to failures.

Luis

[-- Attachment #2: arm_unwind.diff --]
[-- Type: text/x-patch, Size: 2318 bytes --]

2015-02-09  Luis Machado  <lgustavo@codesourcery.com>

	* arm-tdep.c (arm_prologue_unwind_stop_reason): New function.
	(arm_prologue_this_id): Move PC and SP limit checks to
	arm_prologue_unwind_stop_reason.
	(arm_prologue_unwind) <stop_reason> : Set to
	arm_prologue_unwind_stop_reason.

diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index 8e9552a..f3a6325 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -2021,6 +2021,31 @@ arm_make_prologue_cache (struct frame_info *this_frame)
   return cache;
 }
 
+/* Implementation of the stop_reason hook for arm_prologue frames.  */
+
+static enum unwind_stop_reason
+arm_prologue_unwind_stop_reason (struct frame_info *this_frame,
+				 void **this_cache)
+{
+  struct arm_prologue_cache *cache;
+  CORE_ADDR pc;
+
+  if (*this_cache == NULL)
+    *this_cache = arm_make_prologue_cache (this_frame);
+  cache = *this_cache;
+
+  /* This is meant to halt the backtrace at "_start".  */
+  pc = get_frame_pc (this_frame);
+  if (pc <= gdbarch_tdep (get_frame_arch (this_frame))->lowest_pc)
+    return UNWIND_OUTERMOST;
+
+  /* If we've hit a wall, stop.  */
+  if (cache->prev_sp == 0)
+    return UNWIND_OUTERMOST;
+
+  return UNWIND_NO_REASON;
+}
+
 /* Our frame ID for a normal frame is the current function's starting PC
    and the caller's SP when we were called.  */
 
@@ -2037,18 +2062,10 @@ arm_prologue_this_id (struct frame_info *this_frame,
     *this_cache = arm_make_prologue_cache (this_frame);
   cache = *this_cache;
 
-  /* This is meant to halt the backtrace at "_start".  */
-  pc = get_frame_pc (this_frame);
-  if (pc <= gdbarch_tdep (get_frame_arch (this_frame))->lowest_pc)
-    return;
-
-  /* If we've hit a wall, stop.  */
-  if (cache->prev_sp == 0)
-    return;
-
   /* Use function start address as part of the frame ID.  If we cannot
      identify the start address (due to missing symbol information),
      fall back to just using the current PC.  */
+  pc = get_frame_pc (this_frame);
   func = get_frame_func (this_frame);
   if (!func)
     func = pc;
@@ -2117,7 +2134,7 @@ arm_prologue_prev_register (struct frame_info *this_frame,
 
 struct frame_unwind arm_prologue_unwind = {
   NORMAL_FRAME,
-  default_frame_unwind_stop_reason,
+  arm_prologue_unwind_stop_reason,
   arm_prologue_this_id,
   arm_prologue_prev_register,
   NULL,

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Relax ARM prologue unwinder assumption
  2015-02-09 14:51   ` Luis Machado
@ 2015-02-10  9:38     ` Pedro Alves
  2015-02-10 11:52       ` Luis Machado
  0 siblings, 1 reply; 5+ messages in thread
From: Pedro Alves @ 2015-02-10  9:38 UTC (permalink / raw)
  To: lgustavo, 'gdb-patches@sourceware.org'

On 02/09/2015 02:51 PM, Luis Machado wrote:

> Thanks. That mechanism does sound a bit more flexible and it indeed 
> solves the problem i'm seeing.
> 
> How does the updated patch look? I have arm_prologue_unwind_stop_reason 
> implemented now.

That looks good to me.

Thanks,
Pedro Alves


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Relax ARM prologue unwinder assumption
  2015-02-10  9:38     ` Pedro Alves
@ 2015-02-10 11:52       ` Luis Machado
  0 siblings, 0 replies; 5+ messages in thread
From: Luis Machado @ 2015-02-10 11:52 UTC (permalink / raw)
  To: Pedro Alves, 'gdb-patches@sourceware.org'

On 02/10/2015 07:37 AM, Pedro Alves wrote:
> On 02/09/2015 02:51 PM, Luis Machado wrote:
>
>> Thanks. That mechanism does sound a bit more flexible and it indeed
>> solves the problem i'm seeing.
>>
>> How does the updated patch look? I have arm_prologue_unwind_stop_reason
>> implemented now.
>
> That looks good to me.
>
> Thanks,
> Pedro Alves
>

I've checked this in now.

Thanks,
Luis


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-02-10 11:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-05 19:07 [PATCH] Relax ARM prologue unwinder assumption Luis Machado
2015-02-06 15:25 ` Pedro Alves
2015-02-09 14:51   ` Luis Machado
2015-02-10  9:38     ` Pedro Alves
2015-02-10 11:52       ` Luis Machado

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox