Mirror of the gdb mailing list
 help / color / mirror / Atom feed
* Infinite Stack Unwinding ARM
@ 2017-04-04 12:54 Johannes Stoelp
  2017-04-05 11:09 ` Yao Qi
  0 siblings, 1 reply; 6+ messages in thread
From: Johannes Stoelp @ 2017-04-04 12:54 UTC (permalink / raw)
  To: gdb; +Cc: Andreas Ropers, Marc Mones, Kai Schuetz, Johannes Stoelp

Hi folks,

While debugging Linux kernel code on an ARMv8 target we ended up in an "infinite" loop when printing
the backtrace. We tracked it down that this happens when we break in any function origin in
el[01]_irq (just the ones we ran into).
Note: These functions (el[01]_irq) are handwritten assembler functions and therefore gdb uses the
prologue analyzer (gdb/aarch64-tdep.c:aarch64_analyze_prologue(...)) to unwind the previous frame.
We further analyzed the stack unwinding and found the root cause of this infinite loop.

Let's look at el1_irq for example to see what is happening in detail. While unwinding the stack,
gdb comes to the 'el1_irq' function and then starts analyzing the function prologue to further
unwind the stack.

gdb calculates the begin and end of the function prologue as follows:
   start_prologue = 0xffffffc000083c80
   end_prologue   = 0xffffffc000083cd4
When the analyzer hits the first MRS instruction at 0xffffffc000083cc4 it stops because MRS is not supported in the analyzer.

/* el1_irq disassembly start */
Dump of assembler code for function el1_irq (arch/arm64/kernel/entry.S):
start_prologue --> 0xffffffc000083c80 <+0>:     sub     sp, sp, #0x30
                    0xffffffc000083c84 <+4>:     stp     x28, x29, [sp,#-16]!
                    0xffffffc000083c88 <+8>:     stp     x26, x27, [sp,#-16]!
                    0xffffffc000083c8c <+12>:    stp     x24, x25, [sp,#-16]!
                    0xffffffc000083c90 <+16>:    stp     x22, x23, [sp,#-16]!
                    0xffffffc000083c94 <+20>:    stp     x20, x21, [sp,#-16]!
                    0xffffffc000083c98 <+24>:    stp     x18, x19, [sp,#-16]!
                    0xffffffc000083c9c <+28>:    stp     x16, x17, [sp,#-16]!
                    0xffffffc000083ca0 <+32>:    stp     x14, x15, [sp,#-16]!
                    0xffffffc000083ca4 <+36>:    stp     x12, x13, [sp,#-16]!
                    0xffffffc000083ca8 <+40>:    stp     x10, x11, [sp,#-16]!
                    0xffffffc000083cac <+44>:    stp     x8, x9, [sp,#-16]!
                    0xffffffc000083cb0 <+48>:    stp     x6, x7, [sp,#-16]!
                    0xffffffc000083cb4 <+52>:    stp     x4, x5, [sp,#-16]!
                    0xffffffc000083cb8 <+56>:    stp     x2, x3, [sp,#-16]!
                    0xffffffc000083cbc <+60>:    stp     x0, x1, [sp,#-16]!
                    0xffffffc000083cc0 <+64>:    add     x21, sp, #0x120
   analyze_stop --> 0xffffffc000083cc4 <+68>:    mrs     x22, elr_el1
                    0xffffffc000083cc8 <+72>:    mrs     x23, spsr_el1
                    0xffffffc000083ccc <+76>:    stp     x30, x21, [sp,#240]
                    0xffffffc000083cd0 <+80>:    stp     x22, x23, [sp,#256]
   end_prologue --> 0xffffffc000083cd4 <+84>:    msr     daifclr, #0x8
                    0xffffffc000083cd8 <+88>:    ldr     x1, 0xffffffc000084210 <handle_arch_irq>
                    0xffffffc000083cdc <+92>:    mov     x0, sp
                    0xffffffc000083ce0 <+96>:    blr     x1
                    0xffffffc000083ce4 <+100>:   mov     x28, sp
                    ...
/* el1_irq disassembly end */

The analysis doesn't see the store of the link register x30 (0xffffffc000083ccc) and therefore
assumes the previous value of the link register is still valid.
The link register actually has the value of
   x30 = 0xffffffc000083ce4 (pointing into el1_irq)
and the unwinder then uses this value to determine the caller of el1_irq, which oh wonder is again el1_irq.

The other thing is that gdb doesn't recognize this situation as "Backtrace stopped: previous frame
identical to this frame (corrupt stack?)". This in turn happens because the symbolic calculation
for the stack pointer register looks as following after the prologue analyzer stops:
   (gdb) p/x regs[AARCH64_SP_REGNUM]
   $3 = {
      kind = 0x2,               // pv_register
      reg = 0x1f,               // 31
      k = 0xfffffffffffffee0    // #-0x120
   }

gdb then tries to use this stack pointer as frame pointer.

/* gdb/aarch64-tdep.c:aarch64_analyze_prologue(...) start */
   ...
   if (pv_is_register (regs[AARCH64_FP_REGNUM], AARCH64_SP_REGNUM)) {
      /* Frame pointer is fp.  Frame size is constant.  */
      cache->framereg = AARCH64_FP_REGNUM;
      cache->framesize = -regs[AARCH64_FP_REGNUM].k;
   }
   else if (pv_is_register (regs[AARCH64_SP_REGNUM], AARCH64_SP_REGNUM)) {
      /* Try the stack pointer.  */
      cache->framesize = -regs[AARCH64_SP_REGNUM].k;
      cache->framereg = AARCH64_SP_REGNUM;
   }
   else {
      /* We're just out of luck.  We don't know where the frame is.  */
      cache->framereg = -1;
      cache->framesize = 0;
   }
   ...
/* gdb/aarch64-tdep.c:aarch64_analyze_prologue(...) end */

Which in turn is used for calculating the previous stack pointer.

/* gdb/aarch64-tdep.c:aarch64_make_prologue_cache_1 (...) start */
   ...
   aarch64_scan_prologue (this_frame, cache);

   if (cache->framereg == -1)
      return;

   unwound_fp = get_frame_register_unsigned (this_frame, cache->framereg);
   if (unwound_fp == 0)
      return;
   cache->prev_sp = unwound_fp + cache->framesize;
   ...
/* gdb/aarch64-tdep.c:aarch64_make_prologue_cache_1 (...) end */

The result of this is that gdb interprets the situation as a recursion.


* Anyone ran into similar situations with the arm prologue analyzer?
* Anyone worked on an extension for the prologue analyzer to support SYSRegs and therefore
  instructions like MRS?
* A defensive workaround I'm currently using is to stop the stack unwinding when hitting an
  unsupported instruction.


-Johannes


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Infinite Stack Unwinding ARM
  2017-04-04 12:54 Infinite Stack Unwinding ARM Johannes Stoelp
@ 2017-04-05 11:09 ` Yao Qi
  2017-04-07  7:15   ` Johannes Stoelp
  0 siblings, 1 reply; 6+ messages in thread
From: Yao Qi @ 2017-04-05 11:09 UTC (permalink / raw)
  To: Johannes Stoelp; +Cc: gdb, Andreas Ropers, Marc Mones, Kai Schuetz

Johannes Stoelp <Johannes.Stoelp@synopsys.com> writes:

> * Anyone worked on an extension for the prologue analyzer to support SYSRegs and therefore
>   instructions like MRS?

I don't expect prologue analyzer supporting SYSRegs and instruction
MRS.  All the prologue analyzers in GDB are written in a way that
understanding instructions according to the ABI/calling convention of
each architecture and compiler's behavior, so it should be able to parse
the instruction in prologues complying to the ABI.  GDB prologue
analyzer may not understand what does handwritten assembly do.

If you want GDB to unwind from there, add .cfi directives in
arch/arm64/kernel/entry.S

-- 
Yao (齐尧)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Infinite Stack Unwinding ARM
  2017-04-05 11:09 ` Yao Qi
@ 2017-04-07  7:15   ` Johannes Stoelp
  2017-04-08  5:00     ` Duane Ellis
  2017-04-10 12:01     ` Yao Qi
  0 siblings, 2 replies; 6+ messages in thread
From: Johannes Stoelp @ 2017-04-07  7:15 UTC (permalink / raw)
  To: Yao Qi, Johannes Stoelp
  Cc: gdb, Andreas Ropers, Marc Mones, Kai  Schuetz, Johannes Stoelp

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1671 bytes --]

Yao Qi <qiyaoltc@gmail.com> writes:

> I don't expect prologue analyzer supporting SYSRegs and instruction MRS.  All the prologue analyzers in GDB are written in a way that understanding instructions according to the ABI/calling convention of each architecture and  compiler's behavior, so it should be able to parse the instruction in prologues complying to the ABI.  GDB prologue analyzer may not understand what does handwritten assembly do.


Hi Yao,

I see what you are saying about the prologue analyzer and the ABI/calling conventions. 

I understand that gdb does not have to understand every hand written assembler routine, but I would like to emphasize that gdb in this particular case ends in an "infinite" loop printing the backtrace line by line (I put infinite in quotes because the loop is limited by the lower boundary of an integer). 
I would expect gdb to be more defensive in this case and either try other unwinding techniques like backward unwinding (from bottom up) or just stop unwinding because of to less information.
In my understanding situations like this can also occur when the stack gets corrupted. There I would also expect gdb to not end in an infinite loop since gdb is intended to analyze the non-expected situation.

One other question that came up by comparing the arm and the aarch64 analyzer:
    * Is there a special reason/trick why the arm analyzer (gdb/arm-tdep.c:arm_analyze_prologue(...)) skips instructions that it doesn't recognize while the aarch64 analyzer (gdb/aarch64-tdep.c:aarch64_prologue_analyzer(...)) stops when the first unrecognized instruction is hit?

Best,
Johannes 
 
\x16º&ÖëzÛ«ŸŽ{ÓÙb²Ö«r\x18\x1d

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Infinite Stack Unwinding ARM
  2017-04-07  7:15   ` Johannes Stoelp
@ 2017-04-08  5:00     ` Duane Ellis
  2017-04-10 12:01     ` Yao Qi
  1 sibling, 0 replies; 6+ messages in thread
From: Duane Ellis @ 2017-04-08  5:00 UTC (permalink / raw)
  To: Johannes Stoelp; +Cc: Yao Qi, gdb, Andreas Ropers, Marc Mones, Kai Schuetz


> On Apr 7, 2017, at 12:15 AM, Johannes Stoelp <Johannes.Stoelp@synopsys.com> wrote:
> 
> Yao Qi <qiyaoltc@gmail.com> writes:
> 
>> I don't expect prologue analyzer supporting SYSRegs and instruction MRS.  All the prologue analyzers in GDB are written in a way that understanding instructions according to the ABI/calling convention of each architecture and  compiler's behavior, so it should be able to parse the instruction in prologues complying to the ABI.  GDB prologue analyzer may not understand what does handwritten assembly do.
> 
> 
> Hi Yao,
> 
> I see what you are saying about the prologue analyzer and the ABI/calling conventions. 
> 
> I understand that gdb does not have to understand every hand written assembler routine, but I would like to emphasize that gdb in this particular case ends in an "infinite" loop printing the backtrace line by line (I put infinite in quotes because the loop is limited by the lower boundary of an integer). 
> I would expect gdb to be more defensive in this case and either try other unwinding techniques like backward unwinding (from bottom up) or just stop unwinding because of to less information.
> In my understanding situations like this can also occur when the stack gets corrupted. There I would also expect gdb to not end in an infinite loop since gdb is intended to analyze the non-expected situation.
> 
> One other question that came up by comparing the arm and the aarch64 analyzer:
>    * Is there a special reason/trick why the arm analyzer (gdb/arm-tdep.c:arm_analyze_prologue(...)) skips instructions that it doesn't recognize while the aarch64 analyzer (gdb/aarch64-tdep.c:aarch64_prologue_analyzer(...)) stops when the first unrecognized instruction is hit?
> 
> Best,
> Johannes 
> 

All of the below are techniques I have used in a custom port of GDB to a custom CPU.

Some might already exist, some are not, it is a balancing act between the software being debugged (helping the debugger) and the debugger trying the best it can.

It’s not uncommon for some bare metal RTOS to push ZEROS onto the stack, or for example on the ARM cores, to set the LR register to ZERO in the assembly handler, yes, this is an extra opcode or two - on every interrupt - in some cases, this is trivial an not a problem if extra opcodes are painful - it can be done in a compile time conditional in other cases the top 16 bytes or so of the interrupt stack is zeroed

Another thing unwinders do - is make sure the stack is still going on the same direction, and it is not a very large distance away from the previous stack, it’s a heuristic - for example 90% of code does not have huge arrays (i.e.: 10K bytes) on the stack, so - as you walk up the stack you make a decision - is distance(THIS_FRAME -vrs- NEXT_FRAME ) greater then MAXVALUE - abandon the backtrace.

Another thing the unwinder can do is inspect a few bytes before the return address, and make sure (1) the call site is within the application being debugged and not in outer space - and (2) the opcodes at that point are actually a subroutine call.

If the unwinder has access to current thread information and RTOS information - it might know the range of the current threads stack.

Again - along the idea of the target memory map - in an embedded system, RAM is in a fixed region… as is code (flash memory) if you know this (or it is set-able) you the unwinder can make other educated decisions.

-Duane.





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Infinite Stack Unwinding ARM
  2017-04-07  7:15   ` Johannes Stoelp
  2017-04-08  5:00     ` Duane Ellis
@ 2017-04-10 12:01     ` Yao Qi
  2017-04-11 16:20       ` Johannes Stoelp
  1 sibling, 1 reply; 6+ messages in thread
From: Yao Qi @ 2017-04-10 12:01 UTC (permalink / raw)
  To: Johannes Stoelp; +Cc: gdb, Andreas Ropers, Marc Mones, Kai Schuetz

Johannes Stoelp <Johannes.Stoelp@synopsys.com> writes:

> I understand that gdb does not have to understand every hand written
> assembler routine, but I would like to emphasize that gdb in this
> particular case ends in an "infinite" loop printing the backtrace line
> by line (I put infinite in quotes because the loop is limited by the
> lower boundary of an integer).
> I would expect gdb to be more defensive in this case and either try
> other unwinding techniques like backward unwinding (from bottom up) or
> just stop unwinding because of to less information.

Don't understand what do you mean by "backward unwinding".  IIUC,
el1_irq are vector entry points, so I expect GDB unwinding stops on
el1_irq, but I want to check with kernel people to see their
expectation.  Can you post me the backtrace, as a sample output?
We may end up with having some Linux kernel specific stuff, and this
should fall in "Linux kernel debugging for aarch64".  Note that we had
patches for general Linux kernel debugging
https://sourceware.org/ml/gdb-patches/2017-03/msg00268.html (well, I
need to review them soon)

> In my understanding situations like this can also occur when the stack
> gets corrupted. There I would also expect gdb to not end in an
> infinite loop since gdb is intended to analyze the non-expected
> situation.
>
> One other question that came up by comparing the arm and the aarch64 analyzer:
>     * Is there a special reason/trick why the arm analyzer
> (gdb/arm-tdep.c:arm_analyze_prologue(...)) skips instructions that it
> doesn't recognize while the aarch64 analyzer
> (gdb/aarch64-tdep.c:aarch64_prologue_analyzer(...)) stops when the
> first unrecognized instruction is hit?

I don't see any reason here, just because they are two different
architectures.

-- 
Yao (齐尧)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Infinite Stack Unwinding ARM
  2017-04-10 12:01     ` Yao Qi
@ 2017-04-11 16:20       ` Johannes Stoelp
  0 siblings, 0 replies; 6+ messages in thread
From: Johannes Stoelp @ 2017-04-11 16:20 UTC (permalink / raw)
  To: Yao Qi, Johannes Stoelp
  Cc: gdb, Andreas Ropers, Marc Mones, Kai  Schuetz, Johannes Stoelp

Hi Yao,

Yao Qi <qiyaoltc@gmail.com> writes:
> We may end up with having some Linux kernel specific stuff, and this should fall in "Linux kernel debugging for aarch64".  

I see, thanks for the hint.

> Can you post me the backtrace, as a sample output?

When having unlimited backtrace enabled (default), I get backtraces like the following:
#0  gic_handle_irq (regs=0xffffffc07d9c7e30) at drivers/irqchip/irq-gic.c:263
#1  0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#2  0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#3  0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#4  0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#5  0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#6  0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#7  0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
...
#20009 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#20010 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#20011 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#20012 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#20013 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#20014 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
...

Best,
Johannes

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-04-11 16:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-04 12:54 Infinite Stack Unwinding ARM Johannes Stoelp
2017-04-05 11:09 ` Yao Qi
2017-04-07  7:15   ` Johannes Stoelp
2017-04-08  5:00     ` Duane Ellis
2017-04-10 12:01     ` Yao Qi
2017-04-11 16:20       ` Johannes Stoelp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox