* Infinite Stack Unwinding ARM
@ 2017-04-04 12:54 Johannes Stoelp
2017-04-05 11:09 ` Yao Qi
0 siblings, 1 reply; 6+ messages in thread
From: Johannes Stoelp @ 2017-04-04 12:54 UTC (permalink / raw)
To: gdb; +Cc: Andreas Ropers, Marc Mones, Kai Schuetz, Johannes Stoelp
Hi folks,
While debugging Linux kernel code on an ARMv8 target we ended up in an "infinite" loop when printing
the backtrace. We tracked it down that this happens when we break in any function origin in
el[01]_irq (just the ones we ran into).
Note: These functions (el[01]_irq) are handwritten assembler functions and therefore gdb uses the
prologue analyzer (gdb/aarch64-tdep.c:aarch64_analyze_prologue(...)) to unwind the previous frame.
We further analyzed the stack unwinding and found the root cause of this infinite loop.
Let's look at el1_irq for example to see what is happening in detail. While unwinding the stack,
gdb comes to the 'el1_irq' function and then starts analyzing the function prologue to further
unwind the stack.
gdb calculates the begin and end of the function prologue as follows:
start_prologue = 0xffffffc000083c80
end_prologue = 0xffffffc000083cd4
When the analyzer hits the first MRS instruction at 0xffffffc000083cc4 it stops because MRS is not supported in the analyzer.
/* el1_irq disassembly start */
Dump of assembler code for function el1_irq (arch/arm64/kernel/entry.S):
start_prologue --> 0xffffffc000083c80 <+0>: sub sp, sp, #0x30
0xffffffc000083c84 <+4>: stp x28, x29, [sp,#-16]!
0xffffffc000083c88 <+8>: stp x26, x27, [sp,#-16]!
0xffffffc000083c8c <+12>: stp x24, x25, [sp,#-16]!
0xffffffc000083c90 <+16>: stp x22, x23, [sp,#-16]!
0xffffffc000083c94 <+20>: stp x20, x21, [sp,#-16]!
0xffffffc000083c98 <+24>: stp x18, x19, [sp,#-16]!
0xffffffc000083c9c <+28>: stp x16, x17, [sp,#-16]!
0xffffffc000083ca0 <+32>: stp x14, x15, [sp,#-16]!
0xffffffc000083ca4 <+36>: stp x12, x13, [sp,#-16]!
0xffffffc000083ca8 <+40>: stp x10, x11, [sp,#-16]!
0xffffffc000083cac <+44>: stp x8, x9, [sp,#-16]!
0xffffffc000083cb0 <+48>: stp x6, x7, [sp,#-16]!
0xffffffc000083cb4 <+52>: stp x4, x5, [sp,#-16]!
0xffffffc000083cb8 <+56>: stp x2, x3, [sp,#-16]!
0xffffffc000083cbc <+60>: stp x0, x1, [sp,#-16]!
0xffffffc000083cc0 <+64>: add x21, sp, #0x120
analyze_stop --> 0xffffffc000083cc4 <+68>: mrs x22, elr_el1
0xffffffc000083cc8 <+72>: mrs x23, spsr_el1
0xffffffc000083ccc <+76>: stp x30, x21, [sp,#240]
0xffffffc000083cd0 <+80>: stp x22, x23, [sp,#256]
end_prologue --> 0xffffffc000083cd4 <+84>: msr daifclr, #0x8
0xffffffc000083cd8 <+88>: ldr x1, 0xffffffc000084210 <handle_arch_irq>
0xffffffc000083cdc <+92>: mov x0, sp
0xffffffc000083ce0 <+96>: blr x1
0xffffffc000083ce4 <+100>: mov x28, sp
...
/* el1_irq disassembly end */
The analysis doesn't see the store of the link register x30 (0xffffffc000083ccc) and therefore
assumes the previous value of the link register is still valid.
The link register actually has the value of
x30 = 0xffffffc000083ce4 (pointing into el1_irq)
and the unwinder then uses this value to determine the caller of el1_irq, which oh wonder is again el1_irq.
The other thing is that gdb doesn't recognize this situation as "Backtrace stopped: previous frame
identical to this frame (corrupt stack?)". This in turn happens because the symbolic calculation
for the stack pointer register looks as following after the prologue analyzer stops:
(gdb) p/x regs[AARCH64_SP_REGNUM]
$3 = {
kind = 0x2, // pv_register
reg = 0x1f, // 31
k = 0xfffffffffffffee0 // #-0x120
}
gdb then tries to use this stack pointer as frame pointer.
/* gdb/aarch64-tdep.c:aarch64_analyze_prologue(...) start */
...
if (pv_is_register (regs[AARCH64_FP_REGNUM], AARCH64_SP_REGNUM)) {
/* Frame pointer is fp. Frame size is constant. */
cache->framereg = AARCH64_FP_REGNUM;
cache->framesize = -regs[AARCH64_FP_REGNUM].k;
}
else if (pv_is_register (regs[AARCH64_SP_REGNUM], AARCH64_SP_REGNUM)) {
/* Try the stack pointer. */
cache->framesize = -regs[AARCH64_SP_REGNUM].k;
cache->framereg = AARCH64_SP_REGNUM;
}
else {
/* We're just out of luck. We don't know where the frame is. */
cache->framereg = -1;
cache->framesize = 0;
}
...
/* gdb/aarch64-tdep.c:aarch64_analyze_prologue(...) end */
Which in turn is used for calculating the previous stack pointer.
/* gdb/aarch64-tdep.c:aarch64_make_prologue_cache_1 (...) start */
...
aarch64_scan_prologue (this_frame, cache);
if (cache->framereg == -1)
return;
unwound_fp = get_frame_register_unsigned (this_frame, cache->framereg);
if (unwound_fp == 0)
return;
cache->prev_sp = unwound_fp + cache->framesize;
...
/* gdb/aarch64-tdep.c:aarch64_make_prologue_cache_1 (...) end */
The result of this is that gdb interprets the situation as a recursion.
* Anyone ran into similar situations with the arm prologue analyzer?
* Anyone worked on an extension for the prologue analyzer to support SYSRegs and therefore
instructions like MRS?
* A defensive workaround I'm currently using is to stop the stack unwinding when hitting an
unsupported instruction.
-Johannes
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Infinite Stack Unwinding ARM
2017-04-04 12:54 Infinite Stack Unwinding ARM Johannes Stoelp
@ 2017-04-05 11:09 ` Yao Qi
2017-04-07 7:15 ` Johannes Stoelp
0 siblings, 1 reply; 6+ messages in thread
From: Yao Qi @ 2017-04-05 11:09 UTC (permalink / raw)
To: Johannes Stoelp; +Cc: gdb, Andreas Ropers, Marc Mones, Kai Schuetz
Johannes Stoelp <Johannes.Stoelp@synopsys.com> writes:
> * Anyone worked on an extension for the prologue analyzer to support SYSRegs and therefore
> instructions like MRS?
I don't expect prologue analyzer supporting SYSRegs and instruction
MRS. All the prologue analyzers in GDB are written in a way that
understanding instructions according to the ABI/calling convention of
each architecture and compiler's behavior, so it should be able to parse
the instruction in prologues complying to the ABI. GDB prologue
analyzer may not understand what does handwritten assembly do.
If you want GDB to unwind from there, add .cfi directives in
arch/arm64/kernel/entry.S
--
Yao (齐尧)
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Infinite Stack Unwinding ARM
2017-04-05 11:09 ` Yao Qi
@ 2017-04-07 7:15 ` Johannes Stoelp
2017-04-08 5:00 ` Duane Ellis
2017-04-10 12:01 ` Yao Qi
0 siblings, 2 replies; 6+ messages in thread
From: Johannes Stoelp @ 2017-04-07 7:15 UTC (permalink / raw)
To: Yao Qi, Johannes Stoelp
Cc: gdb, Andreas Ropers, Marc Mones, Kai Schuetz, Johannes Stoelp
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1671 bytes --]
Yao Qi <qiyaoltc@gmail.com> writes:
> I don't expect prologue analyzer supporting SYSRegs and instruction MRS. All the prologue analyzers in GDB are written in a way that understanding instructions according to the ABI/calling convention of each architecture and compiler's behavior, so it should be able to parse the instruction in prologues complying to the ABI. GDB prologue analyzer may not understand what does handwritten assembly do.
Hi Yao,
I see what you are saying about the prologue analyzer and the ABI/calling conventions.
I understand that gdb does not have to understand every hand written assembler routine, but I would like to emphasize that gdb in this particular case ends in an "infinite" loop printing the backtrace line by line (I put infinite in quotes because the loop is limited by the lower boundary of an integer).
I would expect gdb to be more defensive in this case and either try other unwinding techniques like backward unwinding (from bottom up) or just stop unwinding because of to less information.
In my understanding situations like this can also occur when the stack gets corrupted. There I would also expect gdb to not end in an infinite loop since gdb is intended to analyze the non-expected situation.
One other question that came up by comparing the arm and the aarch64 analyzer:
* Is there a special reason/trick why the arm analyzer (gdb/arm-tdep.c:arm_analyze_prologue(...)) skips instructions that it doesn't recognize while the aarch64 analyzer (gdb/aarch64-tdep.c:aarch64_prologue_analyzer(...)) stops when the first unrecognized instruction is hit?
Best,
Johannes
\x16º&ÖëzÛ«{ÓÙb²Ö«r\x18\x1d
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Infinite Stack Unwinding ARM
2017-04-07 7:15 ` Johannes Stoelp
@ 2017-04-08 5:00 ` Duane Ellis
2017-04-10 12:01 ` Yao Qi
1 sibling, 0 replies; 6+ messages in thread
From: Duane Ellis @ 2017-04-08 5:00 UTC (permalink / raw)
To: Johannes Stoelp; +Cc: Yao Qi, gdb, Andreas Ropers, Marc Mones, Kai Schuetz
> On Apr 7, 2017, at 12:15 AM, Johannes Stoelp <Johannes.Stoelp@synopsys.com> wrote:
>
> Yao Qi <qiyaoltc@gmail.com> writes:
>
>> I don't expect prologue analyzer supporting SYSRegs and instruction MRS. All the prologue analyzers in GDB are written in a way that understanding instructions according to the ABI/calling convention of each architecture and compiler's behavior, so it should be able to parse the instruction in prologues complying to the ABI. GDB prologue analyzer may not understand what does handwritten assembly do.
>
>
> Hi Yao,
>
> I see what you are saying about the prologue analyzer and the ABI/calling conventions.
>
> I understand that gdb does not have to understand every hand written assembler routine, but I would like to emphasize that gdb in this particular case ends in an "infinite" loop printing the backtrace line by line (I put infinite in quotes because the loop is limited by the lower boundary of an integer).
> I would expect gdb to be more defensive in this case and either try other unwinding techniques like backward unwinding (from bottom up) or just stop unwinding because of to less information.
> In my understanding situations like this can also occur when the stack gets corrupted. There I would also expect gdb to not end in an infinite loop since gdb is intended to analyze the non-expected situation.
>
> One other question that came up by comparing the arm and the aarch64 analyzer:
> * Is there a special reason/trick why the arm analyzer (gdb/arm-tdep.c:arm_analyze_prologue(...)) skips instructions that it doesn't recognize while the aarch64 analyzer (gdb/aarch64-tdep.c:aarch64_prologue_analyzer(...)) stops when the first unrecognized instruction is hit?
>
> Best,
> Johannes
>
All of the below are techniques I have used in a custom port of GDB to a custom CPU.
Some might already exist, some are not, it is a balancing act between the software being debugged (helping the debugger) and the debugger trying the best it can.
It’s not uncommon for some bare metal RTOS to push ZEROS onto the stack, or for example on the ARM cores, to set the LR register to ZERO in the assembly handler, yes, this is an extra opcode or two - on every interrupt - in some cases, this is trivial an not a problem if extra opcodes are painful - it can be done in a compile time conditional in other cases the top 16 bytes or so of the interrupt stack is zeroed
Another thing unwinders do - is make sure the stack is still going on the same direction, and it is not a very large distance away from the previous stack, it’s a heuristic - for example 90% of code does not have huge arrays (i.e.: 10K bytes) on the stack, so - as you walk up the stack you make a decision - is distance(THIS_FRAME -vrs- NEXT_FRAME ) greater then MAXVALUE - abandon the backtrace.
Another thing the unwinder can do is inspect a few bytes before the return address, and make sure (1) the call site is within the application being debugged and not in outer space - and (2) the opcodes at that point are actually a subroutine call.
If the unwinder has access to current thread information and RTOS information - it might know the range of the current threads stack.
Again - along the idea of the target memory map - in an embedded system, RAM is in a fixed region… as is code (flash memory) if you know this (or it is set-able) you the unwinder can make other educated decisions.
-Duane.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Infinite Stack Unwinding ARM
2017-04-07 7:15 ` Johannes Stoelp
2017-04-08 5:00 ` Duane Ellis
@ 2017-04-10 12:01 ` Yao Qi
2017-04-11 16:20 ` Johannes Stoelp
1 sibling, 1 reply; 6+ messages in thread
From: Yao Qi @ 2017-04-10 12:01 UTC (permalink / raw)
To: Johannes Stoelp; +Cc: gdb, Andreas Ropers, Marc Mones, Kai Schuetz
Johannes Stoelp <Johannes.Stoelp@synopsys.com> writes:
> I understand that gdb does not have to understand every hand written
> assembler routine, but I would like to emphasize that gdb in this
> particular case ends in an "infinite" loop printing the backtrace line
> by line (I put infinite in quotes because the loop is limited by the
> lower boundary of an integer).
> I would expect gdb to be more defensive in this case and either try
> other unwinding techniques like backward unwinding (from bottom up) or
> just stop unwinding because of to less information.
Don't understand what do you mean by "backward unwinding". IIUC,
el1_irq are vector entry points, so I expect GDB unwinding stops on
el1_irq, but I want to check with kernel people to see their
expectation. Can you post me the backtrace, as a sample output?
We may end up with having some Linux kernel specific stuff, and this
should fall in "Linux kernel debugging for aarch64". Note that we had
patches for general Linux kernel debugging
https://sourceware.org/ml/gdb-patches/2017-03/msg00268.html (well, I
need to review them soon)
> In my understanding situations like this can also occur when the stack
> gets corrupted. There I would also expect gdb to not end in an
> infinite loop since gdb is intended to analyze the non-expected
> situation.
>
> One other question that came up by comparing the arm and the aarch64 analyzer:
> * Is there a special reason/trick why the arm analyzer
> (gdb/arm-tdep.c:arm_analyze_prologue(...)) skips instructions that it
> doesn't recognize while the aarch64 analyzer
> (gdb/aarch64-tdep.c:aarch64_prologue_analyzer(...)) stops when the
> first unrecognized instruction is hit?
I don't see any reason here, just because they are two different
architectures.
--
Yao (齐尧)
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Infinite Stack Unwinding ARM
2017-04-10 12:01 ` Yao Qi
@ 2017-04-11 16:20 ` Johannes Stoelp
0 siblings, 0 replies; 6+ messages in thread
From: Johannes Stoelp @ 2017-04-11 16:20 UTC (permalink / raw)
To: Yao Qi, Johannes Stoelp
Cc: gdb, Andreas Ropers, Marc Mones, Kai Schuetz, Johannes Stoelp
Hi Yao,
Yao Qi <qiyaoltc@gmail.com> writes:
> We may end up with having some Linux kernel specific stuff, and this should fall in "Linux kernel debugging for aarch64".
I see, thanks for the hint.
> Can you post me the backtrace, as a sample output?
When having unlimited backtrace enabled (default), I get backtraces like the following:
#0 gic_handle_irq (regs=0xffffffc07d9c7e30) at drivers/irqchip/irq-gic.c:263
#1 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#2 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#3 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#4 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#5 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#6 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#7 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
...
#20009 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#20010 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#20011 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#20012 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#20013 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
#20014 0xffffffc000083ce4 in el1_irq () at arch/arm64/kernel/entry.S:346
...
Best,
Johannes
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-04-11 16:20 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-04 12:54 Infinite Stack Unwinding ARM Johannes Stoelp
2017-04-05 11:09 ` Yao Qi
2017-04-07 7:15 ` Johannes Stoelp
2017-04-08 5:00 ` Duane Ellis
2017-04-10 12:01 ` Yao Qi
2017-04-11 16:20 ` Johannes Stoelp
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox