From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 80031 invoked by alias); 4 Apr 2017 12:54:13 -0000 Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org Received: (qmail 80009 invoked by uid 89); 4 Apr 2017 12:54:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE,RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=H*MI:internal, hitting, H*M:internal, mrs X-HELO: smtprelay.synopsys.com Received: from smtprelay.synopsys.com (HELO smtprelay.synopsys.com) (198.182.47.9) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 04 Apr 2017 12:54:10 +0000 Received: from mailhost.synopsys.com (mailhost1.synopsys.com [10.12.238.239]) by smtprelay.synopsys.com (Postfix) with ESMTP id 9FF2C24E08EC for ; Tue, 4 Apr 2017 05:54:10 -0700 (PDT) Received: from mailhost.synopsys.com (localhost [127.0.0.1]) by mailhost.synopsys.com (Postfix) with ESMTP id 812D8B80 for ; Tue, 4 Apr 2017 05:54:10 -0700 (PDT) Received: from US01WEHTC3.internal.synopsys.com (us01wehtc3.internal.synopsys.com [10.15.84.232]) by mailhost.synopsys.com (Postfix) with ESMTP id 786B0B7F for ; Tue, 4 Apr 2017 05:54:10 -0700 (PDT) Received: from DE02WEHTCB.internal.synopsys.com (10.225.19.94) by US01WEHTC3.internal.synopsys.com (10.15.84.232) with Microsoft SMTP Server (TLS) id 14.3.266.1; Tue, 4 Apr 2017 05:54:10 -0700 Received: from DE02WEMBXB.internal.synopsys.com ([fe80::95ce:118a:8321:a099]) by DE02WEHTCB.internal.synopsys.com ([::1]) with mapi id 14.03.0266.001; Tue, 4 Apr 2017 14:54:09 +0200 From: Johannes Stoelp To: "gdb@sourceware.org" CC: Andreas Ropers , Marc Mones , Kai Schuetz , Johannes Stoelp Subject: Infinite Stack Unwinding ARM Date: Tue, 04 Apr 2017 12:54:00 -0000 Message-ID: <6ECCE8A0904A1643BE093611EF2098CE0147553E@DE02WEMBXB.internal.synopsys.com> x-dg-ref: PG1ldGE+PGF0IG5tPSJib2R5LnR4dCIgcD0iYzpcdXNlcnNcanN0b2xwXGFwcGRhdGFccm9hbWluZ1wwOWQ4NDliNi0zMmQzLTRhNDAtODVlZS02Yjg0YmEyOWUzNWJcbXNnc1xtc2ctYzlkNDk4MWUtMTkzNS0xMWU3LTk0MzAtMzRlNmQ3NDYwMTRmXGFtZS10ZXN0XGM5ZDQ5ODIwLTE5MzUtMTFlNy05NDMwLTM0ZTZkNzQ2MDE0ZmJvZHkudHh0IiBzej0iNTc3MCIgdD0iMTMxMzU3ODQwNDc4MzU4NzQyIiBoPSJuZUdzdDI0TEh1SDlBbmNGaGJ0SDdFQWlLemM9IiBpZD0iIiBibD0iMCIgYm89IjEiLz48L21ldGE+ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-SW-Source: 2017-04/txt/msg00007.txt.bz2 Hi folks, While debugging Linux kernel code on an ARMv8 target we ended up in an "inf= inite" loop when printing the backtrace. We tracked it down that this happens when we break in any fu= nction origin in el[01]_irq (just the ones we ran into). Note: These functions (el[01]_irq) are handwritten assembler functions and = therefore gdb uses the prologue analyzer (gdb/aarch64-tdep.c:aarch64_analyze_prologue(...)) to unw= ind the previous frame. We further analyzed the stack unwinding and found the root cause of this in= finite loop. Let's look at el1_irq for example to see what is happening in detail. While= unwinding the stack, gdb comes to the 'el1_irq' function and then starts analyzing the function = prologue to further unwind the stack. gdb calculates the begin and end of the function prologue as follows: start_prologue =3D 0xffffffc000083c80 end_prologue =3D 0xffffffc000083cd4 When the analyzer hits the first MRS instruction at 0xffffffc000083cc4 it s= tops because MRS is not supported in the analyzer. /* el1_irq disassembly start */ Dump of assembler code for function el1_irq (arch/arm64/kernel/entry.S): start_prologue --> 0xffffffc000083c80 <+0>: sub sp, sp, #0x30 0xffffffc000083c84 <+4>: stp x28, x29, [sp,#-16= ]! 0xffffffc000083c88 <+8>: stp x26, x27, [sp,#-16= ]! 0xffffffc000083c8c <+12>: stp x24, x25, [sp,#-16= ]! 0xffffffc000083c90 <+16>: stp x22, x23, [sp,#-16= ]! 0xffffffc000083c94 <+20>: stp x20, x21, [sp,#-16= ]! 0xffffffc000083c98 <+24>: stp x18, x19, [sp,#-16= ]! 0xffffffc000083c9c <+28>: stp x16, x17, [sp,#-16= ]! 0xffffffc000083ca0 <+32>: stp x14, x15, [sp,#-16= ]! 0xffffffc000083ca4 <+36>: stp x12, x13, [sp,#-16= ]! 0xffffffc000083ca8 <+40>: stp x10, x11, [sp,#-16= ]! 0xffffffc000083cac <+44>: stp x8, x9, [sp,#-16]! 0xffffffc000083cb0 <+48>: stp x6, x7, [sp,#-16]! 0xffffffc000083cb4 <+52>: stp x4, x5, [sp,#-16]! 0xffffffc000083cb8 <+56>: stp x2, x3, [sp,#-16]! 0xffffffc000083cbc <+60>: stp x0, x1, [sp,#-16]! 0xffffffc000083cc0 <+64>: add x21, sp, #0x120 analyze_stop --> 0xffffffc000083cc4 <+68>: mrs x22, elr_el1 0xffffffc000083cc8 <+72>: mrs x23, spsr_el1 0xffffffc000083ccc <+76>: stp x30, x21, [sp,#240] 0xffffffc000083cd0 <+80>: stp x22, x23, [sp,#256] end_prologue --> 0xffffffc000083cd4 <+84>: msr daifclr, #0x8 0xffffffc000083cd8 <+88>: ldr x1, 0xffffffc00008= 4210 0xffffffc000083cdc <+92>: mov x0, sp 0xffffffc000083ce0 <+96>: blr x1 0xffffffc000083ce4 <+100>: mov x28, sp ... /* el1_irq disassembly end */ The analysis doesn't see the store of the link register x30 (0xffffffc00008= 3ccc) and therefore assumes the previous value of the link register is still valid. The link register actually has the value of x30 =3D 0xffffffc000083ce4 (pointing into el1_irq) and the unwinder then uses this value to determine the caller of el1_irq, w= hich oh wonder is again el1_irq. The other thing is that gdb doesn't recognize this situation as "Backtrace = stopped: previous frame identical to this frame (corrupt stack?)". This in turn happens because the= symbolic calculation for the stack pointer register looks as following after the prologue analyz= er stops: (gdb) p/x regs[AARCH64_SP_REGNUM] $3 =3D { kind =3D 0x2, // pv_register reg =3D 0x1f, // 31 k =3D 0xfffffffffffffee0 // #-0x120 } gdb then tries to use this stack pointer as frame pointer. /* gdb/aarch64-tdep.c:aarch64_analyze_prologue(...) start */ ... if (pv_is_register (regs[AARCH64_FP_REGNUM], AARCH64_SP_REGNUM)) { /* Frame pointer is fp. Frame size is constant. */ cache->framereg =3D AARCH64_FP_REGNUM; cache->framesize =3D -regs[AARCH64_FP_REGNUM].k; } else if (pv_is_register (regs[AARCH64_SP_REGNUM], AARCH64_SP_REGNUM)) { /* Try the stack pointer. */ cache->framesize =3D -regs[AARCH64_SP_REGNUM].k; cache->framereg =3D AARCH64_SP_REGNUM; } else { /* We're just out of luck. We don't know where the frame is. */ cache->framereg =3D -1; cache->framesize =3D 0; } ... /* gdb/aarch64-tdep.c:aarch64_analyze_prologue(...) end */ Which in turn is used for calculating the previous stack pointer. /* gdb/aarch64-tdep.c:aarch64_make_prologue_cache_1 (...) start */ ... aarch64_scan_prologue (this_frame, cache); if (cache->framereg =3D=3D -1) return; unwound_fp =3D get_frame_register_unsigned (this_frame, cache->framereg); if (unwound_fp =3D=3D 0) return; cache->prev_sp =3D unwound_fp + cache->framesize; ... /* gdb/aarch64-tdep.c:aarch64_make_prologue_cache_1 (...) end */ The result of this is that gdb interprets the situation as a recursion. * Anyone ran into similar situations with the arm prologue analyzer? * Anyone worked on an extension for the prologue analyzer to support SYSReg= s and therefore instructions like MRS? * A defensive workaround I'm currently using is to stop the stack unwinding= when hitting an unsupported instruction. -Johannes