From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26256 invoked by alias); 19 Nov 2010 17:20:29 -0000 Received: (qmail 26129 invoked by uid 22791); 19 Nov 2010 17:20:24 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL,BAYES_00,TW_OV X-Spam-Check-By: sourceware.org Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 19 Nov 2010 17:20:17 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by filtered-rock.gnat.com (Postfix) with ESMTP id 9B5BF2BAB55; Fri, 19 Nov 2010 12:20:15 -0500 (EST) Received: from rock.gnat.com ([127.0.0.1]) by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id P1oxKGsmsTl6; Fri, 19 Nov 2010 12:20:15 -0500 (EST) Received: from joel.gnat.com (localhost.localdomain [127.0.0.1]) by rock.gnat.com (Postfix) with ESMTP id 3983D2BAB21; Fri, 19 Nov 2010 12:20:15 -0500 (EST) Received: by joel.gnat.com (Postfix, from userid 1000) id E03EE1457E0; Fri, 19 Nov 2010 09:20:11 -0800 (PST) Date: Fri, 19 Nov 2010 17:20:00 -0000 From: Joel Brobecker To: Pierre Muller , kettenis@gnu.org Cc: gdb-patches@sourceware.org Subject: Re: [RFC] Improve amd64 prologue analysis Message-ID: <20101119172011.GI2634@adacore.com> References: <001701cb84ea$6883c170$398b4450$@muller@ics-cnrs.unistra.fr> <20101118172209.GE2634@adacore.com> <004201cb87c1$dab95cd0$902c1670$@muller@ics-cnrs.unistra.fr> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="jI8keyz6grp/JLjh" Content-Disposition: inline In-Reply-To: <004201cb87c1$dab95cd0$902c1670$@muller@ics-cnrs.unistra.fr> User-Agent: Mutt/1.5.20 (2009-06-14) Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2010-11/txt/msg00262.txt.bz2 --jI8keyz6grp/JLjh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-length: 1955 > Does this means that we should only use that code if no dwarf debug > info is available? If the DWARF *frame* information is available, then it will automaticallly be used over prologue-parsing. > The problem currently on Windows-64bit generated code is that dwarf > debug information is more deeply broken than stabs, so that I am > still mainly using stabs (especially to debug dwarf problems...). I completely understand the predicament you are in. I would do the same. I don't know the compiler switches all that well, but isn't there a way for us to force it to generate the DWARF frame info alone? This info is used for unwinding during exceptions (unless one uses an exception mechanism based on setjmp/longjmp). If you had that, then you would still be using stabs for symbolic debugging, and the DWARF unwinder for backtracing. > In any case it, this code would still be useful for frames that have > no debug information at all, as it allows for frame that do not uses > RBP to figure out the correct offset for RSP, and thus the correct > caller frame position. I'd have no problem with that either. In fact, I can contribute some code as well that seems to deal with most of the prologues found in system code (patch attached). There is one bit where I was lazy and couldn't figure out how to disassemble some intructions, but that can be fixed as well. However, with that being said, the reason why I never contributed this code is that eventually we will probably have to use the pdata/xdata stuff. This is the equivalent of the DWARF frame info. In retrospect, I think that was a little silly of me to hold that code and not make things better while waiting for pdata/xdata support (Tristan expressed interest in that (so did I), but got stuck on VMS). This part of the code is maintained by MarkK, so he has the final word. But I'm OK with improving the prologue analyzer in the way that you are suggesting. -- Joel --jI8keyz6grp/JLjh Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="amd64-windows-prologue.diff" Content-length: 9816 diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c index 5472db1..397d101 100644 --- a/gdb/amd64-tdep.c +++ b/gdb/amd64-tdep.c @@ -41,6 +41,8 @@ #include "amd64-tdep.h" #include "i387-tdep.h" +#include "ui-file.h" +#include "disasm.h" #include "features/i386/amd64.c" #include "features/i386/amd64-avx.c" @@ -1824,36 +1826,152 @@ amd64_analyze_stack_align (CORE_ADDR pc, CORE_ADDR current_pc, return min (pc + offset + 2, current_pc); } -/* Do a limited analysis of the prologue at PC and update CACHE - accordingly. Bail out early if CURRENT_PC is reached. Return the - address where the analysis stopped. +/* Return non-zero if the instruction at PC saves a register at + a location relative to the RSP register. If it is, then set: + - INSN_LEN: The length of the instruction (in bytes); + - REGNUM: The register being saved; + - OFFSET: The offset from the stack pointer. */ - We will handle only functions beginning with: +static int +amd64_register_save (struct gdbarch *gdbarch, CORE_ADDR pc, + int *insn_len, int *regnum, int *offset) +{ + gdb_byte buf[5]; + int is_save = 0; + enum bfd_endian byte_order = gdbarch_byte_order (gdbarch); - pushq %rbp 0x55 - movq %rsp, %rbp 0x48 0x89 0xe5 + read_memory (pc, buf, sizeof (buf)); + if (buf[0] == 0x48 + && buf[1] == 0x89 + && (buf[2] & 0xc0) == 0x40 /* ModRM mod == 01 in binary */ + && (buf[2] & 0x07) == 0x04 /* Base register is %rsp */ + && buf[3] == 0x24) + { + int mov_regnum = (buf[2] & 0x38) >> 3; - Any function that doesn't start with this sequence will be assumed - to have no prologue and thus no valid frame pointer in %rbp. */ + *insn_len = 5; + switch (mov_regnum) + { + case 0: *regnum = AMD64_RAX_REGNUM; break; + case 1: *regnum = AMD64_RCX_REGNUM; break; + case 2: *regnum = AMD64_RDX_REGNUM; break; + case 3: *regnum = AMD64_RBX_REGNUM; break; + case 4: *regnum = AMD64_RSP_REGNUM; break; + case 5: *regnum = AMD64_RBP_REGNUM; break; + case 6: *regnum = AMD64_RSI_REGNUM; break; + case 7: *regnum = AMD64_RDI_REGNUM; break; + } + *offset = extract_signed_integer (buf + 4, 1, byte_order); + is_save = 1; + } + else if (buf[0] == 0x4c + && buf[1] == 0x89 + && (buf[2] & 0xc0) == 0x40 /* ModRM mod == 01 in binary */ + && (buf[2] & 0x07) == 0x04 /* Base register is %rsp */ + && buf[3] == 0x24) + { + int mov_regnum = (buf[2] & 0x38) >> 3; + + *insn_len = 5; + *regnum = AMD64_R8_REGNUM + mov_regnum; + *offset = extract_signed_integer (buf + 4, 1, byte_order); + is_save = 1; + } + + return is_save; +} + +/* Return a string representation of the instruction at PC. + Set LEN to the number of bytes used by this instruction. + + Disassembling code on x86_64 is so complicated compared to most + archictectures, that it can be easier to just parse a string + rather than disassembling by hand. It's definitely overkill, + but definitly simpler. */ + +static char * +amd64_disass_insn (struct gdbarch *gdbarch, CORE_ADDR pc, int *len) +{ + struct ui_file *tmp_stream; + struct cleanup *cleanups; + char *insn; + long dummy; + + tmp_stream = mem_fileopen (); + cleanups = make_cleanup_ui_file_delete (tmp_stream); + *len = gdb_print_insn (gdbarch, pc, tmp_stream, NULL); + insn = ui_file_xstrdup (tmp_stream, &dummy); + do_cleanups (cleanups); + + return insn; +} + +/* Some prologues sometimes start with a few instructions that + are not part of a typical prologue. Observed in the Windows + kernel32.dll, we had a couple of "mov" instructions before + the real prologue stared. We have also seen GCC emit some + "mov %rN,OFFSET(%rsp)" instructions instead of "push" instructions + in order to save a register on the stack. For lack of a better + word, let's call this section of the function the "pre-prologue". + + This function analyzes the contents of the function pre-prologue, + updates the cache accordingly, and returns the address of the first + instruction past this pre-prologue. */ static CORE_ADDR -amd64_analyze_prologue (struct gdbarch *gdbarch, - CORE_ADDR pc, CORE_ADDR current_pc, - struct amd64_frame_cache *cache) +amd64_analyze_pre_prologue (struct gdbarch *gdbarch, + CORE_ADDR pc, CORE_ADDR limit_pc, + struct amd64_frame_cache *cache) +{ + while (pc < limit_pc) + { + int len = 0; + char *insn; + int regnum = 0; + int offset = 0; + int dummy; + int skip = 0; + + if (amd64_register_save (gdbarch, pc, &len, ®num, &offset)) + { + cache->saved_regs[regnum] = offset + 8; + pc += len; + continue; + } + + insn = amd64_disass_insn (gdbarch, pc, &len); + if (pc + len < limit_pc && strncmp (insn, "mov", 3) == 0) + skip = 1; + xfree (insn); + if (skip) + pc += len; + else + break; + } + + return pc; +} + +/* If the instruction at PC establishes a frame pointer, then update + CACHE accordingly and return the address of the instruction + immediately following it. Otherwise just return the same PC. */ + +static CORE_ADDR +amd64_analyze_frame_pointer_setup (struct gdbarch *gdbarch, + CORE_ADDR pc, CORE_ADDR limit_pc, + struct amd64_frame_cache *cache) { enum bfd_endian byte_order = gdbarch_byte_order (gdbarch); static gdb_byte proto[3] = { 0x48, 0x89, 0xe5 }; /* movq %rsp, %rbp */ gdb_byte buf[3]; gdb_byte op; - if (current_pc <= pc) - return current_pc; - - pc = amd64_analyze_stack_align (pc, current_pc, cache); + if (limit_pc <= pc) + return limit_pc; op = read_memory_unsigned_integer (pc, 1, byte_order); - if (op == 0x55) /* pushq %rbp */ + if (op == 0x55) /* pushq %rbp */ { /* Take into account that we've executed the `pushq %rbp' that starts this instruction sequence. */ @@ -1861,8 +1979,8 @@ amd64_analyze_prologue (struct gdbarch *gdbarch, cache->sp_offset += 8; /* If that's all, return now. */ - if (current_pc <= pc + 1) - return current_pc; + if (limit_pc <= pc + 1) + return limit_pc; /* Check for `movq %rsp, %rbp'. */ read_memory (pc + 1, buf, 3); @@ -1877,6 +1995,113 @@ amd64_analyze_prologue (struct gdbarch *gdbarch, return pc; } +/* Analyzes the instructions in the part of the function prologue + where registers are being saved on stack. Update CACHE accordingly. + Return the address of the first instruction past the last register + save. */ + +static CORE_ADDR +amd64_analyze_register_saves (struct gdbarch *gdbarch, + CORE_ADDR pc, CORE_ADDR limit_pc, + struct amd64_frame_cache *cache) +{ + gdb_byte op; + enum bfd_endian byte_order = gdbarch_byte_order (gdbarch); + + while (pc < limit_pc) + { + int insn_len; + int regnum = -1; + int offset; + + op = read_memory_unsigned_integer (pc, 1, byte_order); + if (op >= 0x50 && op <= 0x57) + { + /* This is a "push REG" insn where REG is %rax .. %rdi. */ + regnum = amd64_arch_reg_to_regnum (op - 0x50); + pc += 1; + } + else if (op == 0x41) + { + op = read_memory_unsigned_integer (pc + 1, 1, byte_order); + if (op >= 0x50 && op <= 0x57) + { + /* This is a "push REG" insn where REG is %r8 .. %r15. */ + regnum = amd64_arch_reg_to_regnum (op - 0x50 + AMD64_R8_REGNUM); + pc += 2; + } + } + + if (regnum >= 0) + { + cache->sp_offset += 8; + cache->saved_regs[regnum] = cache->sp_offset; + } + else + return pc; /* This instruction is not a register save. Return. */ + } + + return pc; +} + +/* If PC points to an instruction that decrements the rsp register, + (thus finishing the frame setup), analyze it, adjust the CACHE + accordingly, and return the address of the instruction immediately + following it. */ + +static CORE_ADDR +amd64_analyze_stack_pointer_adjustment (struct gdbarch *gdbarch, + CORE_ADDR pc, CORE_ADDR limit_pc, + struct amd64_frame_cache *cache) +{ + gdb_byte buf[3]; + enum bfd_endian byte_order = gdbarch_byte_order (gdbarch); + + /* There are several ways to encode the sub instruction. Either way, + it starts with 3 bytes, so read these bytes first. */ + if (pc + sizeof (buf) >= limit_pc) + return pc; + read_memory (pc, buf, sizeof (buf)); + + if (pc + 4 < limit_pc && buf[0] == 0x48 && buf[1] == 0x83 && buf[2] == 0xec) + { + /* 4-byte version of the sub instruction. The offset is encoded + in the 4th byte. */ + cache->sp_offset += + read_memory_unsigned_integer (pc + sizeof (buf), 1, byte_order); + pc += sizeof (buf) + 1; + } + else if (pc + 7 < limit_pc + && buf [0] == 0x48 && buf[1] == 0x81 && buf[2] == 0xec) + { + /* 7-byte version of the sub instruction. The offset is encoded + as a word at the end of the instruction. */ + cache->sp_offset += + read_memory_unsigned_integer (pc + sizeof (buf), 4, byte_order); + pc += sizeof (buf) + 4; + } + + return pc; +} + +/* Do an analysis of the prologue at PC and update CACHE + accordingly. Bail out early if CURRENT_PC is reached. Return the + address where the analysis stopped. */ + +static CORE_ADDR +amd64_analyze_prologue (struct gdbarch *gdbarch, + CORE_ADDR pc, CORE_ADDR current_pc, + struct amd64_frame_cache *cache) +{ + pc = amd64_analyze_pre_prologue (gdbarch, pc, current_pc, cache); + pc = amd64_analyze_stack_align (pc, current_pc, cache); + pc = amd64_analyze_frame_pointer_setup (gdbarch, pc, current_pc, cache); + pc = amd64_analyze_register_saves (gdbarch, pc, current_pc, cache); + pc = amd64_analyze_stack_pointer_adjustment (gdbarch, pc, current_pc, cache); + + return pc; +} + /* Return PC of first real instruction. */ static CORE_ADDR --jI8keyz6grp/JLjh--