From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-130851-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 85911 invoked by alias); 11 Apr 2016 20:20:37 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 85789 invoked by uid 89); 11 Apr 2016 20:20:35 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=
X-HELO: relay1.mentorg.com
Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Mon, 11 Apr 2016 20:20:24 +0000
Received: from svr-orw-fem-06.mgc.mentorg.com ([147.34.97.120])	by relay1.mentorg.com with esmtp 	id 1apiJp-0005TP-80 from Luis_Gustavo@mentor.com ; Mon, 11 Apr 2016 13:20:21 -0700
Received: from [134.86.127.233] (147.34.91.1) by SVR-ORW-FEM-06.mgc.mentorg.com (147.34.97.120) with Microsoft SMTP Server id 14.3.224.2; Mon, 11 Apr 2016 13:20:20 -0700
Subject: Re: [PATCH v4] Fix alignment of disassemble /r
References: <1460384185-21574-1-git-send-email-leonardo.boquillon@tallertechnologies.com>
To: Leonardo Boquillon <leonardo.boquillon@tallertechnologies.com>,	<gdb-patches@sourceware.org>, <daniel.gutson@tallertechnologies.com>,	<markus.t.metzger@intel.com>
From: Luis Machado <lgustavo@codesourcery.com>
Reply-To: Luis Machado <lgustavo@codesourcery.com>
Message-ID: <570C0703.9060100@codesourcery.com>
Date: Mon, 11 Apr 2016 20:20:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0
MIME-Version: 1.0
In-Reply-To: <1460384185-21574-1-git-send-email-leonardo.boquillon@tallertechnologies.com>
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
X-SW-Source: 2016-04/txt/msg00223.txt.bz2

On 04/11/2016 09:16 AM, Leonardo Boquillon wrote:
> When disassembling in raw mode (/r) in a variable-length insn architecture (i.e. x86),
> the output can be messed since no alignment takes place.
>
> The first version of this was sent on this mail thread
> https://sourceware.org/ml/gdb-patches/2014-04/msg00226.html and this patch is
> the same placed previously here https://sourceware.org/bugzilla/show_bug.cgi?id=19768
>
> This patch performs the two passes: the first is at get_insn_set_longest_opcode
> at line 136 in the patch, and the second is the while loop that follows like was
> agreed here https://sourceware.org/ml/gdb-patches/2014-04/msg00427.html
>
> In this version have been added the capability to compute longest opcode for
> btrace.
>
> If this is ok for commit I have a company-wide copyright access, and a coworker
> of mine has write access.
>
> gdb/ChangeLog
> 2016-04-11  Leonardo Boquillon  <leonardo.boquillon@tallertechnologies.com>
>
> 	* disasm.c (gdb_pretty_print_insn): Refactored to use longest opcode.
> 	(dump_insns): Add calls to calculate longest opcode, then pass it to the
> 	print function.
> 	(dis_get_longest_opcode): New function.
> 	* disasm.h (gdb_pretty_print_insn): Refactored to use longest opcode.
> 	* record-btrace.c (btrace_get_longest_opcode): New function.
> 	(btrace_insn_history): Add calls to calculate longest opcode,
> 	then pass it to the print function.
>
> gdb/testsuite/ChangeLog
> 2016-04-11  Leonardo Boquillon  <leonardo.boquillon@tallertechnologies.com>
>
> 	* disas_raw.exp: New file.
> 	* instruction_history.exp: Added test for disass /r.
> 	* main.S: New file.
>
> ---
>   gdb/disasm.c                                     | 32 +++++++++++++++++--
>   gdb/disasm.h                                     |  2 +-
>   gdb/record-btrace.c                              | 27 +++++++++++++++-
>   gdb/testsuite/gdb.btrace/instruction_history.exp | 12 ++++++++
>   gdb/testsuite/gdb.disasm/disas_raw.exp           | 39 ++++++++++++++++++++++++
>   gdb/testsuite/gdb.disasm/main.S                  | 10 ++++++
>   6 files changed, 117 insertions(+), 5 deletions(-)
>   create mode 100644 gdb/testsuite/gdb.disasm/disas_raw.exp
>   create mode 100644 gdb/testsuite/gdb.disasm/main.S
>
> diff --git a/gdb/disasm.c b/gdb/disasm.c
> index 1cf0901..32324fc 100644
> --- a/gdb/disasm.c
> +++ b/gdb/disasm.c
> @@ -169,19 +169,38 @@ compare_lines (const void *mle1p, const void *mle2p)
>     return val;
>   }
>
> +static int
> +dis_get_longest_opcode (struct gdbarch *gdbarch, CORE_ADDR addr, CORE_ADDR high,
> +			int how_many)

The function should be documented.

> +{
> +  int longest_opcode = 0;
> +  unsigned int insn_checked = 0;

Shouldn't longest_opcode be unsigned int since some of the other 
variables are also being forced to unsigned?

> +
> +  while (addr < high && (how_many < 0 || insn_checked++ < how_many))

Do we need to worry about wrong input here that would cause things to 
fail? Is addr always going to be less than high?

I'd prefer the insn_checked increment to be in the loop for readability.

> +    {
> +      const int current_length = gdb_insn_length(gdbarch, addr);

space before (. Multiple ocurrences of this.

Also, i don't quite see the need to use const here and in other places. 
It is such a short piece of code.

> +      longest_opcode =
> +	(current_length > longest_opcode) ? current_length : longest_opcode;
> +      addr += current_length;
> +    }
> +
> +  return longest_opcode;
> +}
> +
>   /* See disasm.h.  */
>
>   int
>   gdb_pretty_print_insn (struct gdbarch *gdbarch, struct ui_out *uiout,
>   		       struct disassemble_info * di,
>   		       const struct disasm_insn *insn, int flags,
> -		       struct ui_file *stb)
> +		       struct ui_file *stb, int longest_opcode)

unsigned int longest_opcode? Multiple ocurrences of this.

>   {
>     /* parts of the symbolic representation of the address */
>     int unmapped;
>     int offset;
>     int line;
>     int size;
> +  unsigned int max_print_space;
>     struct cleanup *ui_out_chain;
>     char *filename = NULL;
>     char *name = NULL;
> @@ -245,6 +264,7 @@ gdb_pretty_print_insn (struct gdbarch *gdbarch, struct ui_out *uiout,
>         CORE_ADDR end_pc;
>         bfd_byte data;
>         int err;
> +      unsigned int i;
>         const char *spacer = "";
>
>         /* Build the opcodes using a temporary stream so we can
> @@ -267,7 +287,10 @@ gdb_pretty_print_insn (struct gdbarch *gdbarch, struct ui_out *uiout,
>   	}
>
>         ui_out_field_stream (uiout, "opcodes", opcode_stream);
> -      ui_out_text (uiout, "\t");
> +      gdb_assert(longest_opcode >= size);

This doesn't look like a fatal failure. I think we should just fallback 
to setting longest_opcode = size if this happens.

> +      max_print_space = 3u * (1u + longest_opcode - size);
> +      for (i = 0; i < max_print_space; i++)
> +	ui_out_text(uiout, " ");
>
>         do_cleanups (cleanups);
>       }
> @@ -291,15 +314,18 @@ dump_insns (struct gdbarch *gdbarch, struct ui_out *uiout,
>   {
>     struct disasm_insn insn;
>     int num_displayed = 0;
> +  int longest_opcode;
>
>     memset (&insn, 0, sizeof (insn));
>     insn.addr = low;
>
> +  longest_opcode = dis_get_longest_opcode (gdbarch, low, high, how_many);
>     while (insn.addr < high && (how_many < 0 || num_displayed < how_many))
>       {
>         int size;
>
> -      size = gdb_pretty_print_insn (gdbarch, uiout, di, &insn, flags, stb);
> +      size = gdb_pretty_print_insn (gdbarch, uiout, di, &insn, flags, stb,
> +				    longest_opcode);
>         if (size <= 0)
>   	break;
>
> diff --git a/gdb/disasm.h b/gdb/disasm.h
> index a2b72b9..21cdd17 100644
> --- a/gdb/disasm.h
> +++ b/gdb/disasm.h
> @@ -53,7 +53,7 @@ struct disasm_insn
>   extern int gdb_pretty_print_insn (struct gdbarch *gdbarch, struct ui_out *uiout,
>   				  struct disassemble_info * di,
>   				  const struct disasm_insn *insn, int flags,
> -				  struct ui_file *stb);
> +				  struct ui_file *stb, int longest_opcode);
>
>   /* Return a filled in disassemble_info object for use by gdb.  */
>
> diff --git a/gdb/record-btrace.c b/gdb/record-btrace.c
> index 77b5180..cb87aac 100644
> --- a/gdb/record-btrace.c
> +++ b/gdb/record-btrace.c
> @@ -660,6 +660,28 @@ btrace_print_lines (struct btrace_line_range lines, struct ui_out *uiout,
>       }
>   }
>
> +static int
> +btrace_get_longest_opcode (struct gdbarch *gdbarch,
> +                           const struct btrace_insn_iterator *begin,
> +                           const struct btrace_insn_iterator *end)
> +{
> +  int longest_opcode = 0;
> +  struct btrace_insn_iterator it;
> +
> +  for (it = *begin; btrace_insn_cmp (&it, end) != 0; btrace_insn_next (&it, 1))
> +    {
> +      const struct btrace_insn *insn = btrace_insn_get (&it);
> +
> +      if (insn != NULL)
> +	{
> +	  const int current_length = gdb_insn_length (gdbarch, insn->pc);
> +	  longest_opcode =
> +	    (current_length > longest_opcode) ? current_length : longest_opcode;
> +	}
> +    }
> +
> +  return longest_opcode;
> +}
>   /* Disassemble a section of the recorded instruction trace.  */
>
>   static void
> @@ -674,6 +696,7 @@ btrace_insn_history (struct ui_out *uiout,
>     struct gdbarch *gdbarch;
>     struct btrace_insn_iterator it;
>     struct btrace_line_range last_lines;
> +  int longest_opcode;
>
>     DEBUG ("itrace (0x%x): [%u; %u)", flags, btrace_insn_number (begin),
>   	 btrace_insn_number (end));
> @@ -692,6 +715,7 @@ btrace_insn_history (struct ui_out *uiout,
>        instructions corresponding to that line.  */
>     ui_item_chain = NULL;
>
> +  longest_opcode = btrace_get_longest_opcode(gdbarch, begin, end);
>     for (it = *begin; btrace_insn_cmp (&it, end) != 0; btrace_insn_next (&it, 1))
>       {
>         const struct btrace_insn *insn;
> @@ -745,7 +769,8 @@ btrace_insn_history (struct ui_out *uiout,
>   	  if ((insn->flags & BTRACE_INSN_FLAG_SPECULATIVE) != 0)
>   	    dinsn.is_speculative = 1;
>
> -	  gdb_pretty_print_insn (gdbarch, uiout, &di, &dinsn, flags, stb);
> +	  gdb_pretty_print_insn (gdbarch, uiout, &di, &dinsn, flags, stb,
> +				 longest_opcode);
>   	}
>       }
>
> diff --git a/gdb/testsuite/gdb.btrace/instruction_history.exp b/gdb/testsuite/gdb.btrace/instruction_history.exp
> index 58ae770..7de35d8 100644
> --- a/gdb/testsuite/gdb.btrace/instruction_history.exp
> +++ b/gdb/testsuite/gdb.btrace/instruction_history.exp
> @@ -97,6 +97,18 @@ gdb_test "record instruction-history /pf 3,7" [multi_line \
>     "7\t   0x\[0-9a-f\]+ <\\+\[0-9\]+>:\tje     0x\[0-9a-f\]+ <loop\\+\[0-9\]+>\r" \
>     ]
>
> +# We don't know the exact encoding that is used but we know that the raw
> +# opcode bytes should be padded so they should all be the same length.
> +#
> +# To simplify things let's assume that the longest opcode is 3B.
> +gdb_test "record instruction-history /r 3,7" [multi_line \
> +  "3\t   0x\[0-9a-f\]+ <loop\\+\[0-9\]+>:\t\[0-9a-f \]{8}   je     0x\[0-9a-f\]+ <loop\\+\[0-9\]+>" \
> +  "4\t   0x\[0-9a-f\]+ <loop\\+\[0-9\]+>:\t\[0-9a-f \]{8}   dec    %eax" \
> +  "5\t   0x\[0-9a-f\]+ <loop\\+\[0-9\]+>:\t\[0-9a-f \]{8}   jmp    0x\[0-9a-f\]+ <loop\\+\[0-9\]+>" \
> +  "6\t   0x\[0-9a-f\]+ <loop\\+\[0-9\]+>:\t\[0-9a-f \]{8}   cmp    \\\$0x0,%eax" \
> +  "7\t   0x\[0-9a-f\]+ <loop\\+\[0-9\]+>:\t\[0-9a-f \]{8}   je     0x\[0-9a-f\]+ <loop\\+\[0-9\]+>\r"
> +  ]
> +
>   gdb_test "record instruction-history 3,3" "3\t   0x\[0-9a-f\]+ <loop\\+\[0-9\]+>:\tje     0x\[0-9a-f\]+ <loop\\+\[0-9\]+>\r"
>
>   # the following tests are checking the iterators
> diff --git a/gdb/testsuite/gdb.disasm/disas_raw.exp b/gdb/testsuite/gdb.disasm/disas_raw.exp
> new file mode 100644
> index 0000000..a233bad
> --- /dev/null
> +++ b/gdb/testsuite/gdb.disasm/disas_raw.exp
> @@ -0,0 +1,39 @@
> +# This testcase is part of GDB, the GNU debugger.
> +#
> +# Copyright 2016 Free Software Foundation, Inc.
> +#
> +# Contributed by Taller Technologies <leonardo.boquillon@tallertechnologies.com>
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test alignment of raw disassemble dump (Bugzilla #19768).
> +
> +if { ![istarget "x86_64-*-*"] } { return -1 }
> +
> +set asm_file "main.S"
> +
> +standard_testfile ${asm_file}
> +
> +if [prepare_for_testing ${testfile}.exp ${testfile} ${asm_file} {}] {
> +    return -1
> +}
> +
> +set asm_line ".*55                        push   %rbp"
> +append asm_line ".*48 89 e5                  mov    %rsp,%rbp"
> +append asm_line ".*b8 00 00 00 00            mov    \\\$0x0,%eax"
> +append asm_line ".*5d                        pop    %rbp"
> +append asm_line ".*c3                        retq   "
> +append asm_line ".*0f 1f 84 00 00 00 00 00   nopl   0x0\\(%rax,%rax,1\\).*"
> +
> +gdb_test "disass /r main" ${asm_line} "disass /r main"
> \ No newline at end of file
> diff --git a/gdb/testsuite/gdb.disasm/main.S b/gdb/testsuite/gdb.disasm/main.S
> new file mode 100644
> index 0000000..0178938
> --- /dev/null
> +++ b/gdb/testsuite/gdb.disasm/main.S
> @@ -0,0 +1,10 @@
> +	.file	"main.c"
> +	.text
> +	.globl	main
> +	.type	main, @function
> +main:
> +	pushq	%rbp
> +	movq	%rsp, %rbp
> +	movl	$0, %eax
> +	popq	%rbp
> +	ret
>

It is always nice to have a testcase, but in this particular case i 
don't see the benefit. It will only be exercised for a particular 
architecture, even though it is a general change, and it may fail if we 
try to exercise things under a different host. For example, Windows.

Should we agree on a display format and declare that the correct output 
mechanism?

Also, should we only adjust the output and calculate the longest opcode 
for architectures that have variable length instructions? Architectures 
with fixed length instructions will not benefit from the additional 
calculations, right?