Re: [PATCH][AArch64] SVE/FPSIMD fixup for big endian

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Alan Hayward via Gdb-patches <gdb-patches@sourceware.org>
To: Luis Machado <luis.machado@linaro.org>
Cc: nd <nd@arm.com>,
	"gdb-patches\\@sourceware.org" <gdb-patches@sourceware.org>
Subject: Re: [PATCH][AArch64] SVE/FPSIMD fixup for big endian
Date: Tue, 1 Dec 2020 11:28:03 +0000	[thread overview]
Message-ID: <D31B28FD-89FB-4502-80E5-28AC85F61997@arm.com> (raw)
In-Reply-To: <20201130185545.940242-1-luis.machado@linaro.org>



> On 30 Nov 2020, at 18:55, Luis Machado <luis.machado@linaro.org> wrote:
> 
> The FPSIMD dump in signal frames and ptrace FPSIMD dump in the SVE context
> structure follows the target endianness, whereas the SVE dumps are
> endianness-independent (LE).
> 
> Therefore, when the system is in BE mode, we need to reverse the bytes
> for the FPSIMD data.
> 
> Given the V registers are larger than 64-bit, I've added a way for value
> bytes to be set, as opposed to passing a 64-bit fixed quantity. This fits
> nicely with the unwinding *_got_bytes function and makes the trad-frame
> more flexible and capable of saving larger registers.
> 
> The memory for the bytes is allocated via the frame obstack, so it gets freed
> after we're done inspecting the frame.
> 
> gdb/ChangeLog:
> 
> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
> 
> 	* aarch64-linux-tdep.c (aarch64_linux_restore_vreg) New function.
> 	(aarch64_linux_sigframe_init): Call aarch64_linux_restore_vreg.
> 	* nat/aarch64-sve-linux-ptrace.c: Include endian.h.
> 	(aarch64_maybe_swab128): New function.
> 	(aarch64_sve_regs_copy_to_reg_buf)
> 	(aarch64_sve_regs_copy_from_reg_buf): Adjust FPSIMD entries.
> 	* trad-frame.c (trad_frame_reset_saved_regs): Initialize
> 	the data field.
> 	(TF_REG_VALUE_BYTES): New enum value.
> 	(trad_frame_value_bytes_p): New function.
> 	(trad_frame_set_value_bytes): New function.
> 	(trad_frame_set_reg_value_bytes): New function.
> 	(trad_frame_get_prev_register): Handle register values saved as bytes.
> 	* trad-frame.h (trad_frame_set_reg_value_bytes): New prototype.
> 	(struct trad_frame_saved_reg) <data>: New field.
> 	(trad_frame_set_value_bytes): New prototype.
> 	(trad_frame_value_bytes_p): New prototype.
> ---
> gdb/aarch64-linux-tdep.c           | 115 ++++++++++++++++++++++++-----
> gdb/nat/aarch64-sve-linux-ptrace.c |  57 +++++++++++++-
> gdb/trad-frame.c                   |  46 +++++++++++-
> gdb/trad-frame.h                   |  19 +++++
> 4 files changed, 213 insertions(+), 24 deletions(-)
> 
> diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
> index c9898bdafd..108f96be71 100644
> --- a/gdb/aarch64-linux-tdep.c
> +++ b/gdb/aarch64-linux-tdep.c
> @@ -180,6 +180,94 @@ read_aarch64_ctx (CORE_ADDR ctx_addr, enum bfd_endian byte_order,
>   return magic;
> }
> 
> +/* Given CACHE, use the trad_frame* functions to restore the FPSIMD
> +   registers from a signal frame.
> +
> +   VREG_NUM is the number of the V register being restored, OFFSET is the
> +   address containing the register value, BYTE_ORDER is the endianness and
> +   HAS_SVE tells us if we have a valid SVE context or not.  */
> +
> +static void
> +aarch64_linux_restore_vreg (struct trad_frame_cache *cache, int num_regs,
> +			    int vreg_num, CORE_ADDR offset,
> +			    enum bfd_endian byte_order, bool has_sve)
> +{
> +  /* WARNING: SIMD state is laid out in memory in target-endian format, while
> +     SVE state is laid out in an endianness-independent format (LE).
> +
> +     So we have a couple cases to consider:
> +
> +     1 - If the target is big endian, then SIMD state is big endian,
> +     requiring a byteswap.
> +
> +     2 - If the target is little endian, then SIMD state is little endian,
> +     which matches the SVE format, so no byteswap is needed. */
> +

With this function, we are only handling FPSIMD values, so no need to mention SVE.
As it is now, it makes the has_sve parts confusing because they are being treated
the same as the rest of the fpsimd.

The same comment is fine when used elsewhere in the patch.


> +  if (byte_order == BFD_ENDIAN_BIG)
> +    {
> +      gdb_byte buf[V_REGISTER_SIZE];
> +
> +      if (target_read_memory (offset, buf, V_REGISTER_SIZE) != 0)
> +	{
> +	  size_t size = V_REGISTER_SIZE/2;
> +
> +	  /* Read the two halves of the V register in reverse byte order.  */
> +	  CORE_ADDR u64 = extract_unsigned_integer (buf, size,
> +						    byte_order);
> +	  CORE_ADDR l64 = extract_unsigned_integer (buf + size, size,
> +						    byte_order);
> +
> +	  /* Copy the reversed bytes to the buffer.  */
> +	  store_unsigned_integer (buf, size, BFD_ENDIAN_LITTLE, l64);
> +	  store_unsigned_integer (buf + size , size, BFD_ENDIAN_LITTLE, u64);
> +
> +	  /* Now we can store the correct bytes for the V register.  */
> +	  trad_frame_set_reg_value_bytes (cache, AARCH64_V0_REGNUM + vreg_num,
> +					  buf, V_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_Q0_REGNUM
> +					  + vreg_num, buf, Q_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_D0_REGNUM
> +					  + vreg_num, buf, D_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_S0_REGNUM
> +					  + vreg_num, buf, S_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_H0_REGNUM
> +					  + vreg_num, buf, H_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_B0_REGNUM
> +					  + vreg_num, buf, B_REGISTER_SIZE);
> +
> +	  if (has_sve)
> +	    trad_frame_set_reg_value_bytes (cache,
> +					    num_regs + AARCH64_SVE_V0_REGNUM
> +					    + vreg_num, buf, V_REGISTER_SIZE);
> +	}
> +      return;
> +    }
> +
> +  /* Little endian, just point at the address containing the register
> +     value.  */
> +  trad_frame_set_reg_addr (cache, AARCH64_V0_REGNUM + vreg_num, offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_Q0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_D0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_S0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_H0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_B0_REGNUM + vreg_num,
> +			   offset);
> +
> +  if (has_sve)
> +    trad_frame_set_reg_addr (cache, num_regs + AARCH64_SVE_V0_REGNUM
> +			     + vreg_num, offset);
> +
> +}
> +
> /* Implement the "init" method of struct tramp_frame.  */
> 
> static void
> @@ -332,27 +420,16 @@ aarch64_linux_sigframe_init (const struct tramp_frame *self,
> 
>       /* If there was no SVE section then set up the V registers.  */
>       if (sve_regs == 0)
> -	for (int i = 0; i < 32; i++)
> -	  {
> -	    CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
> +	{
> +	  for (int i = 0; i < 32; i++)
> +	    {
> +	      CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
> 				  + (i * AARCH64_FPSIMD_VREG_SIZE));
> 
> -	    trad_frame_set_reg_addr (this_cache, AARCH64_V0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_Q0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_D0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_S0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_H0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_B0_REGNUM + i, offset);
> -	    if (tdep->has_sve ())
> -	      trad_frame_set_reg_addr (this_cache,
> -				       num_regs + AARCH64_SVE_V0_REGNUM + i,
> -				       offset);
> -	  }
> +	      aarch64_linux_restore_vreg (this_cache, num_regs, i, offset,
> +					  byte_order, tdep->has_sve ());
> +	    }
> +	}
>     }
> 
>   trad_frame_set_id (this_cache, frame_id_build (sp, func));
> diff --git a/gdb/nat/aarch64-sve-linux-ptrace.c b/gdb/nat/aarch64-sve-linux-ptrace.c
> index 2ce90ccfd7..9ef5e91801 100644
> --- a/gdb/nat/aarch64-sve-linux-ptrace.c
> +++ b/gdb/nat/aarch64-sve-linux-ptrace.c
> @@ -26,6 +26,7 @@
> #include "arch/aarch64.h"
> #include "gdbsupport/common-regcache.h"
> #include "gdbsupport/byte-vector.h"
> +#include <endian.h>
> 
> /* See nat/aarch64-sve-linux-ptrace.h.  */
> 
> @@ -142,6 +143,27 @@ aarch64_sve_get_sveregs (int tid)
>   return buf;
> }
> 
> +/* If we are running in BE mode, convert the contents
> +   of VALUE (a 16 byte buffer) to LE.  */
> +
> +static void
> +aarch64_maybe_swab128 (gdb_byte *value)
> +{
> +  gdb_assert (value != nullptr);
> +
> +#if (__BYTE_ORDER == __BIG_ENDIAN)
> +  gdb_byte copy[16];
> +
> +  /* Save the original value.  */
> +  memcpy (copy, value, 16);
> +
> +  for (int i = 0; i < 15; i++)
> +    value[i] = copy[15 - i];
> +#else
> +  /* Nothing to be done.  */
> +#endif
> +}
> +
> /* See nat/aarch64-sve-linux-ptrace.h.  */
> 
> void
> @@ -184,11 +206,22 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>     }
>   else
>     {
> +      /* WARNING: SIMD state is laid out in memory in target-endian format,
> +	 while SVE state is laid out in an endianness-independent format (LE).
> +
> +	 So we have a couple cases to consider:
> +
> +	 1 - If the target is big endian, then SIMD state is big endian,
> +	 requiring a byteswap.
> +
> +	 2 - If the target is little endian, then SIMD state is little endian,
> +	 which matches the SVE format, so no byteswap is needed. */
> +
>       /* There is no SVE state yet - the register dump contains a fpsimd
> 	 structure instead.  These registers still exist in the hardware, but
> 	 the kernel has not yet initialised them, and so they will be null.  */
> 
> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
> +      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>       struct user_fpsimd_state *fpsimd
> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
> 
> @@ -199,7 +232,9 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
> 
>       for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
> 	{
> -	  memcpy (zero_reg, &fpsimd->vregs[i], sizeof (__int128_t));
> +	  memcpy (zero_reg, &fpsimd->vregs[i], 16);
> +	  /* Handle big endian/little endian SIMD/SVE conversion.  */
> +	  aarch64_maybe_swab128 (zero_reg);

I think we have a long standing bug here. zero_reg was meant to stay as the value 0. But then
it got reused as a general temp buffer.
It’s not shown in the diff, but we do:
* memset zero_reg to 0
* use zero_reg as a temp buffer for copying fpsimd values.
* use zero_reg as value 0 for fpsr and fpcr.

The memset needs moving after using it for fpsimd. (maybe also rename zero_reg to buf?)

Can we also reduce the number of memcpys - just byte swap vregs[i] directly into the zero_reg buffer?


> 	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
> 	}
> 
> @@ -240,7 +275,7 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
> 	 kernel, which is why we try to avoid it.  */
> 
>       bool has_sve_state = false;
> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
> +      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>       struct user_fpsimd_state *fpsimd
> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
> 
> @@ -274,6 +309,18 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
> 	 write out state and return.  */
>       if (!has_sve_state)
> 	{
> +	  /* WARNING: SIMD state is laid out in memory in target-endian format,
> +	     while SVE state is laid out in an endianness-independent format
> +	     (LE).
> +
> +	     So we have a couple cases to consider:
> +
> +	     1 - If the target is big endian, then SIMD state is big endian,
> +	     requiring a byteswap.
> +
> +	     2 - If the target is little endian, then SIMD state is little
> +	     endian, which matches the SVE format, so no byteswap is needed. */
> +
> 	  /* The collects of the Z registers will overflow the size of a vreg.
> 	     There is enough space in the structure to allow for this, but we
> 	     cannot overflow into the next register as we might not be
> @@ -285,7 +332,9 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
> 		  == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
> 		{
> 		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
> -		  memcpy (&fpsimd->vregs[i], zero_reg, sizeof (__int128_t));
> +		  /* Handle big endian/little endian SIMD/SVE conversion.  */
> +		  aarch64_maybe_swab128 (zero_reg);
> +		  memcpy (&fpsimd->vregs[i], zero_reg, 16);
> 		}
> 	    }
> 
> diff --git a/gdb/trad-frame.c b/gdb/trad-frame.c
> index a6a84790a9..8a1aa818ad 100644
> --- a/gdb/trad-frame.c
> +++ b/gdb/trad-frame.c
> @@ -56,6 +56,7 @@ trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>     {
>       regs[regnum].realreg = regnum;
>       regs[regnum].addr = -1;
> +      regs[regnum].data = nullptr;
>     }
> }
> 
> @@ -83,7 +84,7 @@ trad_frame_alloc_saved_regs (struct frame_info *this_frame)
>   return trad_frame_alloc_saved_regs (gdbarch);
> }
> 
> -enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2 };
> +enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2, TF_REG_VALUE_BYTES = -3 };
> 
> int
> trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[], int regnum)
> @@ -106,6 +107,16 @@ trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
> 	  && this_saved_regs[regnum].addr == -1);
> }
> 
> +/* See trad-frame.h.  */
> +
> +bool
> +trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
> +			  int regnum)
> +{
> +  return (this_saved_regs[regnum].realreg == TF_REG_VALUE_BYTES
> +	  && this_saved_regs[regnum].data != nullptr);
> +}
> +
> void
> trad_frame_set_value (struct trad_frame_saved_reg this_saved_regs[],
> 		      int regnum, LONGEST val)
> @@ -224,6 +235,35 @@ trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>   this_saved_regs[regnum].addr = -1;
> }
> 
> +/* See trad-frame.h.  */
> +
> +void
> +trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
> +			    int regnum, const gdb_byte *bytes,
> +			    size_t size)
> +{
> +  this_saved_regs[regnum].realreg = TF_REG_VALUE_BYTES;
> +
> +  /* Allocate the space and copy the data bytes.  */
> +  this_saved_regs[regnum].data = FRAME_OBSTACK_CALLOC (size, gdb_byte);

Am I right to assume this means data will be automatically unallocated when
the trad_frame_saved_reg goes out of scope?


> +  memcpy (this_saved_regs[regnum].data, bytes, size);
> +}
> +
> +/* See trad-frame.h.  */
> +
> +void
> +trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
> +				int regnum, const gdb_byte *bytes,
> +				size_t size)
> +{
> +  /* External interface for users of trad_frame_cache
> +     (who cannot access the prev_regs object directly).  */
> +  trad_frame_set_value_bytes (this_trad_cache->prev_regs, regnum, bytes,
> +			      size);
> +}
> +
> +
> +
> struct value *
> trad_frame_get_prev_register (struct frame_info *this_frame,
> 			      struct trad_frame_saved_reg this_saved_regs[],
> @@ -240,6 +280,10 @@ trad_frame_get_prev_register (struct frame_info *this_frame,
>     /* The register's value is available.  */
>     return frame_unwind_got_constant (this_frame, regnum,
> 				      this_saved_regs[regnum].addr);
> +  else if (trad_frame_value_bytes_p (this_saved_regs, regnum))
> +    /* The register's value is available as a sequence of bytes.  */
> +    return frame_unwind_got_bytes (this_frame, regnum,
> +				   this_saved_regs[regnum].data);
>   else
>     return frame_unwind_got_optimized (this_frame, regnum);
> }
> diff --git a/gdb/trad-frame.h b/gdb/trad-frame.h
> index 7b5785616e..38db439579 100644
> --- a/gdb/trad-frame.h
> +++ b/gdb/trad-frame.h
> @@ -52,6 +52,12 @@ void trad_frame_set_reg_regmap (struct trad_frame_cache *this_trad_cache,
> void trad_frame_set_reg_value (struct trad_frame_cache *this_cache,
> 			       int regnum, LONGEST val);
> 
> +/* Given the cache in THIS_TRAD_CACHE, set the value of REGNUM to the bytes
> +   contained in BYTES with size SIZE.  */
> +void trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
> +				     int regnum, const gdb_byte *bytes,
> +				     size_t size);
> +
> struct value *trad_frame_get_register (struct trad_frame_cache *this_trad_cache,
> 				       struct frame_info *this_frame,
> 				       int regnum);
> @@ -86,6 +92,8 @@ struct trad_frame_saved_reg
> {
>   LONGEST addr; /* A CORE_ADDR fits in a longest.  */
>   int realreg;
> +  /* Register data (for values that don't fit in ADDR).  */
> +  gdb_byte *data;
> };
> 
> /* Encode REGNUM value in the trad-frame.  */
> @@ -104,6 +112,12 @@ void trad_frame_set_addr (struct trad_frame_saved_reg this_trad_cache[],
> void trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
> 			     int regnum);
> 
> +/* Encode REGNUM value in the trad-frame as a sequence of bytes.  This is
> +   useful when the value is larger than what primitive types can hold.  */
> +void trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
> +				 int regnum, const gdb_byte *bytes,
> +				 size_t size);
> +
> /* Convenience functions, return non-zero if the register has been
>    encoded as specified.  */
> int trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[],
> @@ -113,6 +127,11 @@ int trad_frame_addr_p (struct trad_frame_saved_reg this_saved_regs[],
> int trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
> 			  int regnum);
> 
> +/* Return TRUE if REGNUM is stored as a sequence of bytes, and FALSE
> +   otherwise.  */
> +bool trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
> +			      int regnum);
> +
> /* Reset the save regs cache, setting register values to -1.  */
> void trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
> 				  struct trad_frame_saved_reg *regs);
> -- 
> 2.25.1
>

next prev parent reply	other threads:[~2020-12-01 11:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-30 18:55 Luis Machado via Gdb-patches
2020-12-01 11:28 ` Alan Hayward via Gdb-patches [this message]
2020-12-01 12:19   ` Luis Machado via Gdb-patches
2020-12-01 17:38     ` Alan Hayward via Gdb-patches
2020-12-01 18:40       ` Luis Machado via Gdb-patches
2020-12-02  9:07         ` Alan Hayward via Gdb-patches
2020-12-02 17:57 ` [PATCH,v2] " Luis Machado via Gdb-patches
2020-12-03 17:35   ` Alan Hayward via Gdb-patches
2020-12-03 17:37     ` Luis Machado via Gdb-patches
2020-12-04 14:22   ` Luis Machado via Gdb-patches
2020-12-08 13:39     ` Luis Machado via Gdb-patches
2020-12-08 16:10     ` Simon Marchi via Gdb-patches
2020-12-08 19:22       ` Luis Machado via Gdb-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D31B28FD-89FB-4502-80E5-28AC85F61997@arm.com \
    --to=gdb-patches@sourceware.org \
    --cc=Alan.Hayward@arm.com \
    --cc=luis.machado@linaro.org \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox