[PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered
@ 2025-10-23 21:42 Can Acar
  2025-10-23 21:44 ` Can Acar
  2025-10-24 13:43 ` Tom Tromey
  0 siblings, 2 replies; 5+ messages in thread
From: Can Acar @ 2025-10-23 21:42 UTC (permalink / raw)
  To: gdb-patches

DW_FORM_addr is converted (sign-extended) to a signed value when
the dwarf size is less than the size of unrelocated_addr, if the
target architecture "naturally" sign extends an address (bfd.c).

However, the same handling was not done for DW_FORM_addrx. This
meant that for example, trying to `list` a function with an
address >= 0x80000000 on (some?) 32-bit mips targets, when
that address was encoded using DW_FORM_addrx, was broken.

This patch fixes this issue by plumbing read_addr_index_1 into
unit_head::read_address, which is the function used to extract
information from DW_FORM_addr, and so it handles this case
correctly.
---
 gdb/dwarf2/read.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 955893c5f0c..e83d6ca38ae 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -15350,12 +15350,14 @@ dwarf2_per_objfile::read_line_string (const gdb_byte *buf,
 
 static unrelocated_addr
 read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int addr_index,
-		   std::optional<ULONGEST> addr_base, int addr_size)
+		   std::optional<ULONGEST> addr_base, unit_head *cu_header)
 {
   struct objfile *objfile = per_objfile->objfile;
   bfd *abfd = objfile->obfd.get ();
   const gdb_byte *info_ptr;
   ULONGEST addr_base_or_zero = addr_base.has_value () ? *addr_base : 0;
+  unsigned int ignore_bytes_read;
+  unsigned char addr_size = cu_header->addr_size;
 
   per_objfile->per_bfd->addr.read (objfile);
   if (per_objfile->per_bfd->addr.buffer == NULL)
@@ -15368,10 +15370,7 @@ read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int addr_index,
 	   objfile_name (objfile));
   info_ptr = (per_objfile->per_bfd->addr.buffer + addr_base_or_zero
 	      + addr_index * addr_size);
-  if (addr_size == 4)
-    return (unrelocated_addr) bfd_get_32 (abfd, info_ptr);
-  else
-    return (unrelocated_addr) bfd_get_64 (abfd, info_ptr);
+  return cu_header->read_address(abfd, info_ptr, &ignore_bytes_read);
 }
 
 /* Given index ADDR_INDEX in .debug_addr, fetch the value.  */
@@ -15380,7 +15379,7 @@ static unrelocated_addr
 read_addr_index (struct dwarf2_cu *cu, unsigned int addr_index)
 {
   return read_addr_index_1 (cu->per_objfile, addr_index,
-			    cu->addr_base, cu->header.addr_size);
+			    cu->addr_base, &cu->header);
 }
 
 /* Given a pointer to an leb128 value, fetch the value from .debug_addr.  */
@@ -15403,9 +15402,9 @@ dwarf2_read_addr_index (dwarf2_per_cu *per_cu, dwarf2_per_objfile *per_objfile,
 {
   struct dwarf2_cu *cu = per_objfile->get_cu (per_cu);
   std::optional<ULONGEST> addr_base;
-  int addr_size;
+  unit_head *cu_header;
 
-  /* We need addr_base and addr_size.
+  /* We need addr_base and header (only some fields of which are used later).
      If we don't have PER_CU->cu, we have to get it.
      Nasty, but the alternative is storing the needed info in PER_CU,
      which at this point doesn't seem justified: it's not clear how frequently
@@ -15424,17 +15423,17 @@ dwarf2_read_addr_index (dwarf2_per_cu *per_cu, dwarf2_per_objfile *per_objfile,
   if (cu != NULL)
     {
       addr_base = cu->addr_base;
-      addr_size = cu->header.addr_size;
+      cu_header = &cu->header;
     }
   else
     {
       cutu_reader reader (*per_cu, *per_objfile, nullptr, nullptr, false,
 			  language_minimal);
       addr_base = reader.cu ()->addr_base;
-      addr_size = reader.cu ()->header.addr_size;
+      cu_header = &reader.cu ()->header;
     }
 
-  return read_addr_index_1 (per_objfile, addr_index, addr_base, addr_size);
+  return read_addr_index_1 (per_objfile, addr_index, addr_base, cu_header);
 }
 
 /* Given a DW_FORM_GNU_str_index value STR_INDEX, fetch the string.
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered
  2025-10-23 21:42 [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered Can Acar
@ 2025-10-23 21:44 ` Can Acar
  2025-10-24 13:43 ` Tom Tromey
  1 sibling, 0 replies; 5+ messages in thread
From: Can Acar @ 2025-10-23 21:44 UTC (permalink / raw)
  To: gdb-patches

[-- Attachment #1: Type: text/plain, Size: 4491 bytes --]

To be perfectly responsible here, I need to state that I have not run the
testsuite, because I couldn't get it working on my machine. It could be
possible for me to set up a virtual environment where I can run the test
suites if required.

Best regards,
Can

On Thu, Oct 23, 2025 at 2:42 PM Can Acar <canacar@imcan.dev> wrote:

> DW_FORM_addr is converted (sign-extended) to a signed value when
> the dwarf size is less than the size of unrelocated_addr, if the
> target architecture "naturally" sign extends an address (bfd.c).
>
> However, the same handling was not done for DW_FORM_addrx. This
> meant that for example, trying to `list` a function with an
> address >= 0x80000000 on (some?) 32-bit mips targets, when
> that address was encoded using DW_FORM_addrx, was broken.
>
> This patch fixes this issue by plumbing read_addr_index_1 into
> unit_head::read_address, which is the function used to extract
> information from DW_FORM_addr, and so it handles this case
> correctly.
> ---
>  gdb/dwarf2/read.c | 21 ++++++++++-----------
>  1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index 955893c5f0c..e83d6ca38ae 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -15350,12 +15350,14 @@ dwarf2_per_objfile::read_line_string (const
> gdb_byte *buf,
>
>  static unrelocated_addr
>  read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int
> addr_index,
> -                  std::optional<ULONGEST> addr_base, int addr_size)
> +                  std::optional<ULONGEST> addr_base, unit_head *cu_header)
>  {
>    struct objfile *objfile = per_objfile->objfile;
>    bfd *abfd = objfile->obfd.get ();
>    const gdb_byte *info_ptr;
>    ULONGEST addr_base_or_zero = addr_base.has_value () ? *addr_base : 0;
> +  unsigned int ignore_bytes_read;
> +  unsigned char addr_size = cu_header->addr_size;
>
>    per_objfile->per_bfd->addr.read (objfile);
>    if (per_objfile->per_bfd->addr.buffer == NULL)
> @@ -15368,10 +15370,7 @@ read_addr_index_1 (dwarf2_per_objfile
> *per_objfile, unsigned int addr_index,
>            objfile_name (objfile));
>    info_ptr = (per_objfile->per_bfd->addr.buffer + addr_base_or_zero
>               + addr_index * addr_size);
> -  if (addr_size == 4)
> -    return (unrelocated_addr) bfd_get_32 (abfd, info_ptr);
> -  else
> -    return (unrelocated_addr) bfd_get_64 (abfd, info_ptr);
> +  return cu_header->read_address(abfd, info_ptr, &ignore_bytes_read);
>  }
>
>  /* Given index ADDR_INDEX in .debug_addr, fetch the value.  */
> @@ -15380,7 +15379,7 @@ static unrelocated_addr
>  read_addr_index (struct dwarf2_cu *cu, unsigned int addr_index)
>  {
>    return read_addr_index_1 (cu->per_objfile, addr_index,
> -                           cu->addr_base, cu->header.addr_size);
> +                           cu->addr_base, &cu->header);
>  }
>
>  /* Given a pointer to an leb128 value, fetch the value from .debug_addr.
> */
> @@ -15403,9 +15402,9 @@ dwarf2_read_addr_index (dwarf2_per_cu *per_cu,
> dwarf2_per_objfile *per_objfile,
>  {
>    struct dwarf2_cu *cu = per_objfile->get_cu (per_cu);
>    std::optional<ULONGEST> addr_base;
> -  int addr_size;
> +  unit_head *cu_header;
>
> -  /* We need addr_base and addr_size.
> +  /* We need addr_base and header (only some fields of which are used
> later).
>       If we don't have PER_CU->cu, we have to get it.
>       Nasty, but the alternative is storing the needed info in PER_CU,
>       which at this point doesn't seem justified: it's not clear how
> frequently
> @@ -15424,17 +15423,17 @@ dwarf2_read_addr_index (dwarf2_per_cu *per_cu,
> dwarf2_per_objfile *per_objfile,
>    if (cu != NULL)
>      {
>        addr_base = cu->addr_base;
> -      addr_size = cu->header.addr_size;
> +      cu_header = &cu->header;
>      }
>    else
>      {
>        cutu_reader reader (*per_cu, *per_objfile, nullptr, nullptr, false,
>                           language_minimal);
>        addr_base = reader.cu ()->addr_base;
> -      addr_size = reader.cu ()->header.addr_size;
> +      cu_header = &reader.cu ()->header;
>      }
>
> -  return read_addr_index_1 (per_objfile, addr_index, addr_base,
> addr_size);
> +  return read_addr_index_1 (per_objfile, addr_index, addr_base,
> cu_header);
>  }
>
>  /* Given a DW_FORM_GNU_str_index value STR_INDEX, fetch the string.
> --
> 2.50.1 (Apple Git-155)
>
>

[-- Attachment #2: Type: text/html, Size: 5521 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered
  2025-10-23 21:42 [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered Can Acar
  2025-10-23 21:44 ` Can Acar
@ 2025-10-24 13:43 ` Tom Tromey
  2025-10-28  1:02   ` Can Acar
  1 sibling, 1 reply; 5+ messages in thread
From: Tom Tromey @ 2025-10-24 13:43 UTC (permalink / raw)
  To: Can Acar; +Cc: gdb-patches

>>>>> "Can" == Can Acar <canacar@imcan.dev> writes:

Can> DW_FORM_addr is converted (sign-extended) to a signed value when
Can> the dwarf size is less than the size of unrelocated_addr, if the
Can> target architecture "naturally" sign extends an address (bfd.c).

Thanks for the patch.

Can>  static unrelocated_addr
Can>  read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int addr_index,
Can> -		   std::optional<ULONGEST> addr_base, int addr_size)
Can> +		   std::optional<ULONGEST> addr_base, unit_head *cu_header)

I think it'd be better to merge read_addr_index_1 and read_addr_index,
i.e. have it be:

static unrelocated_addr
read_addr_index (struct dwarf2_cu *cu, unsigned int addr_index)
...

Can> +  return cu_header->read_address(abfd, info_ptr, &ignore_bytes_read);

gdb puts a space before "("
 
Can>    else
Can>      {
Can>        cutu_reader reader (*per_cu, *per_objfile, nullptr, nullptr, false,
Can>  			  language_minimal);
Can>        addr_base = reader.cu ()->addr_base;
Can> -      addr_size = reader.cu ()->header.addr_size;
Can> +      cu_header = &reader.cu ()->header;

I'm not sure it is safe to refer to hold this pointer after cutu_reader
is destroyed.

Instead this could be refactored slightly to:

std::optional<cutu_reader> reader;
if ()
...
else
  {
    reader.emplace (...)
    cu = reader->cu ();
  }

thanks,
Tom

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered
  2025-10-24 13:43 ` Tom Tromey
@ 2025-10-28  1:02   ` Can Acar
  2025-10-28  1:03     ` [PATCH v2] " Can Acar
  0 siblings, 1 reply; 5+ messages in thread
From: Can Acar @ 2025-10-28  1:02 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

[-- Attachment #1: Type: text/plain, Size: 1809 bytes --]

CC'ing gdb-patches@sourceware.org as well this time. Sorry for the multiple
emails.

On Fri, Oct 24, 2025 at 6:43 AM Tom Tromey <tom@tromey.com> wrote:

> >>>>> "Can" == Can Acar <canacar@imcan.dev> writes:
>
> Can> DW_FORM_addr is converted (sign-extended) to a signed value when
> Can> the dwarf size is less than the size of unrelocated_addr, if the
> Can> target architecture "naturally" sign extends an address (bfd.c).
>
> Thanks for the patch.
>
> Can>  static unrelocated_addr
> Can>  read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int
> addr_index,
> Can> -             std::optional<ULONGEST> addr_base, int addr_size)
> Can> +             std::optional<ULONGEST> addr_base, unit_head *cu_header)
>
> I think it'd be better to merge read_addr_index_1 and read_addr_index,
> i.e. have it be:
>
> static unrelocated_addr
> read_addr_index (struct dwarf2_cu *cu, unsigned int addr_index)
> ...
>
> Can> +  return cu_header->read_address(abfd, info_ptr, &ignore_bytes_read);
>
> gdb puts a space before "("
>
> Can>    else
> Can>      {
> Can>        cutu_reader reader (*per_cu, *per_objfile, nullptr, nullptr,
> false,
> Can>                      language_minimal);
> Can>        addr_base = reader.cu ()->addr_base;
> Can> -      addr_size = reader.cu ()->header.addr_size;
> Can> +      cu_header = &reader.cu ()->header;
>
> I'm not sure it is safe to refer to hold this pointer after cutu_reader
> is destroyed.
>
> Instead this could be refactored slightly to:
>
> std::optional<cutu_reader> reader;
> if ()
> ...
> else
>   {
>     reader.emplace (...)
>     cu = reader->cu ();
>   }
>
> thanks,
> Tom
>

I've implemented your suggestions, and will be sending in a new patch
version in a moment.

Best regards,
Can

[-- Attachment #2: Type: text/html, Size: 4187 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] Fix: Sign extensions for DW_FORM_addrx were never considered
  2025-10-28  1:02   ` Can Acar
@ 2025-10-28  1:03     ` Can Acar
  0 siblings, 0 replies; 5+ messages in thread
From: Can Acar @ 2025-10-28  1:03 UTC (permalink / raw)
  To: tom; +Cc: gdb-patches, canacar

From: "canacar@imcan.dev" <canacar@imcan.dev>

DW_FORM_addr is converted (sign-extended) to a signed value when
the dwarf size is less than the size of unrelocated_addr, if the
target architecture "naturally" sign extends an address (bfd.c).

However, the same handling was not done for DW_FORM_addrx. This
meant that for example, trying to `list` a function with an
address >= 0x80000000 on (some?) 32-bit mips targets, when
that address was encoded using DW_FORM_addrx, was broken.

This patch fixes this issue by plumbing read_addr_index_1 into
unit_head::read_address, which is the function used to extract
information from DW_FORM_addr, and so it handles this case
correctly.
---
Changes since v1:
  * read_addr_index_1 has been removed, merging its functionality into
  read_addr_index.
  * dwarf2_read_addr_index has been modified to change the reference to the cu
  in the newly-created cu case to something that seems safer.
  * fix a formatting issue in a call to read_address.

 gdb/dwarf2/read.c | 44 ++++++++++++++++----------------------------
 1 file changed, 16 insertions(+), 28 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 431b7c9ea2c..cca42722471 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -15352,14 +15352,20 @@ dwarf2_per_objfile::read_line_string (const gdb_byte *buf,
    ADDR_BASE is the DW_AT_addr_base (DW_AT_GNU_addr_base) attribute or zero.
    ADDR_SIZE is the size of addresses from the CU header.  */
 
+/* Given index ADDR_INDEX in .debug_addr, fetch the value.  */
+
 static unrelocated_addr
-read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int addr_index,
-		   std::optional<ULONGEST> addr_base, int addr_size)
+read_addr_index (struct dwarf2_cu *cu, unsigned int addr_index)
 {
+  const gdb_byte *info_ptr;
+  struct dwarf2_per_objfile *per_objfile = cu->per_objfile;
+  std::optional<ULONGEST> addr_base = cu->addr_base;
+  struct unit_head *cu_header = &cu->header;
   struct objfile *objfile = per_objfile->objfile;
   bfd *abfd = objfile->obfd.get ();
-  const gdb_byte *info_ptr;
   ULONGEST addr_base_or_zero = addr_base.has_value () ? *addr_base : 0;
+  unsigned int ignore_bytes_read;
+  unsigned char addr_size = cu_header->addr_size;
 
   per_objfile->per_bfd->addr.read (objfile);
   if (per_objfile->per_bfd->addr.buffer == NULL)
@@ -15372,20 +15378,9 @@ read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int addr_index,
 	   objfile_name (objfile));
   info_ptr = (per_objfile->per_bfd->addr.buffer + addr_base_or_zero
 	      + addr_index * addr_size);
-  if (addr_size == 4)
-    return (unrelocated_addr) bfd_get_32 (abfd, info_ptr);
-  else
-    return (unrelocated_addr) bfd_get_64 (abfd, info_ptr);
+  return cu_header->read_address (abfd, info_ptr, &ignore_bytes_read);
 }
 
-/* Given index ADDR_INDEX in .debug_addr, fetch the value.  */
-
-static unrelocated_addr
-read_addr_index (struct dwarf2_cu *cu, unsigned int addr_index)
-{
-  return read_addr_index_1 (cu->per_objfile, addr_index,
-			    cu->addr_base, cu->header.addr_size);
-}
 
 /* Given a pointer to an leb128 value, fetch the value from .debug_addr.  */
 
@@ -15405,11 +15400,10 @@ unrelocated_addr
 dwarf2_read_addr_index (dwarf2_per_cu *per_cu, dwarf2_per_objfile *per_objfile,
 			unsigned int addr_index)
 {
+  std::optional<cutu_reader> reader;
   struct dwarf2_cu *cu = per_objfile->get_cu (per_cu);
-  std::optional<ULONGEST> addr_base;
-  int addr_size;
 
-  /* We need addr_base and addr_size.
+  /* read_addr_index requires some fields from cu.
      If we don't have PER_CU->cu, we have to get it.
      Nasty, but the alternative is storing the needed info in PER_CU,
      which at this point doesn't seem justified: it's not clear how frequently
@@ -15425,20 +15419,14 @@ dwarf2_read_addr_index (dwarf2_per_cu *per_cu, dwarf2_per_objfile *per_objfile,
      IWBN to use the aging mechanism to let us lazily later discard the CU.
      For now we skip this optimization.  */
 
-  if (cu != NULL)
+  if (cu == NULL)
     {
-      addr_base = cu->addr_base;
-      addr_size = cu->header.addr_size;
-    }
-  else
-    {
-      cutu_reader reader (*per_cu, *per_objfile, nullptr, nullptr, false,
+      reader.emplace (*per_cu, *per_objfile, nullptr, nullptr, false,
 			  language_minimal);
-      addr_base = reader.cu ()->addr_base;
-      addr_size = reader.cu ()->header.addr_size;
+      cu = reader->cu ();
     }
 
-  return read_addr_index_1 (per_objfile, addr_index, addr_base, addr_size);
+  return read_addr_index (cu, addr_index);
 }
 
 /* Given a DW_FORM_GNU_str_index value STR_INDEX, fetch the string.
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-10-28  1:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-23 21:42 [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered Can Acar
2025-10-23 21:44 ` Can Acar
2025-10-24 13:43 ` Tom Tromey
2025-10-28  1:02   ` Can Acar
2025-10-28  1:03     ` [PATCH v2] " Can Acar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox