* [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered
@ 2025-10-23 21:42 Can Acar
2025-10-23 21:44 ` Can Acar
2025-10-24 13:43 ` Tom Tromey
0 siblings, 2 replies; 5+ messages in thread
From: Can Acar @ 2025-10-23 21:42 UTC (permalink / raw)
To: gdb-patches
DW_FORM_addr is converted (sign-extended) to a signed value when
the dwarf size is less than the size of unrelocated_addr, if the
target architecture "naturally" sign extends an address (bfd.c).
However, the same handling was not done for DW_FORM_addrx. This
meant that for example, trying to `list` a function with an
address >= 0x80000000 on (some?) 32-bit mips targets, when
that address was encoded using DW_FORM_addrx, was broken.
This patch fixes this issue by plumbing read_addr_index_1 into
unit_head::read_address, which is the function used to extract
information from DW_FORM_addr, and so it handles this case
correctly.
---
gdb/dwarf2/read.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 955893c5f0c..e83d6ca38ae 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -15350,12 +15350,14 @@ dwarf2_per_objfile::read_line_string (const gdb_byte *buf,
static unrelocated_addr
read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int addr_index,
- std::optional<ULONGEST> addr_base, int addr_size)
+ std::optional<ULONGEST> addr_base, unit_head *cu_header)
{
struct objfile *objfile = per_objfile->objfile;
bfd *abfd = objfile->obfd.get ();
const gdb_byte *info_ptr;
ULONGEST addr_base_or_zero = addr_base.has_value () ? *addr_base : 0;
+ unsigned int ignore_bytes_read;
+ unsigned char addr_size = cu_header->addr_size;
per_objfile->per_bfd->addr.read (objfile);
if (per_objfile->per_bfd->addr.buffer == NULL)
@@ -15368,10 +15370,7 @@ read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int addr_index,
objfile_name (objfile));
info_ptr = (per_objfile->per_bfd->addr.buffer + addr_base_or_zero
+ addr_index * addr_size);
- if (addr_size == 4)
- return (unrelocated_addr) bfd_get_32 (abfd, info_ptr);
- else
- return (unrelocated_addr) bfd_get_64 (abfd, info_ptr);
+ return cu_header->read_address(abfd, info_ptr, &ignore_bytes_read);
}
/* Given index ADDR_INDEX in .debug_addr, fetch the value. */
@@ -15380,7 +15379,7 @@ static unrelocated_addr
read_addr_index (struct dwarf2_cu *cu, unsigned int addr_index)
{
return read_addr_index_1 (cu->per_objfile, addr_index,
- cu->addr_base, cu->header.addr_size);
+ cu->addr_base, &cu->header);
}
/* Given a pointer to an leb128 value, fetch the value from .debug_addr. */
@@ -15403,9 +15402,9 @@ dwarf2_read_addr_index (dwarf2_per_cu *per_cu, dwarf2_per_objfile *per_objfile,
{
struct dwarf2_cu *cu = per_objfile->get_cu (per_cu);
std::optional<ULONGEST> addr_base;
- int addr_size;
+ unit_head *cu_header;
- /* We need addr_base and addr_size.
+ /* We need addr_base and header (only some fields of which are used later).
If we don't have PER_CU->cu, we have to get it.
Nasty, but the alternative is storing the needed info in PER_CU,
which at this point doesn't seem justified: it's not clear how frequently
@@ -15424,17 +15423,17 @@ dwarf2_read_addr_index (dwarf2_per_cu *per_cu, dwarf2_per_objfile *per_objfile,
if (cu != NULL)
{
addr_base = cu->addr_base;
- addr_size = cu->header.addr_size;
+ cu_header = &cu->header;
}
else
{
cutu_reader reader (*per_cu, *per_objfile, nullptr, nullptr, false,
language_minimal);
addr_base = reader.cu ()->addr_base;
- addr_size = reader.cu ()->header.addr_size;
+ cu_header = &reader.cu ()->header;
}
- return read_addr_index_1 (per_objfile, addr_index, addr_base, addr_size);
+ return read_addr_index_1 (per_objfile, addr_index, addr_base, cu_header);
}
/* Given a DW_FORM_GNU_str_index value STR_INDEX, fetch the string.
--
2.50.1 (Apple Git-155)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered
2025-10-23 21:42 [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered Can Acar
@ 2025-10-23 21:44 ` Can Acar
2025-10-24 13:43 ` Tom Tromey
1 sibling, 0 replies; 5+ messages in thread
From: Can Acar @ 2025-10-23 21:44 UTC (permalink / raw)
To: gdb-patches
[-- Attachment #1: Type: text/plain, Size: 4491 bytes --]
To be perfectly responsible here, I need to state that I have not run the
testsuite, because I couldn't get it working on my machine. It could be
possible for me to set up a virtual environment where I can run the test
suites if required.
Best regards,
Can
On Thu, Oct 23, 2025 at 2:42 PM Can Acar <canacar@imcan.dev> wrote:
> DW_FORM_addr is converted (sign-extended) to a signed value when
> the dwarf size is less than the size of unrelocated_addr, if the
> target architecture "naturally" sign extends an address (bfd.c).
>
> However, the same handling was not done for DW_FORM_addrx. This
> meant that for example, trying to `list` a function with an
> address >= 0x80000000 on (some?) 32-bit mips targets, when
> that address was encoded using DW_FORM_addrx, was broken.
>
> This patch fixes this issue by plumbing read_addr_index_1 into
> unit_head::read_address, which is the function used to extract
> information from DW_FORM_addr, and so it handles this case
> correctly.
> ---
> gdb/dwarf2/read.c | 21 ++++++++++-----------
> 1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index 955893c5f0c..e83d6ca38ae 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -15350,12 +15350,14 @@ dwarf2_per_objfile::read_line_string (const
> gdb_byte *buf,
>
> static unrelocated_addr
> read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int
> addr_index,
> - std::optional<ULONGEST> addr_base, int addr_size)
> + std::optional<ULONGEST> addr_base, unit_head *cu_header)
> {
> struct objfile *objfile = per_objfile->objfile;
> bfd *abfd = objfile->obfd.get ();
> const gdb_byte *info_ptr;
> ULONGEST addr_base_or_zero = addr_base.has_value () ? *addr_base : 0;
> + unsigned int ignore_bytes_read;
> + unsigned char addr_size = cu_header->addr_size;
>
> per_objfile->per_bfd->addr.read (objfile);
> if (per_objfile->per_bfd->addr.buffer == NULL)
> @@ -15368,10 +15370,7 @@ read_addr_index_1 (dwarf2_per_objfile
> *per_objfile, unsigned int addr_index,
> objfile_name (objfile));
> info_ptr = (per_objfile->per_bfd->addr.buffer + addr_base_or_zero
> + addr_index * addr_size);
> - if (addr_size == 4)
> - return (unrelocated_addr) bfd_get_32 (abfd, info_ptr);
> - else
> - return (unrelocated_addr) bfd_get_64 (abfd, info_ptr);
> + return cu_header->read_address(abfd, info_ptr, &ignore_bytes_read);
> }
>
> /* Given index ADDR_INDEX in .debug_addr, fetch the value. */
> @@ -15380,7 +15379,7 @@ static unrelocated_addr
> read_addr_index (struct dwarf2_cu *cu, unsigned int addr_index)
> {
> return read_addr_index_1 (cu->per_objfile, addr_index,
> - cu->addr_base, cu->header.addr_size);
> + cu->addr_base, &cu->header);
> }
>
> /* Given a pointer to an leb128 value, fetch the value from .debug_addr.
> */
> @@ -15403,9 +15402,9 @@ dwarf2_read_addr_index (dwarf2_per_cu *per_cu,
> dwarf2_per_objfile *per_objfile,
> {
> struct dwarf2_cu *cu = per_objfile->get_cu (per_cu);
> std::optional<ULONGEST> addr_base;
> - int addr_size;
> + unit_head *cu_header;
>
> - /* We need addr_base and addr_size.
> + /* We need addr_base and header (only some fields of which are used
> later).
> If we don't have PER_CU->cu, we have to get it.
> Nasty, but the alternative is storing the needed info in PER_CU,
> which at this point doesn't seem justified: it's not clear how
> frequently
> @@ -15424,17 +15423,17 @@ dwarf2_read_addr_index (dwarf2_per_cu *per_cu,
> dwarf2_per_objfile *per_objfile,
> if (cu != NULL)
> {
> addr_base = cu->addr_base;
> - addr_size = cu->header.addr_size;
> + cu_header = &cu->header;
> }
> else
> {
> cutu_reader reader (*per_cu, *per_objfile, nullptr, nullptr, false,
> language_minimal);
> addr_base = reader.cu ()->addr_base;
> - addr_size = reader.cu ()->header.addr_size;
> + cu_header = &reader.cu ()->header;
> }
>
> - return read_addr_index_1 (per_objfile, addr_index, addr_base,
> addr_size);
> + return read_addr_index_1 (per_objfile, addr_index, addr_base,
> cu_header);
> }
>
> /* Given a DW_FORM_GNU_str_index value STR_INDEX, fetch the string.
> --
> 2.50.1 (Apple Git-155)
>
>
[-- Attachment #2: Type: text/html, Size: 5521 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered
2025-10-23 21:42 [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered Can Acar
2025-10-23 21:44 ` Can Acar
@ 2025-10-24 13:43 ` Tom Tromey
2025-10-28 1:02 ` Can Acar
1 sibling, 1 reply; 5+ messages in thread
From: Tom Tromey @ 2025-10-24 13:43 UTC (permalink / raw)
To: Can Acar; +Cc: gdb-patches
>>>>> "Can" == Can Acar <canacar@imcan.dev> writes:
Can> DW_FORM_addr is converted (sign-extended) to a signed value when
Can> the dwarf size is less than the size of unrelocated_addr, if the
Can> target architecture "naturally" sign extends an address (bfd.c).
Thanks for the patch.
Can> static unrelocated_addr
Can> read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int addr_index,
Can> - std::optional<ULONGEST> addr_base, int addr_size)
Can> + std::optional<ULONGEST> addr_base, unit_head *cu_header)
I think it'd be better to merge read_addr_index_1 and read_addr_index,
i.e. have it be:
static unrelocated_addr
read_addr_index (struct dwarf2_cu *cu, unsigned int addr_index)
...
Can> + return cu_header->read_address(abfd, info_ptr, &ignore_bytes_read);
gdb puts a space before "("
Can> else
Can> {
Can> cutu_reader reader (*per_cu, *per_objfile, nullptr, nullptr, false,
Can> language_minimal);
Can> addr_base = reader.cu ()->addr_base;
Can> - addr_size = reader.cu ()->header.addr_size;
Can> + cu_header = &reader.cu ()->header;
I'm not sure it is safe to refer to hold this pointer after cutu_reader
is destroyed.
Instead this could be refactored slightly to:
std::optional<cutu_reader> reader;
if ()
...
else
{
reader.emplace (...)
cu = reader->cu ();
}
thanks,
Tom
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered
2025-10-24 13:43 ` Tom Tromey
@ 2025-10-28 1:02 ` Can Acar
2025-10-28 1:03 ` [PATCH v2] " Can Acar
0 siblings, 1 reply; 5+ messages in thread
From: Can Acar @ 2025-10-28 1:02 UTC (permalink / raw)
To: Tom Tromey; +Cc: gdb-patches
[-- Attachment #1: Type: text/plain, Size: 1809 bytes --]
CC'ing gdb-patches@sourceware.org as well this time. Sorry for the multiple
emails.
On Fri, Oct 24, 2025 at 6:43 AM Tom Tromey <tom@tromey.com> wrote:
> >>>>> "Can" == Can Acar <canacar@imcan.dev> writes:
>
> Can> DW_FORM_addr is converted (sign-extended) to a signed value when
> Can> the dwarf size is less than the size of unrelocated_addr, if the
> Can> target architecture "naturally" sign extends an address (bfd.c).
>
> Thanks for the patch.
>
> Can> static unrelocated_addr
> Can> read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int
> addr_index,
> Can> - std::optional<ULONGEST> addr_base, int addr_size)
> Can> + std::optional<ULONGEST> addr_base, unit_head *cu_header)
>
> I think it'd be better to merge read_addr_index_1 and read_addr_index,
> i.e. have it be:
>
> static unrelocated_addr
> read_addr_index (struct dwarf2_cu *cu, unsigned int addr_index)
> ...
>
> Can> + return cu_header->read_address(abfd, info_ptr, &ignore_bytes_read);
>
> gdb puts a space before "("
>
> Can> else
> Can> {
> Can> cutu_reader reader (*per_cu, *per_objfile, nullptr, nullptr,
> false,
> Can> language_minimal);
> Can> addr_base = reader.cu ()->addr_base;
> Can> - addr_size = reader.cu ()->header.addr_size;
> Can> + cu_header = &reader.cu ()->header;
>
> I'm not sure it is safe to refer to hold this pointer after cutu_reader
> is destroyed.
>
> Instead this could be refactored slightly to:
>
> std::optional<cutu_reader> reader;
> if ()
> ...
> else
> {
> reader.emplace (...)
> cu = reader->cu ();
> }
>
> thanks,
> Tom
>
I've implemented your suggestions, and will be sending in a new patch
version in a moment.
Best regards,
Can
[-- Attachment #2: Type: text/html, Size: 4187 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2] Fix: Sign extensions for DW_FORM_addrx were never considered
2025-10-28 1:02 ` Can Acar
@ 2025-10-28 1:03 ` Can Acar
0 siblings, 0 replies; 5+ messages in thread
From: Can Acar @ 2025-10-28 1:03 UTC (permalink / raw)
To: tom; +Cc: gdb-patches, canacar
From: "canacar@imcan.dev" <canacar@imcan.dev>
DW_FORM_addr is converted (sign-extended) to a signed value when
the dwarf size is less than the size of unrelocated_addr, if the
target architecture "naturally" sign extends an address (bfd.c).
However, the same handling was not done for DW_FORM_addrx. This
meant that for example, trying to `list` a function with an
address >= 0x80000000 on (some?) 32-bit mips targets, when
that address was encoded using DW_FORM_addrx, was broken.
This patch fixes this issue by plumbing read_addr_index_1 into
unit_head::read_address, which is the function used to extract
information from DW_FORM_addr, and so it handles this case
correctly.
---
Changes since v1:
* read_addr_index_1 has been removed, merging its functionality into
read_addr_index.
* dwarf2_read_addr_index has been modified to change the reference to the cu
in the newly-created cu case to something that seems safer.
* fix a formatting issue in a call to read_address.
gdb/dwarf2/read.c | 44 ++++++++++++++++----------------------------
1 file changed, 16 insertions(+), 28 deletions(-)
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 431b7c9ea2c..cca42722471 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -15352,14 +15352,20 @@ dwarf2_per_objfile::read_line_string (const gdb_byte *buf,
ADDR_BASE is the DW_AT_addr_base (DW_AT_GNU_addr_base) attribute or zero.
ADDR_SIZE is the size of addresses from the CU header. */
+/* Given index ADDR_INDEX in .debug_addr, fetch the value. */
+
static unrelocated_addr
-read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int addr_index,
- std::optional<ULONGEST> addr_base, int addr_size)
+read_addr_index (struct dwarf2_cu *cu, unsigned int addr_index)
{
+ const gdb_byte *info_ptr;
+ struct dwarf2_per_objfile *per_objfile = cu->per_objfile;
+ std::optional<ULONGEST> addr_base = cu->addr_base;
+ struct unit_head *cu_header = &cu->header;
struct objfile *objfile = per_objfile->objfile;
bfd *abfd = objfile->obfd.get ();
- const gdb_byte *info_ptr;
ULONGEST addr_base_or_zero = addr_base.has_value () ? *addr_base : 0;
+ unsigned int ignore_bytes_read;
+ unsigned char addr_size = cu_header->addr_size;
per_objfile->per_bfd->addr.read (objfile);
if (per_objfile->per_bfd->addr.buffer == NULL)
@@ -15372,20 +15378,9 @@ read_addr_index_1 (dwarf2_per_objfile *per_objfile, unsigned int addr_index,
objfile_name (objfile));
info_ptr = (per_objfile->per_bfd->addr.buffer + addr_base_or_zero
+ addr_index * addr_size);
- if (addr_size == 4)
- return (unrelocated_addr) bfd_get_32 (abfd, info_ptr);
- else
- return (unrelocated_addr) bfd_get_64 (abfd, info_ptr);
+ return cu_header->read_address (abfd, info_ptr, &ignore_bytes_read);
}
-/* Given index ADDR_INDEX in .debug_addr, fetch the value. */
-
-static unrelocated_addr
-read_addr_index (struct dwarf2_cu *cu, unsigned int addr_index)
-{
- return read_addr_index_1 (cu->per_objfile, addr_index,
- cu->addr_base, cu->header.addr_size);
-}
/* Given a pointer to an leb128 value, fetch the value from .debug_addr. */
@@ -15405,11 +15400,10 @@ unrelocated_addr
dwarf2_read_addr_index (dwarf2_per_cu *per_cu, dwarf2_per_objfile *per_objfile,
unsigned int addr_index)
{
+ std::optional<cutu_reader> reader;
struct dwarf2_cu *cu = per_objfile->get_cu (per_cu);
- std::optional<ULONGEST> addr_base;
- int addr_size;
- /* We need addr_base and addr_size.
+ /* read_addr_index requires some fields from cu.
If we don't have PER_CU->cu, we have to get it.
Nasty, but the alternative is storing the needed info in PER_CU,
which at this point doesn't seem justified: it's not clear how frequently
@@ -15425,20 +15419,14 @@ dwarf2_read_addr_index (dwarf2_per_cu *per_cu, dwarf2_per_objfile *per_objfile,
IWBN to use the aging mechanism to let us lazily later discard the CU.
For now we skip this optimization. */
- if (cu != NULL)
+ if (cu == NULL)
{
- addr_base = cu->addr_base;
- addr_size = cu->header.addr_size;
- }
- else
- {
- cutu_reader reader (*per_cu, *per_objfile, nullptr, nullptr, false,
+ reader.emplace (*per_cu, *per_objfile, nullptr, nullptr, false,
language_minimal);
- addr_base = reader.cu ()->addr_base;
- addr_size = reader.cu ()->header.addr_size;
+ cu = reader->cu ();
}
- return read_addr_index_1 (per_objfile, addr_index, addr_base, addr_size);
+ return read_addr_index (cu, addr_index);
}
/* Given a DW_FORM_GNU_str_index value STR_INDEX, fetch the string.
--
2.50.1 (Apple Git-155)
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-10-28 1:42 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-23 21:42 [PATCH] Fix: Sign extensions for DW_FORM_addrx were never considered Can Acar
2025-10-23 21:44 ` Can Acar
2025-10-24 13:43 ` Tom Tromey
2025-10-28 1:02 ` Can Acar
2025-10-28 1:03 ` [PATCH v2] " Can Acar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox