From: simon.marchi@polymtl.ca
To: gdb-patches@sourceware.org
Cc: Simon Marchi <simon.marchi@efficios.com>
Subject: [PATCH 5/8] gdb/dwarf: generate foreign type units in .debug_names
Date: Mon, 16 Mar 2026 19:19:23 -0400 [thread overview]
Message-ID: <20260316232042.368080-6-simon.marchi@polymtl.ca> (raw)
In-Reply-To: <20260316232042.368080-1-simon.marchi@polymtl.ca>
From: Simon Marchi <simon.marchi@efficios.com>
The DWARF 5 .debug_names index makes the distinction between local type
units and foreign type units. Local type units are those present
directly in the main file. Foreign type units are those present in .dwo
files. GDB only knows how to produce local type units today, which
leads to invalid indexes whenever there are type units in .dwo files.
To observe this, run test gdb.dwarf2/fission-with-type-unit.exp with
board cc-with-debug-names. The test passes, but we see this in the log:
(gdb) file
/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/fission-with-type-unit/fission-with-type-unit^M
Reading symbols from
/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/fission-with-type-unit/fission-with-type-unit...^M
warning: Section .debug_names has incorrect entry in TU table,
ignoring .debug_names.^M
These are the units involved in this test. The first and last CUs are
dummy CUs added by the DWARF assembler, they are not important.
- CU at offset 0x0 of .debug_info in fission-with-type-unit -- dummy
- CU at offset 0xc of .debug_info in fission-with-type-unit
- CU at offset 0x25 of .debug_info in fission-with-type-unit -- dummy
- TU at offset 0x0 of .debug_info.dwo in fission-with-type-unit-dw.dwo
This is the content of the produced .debug_names:
Contents of the .debug_names section:
$ readelf --debug-dump=gdb_index testsuite/outputs/gdb.dwarf2/fission-with-type-unit/fission-with-type-unit
...
Version 5
Augmentation string: 47 44 42 33 00 00 00 00 ("GDB3")
CU table:
[ 0] 0
[ 1] 0xc
[ 2] 0x25
TU table:
[ 0] 0
Foreign TU table:
Used 0 of 0 buckets.
Symbol table:
[ 1] global_var: <0><1> DW_TAG_variable DW_IDX_compile_unit=(udata)=1 DW_IDX_die_offset=(ref_addr)=<0x1c> DW_IDX_GNU_language=(udata)=0 DW_IDX_GNU_internal=(flag_present)=1
[ 2] int:
<0x8><2> DW_TAG_base_type DW_IDX_compile_unit=(udata)=1 DW_IDX_die_offset=(ref_addr)=<0x15> DW_IDX_GNU_language=(udata)=0
<0xf><3> DW_TAG_base_type DW_IDX_type_unit=(udata)=0 DW_IDX_die_offset=(ref_addr)=<0x19> DW_IDX_GNU_language=(udata)=0
The TU table claims that there is a TU at offset 0 in the main file
(fission-with-type-unit). This is wrong: the TU is in
fission-with-type-unit-dw.dwo. It should be listed in the Foreign TU
table instead.
This patch therefore teaches GDB to use the foreign TU table for TUs in
.dwo files.
A note about foreign type units and skeletons: in the early history of
split DWARF and type units, gcc 4.x used to create skeletons for type
units in .dwo files, but subsequent versions don't. DWARF 5 doesn't
have support for type unit skeletons at all. So skeletons for type
units are mostly a historical curiosity at this point, the norm is to
not have them. But if for some reason a type unit in a .dwo file had a
matching skeleton in the main file, then it would be ok for that TU to
be listed in the "TU table". The offset would be that of the skeleton.
While the list of CUs and local TUs contain the offset within the
.debug_info section where to find the unit, the foreign TU list only
contains the 8-byte signature of the types. With just that, a reader
wouldn't be able to easily locate a .dwo that contain the type with a
given signature.
To help with this, index entries for foreign type units may also include
a reference to a compilation unit that can be followed in order to find
a .dwo file containing the type. This patch implements it.
Implementation details
----------------------
The first change is the addition of the dwo_unit::per_cu field, which
allows going from the dwo_unit to the dwarf2_per_cu structure (which
describes the skeleton) that was used to lookup this dwo_unit. This
fields starts at nullptr, and it gets set in lookup_dwo_cutu whenever we
look up the dwo_unit for a given dwarf2_per_cu. This will come handy
later. I made this field an std::atomic, because I think it would be
possible to craft a weird test case that would make two indexer threads
try to set the field on the same dwo_unit. During normal operation, we
expect the field for each dwo_unit representing a CU to be written
exactly once.
In index-write.c, change the get_unit_lists function in
dwarf2/index-write.c to segregate local and foreign type units. Then,
update write_debug_names to emit the list of foreign TUs in the
.debug_names header. This consists of a list of type signatures.
In debug_names::build, for foreign type units, emit a
DW_IDX_compile_unit field. This is the reference to the CU that can be
used to locate the .dwo file containing that type unit. To obtain the
value for this field, look up a CU in the same dwo_file that has its
dwo_unit::per_cu field set (typically there will be exactly one CU, and
the field will be set).
With this patch, the index for the test case above looks like:
$ readelf --debug-dump=gdb_index testsuite/outputs/gdb.dwarf2/fission-with-type-unit/fission-with-type-unit
...
Version 5
Augmentation string: 47 44 42 33 00 00 00 00 ("GDB3")
CU table:
[ 0] 0
[ 1] 0xc
[ 2] 0x25
TU table:
Foreign TU table:
[ 0] 000000000000cafe
Used 0 of 0 buckets.
Symbol table:
[ 1] global_var: <0><1> DW_TAG_variable DW_IDX_compile_unit=(udata)=1 DW_IDX_die_offset=(ref_addr)=<0x1c> DW_IDX_GNU_language=(udata)=0 DW_IDX_GNU_internal=(flag_present)=1
[ 2] int:
<0x8><2> DW_TAG_base_type DW_IDX_compile_unit=(udata)=1 DW_IDX_die_offset=(ref_addr)=<0x15> DW_IDX_GNU_language=(udata)=0
<0xf><3> DW_TAG_base_type DW_IDX_type_unit=(udata)=0 DW_IDX_compile_unit=(udata)=1 DW_IDX_die_offset=(ref_addr)=<0x19> DW_IDX_GNU_language=(udata)=0 ...
We can see that the TU is correctly placed in the foreign TU list, and
that the index entry (the last line) points to the TU at index 0, but
also to the CU at index 1, which is indeed the CU that the reader can
follow to find the type unit.
With this patch, GDB still rejects the index:
(gdb) file /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/fission-with-type-unit/fission-with-type-unit
Reading symbols from /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/fission-with-type-unit/fission-with-type-unit...
warning: Section .debug_names in /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/fission-with-type-unit/fission-with-type-unit has unsupported 1 foreign TUs, ignoring .debug_names.
But at least, we don't produce a bogus index anymore, that's already an
improvement. A following patch in this series implements the reading
side
Change-Id: I311fd7b4ca57d9ff6d64ae08df805c6635961eed
---
gdb/dwarf2/index-write.c | 160 +++++++++++++++++++++++++++++++++------
gdb/dwarf2/read.c | 9 +++
gdb/dwarf2/read.h | 17 +++++
3 files changed, 162 insertions(+), 24 deletions(-)
diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index 0c8474b3e4ed..5edada5d16c3 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -633,6 +633,10 @@ write_address_map (const addrmap *addrmap, data_buf &addr_vec,
addrmap_index_data.previous_cu_index);
}
+/* Is this symbol a compile unit, local type unit (type unit in the main file)
+ or foreign type unit (type unit in a .dwo file)? */
+enum class unit_kind { cu, local_tu, foreign_tu };
+
/* Return true if TU is a foreign type unit, that is a type unit defined in a
.dwo file without a corresponding skeleton in the main file. */
@@ -645,6 +649,22 @@ is_foreign_tu (const signatured_type *tu)
return endswith (tu->section ()->get_name (), ".dwo");
}
+/* Determine the unit_kind for PER_CU. */
+
+static unit_kind
+classify_unit (const dwarf2_per_cu &per_cu)
+{
+ if (auto sig_type = per_cu.as_signatured_type (); sig_type != nullptr)
+ {
+ if (is_foreign_tu (sig_type))
+ return unit_kind::foreign_tu;
+ else
+ return unit_kind::local_tu;
+ }
+ else
+ return unit_kind::cu;
+}
+
/* DWARF-5 .debug_names builder. */
class debug_names
{
@@ -668,9 +688,6 @@ class debug_names
return dwarf5_is_dwarf64 ? 8 : 4;
}
- /* Is this symbol from DW_TAG_compile_unit or DW_TAG_type_unit? */
- enum class unit_kind { cu, tu };
-
/* Insert one symbol. */
void insert (const cooked_index_entry *entry)
{
@@ -723,9 +740,8 @@ class debug_names
for (const cooked_index_entry *entry : these_entries)
{
- unit_kind kind = (entry->per_cu->is_debug_types ()
- ? unit_kind::tu
- : unit_kind::cu);
+ unit_kind kind = classify_unit (*entry->per_cu);
+
/* Some Ada parentage is synthesized by the reader and so
must be ignored here. */
const cooked_index_entry *parent = entry->get_parent ();
@@ -752,6 +768,14 @@ class debug_names
: DW_IDX_type_unit);
m_abbrev_table.append_unsigned_leb128 (DW_FORM_udata);
+ /* For foreign TUs only: the index of a CU that can be used to
+ locate the .dwo file containing that TU. */
+ if (kind == unit_kind::foreign_tu)
+ {
+ m_abbrev_table.append_unsigned_leb128 (DW_IDX_compile_unit);
+ m_abbrev_table.append_unsigned_leb128 (DW_FORM_udata);
+ }
+
/* DIE offset. */
m_abbrev_table.append_unsigned_leb128 (DW_IDX_die_offset);
m_abbrev_table.append_unsigned_leb128 (DW_FORM_ref_addr);
@@ -809,6 +833,67 @@ class debug_names
gdb_assert (it != m_cu_index_htab.cend ());
m_entry_pool.append_unsigned_leb128 (it->second);
+ /* For foreign TUs only: the index of a CU that can be used to
+ locate the .dwo file containing that TU.
+
+ This code implements this snippet of the DWARF 5 standard, from
+ section 6.1.1.2, "Structure of the Name Index":
+
+ When an index entry refers to a foreign type unit, it may have
+ attributes for both CU and (foreign) TU. For such entries, the
+ CU attribute gives the consumer a reference to the CU that may
+ be used to locate a split DWARF object file that contains the
+ type unit
+
+ So, the idea here is to find a CU in the same .dwo file as the
+ type unit (there will typically be one) and write the offset to
+ its skeleton. */
+ if (kind == unit_kind::foreign_tu)
+ {
+ /* If we know about this TU that is only listed in some .dwo
+ file, it means that we looked up the .dwo file from a CU
+ skeleton at some point during indexing, and recorded it in
+ ENTRY->PER_CU. */
+ gdb_assert (entry->per_cu != nullptr);
+
+ signatured_type *sig_type
+ = entry->per_cu->as_signatured_type ();
+
+ /* We know it's a TU, because it has been classified as
+ foreign_tu. */
+ gdb_assert (sig_type != nullptr);
+
+ /* We know it has a dwo_unit, because it's foreign. */
+ gdb_assert (sig_type->dwo_unit != nullptr);
+
+ /* Find a CU in the same .dwo file as the TU. */
+ dwarf2_per_cu *per_cu = nullptr;
+ for (auto &dwo_cu : sig_type->dwo_unit->dwo_file->cus)
+ if (dwo_cu->per_cu != nullptr)
+ {
+ per_cu = dwo_cu->per_cu;
+ break;
+ }
+
+ /* It would be really weird to not find a CU that we looked up
+ in this .dwo file. Otherwise, how would we know about this
+ foreign TU? But it might be possible to craft a bad faith
+ .dwo file that does this by having a one TU with a skeleton
+ and one TU without a skeleton in the same .dwo file (i.e.
+ non-standard DWARF 5). So, just in case, we error out
+ gracefully. */
+ if (per_cu == nullptr)
+ error (_("Could not find a CU for foreign type unit in DWO "
+ "file %s"),
+ sig_type->dwo_unit->dwo_file->dwo_name.c_str ());
+
+ /* This is the CU that can be used to find the .dwo file
+ containing the type unit, write its offset. */
+ const auto cu_it = m_cu_index_htab.find (per_cu);
+ gdb_assert (cu_it != m_cu_index_htab.cend ());
+ m_entry_pool.append_unsigned_leb128 (cu_it->second);
+ }
+
/* DIE offset. */
m_entry_pool.append_uint (dwarf5_offset_size (),
m_dwarf5_byte_order,
@@ -1333,8 +1418,9 @@ struct unit_lists
/* Compilation units. */
std::vector<const dwarf2_per_cu *> comp;
- /* Type units. */
- std::vector<const signatured_type *> type;
+ /* Type units, local and foreign. */
+ std::vector<const signatured_type *> local_type;
+ std::vector<const signatured_type *> foreign_type;
};
/* Get sorted (by section offset) lists of comp units and type units. */
@@ -1345,16 +1431,28 @@ get_unit_lists (const dwarf2_per_bfd &per_bfd)
unit_lists lists;
for (const auto &unit : per_bfd.all_units)
- if (const signatured_type *sig_type = unit->as_signatured_type ();
- sig_type != nullptr)
- lists.type.emplace_back (sig_type);
- else
- lists.comp.emplace_back (unit.get ());
+ switch (classify_unit (*unit))
+ {
+ case unit_kind::cu:
+ lists.comp.emplace_back (unit.get ());
+ break;
+
+ case unit_kind::local_tu:
+ lists.local_type.emplace_back (unit->as_signatured_type ());
+ break;
+
+ case unit_kind::foreign_tu:
+ lists.foreign_type.emplace_back (unit->as_signatured_type ());
+ break;
+ }
auto by_sect_off = [] (const dwarf2_per_cu *lhs, const dwarf2_per_cu *rhs)
{ return lhs->sect_off () < rhs->sect_off (); };
- /* Sort both lists, even though it is technically not always required:
+ auto by_sig = [] (const signatured_type *lhs, const signatured_type *rhs)
+ { return lhs->signature < rhs->signature; };
+
+ /* Sort the lists, even though it is technically not always required:
- while .gdb_index requires the CU list to be sorted, DWARF 5 doesn't
say anything about the order of CUs in .debug_names.
@@ -1364,7 +1462,8 @@ get_unit_lists (const dwarf2_per_bfd &per_bfd)
However, it helps make sure that GDB produce a stable and predictable
output, which is nice. */
std::sort (lists.comp.begin (), lists.comp.end (), by_sect_off);
- std::sort (lists.type.begin (), lists.type.end (), by_sect_off);
+ std::sort (lists.local_type.begin (), lists.local_type.end (), by_sect_off);
+ std::sort (lists.foreign_type.begin (), lists.foreign_type.end (), by_sig);
return lists;
}
@@ -1392,10 +1491,9 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, cooked_index *table,
that DWARF 5's .debug_names does with "foreign type units". If the
executable has such skeletonless type units, refuse to produce an index,
instead of producing a bogus one. */
- for (const signatured_type *tu : units.type)
- if (is_foreign_tu (tu))
- error (_("Found foreign (skeletonless) type unit, unable to produce "
- ".gdb_index. Consider using .debug_names instead."));
+ if (!units.foreign_type.empty ())
+ error (_("Found foreign (skeletonless) type unit, unable to produce "
+ ".gdb_index. Consider using .debug_names instead."));
int counter = 0;
@@ -1421,7 +1519,7 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, cooked_index *table,
/* Write type units. */
data_buf types_cu_list;
- for (const signatured_type *sig_type : units.type)
+ for (const signatured_type *sig_type : units.local_type)
{
const auto insertpair = cu_index_htab.emplace (sig_type, counter);
gdb_assert (insertpair.second);
@@ -1494,7 +1592,7 @@ write_debug_names (dwarf2_per_bfd *per_bfd, cooked_index *table,
data_buf type_unit_list;
- for (auto [i, per_cu] : gdb::enumerate (units.type))
+ for (auto [i, per_cu] : gdb::enumerate (units.local_type))
{
nametable.add_cu (per_cu, i);
type_unit_list.append_uint (nametable.dwarf5_offset_size (),
@@ -1502,9 +1600,21 @@ write_debug_names (dwarf2_per_bfd *per_bfd, cooked_index *table,
to_underlying (per_cu->sect_off ()));
}
+ data_buf foreign_type_unit_list;
+
+ for (auto [i, sig_type] : gdb::enumerate (units.foreign_type))
+ {
+ /* The numbering of foreign type units follows the numbering of local type
+ units. */
+ nametable.add_cu (sig_type, units.local_type.size () + i);
+ foreign_type_unit_list.append_uint (8, dwarf5_byte_order,
+ sig_type->signature);
+ }
+
/* Verify that all units are represented. */
gdb_assert (units.comp.size () == per_bfd->num_comp_units);
- gdb_assert (units.type.size () == per_bfd->num_type_units);
+ gdb_assert (units.local_type.size () + units.foreign_type.size ()
+ == per_bfd->num_type_units);
for (const cooked_index_entry *entry : table->all_entries ())
nametable.insert (entry);
@@ -1521,6 +1631,7 @@ write_debug_names (dwarf2_per_bfd *per_bfd, cooked_index *table,
expected_bytes += bytes_of_header;
expected_bytes += comp_unit_list.size ();
expected_bytes += type_unit_list.size ();
+ expected_bytes += foreign_type_unit_list.size ();
expected_bytes += nametable.bytes ();
data_buf header;
@@ -1547,11 +1658,11 @@ write_debug_names (dwarf2_per_bfd *per_bfd, cooked_index *table,
/* local_type_unit_count - The number of TUs in the local TU
list. */
- header.append_uint (4, dwarf5_byte_order, units.type.size ());
+ header.append_uint (4, dwarf5_byte_order, units.local_type.size ());
/* foreign_type_unit_count - The number of TUs in the foreign TU
list. */
- header.append_uint (4, dwarf5_byte_order, 0);
+ header.append_uint (4, dwarf5_byte_order, units.foreign_type.size ());
/* bucket_count - The number of hash buckets in the hash lookup
table. GDB does not use the hash table, so there's also no need
@@ -1577,6 +1688,7 @@ write_debug_names (dwarf2_per_bfd *per_bfd, cooked_index *table,
header.file_write (out_file);
comp_unit_list.file_write (out_file);
type_unit_list.file_write (out_file);
+ foreign_type_unit_list.file_write (out_file);
nametable.file_write (out_file, out_file_str);
assert_file_size (out_file, expected_bytes);
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index fd1c37ad8e89..af7ef8d8f3db 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -7232,6 +7232,15 @@ cutu_reader::lookup_dwo_cutu (dwarf2_cu *cu, const char *dwo_name,
("DWO %s %s(%s) found: @%s", kind, dwo_name,
hex_string (signature),
host_address_to_string (dwo_unit_it->get ()));
+
+ /* Record the dwarf2_per_cu that was used to look up this
+ dwo_unit. There will typically be exactly one skeleton
+ pointing to each DWO CU. But if for some mysterious reason
+ there are multiple skeletons pointing to the same DWO CU, it's
+ fine, we just need to remember one. This will keep the last
+ one seen. */
+ (*dwo_unit_it)->per_cu = cu->per_cu;
+
return dwo_unit_it->get ();
}
}
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index 5c61e91870b4..1687bc52e432 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -259,6 +259,7 @@ struct dwarf2_per_cu
/* If this dwarf2_per_cu is a signatured_type, return "this" cast to
signatured_type. Otherwise, return nullptr. */
signatured_type *as_signatured_type ();
+ const signatured_type *as_signatured_type () const;
dwarf2_per_bfd *per_bfd () const
{ return m_per_bfd; }
@@ -455,6 +456,17 @@ dwarf2_per_cu::as_signatured_type ()
return nullptr;
}
+/* See dwarf2_per_cu declaration. */
+
+inline const signatured_type *
+dwarf2_per_cu::as_signatured_type () const
+{
+ if (m_is_debug_types)
+ return static_cast<const signatured_type *> (this);
+
+ return nullptr;
+}
+
/* Hash a signatured_type object based on its signature. */
struct signatured_type_hash
@@ -496,6 +508,11 @@ struct dwo_unit
/* Backlink to the containing struct dwo_file. */
struct dwo_file *dwo_file = nullptr;
+ /* For compile units only, the per-CU structure that represents the skeleton
+ that we used to reach this dwo_unit. It starts as nullptr, and gets set
+ in cutu_reader::lookup_dwo_cutu. */
+ std::atomic<dwarf2_per_cu *> per_cu = nullptr;
+
/* The "id" that distinguishes this CU/TU.
.debug_info calls this "dwo_id", .debug_types calls this "signature".
Since signatures came first, we stick with it for consistency. */
--
2.53.0
next prev parent reply other threads:[~2026-03-16 23:31 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-16 23:19 [PATCH 0/8] Handle " simon.marchi
2026-03-16 23:19 ` [PATCH 1/8] gdb/dwarf: refuse to produce .gdb_index when skeletonless type units are present simon.marchi
2026-03-17 12:57 ` Eli Zaretskii
2026-03-16 23:19 ` [PATCH 2/8] gdb/dwarf: move dwo_unit and dwo_file to read.h simon.marchi
2026-03-16 23:19 ` [PATCH 3/8] gdb/dwarf: move dwarf2_cu::section to cu.c simon.marchi
2026-03-16 23:19 ` [PATCH 4/8] gdb/dwarf: add dwo_file::find_tus simon.marchi
2026-03-16 23:19 ` simon.marchi [this message]
2026-03-16 23:19 ` [PATCH 6/8] gdb/dwarf: add debug output in read-debug-names.c simon.marchi
2026-03-16 23:19 ` [PATCH 7/8] gdb/dwarf: add more context to complaints in mapped_debug_names_reader::scan_one_entry simon.marchi
2026-03-16 23:19 ` [PATCH 8/8] gdb/dwarf: read foreign type units simon.marchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260316232042.368080-6-simon.marchi@polymtl.ca \
--to=simon.marchi@polymtl.ca \
--cc=gdb-patches@sourceware.org \
--cc=simon.marchi@efficios.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox