* [PATCH 1/5] gdb/dwarf: merge one CU/TU code path
2026-01-17 6:02 [PATCH 0/5] Some semi-random DWARF cleanups simon.marchi
@ 2026-01-17 6:02 ` simon.marchi
2026-01-17 6:02 ` [PATCH 2/5] gdb/dwarf: move DWP htab nullptr check to lookup_dwo_unit_in_dwp simon.marchi
` (4 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: simon.marchi @ 2026-01-17 6:02 UTC (permalink / raw)
To: gdb-patches; +Cc: Simon Marchi
From: Simon Marchi <simon.marchi@polymtl.ca>
Simplify the code by merging the paths for CUs and TUs. It is also not
necessary to check if the maps are empty.
Change-Id: I43e2cf4bb1217a72dda8a03957ed733f87bf57b2
---
gdb/dwarf2/read.c | 31 ++++++++++---------------------
1 file changed, 10 insertions(+), 21 deletions(-)
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 98cc1aa5b941..331f1b9c5261 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -7560,30 +7560,19 @@ cutu_reader::lookup_dwo_cutu (dwarf2_cu *cu, const char *dwo_name,
dwo_file = add_dwo_file (per_bfd, std::move (new_dwo_file));
}
- if (dwo_file != NULL)
+ if (dwo_file != nullptr)
{
- struct dwo_unit *dwo_cutu = NULL;
+ dwo_unit_set &dwo_units
+ = is_debug_types ? dwo_file->tus : dwo_file->cus;
- if (is_debug_types && !dwo_file->tus.empty ())
+ if (auto dwo_unit_it = dwo_units.find (signature);
+ dwo_unit_it != dwo_units.end ())
{
- if (auto it = dwo_file->tus.find (signature);
- it != dwo_file->tus.end ())
- dwo_cutu = it->get ();
- }
- else if (!is_debug_types && !dwo_file->cus.empty ())
- {
- if (auto it = dwo_file->cus.find (signature);
- it != dwo_file->cus.end ())
- dwo_cutu = it->get ();
- }
-
- if (dwo_cutu != NULL)
- {
- dwarf_read_debug_printf ("DWO %s %s(%s) found: @%s",
- kind, dwo_name, hex_string (signature),
- host_address_to_string (dwo_cutu));
-
- return dwo_cutu;
+ dwarf_read_debug_printf
+ ("DWO %s %s(%s) found: @%s", kind, dwo_name,
+ hex_string (signature),
+ host_address_to_string (dwo_unit_it->get ()));
+ return dwo_unit_it->get ();
}
}
}
--
2.52.0
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 2/5] gdb/dwarf: move DWP htab nullptr check to lookup_dwo_unit_in_dwp
2026-01-17 6:02 [PATCH 0/5] Some semi-random DWARF cleanups simon.marchi
2026-01-17 6:02 ` [PATCH 1/5] gdb/dwarf: merge one CU/TU code path simon.marchi
@ 2026-01-17 6:02 ` simon.marchi
2026-01-17 6:02 ` [PATCH 3/5] gdb/dwarf: add comments to debug_names::build simon.marchi
` (3 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: simon.marchi @ 2026-01-17 6:02 UTC (permalink / raw)
To: gdb-patches; +Cc: Simon Marchi
From: Simon Marchi <simon.marchi@polymtl.ca>
cutu_reader::lookup_dwo_cutu gets hold of the right htab for the kind of
unit to look up (CU or TU), but then just uses the pointer to know if it
should call lookup_dwo_unit_in_dwp or not. lookup_dwo_unit_in_dwp then
uses the same condition to obtain the correct htab. I think it would
make more sense to return early from lookup_dwo_unit_in_dwp instead, if
the required htab does not exist.
Change-Id: I6f8f96aba2452f8261b3c60667e72fb21b97c89d
---
gdb/dwarf2/read.c | 32 +++++++++++++++-----------------
1 file changed, 15 insertions(+), 17 deletions(-)
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 331f1b9c5261..1b728764f711 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -6967,8 +6967,12 @@ lookup_dwo_unit_in_dwp (dwarf2_per_bfd *per_bfd,
struct dwp_file *dwp_file, const char *comp_dir,
ULONGEST signature, int is_debug_types)
{
- const struct dwp_hash_table *dwp_htab =
- is_debug_types ? dwp_file->tus : dwp_file->cus;
+ const dwp_hash_table *dwp_htab
+ = is_debug_types ? dwp_file->tus : dwp_file->cus;
+
+ if (dwp_htab == nullptr)
+ return nullptr;
+
bfd *dbfd = dwp_file->dbfd.get ();
uint32_t mask = dwp_htab->nr_slots - 1;
uint32_t hash = signature & mask;
@@ -7522,25 +7526,19 @@ cutu_reader::lookup_dwo_cutu (dwarf2_cu *cu, const char *dwo_name,
dwp_file *dwp_file = per_objfile->per_bfd->dwp_file.get ();
- if (dwp_file != NULL)
+ if (dwp_file != nullptr)
{
- const struct dwp_hash_table *dwp_htab =
- is_debug_types ? dwp_file->tus : dwp_file->cus;
+ dwo_unit *dwo_cutu
+ = lookup_dwo_unit_in_dwp (per_bfd, dwp_file, comp_dir, signature,
+ is_debug_types);
- if (dwp_htab != NULL)
+ if (dwo_cutu != nullptr)
{
- struct dwo_unit *dwo_cutu =
- lookup_dwo_unit_in_dwp (per_bfd, dwp_file, comp_dir, signature,
- is_debug_types);
+ dwarf_read_debug_printf ("Virtual DWO %s %s found: @%s",
+ kind, hex_string (signature),
+ host_address_to_string (dwo_cutu));
- if (dwo_cutu != NULL)
- {
- dwarf_read_debug_printf ("Virtual DWO %s %s found: @%s",
- kind, hex_string (signature),
- host_address_to_string (dwo_cutu));
-
- return dwo_cutu;
- }
+ return dwo_cutu;
}
}
else
--
2.52.0
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 3/5] gdb/dwarf: add comments to debug_names::build
2026-01-17 6:02 [PATCH 0/5] Some semi-random DWARF cleanups simon.marchi
2026-01-17 6:02 ` [PATCH 1/5] gdb/dwarf: merge one CU/TU code path simon.marchi
2026-01-17 6:02 ` [PATCH 2/5] gdb/dwarf: move DWP htab nullptr check to lookup_dwo_unit_in_dwp simon.marchi
@ 2026-01-17 6:02 ` simon.marchi
2026-01-17 6:02 ` [PATCH 4/5] gdb/dwarf: rename some abbrev-related things in debug_names writer simon.marchi
` (2 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: simon.marchi @ 2026-01-17 6:02 UTC (permalink / raw)
To: gdb-patches; +Cc: Simon Marchi
From: Simon Marchi <simon.marchi@polymtl.ca>
These help me follow what is going on, when following with the spec on
the side.
Change-Id: I0da0047eefd233e8f7635118d67622f07c29ce01
---
gdb/dwarf2/index-write.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index d31e7b631b86..39910cd3f71c 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -728,33 +728,49 @@ class debug_names
if (idx == 0)
{
idx = next_abbrev++;
+
+ /* Abbrev number and tag. */
m_abbrev_table.append_unsigned_leb128 (idx);
m_abbrev_table.append_unsigned_leb128 (entry->tag);
+
+ /* Unit index. */
m_abbrev_table.append_unsigned_leb128
(kind == unit_kind::cu
? DW_IDX_compile_unit
: DW_IDX_type_unit);
m_abbrev_table.append_unsigned_leb128 (DW_FORM_udata);
+
+ /* DIE offset. */
m_abbrev_table.append_unsigned_leb128 (DW_IDX_die_offset);
m_abbrev_table.append_unsigned_leb128 (DW_FORM_ref_addr);
+
+ /* Language. */
m_abbrev_table.append_unsigned_leb128 (DW_IDX_GNU_language);
m_abbrev_table.append_unsigned_leb128 (DW_FORM_udata);
+
+ /* Internal linkage flag. */
if (!tag_is_type (entry->tag)
&& (entry->flags & IS_STATIC) != 0)
{
m_abbrev_table.append_unsigned_leb128 (DW_IDX_GNU_internal);
m_abbrev_table.append_unsigned_leb128 (DW_FORM_flag_present);
}
+
+ /* Main subprogram flag. */
if ((entry->flags & IS_MAIN) != 0)
{
m_abbrev_table.append_unsigned_leb128 (DW_IDX_GNU_main);
m_abbrev_table.append_unsigned_leb128 (DW_FORM_flag_present);
}
+
+ /* Linkage name flag. */
if ((entry->flags & IS_LINKAGE) != 0)
{
m_abbrev_table.append_unsigned_leb128 (DW_IDX_GNU_linkage_name);
m_abbrev_table.append_unsigned_leb128 (DW_FORM_flag_present);
}
+
+ /* Parent offset. */
if (parent != nullptr)
{
m_abbrev_table.append_unsigned_leb128 (DW_IDX_parent);
@@ -773,19 +789,23 @@ class debug_names
.second);
gdb_assert (offset_inserted);
- /* Write the entry to the pool. */
+ /* Write the entry to the pool, starting with the abbrev number. */
m_entry_pool.append_unsigned_leb128 (idx);
+ /* Unit index. */
const auto it = m_cu_index_htab.find (entry->per_cu);
gdb_assert (it != m_cu_index_htab.cend ());
m_entry_pool.append_unsigned_leb128 (it->second);
+ /* DIE offset. */
m_entry_pool.append_uint (dwarf5_offset_size (),
m_dwarf5_byte_order,
to_underlying (entry->die_offset));
+ /* Language. */
m_entry_pool.append_unsigned_leb128 (entry->per_cu->dw_lang ());
+ /* Parent offset. */
if (parent != nullptr)
{
m_offsets_to_patch.emplace_back (m_entry_pool.size (), parent);
--
2.52.0
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 4/5] gdb/dwarf: rename some abbrev-related things in debug_names writer
2026-01-17 6:02 [PATCH 0/5] Some semi-random DWARF cleanups simon.marchi
` (2 preceding siblings ...)
2026-01-17 6:02 ` [PATCH 3/5] gdb/dwarf: add comments to debug_names::build simon.marchi
@ 2026-01-17 6:02 ` simon.marchi
2026-01-17 6:02 ` [PATCH 5/5] gdb/dwarf: add unit_lists structure to index writer simon.marchi
2026-01-20 17:00 ` [PATCH 0/5] Some semi-random DWARF cleanups Tom Tromey
5 siblings, 0 replies; 11+ messages in thread
From: simon.marchi @ 2026-01-17 6:02 UTC (permalink / raw)
To: gdb-patches; +Cc: Simon Marchi
From: Simon Marchi <simon.marchi@polymtl.ca>
I think it would be clearer for some of these things to be called
"abbrev", rather than "idx".
Change-Id: Ic128a77dc7ce14a6179c049a2de21f3f9636405d
---
gdb/dwarf2/index-write.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index 39910cd3f71c..c4bd424d4340 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -720,17 +720,17 @@ class debug_names
if (parent != nullptr && (parent->flags & IS_SYNTHESIZED) != 0)
parent = nullptr;
- int &idx = m_indexkey_to_idx[index_key (entry->tag,
- kind,
- entry->flags,
- entry->lang,
- parent != nullptr)];
- if (idx == 0)
+ int &abbrev = m_indexkey_to_abbrev[index_key (entry->tag,
+ kind,
+ entry->flags,
+ entry->lang,
+ parent != nullptr)];
+ if (abbrev == 0)
{
- idx = next_abbrev++;
+ abbrev = next_abbrev++;
/* Abbrev number and tag. */
- m_abbrev_table.append_unsigned_leb128 (idx);
+ m_abbrev_table.append_unsigned_leb128 (abbrev);
m_abbrev_table.append_unsigned_leb128 (entry->tag);
/* Unit index. */
@@ -790,7 +790,7 @@ class debug_names
gdb_assert (offset_inserted);
/* Write the entry to the pool, starting with the abbrev number. */
- m_entry_pool.append_unsigned_leb128 (idx);
+ m_entry_pool.append_unsigned_leb128 (abbrev);
/* Unit index. */
const auto it = m_cu_index_htab.find (entry->per_cu);
@@ -1107,8 +1107,8 @@ class debug_names
debug_str_lookup m_debugstrlookup;
/* Map each used .debug_names abbreviation tag parameter to its
- index value. */
- gdb::unordered_map<index_key, int, index_key_hasher> m_indexkey_to_idx;
+ abbrev value. */
+ gdb::unordered_map<index_key, int, index_key_hasher> m_indexkey_to_abbrev;
/* .debug_names abbreviation table. */
data_buf m_abbrev_table;
--
2.52.0
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 5/5] gdb/dwarf: add unit_lists structure to index writer
2026-01-17 6:02 [PATCH 0/5] Some semi-random DWARF cleanups simon.marchi
` (3 preceding siblings ...)
2026-01-17 6:02 ` [PATCH 4/5] gdb/dwarf: rename some abbrev-related things in debug_names writer simon.marchi
@ 2026-01-17 6:02 ` simon.marchi
2026-01-20 17:00 ` Tom Tromey
2026-01-20 17:00 ` [PATCH 0/5] Some semi-random DWARF cleanups Tom Tromey
5 siblings, 1 reply; 11+ messages in thread
From: simon.marchi @ 2026-01-17 6:02 UTC (permalink / raw)
To: gdb-patches; +Cc: Simon Marchi
From: Simon Marchi <simon.marchi@polymtl.ca>
I think it makes the code more readable than the pair of vector. Also,
I'm considering adding a third list (foreigh type units), which will be
easier with the structure.
Change-Id: I38ec4ddf8f786a2ba10c5b371cfe04c2baaa7da9
---
gdb/dwarf2/index-write.c | 34 +++++++++++++++++++---------------
1 file changed, 19 insertions(+), 15 deletions(-)
diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index c4bd424d4340..ffbd3777b4c7 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -1314,21 +1314,25 @@ write_shortcuts_table (cooked_index *table, data_buf &shortcuts,
shortcuts.append_offset (main_name_offset);
}
+struct unit_lists
+{
+ std::vector<const dwarf2_per_cu *> comp;
+ std::vector<const signatured_type *> type;
+};
+
/* Get sorted (by section offset) lists of comp units and type units. */
-static std::pair<std::vector<const dwarf2_per_cu *>,
- std::vector<const signatured_type *>>
+static unit_lists
get_unit_lists (const dwarf2_per_bfd &per_bfd)
{
- std::vector<const dwarf2_per_cu *> comp_units;
- std::vector<const signatured_type *> type_units;
+ unit_lists lists;
for (const auto &unit : per_bfd.all_units)
if (const signatured_type *sig_type = unit->as_signatured_type ();
sig_type != nullptr)
- type_units.emplace_back (sig_type);
+ lists.type.emplace_back (sig_type);
else
- comp_units.emplace_back (unit.get ());
+ lists.comp.emplace_back (unit.get ());
auto by_sect_off = [] (const dwarf2_per_cu *lhs, const dwarf2_per_cu *rhs)
{ return lhs->sect_off () < rhs->sect_off (); };
@@ -1342,10 +1346,10 @@ get_unit_lists (const dwarf2_per_bfd &per_bfd)
However, it helps make sure that GDB produce a stable and predictable
output, which is nice. */
- std::sort (comp_units.begin (), comp_units.end (), by_sect_off);
- std::sort (type_units.begin (), type_units.end (), by_sect_off);
+ std::sort (lists.comp.begin (), lists.comp.end (), by_sect_off);
+ std::sort (lists.type.begin (), lists.type.end (), by_sect_off);
- return {std::move (comp_units), std::move (type_units)};
+ return lists;
}
/* Write contents of a .gdb_index section for OBJFILE into OUT_FILE.
@@ -1365,14 +1369,14 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, cooked_index *table,
cu_index_map cu_index_htab;
cu_index_htab.reserve (per_bfd->all_units.size ());
- auto [comp_units, type_units] = get_unit_lists (*per_bfd);
+ unit_lists units = get_unit_lists (*per_bfd);
int counter = 0;
/* Write comp units. */
data_buf objfile_cu_list;
data_buf dwz_cu_list;
- for (const dwarf2_per_cu *per_cu : comp_units)
+ for (const dwarf2_per_cu *per_cu : units.comp)
{
const auto insertpair = cu_index_htab.emplace (per_cu, counter);
gdb_assert (insertpair.second);
@@ -1390,7 +1394,7 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, cooked_index *table,
/* Write type units. */
data_buf types_cu_list;
- for (const signatured_type *sig_type : type_units)
+ for (const signatured_type *sig_type : units.type)
{
const auto insertpair = cu_index_htab.emplace (sig_type, counter);
gdb_assert (insertpair.second);
@@ -1449,12 +1453,12 @@ write_debug_names (dwarf2_per_bfd *per_bfd, cooked_index *table,
const enum bfd_endian dwarf5_byte_order
= bfd_big_endian (per_bfd->obfd) ? BFD_ENDIAN_BIG : BFD_ENDIAN_LITTLE;
- auto [comp_units, type_units] = get_unit_lists (*per_bfd);
+ unit_lists units = get_unit_lists (*per_bfd);
debug_names nametable (per_bfd, dwarf5_is_dwarf64, dwarf5_byte_order);
data_buf comp_unit_list;
int comp_unit_counter = 0;
- for (const auto per_cu : comp_units)
+ for (const auto per_cu : units.comp)
{
nametable.add_cu (per_cu, comp_unit_counter);
comp_unit_list.append_uint (nametable.dwarf5_offset_size (),
@@ -1466,7 +1470,7 @@ write_debug_names (dwarf2_per_bfd *per_bfd, cooked_index *table,
data_buf type_unit_list;
int type_unit_counter = 0;
- for (const auto per_cu : type_units)
+ for (const auto per_cu : units.type)
{
nametable.add_cu (per_cu, type_unit_counter);
type_unit_list.append_uint (nametable.dwarf5_offset_size (),
--
2.52.0
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH 5/5] gdb/dwarf: add unit_lists structure to index writer
2026-01-17 6:02 ` [PATCH 5/5] gdb/dwarf: add unit_lists structure to index writer simon.marchi
@ 2026-01-20 17:00 ` Tom Tromey
2026-01-20 17:33 ` Simon Marchi
0 siblings, 1 reply; 11+ messages in thread
From: Tom Tromey @ 2026-01-20 17:00 UTC (permalink / raw)
To: simon.marchi; +Cc: gdb-patches
>>>>> "Simon" == simon marchi <simon.marchi@polymtl.ca> writes:
Simon> From: Simon Marchi <simon.marchi@polymtl.ca>
Simon> I think it makes the code more readable than the pair of vector. Also,
Simon> I'm considering adding a third list (foreigh type units), which will be
Typo, "foreign"
Simon> +struct unit_lists
Simon> +{
Simon> + std::vector<const dwarf2_per_cu *> comp;
Simon> + std::vector<const signatured_type *> type;
Simon> +};
I think it would be good to have comments here.
Tom
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH 5/5] gdb/dwarf: add unit_lists structure to index writer
2026-01-20 17:00 ` Tom Tromey
@ 2026-01-20 17:33 ` Simon Marchi
2026-01-20 20:42 ` Tom Tromey
0 siblings, 1 reply; 11+ messages in thread
From: Simon Marchi @ 2026-01-20 17:33 UTC (permalink / raw)
To: Tom Tromey; +Cc: gdb-patches
On 2026-01-20 12:00, Tom Tromey wrote:
>>>>>> "Simon" == simon marchi <simon.marchi@polymtl.ca> writes:
>
> Simon> From: Simon Marchi <simon.marchi@polymtl.ca>
> Simon> I think it makes the code more readable than the pair of vector. Also,
> Simon> I'm considering adding a third list (foreigh type units), which will be
>
> Typo, "foreign"
>
>
> Simon> +struct unit_lists
> Simon> +{
> Simon> + std::vector<const dwarf2_per_cu *> comp;
> Simon> + std::vector<const signatured_type *> type;
> Simon> +};
>
> I think it would be good to have comments here.
>
> Tom
Is this good?
---
/* Lists of units, split by kind. Each list is sorted by section offset. */
struct unit_lists
{
/* Compilation units. */
std::vector<const dwarf2_per_cu *> comp;
/* Type units. */
std::vector<const signatured_type *> type;
};
---
Simon
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/5] Some semi-random DWARF cleanups
2026-01-17 6:02 [PATCH 0/5] Some semi-random DWARF cleanups simon.marchi
` (4 preceding siblings ...)
2026-01-17 6:02 ` [PATCH 5/5] gdb/dwarf: add unit_lists structure to index writer simon.marchi
@ 2026-01-20 17:00 ` Tom Tromey
5 siblings, 0 replies; 11+ messages in thread
From: Tom Tromey @ 2026-01-20 17:00 UTC (permalink / raw)
To: simon.marchi; +Cc: gdb-patches
>>>>> "Simon" == simon marchi <simon.marchi@polymtl.ca> writes:
Simon> From: Simon Marchi <simon.marchi@polymtl.ca>
Simon> I am working on a patch to handle foreign TUs when generating the
Simon> .debug_names index. These are the cleanups I accumulated, that I think
Simon> make sense on their own.
Some nits on patch 5 but thanks, this is all ok by me.
Approved-By: Tom Tromey <tom@tromey.com>
Tom
^ permalink raw reply [flat|nested] 11+ messages in thread