Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Tom de Vries <tdevries@suse.de>
To: gdb-patches@sourceware.org
Subject: Re: [RFC 2/2] [gdb/symtab] Don't expand type units for breakpoints
Date: Wed, 8 Apr 2026 14:07:27 +0200	[thread overview]
Message-ID: <e70b9693-f85c-4b1a-bf9e-5c85c06e0a00@suse.de> (raw)
In-Reply-To: <20260408110827.2249311-3-tdevries@suse.de>

On 4/8/26 1:08 PM, Tom de Vries wrote:
> Consider a cc1plus exec compiled with GCC and LTO [1].
> 
> If we try to set a breakpoint on do_rpo_vn:
> ...
> $ gdb -q -batch cc1plus \
>      -ex "b do_rpo_vn"
> Breakpoint 1 at 0xfd75f0: do_rpo_vn. (2 locations)
> ...
> we expand 435 units:
> ...
> $ gdb -q -batch cc1plus \
>      -ex "b do_rpo_vn" \
>      -ex "maint print statistics" \
>    | grep "Number of read units"
>    Number of read units: 435
> ...
> 
> [ When compiling with LTO, GCC generates debug info in two phases:
> - in the early phase, it generates a unit for each source file, containing the
>    types in the source file.  Let's call these type-only units.
> - in the late phase, it generates a so-called artificial unit for each
>    optimized code blob.  The artificial unit doesn't contains types, but
>    imports them from the type-only units.
> 
> The cc1plus exec contains 1150 units, 128 of which are artificials units:
> ...
> $ readelf -wi --dwarf-depth=1 cc1plus | grep -c "Compilation Unit @"
> 1150
> $ readelf -wi --dwarf-depth=1 cc1plus | grep -c "<artificial>"
> 128
> ... ]
> 
> Most of the expanded units are type-only units, which have no relevant
> information when we're looking for function symbols.
> 
> They're merely expanded because the artificial units import them.
> 
> So the question is: can we skip expanding the type-only units when looking for
> function symbols?
> 
> This patch contains a naive implementation of that approach, in process_queue.
> 
> Using it, we expand just 7 artificial units:
> ...
> $ gdb -q -batch cc1plus \
>      -ex "b do_rpo_vn" \
>      -ex "maint print statistics" \
>    | grep "Number of read units"
>    Number of read units: 7
> ...
> 
> But there's a problem with this implementation, which we can illustrate using
> DWZ:
> ...
> $ cat test.c
> struct s {
>    int x0, x1, x2, x3, x4, x5, x6, x7, x8, x9;
> };
> struct s var1;
> int main (void) {
>    return 0;
> }
> $ gcc -g test.c; rm -f ab.dwz; cp a.out b.out; dwz -m ab.dwz a.out b.out
> $ gdb -q -batch a.out -ex "b main" -ex "ptype struct s"
> Breakpoint 1 at 0x40111a: file test.c, line 6.
> No struct type named s.
> ...
> 
> First, we use DWZ to move struct s to a partial unit.
> 
> Then in the GDB session the following happens:
> - setting a breakpoint on main expands the test.c CU
> - the test.c CU has an import of the partial CU, but the partial CU is not
>    expanded, because it contains only types
> - printing struct s attempts to expand test.c CU, finds that it's already
>    expanded, and the partial CU containing struct s remains unexpanded.
> - gdb tries to find struct s in the expanded CUs, and struct s is not found.
> 

I tried bypassing this problem for DWZ using:
...
@@ -3910,6 +3910,7 @@ process_queue (dwarf2_per_objfile *per_objfile, 
domain_search_flags domain)
        gdb_assert (!per_objfile->compunit_symtab_set_p (per_cu));

        bool skip = (domain == SEARCH_FUNCTION_DOMAIN
+		   && cu->header.unit_type != DW_UT_partial
  		   && !per_cu->addresses_seen);
        if (skip)
  	{
...
and that does fix the DWZ counter-example, but not the testsuite 
regressions mentioned below.  IWBN though to have this work for partial 
units as well.

> RFC: what is the best way to fix this problem?
> 
> I wondered about expanding function compunit_symtab_set_p et al with a
> domain_search_flags parameter, giving us a way to express that a CU is
> expanded for SEARCH_FUNCTION_DOMAIN, but not for other domains.
> 
> But there may be a better way to achieve this.
> 
> [ FWIW, I've tested this on x86_64-linux, and ran into a number of FAILs:
> ...
>        9 gdb.cp/cpexprs-debug-types.exp:
>        1 gdb.dlang/circular.exp:
>        2 gdb.dwarf2/ada-cold-name.exp:
>        4 gdb.dwarf2/ada-linkage-name.exp:
>       14 gdb.dwarf2/dw2-bad-abstract-origin.exp:
>        1 gdb.dwarf2/dw2-inline-bt.exp:
>        1 gdb.dwarf2/dw2-prologue-end-2.exp:
>        1 gdb.dwarf2/dw2-ranges-psym.exp:
>        7 gdb.dwarf2/dw2-step-between-different-inline-functions.exp:
>        6 gdb.dwarf2/dw2-step-between-inline-func-blocks.exp:
>       60 gdb.dwarf2/dw2-unexpected-entry-pc.exp:
>        1 gdb.dwarf2/inlined_subroutine-inheritance.exp:
>        1 gdb.dwarf2/struct-decl.exp:
> ... ]
>

I investigated the last failure.  The problem is that addresses_seen is 
false in this CU:
...
  <0><2dc>: Abbrev Number: 1 (DW_TAG_compile_unit)
     <2dd>   DW_AT_language    : 4       (C++)
     <2de>   DW_AT_name        : main.c
     <2e5>   DW_AT_comp_dir    : /tmp
  <1><2ea>: Abbrev Number: 2 (DW_TAG_structure_type)
     <2eb>   DW_AT_byte_size   : 8
     <2ec>   DW_AT_encoding    : 5       (signed)
     <2ed>   DW_AT_name        : the_type
     <2f6>   DW_AT_declaration : 1
  <2><2f6>: Abbrev Number: 3 (DW_TAG_subprogram)
     <2f7>   DW_AT_name        : method
     <2fe>   DW_AT_declaration : 1
  <2><2fe>: Abbrev Number: 0
  <1><2ff>: Abbrev Number: 4 (DW_TAG_subprogram)
     <300>   DW_AT_specification: <0x2f6>
     <304>   DW_AT_low_pc      : 0x401116
     <30c>   DW_AT_high_pc     : 0x401121
  <1><314>: Abbrev Number: 0
...
because the top-level DIE does not contain a PC range, despite the CU 
containing a subprogram with PC range.

Thanks,
- Tom

> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33922
> ---
>   gdb/dwarf2/read.c | 12 +++++++++++-
>   1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index afd0b60006e..475dc03f0b0 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -1945,7 +1945,8 @@ dwarf2_base_index_functions::search_one
>   
>     compunit_symtab *symtab
>       = dw2_instantiate_symtab (per_cu, per_objfile, false, domain);
> -  gdb_assert (symtab != nullptr);
> +  if (symtab == nullptr)
> +    return true;
>   
>     if (listener != nullptr)
>       {
> @@ -3908,6 +3909,15 @@ process_queue (dwarf2_per_objfile *per_objfile, domain_search_flags domain)
>   
>         gdb_assert (!per_objfile->compunit_symtab_set_p (per_cu));
>   
> +      bool skip = (domain == SEARCH_FUNCTION_DOMAIN
> +		   && !per_cu->addresses_seen);
> +      if (skip)
> +	{
> +	  cu->queued = false;
> +	  per_objfile->queue->pop ();
> +	  continue;
> +	}
> +
>         namespace chr = std::chrono;
>   
>         unsigned int debug_print_threshold;


      reply	other threads:[~2026-04-08 12:07 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08 11:08 [RFC 0/2] " Tom de Vries
2026-04-08 11:08 ` [RFC 1/2] [gdb/symtab] Propagate search domain to process_queue Tom de Vries
2026-04-08 11:08 ` [RFC 2/2] [gdb/symtab] Don't expand type units for breakpoints Tom de Vries
2026-04-08 12:07   ` Tom de Vries [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e70b9693-f85c-4b1a-bf9e-5c85c06e0a00@suse.de \
    --to=tdevries@suse.de \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox