From: Simon Marchi via Gdb-patches <gdb-patches@sourceware.org>
To: gdb-patches@sourceware.org
Subject: Re: [PATCH] gdb/dwarf: fix reading subprogram with DW_AT_specification (PR gdb/26693)
Date: Tue, 20 Oct 2020 11:13:58 -0400 [thread overview]
Message-ID: <4613bd56-8d18-0c60-40dd-878f2a6765be@polymtl.ca> (raw)
In-Reply-To: <20201013141849.24640-1-simon.marchi@polymtl.ca>
Ping. This is the last blocker for the release.
Simon
On 2020-10-13 10:18 a.m., Simon Marchi wrote:
> Fix a regression introduced by commit 7188ed02d2a7 ("Replace
> dwarf2_per_cu_data::cu backlink with per-objfile map").
>
> This patch targets both master and gdb-10-branch, since this is a
> regression from GDB 9.
>
> Analysis
> --------
>
> The DWARF generated by the included test case looks like:
>
> 0x0000000b: DW_TAG_compile_unit
> DW_AT_language [DW_FORM_sdata] (4)
>
> 0x0000000d: DW_TAG_base_type
> DW_AT_name [DW_FORM_string] ("int")
> DW_AT_byte_size [DW_FORM_data1] (0x04)
> DW_AT_encoding [DW_FORM_sdata] (5)
>
> 0x00000014: DW_TAG_subprogram
> DW_AT_name [DW_FORM_string] ("apply")
>
> 0x0000001b: DW_TAG_subprogram
> DW_AT_specification [DW_FORM_ref4] (0x00000014 "apply")
> DW_AT_low_pc [DW_FORM_addr] (0x0000000000001234)
> DW_AT_high_pc [DW_FORM_data8] (0x0000000000000020)
>
> 0x00000030: DW_TAG_template_type_parameter
> DW_AT_name [DW_FORM_string] ("T")
> DW_AT_type [DW_FORM_ref4] (0x0000000d "int")
>
> 0x00000037: NULL
>
> 0x00000038: NULL
>
> Simply loading the file in GDB makes it crash:
>
> $ ./gdb -nx --data-directory=data-directory testsuite/outputs/gdb.dwarf2/pr26693/pr26693
> [1] 15188 abort (core dumped) ./gdb -nx --data-directory=data-directory
>
> The crash happens here, where htab (a dwarf2_cu::die_hash field) is
> unexpectedly NULL while generating partial symbols:
>
> #0 0x000055555fa28188 in htab_find_with_hash (htab=0x0, element=0x7fffffffbfa0, hash=27) at /home/simark/src/binutils-gdb/libiberty/hashtab.c:591
> #1 0x000055555cb4eb2e in follow_die_offset (sect_off=(unknown: 27), offset_in_dwz=0, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22951
> #2 0x000055555cb4edfb in follow_die_ref (src_die=0x0, attr=0x7fffffffc130, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22968
> #3 0x000055555caa48c5 in partial_die_full_name (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8441
> #4 0x000055555caa4d79 in add_partial_symbol (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8469
> #5 0x000055555caa7d8c in add_partial_subprogram (pdi=0x621000157e70, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8737
> #6 0x000055555caa265c in scan_partial_symbols (first_die=0x621000157e00, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8230
> #7 0x000055555ca98e3f in process_psymtab_comp_unit_reader (reader=0x7fffffffc6b0, info_ptr=0x60600009650d "\003int", comp_unit_die=0x621000157d10, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7614
> #8 0x000055555ca9aa2c in process_psymtab_comp_unit (this_cu=0x621000155510, per_objfile=0x613000009f80, want_partial_unit=false, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7712
> #9 0x000055555caa051a in dwarf2_build_psymtabs_hard (per_objfile=0x613000009f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8073
>
> The special thing about this DWARF is that the subprogram at 0x1b is a
> template specialization described with DW_AT_specification, and has no
> DW_AT_name in itself. To compute the name of this subprogram,
> partial_die_full_name needs to load the full DIE for this partial DIE.
> The name is generated from the templated function name and the actual
> tempalate parameter values of the specialization.
>
> To load the full DIE, partial_die_full_name creates a dummy DWARF
> attribute of form DW_FORM_ref_addr that points to our subprogram's DIE,
> and calls follow_die_ref on it. This eventually causes
> load_full_comp_unit to be called for the exact same CU we are currently
> making partial symbols for:
>
> #0 load_full_comp_unit (this_cu=0x621000155510, per_objfile=0x613000009f80, skip_partial=false, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:9238
> #1 0x000055555cb4e943 in follow_die_offset (sect_off=(unknown: 27), offset_in_dwz=0, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22942
> #2 0x000055555cb4edfb in follow_die_ref (src_die=0x0, attr=0x7fffffffc130, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22968
> #3 0x000055555caa48c5 in partial_die_full_name (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8441
> #4 0x000055555caa4d79 in add_partial_symbol (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8469
> #5 0x000055555caa7d8c in add_partial_subprogram (pdi=0x621000157e70, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8737
> #6 0x000055555caa265c in scan_partial_symbols (first_die=0x621000157e00, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8230
> #7 0x000055555ca98e3f in process_psymtab_comp_unit_reader (reader=0x7fffffffc6b0, info_ptr=0x60600009650d "\003int", comp_unit_die=0x621000157d10, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7614
> #8 0x000055555ca9aa2c in process_psymtab_comp_unit (this_cu=0x621000155510, per_objfile=0x613000009f80, want_partial_unit=false, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7712
> #9 0x000055555caa051a in dwarf2_build_psymtabs_hard (per_objfile=0x613000009f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8073
>
> load_full_comp_unit creates a cutu_reader for the CU. Since a dwarf2_cu
> object already exists for the CU, load_full_comp_unit is expected to
> find it and pass it to cutu_reader, so that cutu_reader doesn't create
> a new dwarf2_cu for the CU.
>
> And this is the difference between before and after the regression.
> Before commit 7188ed02d2a7, the dwarf2_per_cu_data -> dwarf2_cu link was
> a simple pointer in dwarf2_per_cu_data. This pointer was set up when
> starting the read the partial symbols. So it was already available at
> that point where load_full_comp_unit gets called. Post-7188ed02d2a7,
> this link is per-objfile, kept in the dwarf2_per_objfile::m_dwarf2_cus
> hash map. The entry is only put in the hash map once the partial
> symbols have been successfully read, when cutu_reader::keep is called.
> Therefore, it is _not_ set at the point load_full_comp_unit is called.
>
> As a consequence, a new dwarf2_cu object gets created and initialized by
> load_full_comp_unit (including initializing that dwarf2_cu::die_hash
> field). Meanwhile, the dwarf2_cu object created and used by the callers
> up the stack does not get initialized for full symbol reading, and the
> dwarf2_cu::die_hash field stays unexpectedly NULL.
>
> Solution
> --------
>
> Since the caller of load_full_comp_unit knows about the existing
> dwarf2_cu object for the CU we are reading (the one load_full_comp_unit
> is expected to find), we can simply make it pass it down, instead of
> having load_full_comp_unit look up the per-objfile map.
>
> load_full_comp_unit therefore gets a new `existing_cu` parameter. All
> other callers get updated to pass `per_objfile->get_cu (per_cu)`, so the
> behavior shouldn't change for them, compared to the current HEAD.
>
> A test is added, which is the bare minimum to reproduce the issue.
>
> Notes
> -----
>
> The original problem was reproduced by downloading
>
> https://github.com/oneapi-src/oneTBB/releases/download/v2020.3/tbb-2020.3-lin.tgz
>
> and loading libtbb.so in GDB. This code was compiled with the Intel
> C/C++ compiler. I was not able to reproduce the issue using GCC, I
> think because GCC puts a DW_AT_name in the specialized subprogram, so
> there's no need for partial_die_full_name to load the full DIE of the
> subprogram, and the faulty code doesn't execute.
>
> gdb/ChangeLog:
>
> PR gdb/26693
> * dwarf2/read.c (load_full_comp_unit): Add existing_cu
> parameter.
> (load_cu): Pass existing CU.
> (process_imported_unit_die): Likewise.
> (follow_die_offset): Likewise.
>
> gdb/testsuite/ChangeLog:
>
> PR gdb/26693
> * gdb.dwarf2/template-specification-full-name.c: New test.
> * gdb.dwarf2/template-specification-full-name.exp: New test.
>
> Change-Id: I57c8042f96c45f15797a3848e4d384181c56bb44
> ---
> gdb/dwarf2/read.c | 15 ++--
> .../template-specification-full-name.c | 22 ++++++
> .../template-specification-full-name.exp | 75 +++++++++++++++++++
> 3 files changed, 107 insertions(+), 5 deletions(-)
> create mode 100644 gdb/testsuite/gdb.dwarf2/template-specification-full-name.c
> create mode 100644 gdb/testsuite/gdb.dwarf2/template-specification-full-name.exp
>
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index 2ec3789135d6..a8b48374200c 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -1606,6 +1606,7 @@ static int create_all_type_units (dwarf2_per_objfile *per_objfile);
>
> static void load_full_comp_unit (dwarf2_per_cu_data *per_cu,
> dwarf2_per_objfile *per_objfile,
> + dwarf2_cu *existing_cu,
> bool skip_partial,
> enum language pretend_language);
>
> @@ -2385,7 +2386,8 @@ load_cu (dwarf2_per_cu_data *per_cu, dwarf2_per_objfile *per_objfile,
> if (per_cu->is_debug_types)
> load_full_type_unit (per_cu, per_objfile);
> else
> - load_full_comp_unit (per_cu, per_objfile, skip_partial, language_minimal);
> + load_full_comp_unit (per_cu, per_objfile, per_objfile->get_cu (per_cu),
> + skip_partial, language_minimal);
>
> dwarf2_cu *cu = per_objfile->get_cu (per_cu);
> if (cu == nullptr)
> @@ -9230,12 +9232,12 @@ die_eq (const void *item_lhs, const void *item_rhs)
> static void
> load_full_comp_unit (dwarf2_per_cu_data *this_cu,
> dwarf2_per_objfile *per_objfile,
> + dwarf2_cu *existing_cu,
> bool skip_partial,
> enum language pretend_language)
> {
> gdb_assert (! this_cu->is_debug_types);
>
> - dwarf2_cu *existing_cu = per_objfile->get_cu (this_cu);
> cutu_reader reader (this_cu, per_objfile, NULL, existing_cu, skip_partial);
> if (reader.dummy_p)
> return;
> @@ -10101,7 +10103,8 @@ process_imported_unit_die (struct die_info *die, struct dwarf2_cu *cu)
>
> /* If necessary, add it to the queue and load its DIEs. */
> if (maybe_queue_comp_unit (cu, per_cu, per_objfile, cu->language))
> - load_full_comp_unit (per_cu, per_objfile, false, cu->language);
> + load_full_comp_unit (per_cu, per_objfile, per_objfile->get_cu (per_cu),
> + false, cu->language);
>
> cu->per_cu->imported_symtabs_push (per_cu);
> }
> @@ -22931,7 +22934,8 @@ follow_die_offset (sect_offset sect_off, int offset_in_dwz,
>
> /* If necessary, add it to the queue and load its DIEs. */
> if (maybe_queue_comp_unit (cu, per_cu, per_objfile, cu->language))
> - load_full_comp_unit (per_cu, per_objfile, false, cu->language);
> + load_full_comp_unit (per_cu, per_objfile, per_objfile->get_cu (per_cu),
> + false, cu->language);
>
> target_cu = per_objfile->get_cu (per_cu);
> }
> @@ -22939,7 +22943,8 @@ follow_die_offset (sect_offset sect_off, int offset_in_dwz,
> {
> /* We're loading full DIEs during partial symbol reading. */
> gdb_assert (per_objfile->per_bfd->reading_partial_symbols);
> - load_full_comp_unit (cu->per_cu, per_objfile, false, language_minimal);
> + load_full_comp_unit (cu->per_cu, per_objfile, cu, false,
> + language_minimal);
> }
>
> *ref_cu = target_cu;
> diff --git a/gdb/testsuite/gdb.dwarf2/template-specification-full-name.c b/gdb/testsuite/gdb.dwarf2/template-specification-full-name.c
> new file mode 100644
> index 000000000000..9d7b2f1a4c28
> --- /dev/null
> +++ b/gdb/testsuite/gdb.dwarf2/template-specification-full-name.c
> @@ -0,0 +1,22 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> + Copyright 2020 Free Software Foundation, Inc.
> +
> + This program is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3 of the License, or
> + (at your option) any later version.
> +
> + This program is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this program. If not, see <http://www.gnu.org/licenses/>. */
> +
> +int
> +main (void)
> +{
> + return 0;
> +}
> diff --git a/gdb/testsuite/gdb.dwarf2/template-specification-full-name.exp b/gdb/testsuite/gdb.dwarf2/template-specification-full-name.exp
> new file mode 100644
> index 000000000000..87b3604e0760
> --- /dev/null
> +++ b/gdb/testsuite/gdb.dwarf2/template-specification-full-name.exp
> @@ -0,0 +1,75 @@
> +# Copyright 2020 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test to reproduce the crash described in PR 26693. The following DWARF was
> +# crashing GDB while loading partial symbols: a DW_TAG_subprogram with a
> +# DW_AT_specification (pointing to another subprogram), without a DW_AT_name
> +# and with a template parameter (DW_TAG_template_type_parameter). More
> +# precisely, the crash was happening when trying to compute the full name of the
> +# subprogram.
> +
> +load_lib dwarf.exp
> +
> +if {![dwarf2_support]} {
> + return 0
> +}
> +
> +standard_testfile .c .S
> +
> +set asm_file [standard_output_file $srcfile2]
> +Dwarf::assemble $asm_file {
> + cu {} {
> + DW_TAG_compile_unit {
> + {DW_AT_language @DW_LANG_C_plus_plus}
> + } {
> + declare_labels templated_subprogram int
> +
> + int: DW_TAG_base_type {
> + {DW_AT_name "int"}
> + {DW_AT_byte_size 4 DW_FORM_data1}
> + {DW_AT_encoding @DW_ATE_signed}
> + }
> +
> + # The templated subprogram.
> + templated_subprogram: DW_TAG_subprogram {
> + {DW_AT_name "apply"}
> + }
> +
> + # The template specialization (the low and high PC are phony).
> + DW_TAG_subprogram {
> + {DW_AT_specification :$templated_subprogram}
> + {DW_AT_low_pc 0x1234 DW_FORM_addr}
> + {DW_AT_high_pc 0x20 DW_FORM_data8}
> + } {
> + DW_TAG_template_type_param {
> + {DW_AT_name "T"}
> + {DW_AT_type :$int DW_FORM_ref4}
> + }
> + }
> + }
> + }
> +}
> +
> +if { [prepare_for_testing "failed to prepare" ${testfile} \
> + [list $srcfile $asm_file] {nodebug}] } {
> + return -1
> +}
> +
> +if ![runto_main] {
> + return -1
> +}
> +
> +# Just a sanity check to make sure GDB slurped the symbols correctly.
> +gdb_test "print apply<int>" " = {void \\(void\\)} $hex <apply<int>\\(\\)>"
> --
> 2.28.0
>
next prev parent reply other threads:[~2020-10-20 15:14 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-13 14:18 Simon Marchi via Gdb-patches
2020-10-20 15:13 ` Simon Marchi via Gdb-patches [this message]
2020-10-20 15:55 ` Strasuns, Mihails via Gdb-patches
2020-10-20 16:29 ` Tom de Vries
2020-10-20 16:48 ` Simon Marchi via Gdb-patches
2020-10-20 16:50 ` [PATCH v2] " Simon Marchi via Gdb-patches
2020-10-21 20:42 ` Tom Tromey
2020-10-22 2:37 ` Simon Marchi via Gdb-patches
2020-10-22 14:18 ` Tom Tromey
2020-10-22 14:51 ` Simon Marchi via Gdb-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4613bd56-8d18-0c60-40dd-878f2a6765be@polymtl.ca \
--to=gdb-patches@sourceware.org \
--cc=simon.marchi@polymtl.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox