From: Andrew Burgess <aburgess@redhat.com>
To: Lancelot SIX <lancelot.six@amd.com>
Cc: gdb-patches@sourceware.org, Tom Tromey <tom@tromey.com>
Subject: Re: [PATCHv2 2/3] gdb: make structured core file mappings processing global
Date: Mon, 13 Oct 2025 15:37:34 +0100 [thread overview]
Message-ID: <87jz0yc1g1.fsf@redhat.com> (raw)
In-Reply-To: <azpbye6xzjodiz33sp3oirjyjxxv266vxmb7eyaauuxspwgica@6omdz5klkmla>
Lancelot SIX <lancelot.six@amd.com> writes:
> Hi Andrew,
>
> I think there is an issue with this patch. When running with this and
> address sanitizer, I end-up with use-after-free errors.
>
> See below for what I think is the issue.
>
> On Tue, Sep 23, 2025 at 02:44:07PM +0100, Andrew Burgess wrote:
>> In corelow.c, within core_target::build_file_mappings, we have code
>> that wraps around a call to gdbarch_read_core_file_mappings and
>> provides more structure to the results.
>>
>> Specifically, gdbarch_read_core_file_mappings calls a callback once
>> for every region of every mapped file. The wrapper code groups all of
>> the mappings for one file into an instance of 'struct mapped_file',
>> this allows all of the mapped regions to be associated with the
>> build-id and filename of a file.
>>
>> In the next commit I plan to make this information available via the
>> Python API, and so I need to allow access to this structured wrapping
>> outside of corelow.c.
>>
>> This commit renames 'struct mapped_file' to 'struct core_mapped_file'
>> and moves the struct into gdbcore.h. Then a new global function
>> gdb_read_core_file_mappings is created into which I move the code to
>> build the structured data.
>>
>> Then corelow.c is updated to call gdb_read_core_file_mappings.
>>
>> This commit does not extend the Python API, that is for the next
>> commit.
>>
>> There should be no user visible changes after this commit.
>>
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32844
>>
>> Approved-By: Tom Tromey <tom@tromey.com>
>> ---
>> gdb/corelow.c | 218 +++++++++++++++++++++++++++-----------------------
>> gdb/gdbcore.h | 43 ++++++++++
>> 2 files changed, 160 insertions(+), 101 deletions(-)
>>
>> diff --git a/gdb/corelow.c b/gdb/corelow.c
>> index 2f202dc1fbf..ee97e29556e 100644
>> --- a/gdb/corelow.c
>> +++ b/gdb/corelow.c
>> @@ -366,108 +366,27 @@ core_target::core_target ()
>> void
>> core_target::build_file_mappings ()
>> {
>> - /* Type holding information about a single file mapped into the inferior
>> - at the point when the core file was created. Associates a build-id
>> - with the list of regions the file is mapped into. */
>> - struct mapped_file
>> - {
>> - /* Type for a region of a file that was mapped into the inferior when
>> - the core file was generated. */
>> - struct region
>> - {
>> - /* Constructor. See member variables for argument descriptions. */
>> - region (CORE_ADDR start_, CORE_ADDR end_, CORE_ADDR file_ofs_)
>> - : start (start_),
>> - end (end_),
>> - file_ofs (file_ofs_)
>> - { /* Nothing. */ }
>> -
>> - /* The inferior address for the start of the mapped region. */
>> - CORE_ADDR start;
>> -
>> - /* The inferior address immediately after the mapped region. */
>> - CORE_ADDR end;
>> -
>> - /* The offset within the mapped file for this content. */
>> - CORE_ADDR file_ofs;
>> - };
>> -
>> - /* If not nullptr, then this is the build-id associated with this
>> - file. */
>> - const bfd_build_id *build_id = nullptr;
>> -
>> - /* If true then we have seen multiple different build-ids associated
>> - with the same filename. The build_id field will have been set back
>> - to nullptr, and we should not set build_id in future. */
>> - bool ignore_build_id_p = false;
>> -
>> - /* All the mapped regions of this file. */
>> - std::vector<region> regions;
>> - };
>> -
>> gdb::unordered_map<std::string, struct bfd *> bfd_map;
>> gdb::unordered_set<std::string> unavailable_paths;
>>
>> /* All files mapped into the core file. The key is the filename. */
>> - gdb::unordered_map<std::string, mapped_file> mapped_files;
>> + std::vector<core_mapped_file> mapped_files
>> + = gdb_read_core_file_mappings (m_core_gdbarch,
>> + current_program_space->core_bfd ());
>>
>> - /* See linux_read_core_file_mappings() in linux-tdep.c for an example
>> - read_core_file_mappings method. */
>> - gdbarch_read_core_file_mappings (m_core_gdbarch,
>> - current_program_space->core_bfd (),
>> -
>> - /* After determining the number of mappings, read_core_file_mappings
>> - will invoke this lambda. */
>> - [&] (ULONGEST)
>> - {
>> - },
>> -
>> - /* read_core_file_mappings will invoke this lambda for each mapping
>> - that it finds. */
>> - [&] (int num, ULONGEST start, ULONGEST end, ULONGEST file_ofs,
>> - const char *filename, const bfd_build_id *build_id)
>> - {
>> - /* Architecture-specific read_core_mapping methods are expected to
>> - weed out non-file-backed mappings. */
>> - gdb_assert (filename != nullptr);
>> -
>> - /* Add this mapped region to the data for FILENAME. */
>> - mapped_file &file_data = mapped_files[filename];
>> - file_data.regions.emplace_back (start, end, file_ofs);
>> - if (build_id != nullptr && !file_data.ignore_build_id_p)
>> - {
>> - if (file_data.build_id == nullptr)
>> - file_data.build_id = build_id;
>> - else if (!build_id_equal (build_id, file_data.build_id))
>> - {
>> - warning (_("Multiple build-ids found for %ps"),
>> - styled_string (file_name_style.style (), filename));
>> - file_data.build_id = nullptr;
>> - file_data.ignore_build_id_p = true;
>> - }
>> - }
>> - });
>> -
>> - /* Get the build-id of the core file. */
>> - const bfd_build_id *core_build_id
>> - = build_id_bfd_get (current_program_space->core_bfd ());
>> -
>> - for (const auto &[filename, file_data] : mapped_files)
>> + for (const core_mapped_file &file_data : mapped_files)
>> {
>> - /* If this mapped file has the same build-id as was discovered for
>> - the core-file itself, then we assume this is the main
>> - executable. Record the filename as we can use this later. */
>> - if (file_data.build_id != nullptr
>> - && m_expected_exec_filename.empty ()
>> - && build_id_equal (file_data.build_id, core_build_id))
>> - m_expected_exec_filename = filename;
>> + /* If this mapped file is marked as the main executable then record
>> + the filename as we can use this later. */
>> + if (file_data.is_main_exec && m_expected_exec_filename.empty ())
>> + m_expected_exec_filename = file_data.filename;
>>
>> /* Use exec_file_find() to do sysroot expansion. It'll
>> also strip the potential sysroot "target:" prefix. If
>> there is no sysroot, an equivalent (possibly more
>> canonical) pathname will be provided. */
>> gdb::unique_xmalloc_ptr<char> expanded_fname
>> - = exec_file_find (filename.c_str (), nullptr);
>> + = exec_file_find (file_data.filename.c_str (), nullptr);
>>
>> bool build_id_mismatch = false;
>> if (expanded_fname != nullptr && file_data.build_id != nullptr)
>> @@ -509,7 +428,7 @@ core_target::build_file_mappings ()
>> {
>> abfd = find_objfile_by_build_id (current_program_space,
>> file_data.build_id,
>> - filename.c_str ());
>> + file_data.filename.c_str ());
>>
>> if (abfd != nullptr)
>> {
>> @@ -527,7 +446,7 @@ core_target::build_file_mappings ()
>> }
>>
>> std::vector<mem_range> ranges;
>> - for (const mapped_file::region ®ion : file_data.regions)
>> + for (const core_mapped_file::region ®ion : file_data.regions)
>> ranges.emplace_back (region.start, region.end - region.start);
>>
>> if (expanded_fname == nullptr
>> @@ -545,7 +464,7 @@ core_target::build_file_mappings ()
>> bool content_is_in_core_file_p = true;
>>
>> /* Record all regions for this file as unavailable. */
>> - for (const mapped_file::region ®ion : file_data.regions)
>> + for (const core_mapped_file::region ®ion : file_data.regions)
>> {
>> /* Check to see if the region is available within the core
>> file. */
>> @@ -577,33 +496,33 @@ core_target::build_file_mappings ()
>> if (build_id_mismatch)
>> {
>> if (expanded_fname == nullptr
>> - || filename == expanded_fname.get ())
>> + || file_data.filename == expanded_fname.get ())
>> warning (_("File %ps doesn't match build-id from core-file "
>> "during file-backed mapping processing"),
>> styled_string (file_name_style.style (),
>> - filename.c_str ()));
>> + file_data.filename.c_str ()));
>> else
>> warning (_("File %ps which was expanded to %ps, doesn't match "
>> "build-id from core-file during file-backed "
>> "mapping processing"),
>> styled_string (file_name_style.style (),
>> - filename.c_str ()),
>> + file_data.filename.c_str ()),
>> styled_string (file_name_style.style (),
>> expanded_fname.get ()));
>> }
>> else if (!content_is_in_core_file_p)
>> {
>> if (expanded_fname == nullptr
>> - || filename == expanded_fname.get ())
>> + || file_data.filename == expanded_fname.get ())
>> warning (_("Can't open file %ps during file-backed mapping "
>> "note processing"),
>> styled_string (file_name_style.style (),
>> - filename.c_str ()));
>> + file_data.filename.c_str ()));
>> else
>> warning (_("Can't open file %ps which was expanded to %ps "
>> "during file-backed mapping note processing"),
>> styled_string (file_name_style.style (),
>> - filename.c_str ()),
>> + file_data.filename.c_str ()),
>> styled_string (file_name_style.style (),
>> expanded_fname.get ()));
>> }
>> @@ -617,7 +536,7 @@ core_target::build_file_mappings ()
>> abfd.get ());
>>
>> /* Create sections for each mapped region. */
>> - for (const mapped_file::region ®ion : file_data.regions)
>> + for (const core_mapped_file::region ®ion : file_data.regions)
>> {
>> /* Make new BFD section. All sections have the same name,
>> which is permitted by bfd_make_section_anyway(). */
>> @@ -653,7 +572,7 @@ core_target::build_file_mappings ()
>> soname = gdb_bfd_read_elf_soname (actual_filename);
>> }
>>
>> - m_mapped_file_info.add (soname.get (), filename.c_str (),
>> + m_mapped_file_info.add (soname.get (), file_data.filename.c_str (),
>> actual_filename, std::move (ranges),
>> file_data.build_id);
>> }
>> @@ -2163,6 +2082,103 @@ mapped_file_info::lookup (const char *filename,
>>
>> /* See gdbcore.h. */
>>
>> +std::vector<core_mapped_file>
>> +gdb_read_core_file_mappings (struct gdbarch *gdbarch, struct bfd *cbfd)
>> +{
>> + std::vector<core_mapped_file> results;
>> +
>> + /* A map entry used while building RESULTS. */
>> + struct map_entry
>> + {
>> + explicit map_entry (core_mapped_file *ptr)
>> + : file_data (ptr)
>> + { /* Nothing. */ }
>> +
>> + /* Points to an entry in RESULTS, this allows entries to be quickly
>> + looked up and updated as new mappings are read. */
>> + core_mapped_file *file_data = nullptr;
>> +
>> + /* If true then we have seen multiple different build-ids associated
>> + with the filename of FILE_DATA. The FILE_DATA->build_id field will
>> + have been set to nullptr, and we should not set FILE_DATA->build_id
>> + in future. */
>> + bool ignore_build_id_p = false;
>> + };
>> +
>> + /* All files mapped into the core file. The key is the filename. */
>> + gdb::unordered_map<std::string, map_entry> mapped_files;
>> +
>> + /* Get the build-id of the core file. At least on Linux, this will be
>> + the build-id for the main executable. If other targets add the
>> + gdbarch_read_core_file_mappings method, then it might turn out that
>> + this logic is no longer true, in which case this might need to move
>> + into the gdbarch_read_core_file_mappings method. */
>> + const bfd_build_id *core_build_id = build_id_bfd_get (cbfd);
>> +
>> + /* See linux_read_core_file_mappings() in linux-tdep.c for an example
>> + read_core_file_mappings method. */
>> + gdbarch_read_core_file_mappings (gdbarch, cbfd,
>> + /* After determining the number of mappings, read_core_file_mappings
>> + will invoke this lambda. */
>> + [&] (ULONGEST)
>> + {
>> + },
>> +
>> + /* read_core_file_mappings will invoke this lambda for each mapping
>> + that it finds. */
>> + [&] (int num, ULONGEST start, ULONGEST end, ULONGEST file_ofs,
>> + const char *filename, const bfd_build_id *build_id)
>> + {
>> + /* Architecture-specific read_core_mapping methods are expected to
>> + weed out non-file-backed mappings. */
>> + gdb_assert (filename != nullptr);
>> +
>> + /* Add this mapped region to the data for FILENAME. */
>> + auto iter = mapped_files.find (filename);
>> + if (iter == mapped_files.end ())
>> + {
>> + /* Create entry in results list. */
>> + results.emplace_back ();
>> +
>> + /* The entry to be added to the lookup map. */
>> + map_entry entry (&results.back ());
>
> This bit is (I think) invalid. You are taking the address of an element
> in the vector to place it in the map_entry cache (mapped_files).
> However, taking the address of the element of a vector in this case is
> wrong. As you grow the vector (results.emplace_back), the backing
> storage might be grown, and as a consequence values moved and existing
> pointers/references/iterators are invalidated.
>
> Using a list could be a drop-in replacement, or storing the index in the
> vector rather than the element address would fix the issue.
>
> I can submit a patch doing either of those, depending on what you
> prefer.
>
Thanks for tracking this down.
If you have a patch or want to create the patch, then feel free to go
with whatever approach you feel is best.
But I don't want to dump my mess on you. I'm happy to fix this up.
Just let me know.
Thanks,
Andrew
next prev parent reply other threads:[~2025-10-13 14:38 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-02 16:03 [PATCH 0/3] Core file Python API Andrew Burgess
2025-09-02 16:03 ` [PATCH 1/3] gdb/python: introduce gdb.Corefile API Andrew Burgess
2025-09-02 16:26 ` Eli Zaretskii
2025-09-16 17:25 ` Tom Tromey
2025-09-23 13:50 ` Andrew Burgess
2025-09-02 16:03 ` [PATCH 2/3] gdb: make structured core file mappings processing global Andrew Burgess
2025-09-16 17:28 ` Tom Tromey
2025-09-02 16:03 ` [PATCH 3/3] gdb/python: add Corefile.mapped_files method Andrew Burgess
2025-09-16 17:54 ` Tom Tromey
2025-09-23 13:52 ` Andrew Burgess
2025-09-23 13:44 ` [PATCHv2 0/3] Core file Python API Andrew Burgess
2025-09-23 13:44 ` [PATCHv2 1/3] gdb/python: introduce gdb.Corefile API Andrew Burgess
2025-10-03 18:56 ` Tom Tromey
2025-10-06 8:54 ` Andrew Burgess
2025-10-06 15:39 ` Tom Tromey
2025-10-06 16:13 ` Andrew Burgess
2025-09-23 13:44 ` [PATCHv2 2/3] gdb: make structured core file mappings processing global Andrew Burgess
2025-10-13 13:57 ` Lancelot SIX
2025-10-13 14:37 ` Andrew Burgess [this message]
2025-10-13 15:16 ` Six, Lancelot
2025-10-14 9:12 ` Lancelot SIX
2025-09-23 13:44 ` [PATCHv2 3/3] gdb/python: add Corefile.mapped_files method Andrew Burgess
2025-10-03 19:15 ` Tom Tromey
2025-10-07 6:24 ` Tom de Vries
2025-10-07 12:21 ` Andrew Burgess
2025-10-07 13:08 ` Tom de Vries
2025-10-07 13:26 ` Andrew Burgess
2025-10-07 14:38 ` Andrew Burgess
2025-10-07 15:43 ` Tom de Vries
2025-10-07 16:28 ` Andrew Burgess
2025-10-08 9:29 ` Andrew Burgess
2025-10-08 10:36 ` Tom de Vries
2025-10-08 14:14 ` Andrew Burgess
2025-10-08 15:43 ` Tom de Vries
2025-10-08 16:03 ` Andrew Burgess
2025-10-16 20:00 ` Tom Tromey
2025-10-17 10:02 ` Andrew Burgess
2025-10-17 13:32 ` Andrew Burgess
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87jz0yc1g1.fsf@redhat.com \
--to=aburgess@redhat.com \
--cc=gdb-patches@sourceware.org \
--cc=lancelot.six@amd.com \
--cc=tom@tromey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox