From: Lancelot SIX <lancelot.six@amd.com>
To: Andrew Burgess <aburgess@redhat.com>
Cc: <gdb-patches@sourceware.org>, Tom Tromey <tom@tromey.com>
Subject: Re: [PATCHv2 2/3] gdb: make structured core file mappings processing global
Date: Mon, 13 Oct 2025 14:57:48 +0100 [thread overview]
Message-ID: <azpbye6xzjodiz33sp3oirjyjxxv266vxmb7eyaauuxspwgica@6omdz5klkmla> (raw)
In-Reply-To: <b269dfdee405a946074bf1756634c20558e343d8.1758634958.git.aburgess@redhat.com>
Hi Andrew,
I think there is an issue with this patch. When running with this and
address sanitizer, I end-up with use-after-free errors.
See below for what I think is the issue.
On Tue, Sep 23, 2025 at 02:44:07PM +0100, Andrew Burgess wrote:
> In corelow.c, within core_target::build_file_mappings, we have code
> that wraps around a call to gdbarch_read_core_file_mappings and
> provides more structure to the results.
>
> Specifically, gdbarch_read_core_file_mappings calls a callback once
> for every region of every mapped file. The wrapper code groups all of
> the mappings for one file into an instance of 'struct mapped_file',
> this allows all of the mapped regions to be associated with the
> build-id and filename of a file.
>
> In the next commit I plan to make this information available via the
> Python API, and so I need to allow access to this structured wrapping
> outside of corelow.c.
>
> This commit renames 'struct mapped_file' to 'struct core_mapped_file'
> and moves the struct into gdbcore.h. Then a new global function
> gdb_read_core_file_mappings is created into which I move the code to
> build the structured data.
>
> Then corelow.c is updated to call gdb_read_core_file_mappings.
>
> This commit does not extend the Python API, that is for the next
> commit.
>
> There should be no user visible changes after this commit.
>
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32844
>
> Approved-By: Tom Tromey <tom@tromey.com>
> ---
> gdb/corelow.c | 218 +++++++++++++++++++++++++++-----------------------
> gdb/gdbcore.h | 43 ++++++++++
> 2 files changed, 160 insertions(+), 101 deletions(-)
>
> diff --git a/gdb/corelow.c b/gdb/corelow.c
> index 2f202dc1fbf..ee97e29556e 100644
> --- a/gdb/corelow.c
> +++ b/gdb/corelow.c
> @@ -366,108 +366,27 @@ core_target::core_target ()
> void
> core_target::build_file_mappings ()
> {
> - /* Type holding information about a single file mapped into the inferior
> - at the point when the core file was created. Associates a build-id
> - with the list of regions the file is mapped into. */
> - struct mapped_file
> - {
> - /* Type for a region of a file that was mapped into the inferior when
> - the core file was generated. */
> - struct region
> - {
> - /* Constructor. See member variables for argument descriptions. */
> - region (CORE_ADDR start_, CORE_ADDR end_, CORE_ADDR file_ofs_)
> - : start (start_),
> - end (end_),
> - file_ofs (file_ofs_)
> - { /* Nothing. */ }
> -
> - /* The inferior address for the start of the mapped region. */
> - CORE_ADDR start;
> -
> - /* The inferior address immediately after the mapped region. */
> - CORE_ADDR end;
> -
> - /* The offset within the mapped file for this content. */
> - CORE_ADDR file_ofs;
> - };
> -
> - /* If not nullptr, then this is the build-id associated with this
> - file. */
> - const bfd_build_id *build_id = nullptr;
> -
> - /* If true then we have seen multiple different build-ids associated
> - with the same filename. The build_id field will have been set back
> - to nullptr, and we should not set build_id in future. */
> - bool ignore_build_id_p = false;
> -
> - /* All the mapped regions of this file. */
> - std::vector<region> regions;
> - };
> -
> gdb::unordered_map<std::string, struct bfd *> bfd_map;
> gdb::unordered_set<std::string> unavailable_paths;
>
> /* All files mapped into the core file. The key is the filename. */
> - gdb::unordered_map<std::string, mapped_file> mapped_files;
> + std::vector<core_mapped_file> mapped_files
> + = gdb_read_core_file_mappings (m_core_gdbarch,
> + current_program_space->core_bfd ());
>
> - /* See linux_read_core_file_mappings() in linux-tdep.c for an example
> - read_core_file_mappings method. */
> - gdbarch_read_core_file_mappings (m_core_gdbarch,
> - current_program_space->core_bfd (),
> -
> - /* After determining the number of mappings, read_core_file_mappings
> - will invoke this lambda. */
> - [&] (ULONGEST)
> - {
> - },
> -
> - /* read_core_file_mappings will invoke this lambda for each mapping
> - that it finds. */
> - [&] (int num, ULONGEST start, ULONGEST end, ULONGEST file_ofs,
> - const char *filename, const bfd_build_id *build_id)
> - {
> - /* Architecture-specific read_core_mapping methods are expected to
> - weed out non-file-backed mappings. */
> - gdb_assert (filename != nullptr);
> -
> - /* Add this mapped region to the data for FILENAME. */
> - mapped_file &file_data = mapped_files[filename];
> - file_data.regions.emplace_back (start, end, file_ofs);
> - if (build_id != nullptr && !file_data.ignore_build_id_p)
> - {
> - if (file_data.build_id == nullptr)
> - file_data.build_id = build_id;
> - else if (!build_id_equal (build_id, file_data.build_id))
> - {
> - warning (_("Multiple build-ids found for %ps"),
> - styled_string (file_name_style.style (), filename));
> - file_data.build_id = nullptr;
> - file_data.ignore_build_id_p = true;
> - }
> - }
> - });
> -
> - /* Get the build-id of the core file. */
> - const bfd_build_id *core_build_id
> - = build_id_bfd_get (current_program_space->core_bfd ());
> -
> - for (const auto &[filename, file_data] : mapped_files)
> + for (const core_mapped_file &file_data : mapped_files)
> {
> - /* If this mapped file has the same build-id as was discovered for
> - the core-file itself, then we assume this is the main
> - executable. Record the filename as we can use this later. */
> - if (file_data.build_id != nullptr
> - && m_expected_exec_filename.empty ()
> - && build_id_equal (file_data.build_id, core_build_id))
> - m_expected_exec_filename = filename;
> + /* If this mapped file is marked as the main executable then record
> + the filename as we can use this later. */
> + if (file_data.is_main_exec && m_expected_exec_filename.empty ())
> + m_expected_exec_filename = file_data.filename;
>
> /* Use exec_file_find() to do sysroot expansion. It'll
> also strip the potential sysroot "target:" prefix. If
> there is no sysroot, an equivalent (possibly more
> canonical) pathname will be provided. */
> gdb::unique_xmalloc_ptr<char> expanded_fname
> - = exec_file_find (filename.c_str (), nullptr);
> + = exec_file_find (file_data.filename.c_str (), nullptr);
>
> bool build_id_mismatch = false;
> if (expanded_fname != nullptr && file_data.build_id != nullptr)
> @@ -509,7 +428,7 @@ core_target::build_file_mappings ()
> {
> abfd = find_objfile_by_build_id (current_program_space,
> file_data.build_id,
> - filename.c_str ());
> + file_data.filename.c_str ());
>
> if (abfd != nullptr)
> {
> @@ -527,7 +446,7 @@ core_target::build_file_mappings ()
> }
>
> std::vector<mem_range> ranges;
> - for (const mapped_file::region ®ion : file_data.regions)
> + for (const core_mapped_file::region ®ion : file_data.regions)
> ranges.emplace_back (region.start, region.end - region.start);
>
> if (expanded_fname == nullptr
> @@ -545,7 +464,7 @@ core_target::build_file_mappings ()
> bool content_is_in_core_file_p = true;
>
> /* Record all regions for this file as unavailable. */
> - for (const mapped_file::region ®ion : file_data.regions)
> + for (const core_mapped_file::region ®ion : file_data.regions)
> {
> /* Check to see if the region is available within the core
> file. */
> @@ -577,33 +496,33 @@ core_target::build_file_mappings ()
> if (build_id_mismatch)
> {
> if (expanded_fname == nullptr
> - || filename == expanded_fname.get ())
> + || file_data.filename == expanded_fname.get ())
> warning (_("File %ps doesn't match build-id from core-file "
> "during file-backed mapping processing"),
> styled_string (file_name_style.style (),
> - filename.c_str ()));
> + file_data.filename.c_str ()));
> else
> warning (_("File %ps which was expanded to %ps, doesn't match "
> "build-id from core-file during file-backed "
> "mapping processing"),
> styled_string (file_name_style.style (),
> - filename.c_str ()),
> + file_data.filename.c_str ()),
> styled_string (file_name_style.style (),
> expanded_fname.get ()));
> }
> else if (!content_is_in_core_file_p)
> {
> if (expanded_fname == nullptr
> - || filename == expanded_fname.get ())
> + || file_data.filename == expanded_fname.get ())
> warning (_("Can't open file %ps during file-backed mapping "
> "note processing"),
> styled_string (file_name_style.style (),
> - filename.c_str ()));
> + file_data.filename.c_str ()));
> else
> warning (_("Can't open file %ps which was expanded to %ps "
> "during file-backed mapping note processing"),
> styled_string (file_name_style.style (),
> - filename.c_str ()),
> + file_data.filename.c_str ()),
> styled_string (file_name_style.style (),
> expanded_fname.get ()));
> }
> @@ -617,7 +536,7 @@ core_target::build_file_mappings ()
> abfd.get ());
>
> /* Create sections for each mapped region. */
> - for (const mapped_file::region ®ion : file_data.regions)
> + for (const core_mapped_file::region ®ion : file_data.regions)
> {
> /* Make new BFD section. All sections have the same name,
> which is permitted by bfd_make_section_anyway(). */
> @@ -653,7 +572,7 @@ core_target::build_file_mappings ()
> soname = gdb_bfd_read_elf_soname (actual_filename);
> }
>
> - m_mapped_file_info.add (soname.get (), filename.c_str (),
> + m_mapped_file_info.add (soname.get (), file_data.filename.c_str (),
> actual_filename, std::move (ranges),
> file_data.build_id);
> }
> @@ -2163,6 +2082,103 @@ mapped_file_info::lookup (const char *filename,
>
> /* See gdbcore.h. */
>
> +std::vector<core_mapped_file>
> +gdb_read_core_file_mappings (struct gdbarch *gdbarch, struct bfd *cbfd)
> +{
> + std::vector<core_mapped_file> results;
> +
> + /* A map entry used while building RESULTS. */
> + struct map_entry
> + {
> + explicit map_entry (core_mapped_file *ptr)
> + : file_data (ptr)
> + { /* Nothing. */ }
> +
> + /* Points to an entry in RESULTS, this allows entries to be quickly
> + looked up and updated as new mappings are read. */
> + core_mapped_file *file_data = nullptr;
> +
> + /* If true then we have seen multiple different build-ids associated
> + with the filename of FILE_DATA. The FILE_DATA->build_id field will
> + have been set to nullptr, and we should not set FILE_DATA->build_id
> + in future. */
> + bool ignore_build_id_p = false;
> + };
> +
> + /* All files mapped into the core file. The key is the filename. */
> + gdb::unordered_map<std::string, map_entry> mapped_files;
> +
> + /* Get the build-id of the core file. At least on Linux, this will be
> + the build-id for the main executable. If other targets add the
> + gdbarch_read_core_file_mappings method, then it might turn out that
> + this logic is no longer true, in which case this might need to move
> + into the gdbarch_read_core_file_mappings method. */
> + const bfd_build_id *core_build_id = build_id_bfd_get (cbfd);
> +
> + /* See linux_read_core_file_mappings() in linux-tdep.c for an example
> + read_core_file_mappings method. */
> + gdbarch_read_core_file_mappings (gdbarch, cbfd,
> + /* After determining the number of mappings, read_core_file_mappings
> + will invoke this lambda. */
> + [&] (ULONGEST)
> + {
> + },
> +
> + /* read_core_file_mappings will invoke this lambda for each mapping
> + that it finds. */
> + [&] (int num, ULONGEST start, ULONGEST end, ULONGEST file_ofs,
> + const char *filename, const bfd_build_id *build_id)
> + {
> + /* Architecture-specific read_core_mapping methods are expected to
> + weed out non-file-backed mappings. */
> + gdb_assert (filename != nullptr);
> +
> + /* Add this mapped region to the data for FILENAME. */
> + auto iter = mapped_files.find (filename);
> + if (iter == mapped_files.end ())
> + {
> + /* Create entry in results list. */
> + results.emplace_back ();
> +
> + /* The entry to be added to the lookup map. */
> + map_entry entry (&results.back ());
This bit is (I think) invalid. You are taking the address of an element
in the vector to place it in the map_entry cache (mapped_files).
However, taking the address of the element of a vector in this case is
wrong. As you grow the vector (results.emplace_back), the backing
storage might be grown, and as a consequence values moved and existing
pointers/references/iterators are invalidated.
Using a list could be a drop-in replacement, or storing the index in the
vector rather than the element address would fix the issue.
I can submit a patch doing either of those, depending on what you
prefer.
Best,
Lancelot.
> + entry.file_data->filename = filename;
> +
> + /* Add entry to the quick lookup map and update ITER. */
> + auto inserted_result
> + = mapped_files.insert ({filename, std::move (entry)});
> + gdb_assert (inserted_result.second);
> + iter = inserted_result.first;
> + }
> +
> + core_mapped_file &file_data = *iter->second.file_data;
> + bool &ignore_build_id_p = iter->second.ignore_build_id_p;
> +
> + file_data.regions.emplace_back (start, end, file_ofs);
> + if (build_id != nullptr && !ignore_build_id_p)
> + {
> + if (file_data.build_id == nullptr)
> + file_data.build_id = build_id;
> + else if (!build_id_equal (build_id, file_data.build_id))
> + {
> + warning (_("Multiple build-ids found for %ps"),
> + styled_string (file_name_style.style (), filename));
> + file_data.build_id = nullptr;
> + ignore_build_id_p = true;
> + }
> + }
> +
> + if (build_id != nullptr
> + && core_build_id != nullptr
> + && build_id_equal (build_id, core_build_id))
> + file_data.is_main_exec = true;
> + });
> +
> + return results;
> +}
> +
> +/* See gdbcore.h. */
> +
> std::optional <core_target_mapped_file_info>
> core_target_find_mapped_file (const char *filename,
> std::optional<CORE_ADDR> addr)
> diff --git a/gdb/gdbcore.h b/gdb/gdbcore.h
> index e0e3843c97b..23432ecbd07 100644
> --- a/gdb/gdbcore.h
> +++ b/gdb/gdbcore.h
> @@ -258,4 +258,47 @@ std::optional<core_target_mapped_file_info>
> core_target_find_mapped_file (const char *filename,
> std::optional<CORE_ADDR> addr);
>
> +/* Type holding information about a single file mapped into the inferior
> + at the point when the core file was created. Associates a build-id
> + with the list of regions the file is mapped into. */
> +struct core_mapped_file
> +{
> + /* Type for a region of a file that was mapped into the inferior when
> + the core file was generated. */
> + struct region
> + {
> + /* Constructor. See member variables for argument descriptions. */
> + region (CORE_ADDR start_, CORE_ADDR end_, CORE_ADDR file_ofs_)
> + : start (start_),
> + end (end_),
> + file_ofs (file_ofs_)
> + { /* Nothing. */ }
> +
> + /* The inferior address for the start of the mapped region. */
> + CORE_ADDR start;
> +
> + /* The inferior address immediately after the mapped region. */
> + CORE_ADDR end;
> +
> + /* The offset within the mapped file for this content. */
> + CORE_ADDR file_ofs;
> + };
> +
> + /* The filename as recorded in the core file. */
> + std::string filename;
> +
> + /* If not nullptr, then this is the build-id associated with this
> + file. */
> + const bfd_build_id *build_id = nullptr;
> +
> + /* All the mapped regions of this file. */
> + std::vector<region> regions;
> +
> + /* True if this is the main executable. */
> + bool is_main_exec = false;
> +};
> +
> +extern std::vector<core_mapped_file> gdb_read_core_file_mappings
> + (struct gdbarch *gdbarch, struct bfd *cbfd);
> +
> #endif /* GDB_GDBCORE_H */
> --
> 2.47.1
>
>
next prev parent reply other threads:[~2025-10-13 13:58 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-02 16:03 [PATCH 0/3] Core file Python API Andrew Burgess
2025-09-02 16:03 ` [PATCH 1/3] gdb/python: introduce gdb.Corefile API Andrew Burgess
2025-09-02 16:26 ` Eli Zaretskii
2025-09-16 17:25 ` Tom Tromey
2025-09-23 13:50 ` Andrew Burgess
2025-09-02 16:03 ` [PATCH 2/3] gdb: make structured core file mappings processing global Andrew Burgess
2025-09-16 17:28 ` Tom Tromey
2025-09-02 16:03 ` [PATCH 3/3] gdb/python: add Corefile.mapped_files method Andrew Burgess
2025-09-16 17:54 ` Tom Tromey
2025-09-23 13:52 ` Andrew Burgess
2025-09-23 13:44 ` [PATCHv2 0/3] Core file Python API Andrew Burgess
2025-09-23 13:44 ` [PATCHv2 1/3] gdb/python: introduce gdb.Corefile API Andrew Burgess
2025-10-03 18:56 ` Tom Tromey
2025-10-06 8:54 ` Andrew Burgess
2025-10-06 15:39 ` Tom Tromey
2025-10-06 16:13 ` Andrew Burgess
2025-09-23 13:44 ` [PATCHv2 2/3] gdb: make structured core file mappings processing global Andrew Burgess
2025-10-13 13:57 ` Lancelot SIX [this message]
2025-10-13 14:37 ` Andrew Burgess
2025-10-13 15:16 ` Six, Lancelot
2025-10-14 9:12 ` Lancelot SIX
2025-09-23 13:44 ` [PATCHv2 3/3] gdb/python: add Corefile.mapped_files method Andrew Burgess
2025-10-03 19:15 ` Tom Tromey
2025-10-07 6:24 ` Tom de Vries
2025-10-07 12:21 ` Andrew Burgess
2025-10-07 13:08 ` Tom de Vries
2025-10-07 13:26 ` Andrew Burgess
2025-10-07 14:38 ` Andrew Burgess
2025-10-07 15:43 ` Tom de Vries
2025-10-07 16:28 ` Andrew Burgess
2025-10-08 9:29 ` Andrew Burgess
2025-10-08 10:36 ` Tom de Vries
2025-10-08 14:14 ` Andrew Burgess
2025-10-08 15:43 ` Tom de Vries
2025-10-08 16:03 ` Andrew Burgess
2025-10-16 20:00 ` Tom Tromey
2025-10-17 10:02 ` Andrew Burgess
2025-10-17 13:32 ` Andrew Burgess
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=azpbye6xzjodiz33sp3oirjyjxxv266vxmb7eyaauuxspwgica@6omdz5klkmla \
--to=lancelot.six@amd.com \
--cc=aburgess@redhat.com \
--cc=gdb-patches@sourceware.org \
--cc=tom@tromey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox