Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Andrew Burgess <aburgess@redhat.com>
To: Lancelot SIX <lancelot.six@amd.com>
Cc: gdb-patches@sourceware.org, Tom Tromey <tom@tromey.com>
Subject: Re: [PATCHv2 2/3] gdb: make structured core file mappings processing global
Date: Mon, 13 Oct 2025 15:37:34 +0100	[thread overview]
Message-ID: <87jz0yc1g1.fsf@redhat.com> (raw)
In-Reply-To: <azpbye6xzjodiz33sp3oirjyjxxv266vxmb7eyaauuxspwgica@6omdz5klkmla>

Lancelot SIX <lancelot.six@amd.com> writes:

> Hi Andrew,
>
> I think there is an issue with this patch.  When running with this and
> address sanitizer, I end-up with use-after-free errors.
>
> See below for what I think is the issue.
>
> On Tue, Sep 23, 2025 at 02:44:07PM +0100, Andrew Burgess wrote:
>> In corelow.c, within core_target::build_file_mappings, we have code
>> that wraps around a call to gdbarch_read_core_file_mappings and
>> provides more structure to the results.
>> 
>> Specifically, gdbarch_read_core_file_mappings calls a callback once
>> for every region of every mapped file.  The wrapper code groups all of
>> the mappings for one file into an instance of 'struct mapped_file',
>> this allows all of the mapped regions to be associated with the
>> build-id and filename of a file.
>> 
>> In the next commit I plan to make this information available via the
>> Python API, and so I need to allow access to this structured wrapping
>> outside of corelow.c.
>> 
>> This commit renames 'struct mapped_file' to 'struct core_mapped_file'
>> and moves the struct into gdbcore.h.  Then a new global function
>> gdb_read_core_file_mappings is created into which I move the code to
>> build the structured data.
>> 
>> Then corelow.c is updated to call gdb_read_core_file_mappings.
>> 
>> This commit does not extend the Python API, that is for the next
>> commit.
>> 
>> There should be no user visible changes after this commit.
>> 
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32844
>> 
>> Approved-By: Tom Tromey <tom@tromey.com>
>> ---
>>  gdb/corelow.c | 218 +++++++++++++++++++++++++++-----------------------
>>  gdb/gdbcore.h |  43 ++++++++++
>>  2 files changed, 160 insertions(+), 101 deletions(-)
>> 
>> diff --git a/gdb/corelow.c b/gdb/corelow.c
>> index 2f202dc1fbf..ee97e29556e 100644
>> --- a/gdb/corelow.c
>> +++ b/gdb/corelow.c
>> @@ -366,108 +366,27 @@ core_target::core_target ()
>>  void
>>  core_target::build_file_mappings ()
>>  {
>> -  /* Type holding information about a single file mapped into the inferior
>> -     at the point when the core file was created.  Associates a build-id
>> -     with the list of regions the file is mapped into.  */
>> -  struct mapped_file
>> -  {
>> -    /* Type for a region of a file that was mapped into the inferior when
>> -       the core file was generated.  */
>> -    struct region
>> -    {
>> -      /* Constructor.   See member variables for argument descriptions.  */
>> -      region (CORE_ADDR start_, CORE_ADDR end_, CORE_ADDR file_ofs_)
>> -	: start (start_),
>> -	  end (end_),
>> -	  file_ofs (file_ofs_)
>> -      { /* Nothing.  */ }
>> -
>> -      /* The inferior address for the start of the mapped region.  */
>> -      CORE_ADDR start;
>> -
>> -      /* The inferior address immediately after the mapped region.  */
>> -      CORE_ADDR end;
>> -
>> -      /* The offset within the mapped file for this content.  */
>> -      CORE_ADDR file_ofs;
>> -    };
>> -
>> -    /* If not nullptr, then this is the build-id associated with this
>> -       file.  */
>> -    const bfd_build_id *build_id = nullptr;
>> -
>> -    /* If true then we have seen multiple different build-ids associated
>> -       with the same filename.  The build_id field will have been set back
>> -       to nullptr, and we should not set build_id in future.  */
>> -    bool ignore_build_id_p = false;
>> -
>> -    /* All the mapped regions of this file.  */
>> -    std::vector<region> regions;
>> -  };
>> -
>>    gdb::unordered_map<std::string, struct bfd *> bfd_map;
>>    gdb::unordered_set<std::string> unavailable_paths;
>>  
>>    /* All files mapped into the core file.  The key is the filename.  */
>> -  gdb::unordered_map<std::string, mapped_file> mapped_files;
>> +  std::vector<core_mapped_file> mapped_files
>> +    = gdb_read_core_file_mappings (m_core_gdbarch,
>> +				   current_program_space->core_bfd ());
>>  
>> -  /* See linux_read_core_file_mappings() in linux-tdep.c for an example
>> -     read_core_file_mappings method.  */
>> -  gdbarch_read_core_file_mappings (m_core_gdbarch,
>> -				   current_program_space->core_bfd (),
>> -
>> -    /* After determining the number of mappings, read_core_file_mappings
>> -       will invoke this lambda.  */
>> -    [&] (ULONGEST)
>> -      {
>> -      },
>> -
>> -    /* read_core_file_mappings will invoke this lambda for each mapping
>> -       that it finds.  */
>> -    [&] (int num, ULONGEST start, ULONGEST end, ULONGEST file_ofs,
>> -	 const char *filename, const bfd_build_id *build_id)
>> -      {
>> -	/* Architecture-specific read_core_mapping methods are expected to
>> -	   weed out non-file-backed mappings.  */
>> -	gdb_assert (filename != nullptr);
>> -
>> -	/* Add this mapped region to the data for FILENAME.  */
>> -	mapped_file &file_data = mapped_files[filename];
>> -	file_data.regions.emplace_back (start, end, file_ofs);
>> -	if (build_id != nullptr && !file_data.ignore_build_id_p)
>> -	  {
>> -	    if (file_data.build_id == nullptr)
>> -	      file_data.build_id = build_id;
>> -	    else if (!build_id_equal (build_id, file_data.build_id))
>> -	      {
>> -		warning (_("Multiple build-ids found for %ps"),
>> -			 styled_string (file_name_style.style (), filename));
>> -		file_data.build_id = nullptr;
>> -		file_data.ignore_build_id_p = true;
>> -	      }
>> -	  }
>> -      });
>> -
>> -  /* Get the build-id of the core file.  */
>> -  const bfd_build_id *core_build_id
>> -    = build_id_bfd_get (current_program_space->core_bfd ());
>> -
>> -  for (const auto &[filename, file_data] : mapped_files)
>> +  for (const core_mapped_file &file_data : mapped_files)
>>      {
>> -      /* If this mapped file has the same build-id as was discovered for
>> -	 the core-file itself, then we assume this is the main
>> -	 executable.  Record the filename as we can use this later.  */
>> -      if (file_data.build_id != nullptr
>> -	  && m_expected_exec_filename.empty ()
>> -	  && build_id_equal (file_data.build_id, core_build_id))
>> -	m_expected_exec_filename = filename;
>> +      /* If this mapped file is marked as the main executable then record
>> +	 the filename as we can use this later.  */
>> +      if (file_data.is_main_exec && m_expected_exec_filename.empty ())
>> +	m_expected_exec_filename = file_data.filename;
>>  
>>        /* Use exec_file_find() to do sysroot expansion.  It'll
>>  	 also strip the potential sysroot "target:" prefix.  If
>>  	 there is no sysroot, an equivalent (possibly more
>>  	 canonical) pathname will be provided.  */
>>        gdb::unique_xmalloc_ptr<char> expanded_fname
>> -	= exec_file_find (filename.c_str (), nullptr);
>> +	= exec_file_find (file_data.filename.c_str (), nullptr);
>>  
>>        bool build_id_mismatch = false;
>>        if (expanded_fname != nullptr && file_data.build_id != nullptr)
>> @@ -509,7 +428,7 @@ core_target::build_file_mappings ()
>>  	{
>>  	  abfd = find_objfile_by_build_id (current_program_space,
>>  					   file_data.build_id,
>> -					   filename.c_str ());
>> +					   file_data.filename.c_str ());
>>  
>>  	  if (abfd != nullptr)
>>  	    {
>> @@ -527,7 +446,7 @@ core_target::build_file_mappings ()
>>  	}
>>  
>>        std::vector<mem_range> ranges;
>> -      for (const mapped_file::region &region : file_data.regions)
>> +      for (const core_mapped_file::region &region : file_data.regions)
>>  	ranges.emplace_back (region.start, region.end - region.start);
>>  
>>        if (expanded_fname == nullptr
>> @@ -545,7 +464,7 @@ core_target::build_file_mappings ()
>>  	  bool content_is_in_core_file_p = true;
>>  
>>  	  /* Record all regions for this file as unavailable.  */
>> -	  for (const mapped_file::region &region : file_data.regions)
>> +	  for (const core_mapped_file::region &region : file_data.regions)
>>  	    {
>>  	      /* Check to see if the region is available within the core
>>  		 file.  */
>> @@ -577,33 +496,33 @@ core_target::build_file_mappings ()
>>  	  if (build_id_mismatch)
>>  	    {
>>  	      if (expanded_fname == nullptr
>> -		  || filename == expanded_fname.get ())
>> +		  || file_data.filename == expanded_fname.get ())
>>  		warning (_("File %ps doesn't match build-id from core-file "
>>  			   "during file-backed mapping processing"),
>>  			 styled_string (file_name_style.style (),
>> -					filename.c_str ()));
>> +					file_data.filename.c_str ()));
>>  	      else
>>  		warning (_("File %ps which was expanded to %ps, doesn't match "
>>  			   "build-id from core-file during file-backed "
>>  			   "mapping processing"),
>>  			 styled_string (file_name_style.style (),
>> -					filename.c_str ()),
>> +					file_data.filename.c_str ()),
>>  			 styled_string (file_name_style.style (),
>>  					expanded_fname.get ()));
>>  	    }
>>  	  else if (!content_is_in_core_file_p)
>>  	    {
>>  	      if (expanded_fname == nullptr
>> -		  || filename == expanded_fname.get ())
>> +		  || file_data.filename == expanded_fname.get ())
>>  		warning (_("Can't open file %ps during file-backed mapping "
>>  			   "note processing"),
>>  			 styled_string (file_name_style.style (),
>> -					filename.c_str ()));
>> +					file_data.filename.c_str ()));
>>  	      else
>>  		warning (_("Can't open file %ps which was expanded to %ps "
>>  			   "during file-backed mapping note processing"),
>>  			 styled_string (file_name_style.style (),
>> -					filename.c_str ()),
>> +					file_data.filename.c_str ()),
>>  			 styled_string (file_name_style.style (),
>>  					expanded_fname.get ()));
>>  	    }
>> @@ -617,7 +536,7 @@ core_target::build_file_mappings ()
>>  				    abfd.get ());
>>  
>>  	  /* Create sections for each mapped region.  */
>> -	  for (const mapped_file::region &region : file_data.regions)
>> +	  for (const core_mapped_file::region &region : file_data.regions)
>>  	    {
>>  	      /* Make new BFD section.  All sections have the same name,
>>  		 which is permitted by bfd_make_section_anyway().  */
>> @@ -653,7 +572,7 @@ core_target::build_file_mappings ()
>>  	      soname = gdb_bfd_read_elf_soname (actual_filename);
>>  	    }
>>  
>> -	  m_mapped_file_info.add (soname.get (), filename.c_str (),
>> +	  m_mapped_file_info.add (soname.get (), file_data.filename.c_str (),
>>  				  actual_filename, std::move (ranges),
>>  				  file_data.build_id);
>>  	}
>> @@ -2163,6 +2082,103 @@ mapped_file_info::lookup (const char *filename,
>>  
>>  /* See gdbcore.h.  */
>>  
>> +std::vector<core_mapped_file>
>> +gdb_read_core_file_mappings (struct gdbarch *gdbarch, struct bfd *cbfd)
>> +{
>> +  std::vector<core_mapped_file> results;
>> +
>> +  /* A map entry used while building RESULTS.  */
>> +  struct map_entry
>> +  {
>> +    explicit map_entry (core_mapped_file *ptr)
>> +      : file_data (ptr)
>> +    { /* Nothing.  */ }
>> +
>> +    /* Points to an entry in RESULTS, this allows entries to be quickly
>> +       looked up and updated as new mappings are read.  */
>> +    core_mapped_file *file_data = nullptr;
>> +
>> +    /* If true then we have seen multiple different build-ids associated
>> +       with the filename of FILE_DATA.  The FILE_DATA->build_id field will
>> +       have been set to nullptr, and we should not set FILE_DATA->build_id
>> +       in future.  */
>> +    bool ignore_build_id_p = false;
>> +  };
>> +
>> +  /* All files mapped into the core file.  The key is the filename.  */
>> +  gdb::unordered_map<std::string, map_entry> mapped_files;
>> +
>> +  /* Get the build-id of the core file.  At least on Linux, this will be
>> +     the build-id for the main executable.  If other targets add the
>> +     gdbarch_read_core_file_mappings method, then it might turn out that
>> +     this logic is no longer true, in which case this might need to move
>> +     into the gdbarch_read_core_file_mappings method.  */
>> +  const bfd_build_id *core_build_id = build_id_bfd_get (cbfd);
>> +
>> +  /* See linux_read_core_file_mappings() in linux-tdep.c for an example
>> +     read_core_file_mappings method.  */
>> +  gdbarch_read_core_file_mappings (gdbarch, cbfd,
>> +    /* After determining the number of mappings, read_core_file_mappings
>> +       will invoke this lambda.  */
>> +    [&] (ULONGEST)
>> +      {
>> +      },
>> +
>> +    /* read_core_file_mappings will invoke this lambda for each mapping
>> +       that it finds.  */
>> +    [&] (int num, ULONGEST start, ULONGEST end, ULONGEST file_ofs,
>> +	 const char *filename, const bfd_build_id *build_id)
>> +      {
>> +	/* Architecture-specific read_core_mapping methods are expected to
>> +	   weed out non-file-backed mappings.  */
>> +	gdb_assert (filename != nullptr);
>> +
>> +	/* Add this mapped region to the data for FILENAME.  */
>> +	auto iter = mapped_files.find (filename);
>> +	if (iter == mapped_files.end ())
>> +	  {
>> +	    /* Create entry in results list.  */
>> +	    results.emplace_back ();
>> +
>> +	    /* The entry to be added to the lookup map.  */
>> +	    map_entry entry (&results.back ());
>
> This bit is (I think) invalid.  You are taking the address of an element
> in the vector to place it in the map_entry cache (mapped_files).
> However, taking the address of the element of a vector in this case is
> wrong.  As you grow the vector (results.emplace_back), the backing
> storage might be grown, and as a consequence values moved and existing
> pointers/references/iterators are invalidated.
>
> Using a list could be a drop-in replacement, or storing the index in the
> vector rather than the element address would fix the issue.
>
> I can submit a patch doing either of those, depending on what you
> prefer.
>

Thanks for tracking this down.

If you have a patch or want to create the patch, then feel free to go
with whatever approach you feel is best.

But I don't want to dump my mess on you.  I'm happy to fix this up.

Just let me know.

Thanks,
Andrew



  reply	other threads:[~2025-10-13 14:38 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-02 16:03 [PATCH 0/3] Core file Python API Andrew Burgess
2025-09-02 16:03 ` [PATCH 1/3] gdb/python: introduce gdb.Corefile API Andrew Burgess
2025-09-02 16:26   ` Eli Zaretskii
2025-09-16 17:25   ` Tom Tromey
2025-09-23 13:50     ` Andrew Burgess
2025-09-02 16:03 ` [PATCH 2/3] gdb: make structured core file mappings processing global Andrew Burgess
2025-09-16 17:28   ` Tom Tromey
2025-09-02 16:03 ` [PATCH 3/3] gdb/python: add Corefile.mapped_files method Andrew Burgess
2025-09-16 17:54   ` Tom Tromey
2025-09-23 13:52     ` Andrew Burgess
2025-09-23 13:44 ` [PATCHv2 0/3] Core file Python API Andrew Burgess
2025-09-23 13:44   ` [PATCHv2 1/3] gdb/python: introduce gdb.Corefile API Andrew Burgess
2025-10-03 18:56     ` Tom Tromey
2025-10-06  8:54       ` Andrew Burgess
2025-10-06 15:39         ` Tom Tromey
2025-10-06 16:13           ` Andrew Burgess
2025-09-23 13:44   ` [PATCHv2 2/3] gdb: make structured core file mappings processing global Andrew Burgess
2025-10-13 13:57     ` Lancelot SIX
2025-10-13 14:37       ` Andrew Burgess [this message]
2025-10-13 15:16         ` Six, Lancelot
2025-10-14  9:12         ` Lancelot SIX
2025-09-23 13:44   ` [PATCHv2 3/3] gdb/python: add Corefile.mapped_files method Andrew Burgess
2025-10-03 19:15     ` Tom Tromey
2025-10-07  6:24       ` Tom de Vries
2025-10-07 12:21         ` Andrew Burgess
2025-10-07 13:08           ` Tom de Vries
2025-10-07 13:26             ` Andrew Burgess
2025-10-07 14:38               ` Andrew Burgess
2025-10-07 15:43                 ` Tom de Vries
2025-10-07 16:28                   ` Andrew Burgess
2025-10-08  9:29                     ` Andrew Burgess
2025-10-08 10:36                       ` Tom de Vries
2025-10-08 14:14                         ` Andrew Burgess
2025-10-08 15:43                           ` Tom de Vries
2025-10-08 16:03                             ` Andrew Burgess
2025-10-16 20:00           ` Tom Tromey
2025-10-17 10:02             ` Andrew Burgess
2025-10-17 13:32               ` Andrew Burgess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jz0yc1g1.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=lancelot.six@amd.com \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox