Re: [PATCH 1/2] gdb: move all bfd_cache_close_all calls in gdb_bfd.c

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Andrew Burgess <aburgess@redhat.com>
To: Lancelot SIX <lsix@lancelotsix.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH 1/2] gdb: move all bfd_cache_close_all calls in gdb_bfd.c
Date: Fri, 27 Oct 2023 09:49:05 +0100	[thread overview]
Message-ID: <878r7o30xq.fsf@redhat.com> (raw)
In-Reply-To: <20231026090601.awjsf5mrc2beyatn@octopus>

Lancelot SIX <lsix@lancelotsix.com> writes:

> On Wed, Oct 25, 2023 at 04:08:17PM +0100, Andrew Burgess wrote:
>> In the following commit I ran into a problem.  The next commit aims to
>> improve GDB's handling of the main executable being a file on a remote
>> target (i.e. one with a 'target:' prefix).
>> 
>> To do this I have replace a system 'stat' call with a bfd_stat call.
>> 
>> However, doing this caused a regression in gdb.base/attach.exp.
>> 
>> The problem is that the bfd library caches open FILE* handles for bfd
>> objects that it has accessed, which is great for short-lived, non
>> interactive programs (e.g. the assembler, or objcopy, etc), however,
>> for GDB this caching causes us a problem.
>> 
>> If we open the main executable as a bfd then the bfd library will
>> cache the open FILE*.  If some time passes, maybe just sat at the GDB
>> prompt, or with the inferior running, and then later we use bfd_stat
>> to check if the underlying, on-disk file has changed, then the bfd
>> library will actually use fstat on the underlying file descriptor.
>> This is of course slightly different than using system stat on with
>> the on-disk file name.
>> 
>> If the on-disk file has changed then system stat will give results for
>> the current on-disk file.  But, if the bfd cache is still holding open
>> the file descriptor for the original on-disk file (from before the
>> change) then fstat will return a result based on the original file,
>> and so show no change as having happened.
>> 
>> This is a known problem in GDB, and so far this has been solved by
>> scattering bfd_cache_close_all() calls throughout GDB.  But, as I
>> said, in the next commit I've made a change and run into a
>> problem (gdb.base/attach.exp) where we are apparently missing a
>> bfd_cache_close_all() call.
>> 
>> Now I could solve this problem by adding a bfd_cache_close_all() call
>> before the bfd_stat call that I plan to add in the next commit, that
>> would for sure solve the problem, but feels a little crude.
>> 
>> Better I think would be to track down where the bfd is being opened
>> and add a corresponding bfd_cache_close_all() call elsewhere in GDB
>> once we've finished doing whatever it is that caused us to open the
>> bfd in the first place.
>> 
>> This second solution felt like the better choice, so I tracked the
>> problem down to elf_locate_base and fixed that.  But that just exposed
>> another problem in gdb_bfd_map_section which was also re-opening the
>> bfd, so I fixed this (with another bfd_cache_close_all() call), and
>> that exposed another issue in gdbarch_lookup_osabi... and at this
>> point I wondered if I was approaching this problem the wrong way...
>> 
>> .... And so, I wonder, is there a _better_ way to handle these
>> bfd_cache_close_all() calls?
>> 
>> I see two problems with the current approach:
>> 
>>   1. It's fragile.  Folk aren't always aware that they need to clear
>>   the bfd cache, and this feels like something that is easy to
>>   overlook in review.  So adding new code to GDB can innocently touch
>>   a bfd, which populates the cache, which will then be a bug that can
>>   lie hidden until an on-disk file just happens to change at the wrong
>>   time ... and GDB fails to spot the change.  Additionally,
>> 
>>   2. It's in efficient.  The caching is intended to stop the bfd
>>   library from continually having to re-open the on-disk file.  If we
>>   have a function that touches a bfd then often that function is the
>>   obvious place to call bfd_cache_close_all.  But if a single GDB
>>   command calls multiple functions, each of which touch the bfd, then
>>   we will end up opening and closing the same on-disk file multiple
>>   times.  It feels like we would be better postponing the
>>   bfd_cache_close_all call until some later point, then we can benefit
>>   from the bfd cache.
>> 
>> So, in this commit I propose a new approach.  We now clear the bfd
>> cache in two places:
>> 
>>   (a) Just before we display a GDB prompt.  We display a prompt after
>>   completing a command, and GDB is about to enter an idle state
>>   waiting for further input from the user (or in async mode, for an
>>   inferior event).  If while we are in this idle state the user
>>   changes the on-disk file(s) then we would like GDB to notice this
>>   the next time it leaves its idle state, e.g. the next time the user
>>   executes a command, or when an inferior event arrives,
>> 
>>   (b) When we resume the inferior.  In synchronous mode, resuming the
>>   inferior is another time when GDB is blocked and sitting idle, but
>>   in this case we don't display a prompt.  As with (a) above, when an
>>   inferior event arrives we want GDB to notice any changes to on-disk
>>   files.
>> 
>> Nicely, there are existing observers for both of these
>> cases (before_prompt and target_resumed respectively), so, in
>> gdb_bfd.c I've hooked into both of these and arranged to clear the bfd
>> cache.
>> 
>> With this commit in place the next commit no longer shows any issues
>> with gdb.base/attach.exp.
>> 
>> One possible problem that I see with this commit is: what if some
>> other user of one of the observers I've hooked into causes a bfd to be
>> opened after my new observers have run?  If this happened then we
>> would proceed with a populated bfd cache, and this might causes
>> problems.
>> 
>> Right now, the only other users of the observers that I'm connecting
>> too are the extension languages, specifically, Python.  I suspect it
>> must be possible for Python to carry out actions that can cause the
>> bfd cache to be populated, so maybe there is a risk here.
>> 
>> To counter this risk, we could move the bfd_cache_close_all() calls
>> out of observers, and move them into GDB core close too, but after the
>> two observers in questions are actually called, so into
>> top_level_prompt for the before_prompt observer and into
>> notify_target_resumed for the target_resumed observer.
>> 
>> Another choice would be to add a bfd_cache_close_all() into
>> gdbpy_enter::~gdbpy_enter, so whenever we call into a Python
>> extension, we always clear the bfd cache once we're done.
>> 
>> For now I haven't made either of these changes, maybe I'm worrying
>> about nothing?  I'd appreciate peoples thoughts.
>
> Hi Andrew,
>
> I think there are a couple of places in GDB where observers are not
> notified directly, but via a notify_ helper function.  Those functions
> ensure that some step are take before/after the observers listeners
> execute.
>
> Instances of this are notify_about_to_proceed, notify_about_to_proceed,
> notify_normal_stop or notify_user_selected_context_changed (in
> infrunc.c).
>
> For this problem, could you replace
> "gdb::observers::target_resumed.notify (ptid);" and
> "gdb::observers::before_prompt.notify (get_prompt ().c_str ())" calls
> with such helper function?  This way, you could ensure that listeners
> are all executed before the "bfd_cache_close_all ()" call.
>
> This approach does require that future change do not insert new direct
> notify calls to the observers, but at least should solve the problem of
> observers being notified in an arbitrary order.

Thanks.  I think this is probably the best approach.  I'll re-roll this
patch along these lines.

Thanks,
Andrew


>
> Best,
> Lancelot.
>
>> 
>> This commit also removes all of the other bfd_cache_close_all() calls
>> from GDB.  My claim is that these are no longer needed.
>> ---
>>  gdb/corefile.c |  5 -----
>>  gdb/exec.c     |  2 --
>>  gdb/gdb_bfd.c  | 20 ++++++++++++++++++++
>>  gdb/infcmd.c   |  1 -
>>  gdb/inferior.c |  2 --
>>  gdb/symfile.c  |  1 -
>>  gdb/target.c   |  5 -----
>>  7 files changed, 20 insertions(+), 16 deletions(-)
>> 
>> diff --git a/gdb/corefile.c b/gdb/corefile.c
>> index c27061a3ae3..19a96bc6f86 100644
>> --- a/gdb/corefile.c
>> +++ b/gdb/corefile.c
>> @@ -120,11 +120,6 @@ reopen_exec_file (void)
>>        && current_program_space->ebfd_mtime
>>        && current_program_space->ebfd_mtime != st.st_mtime)
>>      exec_file_attach (filename.c_str (), 0);
>> -  else
>> -    /* If we accessed the file since last opening it, close it now;
>> -       this stops GDB from holding the executable open after it
>> -       exits.  */
>> -    bfd_cache_close_all ();
>>  }
>>  \f
>>  /* If we have both a core file and an exec file,
>> diff --git a/gdb/exec.c b/gdb/exec.c
>> index 5956012338f..59965b84d55 100644
>> --- a/gdb/exec.c
>> +++ b/gdb/exec.c
>> @@ -500,8 +500,6 @@ exec_file_attach (const char *filename, int from_tty)
>>  	(*deprecated_exec_file_display_hook) (filename);
>>      }
>>  
>> -  bfd_cache_close_all ();
>> -
>>    /* Are are loading the same executable?  */
>>    bfd *prev_bfd = exec_bfd_holder.get ();
>>    bfd *curr_bfd = current_program_space->exec_bfd ();
>> diff --git a/gdb/gdb_bfd.c b/gdb/gdb_bfd.c
>> index 56a4c5ecc91..9b7a06da231 100644
>> --- a/gdb/gdb_bfd.c
>> +++ b/gdb/gdb_bfd.c
>> @@ -33,6 +33,7 @@
>>  #include "gdbsupport/fileio.h"
>>  #include "inferior.h"
>>  #include "cli/cli-style.h"
>> +#include "observable.h"
>>  #include <unordered_map>
>>  
>>  /* An object of this type is stored in the section's user data when
>> @@ -1171,10 +1172,29 @@ gdb_bfd_error_handler (const char *fmt, va_list ap)
>>    (*default_bfd_error_handler) (fmt, ap);
>>  }
>>  
>> +/* A before_prompt observer.  */
>> +
>> +static void
>> +gdb_bfd_before_prompt (const char * /* ignored */)
>> +{
>> +  bfd_cache_close_all ();
>> +}
>> +
>> +/* A target_resumed observer.  */
>> +
>> +static void
>> +gdb_bfd_target_resumed (ptid_t /* ignored */)
>> +{
>> +  bfd_cache_close_all ();
>> +}
>> +
>>  void _initialize_gdb_bfd ();
>>  void
>>  _initialize_gdb_bfd ()
>>  {
>> +  gdb::observers::target_resumed.attach (gdb_bfd_target_resumed, "gdb-bfd");
>> +  gdb::observers::before_prompt.attach (gdb_bfd_before_prompt, "gdb-bfd");
>> +
>>    all_bfds = htab_create_alloc (10, htab_hash_pointer, htab_eq_pointer,
>>  				NULL, xcalloc, xfree);
>>  
>> diff --git a/gdb/infcmd.c b/gdb/infcmd.c
>> index cf8cd527955..5153843dde8 100644
>> --- a/gdb/infcmd.c
>> +++ b/gdb/infcmd.c
>> @@ -2498,7 +2498,6 @@ kill_command (const char *arg, int from_tty)
>>    int infnum = current_inferior ()->num;
>>  
>>    target_kill ();
>> -  bfd_cache_close_all ();
>>  
>>    update_previous_thread ();
>>  
>> diff --git a/gdb/inferior.c b/gdb/inferior.c
>> index 1778723863e..927c5f16ae2 100644
>> --- a/gdb/inferior.c
>> +++ b/gdb/inferior.c
>> @@ -714,8 +714,6 @@ kill_inferior_command (const char *args, int from_tty)
>>  
>>        target_kill ();
>>      }
>> -
>> -  bfd_cache_close_all ();
>>  }
>>  
>>  /* See inferior.h.  */
>> diff --git a/gdb/symfile.c b/gdb/symfile.c
>> index eebc5ea44b9..24570372316 100644
>> --- a/gdb/symfile.c
>> +++ b/gdb/symfile.c
>> @@ -1124,7 +1124,6 @@ symbol_file_add_with_addrs (const gdb_bfd_ref_ptr &abfd, const char *name,
>>  
>>    gdb::observers::new_objfile.notify (objfile);
>>  
>> -  bfd_cache_close_all ();
>>    return objfile;
>>  }
>>  
>> diff --git a/gdb/target.c b/gdb/target.c
>> index f688ff33e3b..aeb53dd91d0 100644
>> --- a/gdb/target.c
>> +++ b/gdb/target.c
>> @@ -2729,11 +2729,6 @@ target_mourn_inferior (ptid_t ptid)
>>  {
>>    gdb_assert (ptid.pid () == inferior_ptid.pid ());
>>    current_inferior ()->top_target ()->mourn_inferior ();
>> -
>> -  /* We no longer need to keep handles on any of the object files.
>> -     Make sure to release them to avoid unnecessarily locking any
>> -     of them while we're not actually debugging.  */
>> -  bfd_cache_close_all ();
>>  }
>>  
>>  /* Look for a target which can describe architectural features, starting
>> -- 
>> 2.25.4
>>

next prev parent reply	other threads:[~2023-10-27  8:49 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-25 15:08 [PATCH 0/2] BFD cache management And Exec file with target: prefix Andrew Burgess
2023-10-25 15:08 ` [PATCH 1/2] gdb: move all bfd_cache_close_all calls in gdb_bfd.c Andrew Burgess
2023-10-26  9:06   ` Lancelot SIX
2023-10-27  8:49     ` Andrew Burgess [this message]
2023-10-27 19:30   ` Tom Tromey
2023-10-30 10:20     ` Andrew Burgess
2023-10-31 18:28       ` Tom Tromey
2023-11-01 10:46         ` Andrew Burgess
2023-11-12 23:38           ` Tom Tromey
2023-10-25 15:08 ` [PATCH 2/2] gdb: fix reopen_exec_file for files with target: prefix Andrew Burgess
2023-10-27 18:39   ` Tom Tromey
2023-10-30 13:41 ` [PATCH 0/2] BFD cache management And Exec file " Andrew Burgess
2023-10-30 13:41   ` [PATCH 1/2] gdb: move all bfd_cache_close_all calls in gdb_bfd.c Andrew Burgess
2023-10-30 13:41   ` [PATCH 2/2] gdb: fix reopen_exec_file for files with target: prefix Andrew Burgess
2023-11-12 23:40   ` [PATCH 0/2] BFD cache management And Exec file " Tom Tromey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878r7o30xq.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=lsix@lancelotsix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox