From: Andrew Burgess <aburgess@redhat.com>
To: Lancelot SIX <lsix@lancelotsix.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH 1/2] gdb: move all bfd_cache_close_all calls in gdb_bfd.c
Date: Fri, 27 Oct 2023 09:49:05 +0100 [thread overview]
Message-ID: <878r7o30xq.fsf@redhat.com> (raw)
In-Reply-To: <20231026090601.awjsf5mrc2beyatn@octopus>
Lancelot SIX <lsix@lancelotsix.com> writes:
> On Wed, Oct 25, 2023 at 04:08:17PM +0100, Andrew Burgess wrote:
>> In the following commit I ran into a problem. The next commit aims to
>> improve GDB's handling of the main executable being a file on a remote
>> target (i.e. one with a 'target:' prefix).
>>
>> To do this I have replace a system 'stat' call with a bfd_stat call.
>>
>> However, doing this caused a regression in gdb.base/attach.exp.
>>
>> The problem is that the bfd library caches open FILE* handles for bfd
>> objects that it has accessed, which is great for short-lived, non
>> interactive programs (e.g. the assembler, or objcopy, etc), however,
>> for GDB this caching causes us a problem.
>>
>> If we open the main executable as a bfd then the bfd library will
>> cache the open FILE*. If some time passes, maybe just sat at the GDB
>> prompt, or with the inferior running, and then later we use bfd_stat
>> to check if the underlying, on-disk file has changed, then the bfd
>> library will actually use fstat on the underlying file descriptor.
>> This is of course slightly different than using system stat on with
>> the on-disk file name.
>>
>> If the on-disk file has changed then system stat will give results for
>> the current on-disk file. But, if the bfd cache is still holding open
>> the file descriptor for the original on-disk file (from before the
>> change) then fstat will return a result based on the original file,
>> and so show no change as having happened.
>>
>> This is a known problem in GDB, and so far this has been solved by
>> scattering bfd_cache_close_all() calls throughout GDB. But, as I
>> said, in the next commit I've made a change and run into a
>> problem (gdb.base/attach.exp) where we are apparently missing a
>> bfd_cache_close_all() call.
>>
>> Now I could solve this problem by adding a bfd_cache_close_all() call
>> before the bfd_stat call that I plan to add in the next commit, that
>> would for sure solve the problem, but feels a little crude.
>>
>> Better I think would be to track down where the bfd is being opened
>> and add a corresponding bfd_cache_close_all() call elsewhere in GDB
>> once we've finished doing whatever it is that caused us to open the
>> bfd in the first place.
>>
>> This second solution felt like the better choice, so I tracked the
>> problem down to elf_locate_base and fixed that. But that just exposed
>> another problem in gdb_bfd_map_section which was also re-opening the
>> bfd, so I fixed this (with another bfd_cache_close_all() call), and
>> that exposed another issue in gdbarch_lookup_osabi... and at this
>> point I wondered if I was approaching this problem the wrong way...
>>
>> .... And so, I wonder, is there a _better_ way to handle these
>> bfd_cache_close_all() calls?
>>
>> I see two problems with the current approach:
>>
>> 1. It's fragile. Folk aren't always aware that they need to clear
>> the bfd cache, and this feels like something that is easy to
>> overlook in review. So adding new code to GDB can innocently touch
>> a bfd, which populates the cache, which will then be a bug that can
>> lie hidden until an on-disk file just happens to change at the wrong
>> time ... and GDB fails to spot the change. Additionally,
>>
>> 2. It's in efficient. The caching is intended to stop the bfd
>> library from continually having to re-open the on-disk file. If we
>> have a function that touches a bfd then often that function is the
>> obvious place to call bfd_cache_close_all. But if a single GDB
>> command calls multiple functions, each of which touch the bfd, then
>> we will end up opening and closing the same on-disk file multiple
>> times. It feels like we would be better postponing the
>> bfd_cache_close_all call until some later point, then we can benefit
>> from the bfd cache.
>>
>> So, in this commit I propose a new approach. We now clear the bfd
>> cache in two places:
>>
>> (a) Just before we display a GDB prompt. We display a prompt after
>> completing a command, and GDB is about to enter an idle state
>> waiting for further input from the user (or in async mode, for an
>> inferior event). If while we are in this idle state the user
>> changes the on-disk file(s) then we would like GDB to notice this
>> the next time it leaves its idle state, e.g. the next time the user
>> executes a command, or when an inferior event arrives,
>>
>> (b) When we resume the inferior. In synchronous mode, resuming the
>> inferior is another time when GDB is blocked and sitting idle, but
>> in this case we don't display a prompt. As with (a) above, when an
>> inferior event arrives we want GDB to notice any changes to on-disk
>> files.
>>
>> Nicely, there are existing observers for both of these
>> cases (before_prompt and target_resumed respectively), so, in
>> gdb_bfd.c I've hooked into both of these and arranged to clear the bfd
>> cache.
>>
>> With this commit in place the next commit no longer shows any issues
>> with gdb.base/attach.exp.
>>
>> One possible problem that I see with this commit is: what if some
>> other user of one of the observers I've hooked into causes a bfd to be
>> opened after my new observers have run? If this happened then we
>> would proceed with a populated bfd cache, and this might causes
>> problems.
>>
>> Right now, the only other users of the observers that I'm connecting
>> too are the extension languages, specifically, Python. I suspect it
>> must be possible for Python to carry out actions that can cause the
>> bfd cache to be populated, so maybe there is a risk here.
>>
>> To counter this risk, we could move the bfd_cache_close_all() calls
>> out of observers, and move them into GDB core close too, but after the
>> two observers in questions are actually called, so into
>> top_level_prompt for the before_prompt observer and into
>> notify_target_resumed for the target_resumed observer.
>>
>> Another choice would be to add a bfd_cache_close_all() into
>> gdbpy_enter::~gdbpy_enter, so whenever we call into a Python
>> extension, we always clear the bfd cache once we're done.
>>
>> For now I haven't made either of these changes, maybe I'm worrying
>> about nothing? I'd appreciate peoples thoughts.
>
> Hi Andrew,
>
> I think there are a couple of places in GDB where observers are not
> notified directly, but via a notify_ helper function. Those functions
> ensure that some step are take before/after the observers listeners
> execute.
>
> Instances of this are notify_about_to_proceed, notify_about_to_proceed,
> notify_normal_stop or notify_user_selected_context_changed (in
> infrunc.c).
>
> For this problem, could you replace
> "gdb::observers::target_resumed.notify (ptid);" and
> "gdb::observers::before_prompt.notify (get_prompt ().c_str ())" calls
> with such helper function? This way, you could ensure that listeners
> are all executed before the "bfd_cache_close_all ()" call.
>
> This approach does require that future change do not insert new direct
> notify calls to the observers, but at least should solve the problem of
> observers being notified in an arbitrary order.
Thanks. I think this is probably the best approach. I'll re-roll this
patch along these lines.
Thanks,
Andrew
>
> Best,
> Lancelot.
>
>>
>> This commit also removes all of the other bfd_cache_close_all() calls
>> from GDB. My claim is that these are no longer needed.
>> ---
>> gdb/corefile.c | 5 -----
>> gdb/exec.c | 2 --
>> gdb/gdb_bfd.c | 20 ++++++++++++++++++++
>> gdb/infcmd.c | 1 -
>> gdb/inferior.c | 2 --
>> gdb/symfile.c | 1 -
>> gdb/target.c | 5 -----
>> 7 files changed, 20 insertions(+), 16 deletions(-)
>>
>> diff --git a/gdb/corefile.c b/gdb/corefile.c
>> index c27061a3ae3..19a96bc6f86 100644
>> --- a/gdb/corefile.c
>> +++ b/gdb/corefile.c
>> @@ -120,11 +120,6 @@ reopen_exec_file (void)
>> && current_program_space->ebfd_mtime
>> && current_program_space->ebfd_mtime != st.st_mtime)
>> exec_file_attach (filename.c_str (), 0);
>> - else
>> - /* If we accessed the file since last opening it, close it now;
>> - this stops GDB from holding the executable open after it
>> - exits. */
>> - bfd_cache_close_all ();
>> }
>> \f
>> /* If we have both a core file and an exec file,
>> diff --git a/gdb/exec.c b/gdb/exec.c
>> index 5956012338f..59965b84d55 100644
>> --- a/gdb/exec.c
>> +++ b/gdb/exec.c
>> @@ -500,8 +500,6 @@ exec_file_attach (const char *filename, int from_tty)
>> (*deprecated_exec_file_display_hook) (filename);
>> }
>>
>> - bfd_cache_close_all ();
>> -
>> /* Are are loading the same executable? */
>> bfd *prev_bfd = exec_bfd_holder.get ();
>> bfd *curr_bfd = current_program_space->exec_bfd ();
>> diff --git a/gdb/gdb_bfd.c b/gdb/gdb_bfd.c
>> index 56a4c5ecc91..9b7a06da231 100644
>> --- a/gdb/gdb_bfd.c
>> +++ b/gdb/gdb_bfd.c
>> @@ -33,6 +33,7 @@
>> #include "gdbsupport/fileio.h"
>> #include "inferior.h"
>> #include "cli/cli-style.h"
>> +#include "observable.h"
>> #include <unordered_map>
>>
>> /* An object of this type is stored in the section's user data when
>> @@ -1171,10 +1172,29 @@ gdb_bfd_error_handler (const char *fmt, va_list ap)
>> (*default_bfd_error_handler) (fmt, ap);
>> }
>>
>> +/* A before_prompt observer. */
>> +
>> +static void
>> +gdb_bfd_before_prompt (const char * /* ignored */)
>> +{
>> + bfd_cache_close_all ();
>> +}
>> +
>> +/* A target_resumed observer. */
>> +
>> +static void
>> +gdb_bfd_target_resumed (ptid_t /* ignored */)
>> +{
>> + bfd_cache_close_all ();
>> +}
>> +
>> void _initialize_gdb_bfd ();
>> void
>> _initialize_gdb_bfd ()
>> {
>> + gdb::observers::target_resumed.attach (gdb_bfd_target_resumed, "gdb-bfd");
>> + gdb::observers::before_prompt.attach (gdb_bfd_before_prompt, "gdb-bfd");
>> +
>> all_bfds = htab_create_alloc (10, htab_hash_pointer, htab_eq_pointer,
>> NULL, xcalloc, xfree);
>>
>> diff --git a/gdb/infcmd.c b/gdb/infcmd.c
>> index cf8cd527955..5153843dde8 100644
>> --- a/gdb/infcmd.c
>> +++ b/gdb/infcmd.c
>> @@ -2498,7 +2498,6 @@ kill_command (const char *arg, int from_tty)
>> int infnum = current_inferior ()->num;
>>
>> target_kill ();
>> - bfd_cache_close_all ();
>>
>> update_previous_thread ();
>>
>> diff --git a/gdb/inferior.c b/gdb/inferior.c
>> index 1778723863e..927c5f16ae2 100644
>> --- a/gdb/inferior.c
>> +++ b/gdb/inferior.c
>> @@ -714,8 +714,6 @@ kill_inferior_command (const char *args, int from_tty)
>>
>> target_kill ();
>> }
>> -
>> - bfd_cache_close_all ();
>> }
>>
>> /* See inferior.h. */
>> diff --git a/gdb/symfile.c b/gdb/symfile.c
>> index eebc5ea44b9..24570372316 100644
>> --- a/gdb/symfile.c
>> +++ b/gdb/symfile.c
>> @@ -1124,7 +1124,6 @@ symbol_file_add_with_addrs (const gdb_bfd_ref_ptr &abfd, const char *name,
>>
>> gdb::observers::new_objfile.notify (objfile);
>>
>> - bfd_cache_close_all ();
>> return objfile;
>> }
>>
>> diff --git a/gdb/target.c b/gdb/target.c
>> index f688ff33e3b..aeb53dd91d0 100644
>> --- a/gdb/target.c
>> +++ b/gdb/target.c
>> @@ -2729,11 +2729,6 @@ target_mourn_inferior (ptid_t ptid)
>> {
>> gdb_assert (ptid.pid () == inferior_ptid.pid ());
>> current_inferior ()->top_target ()->mourn_inferior ();
>> -
>> - /* We no longer need to keep handles on any of the object files.
>> - Make sure to release them to avoid unnecessarily locking any
>> - of them while we're not actually debugging. */
>> - bfd_cache_close_all ();
>> }
>>
>> /* Look for a target which can describe architectural features, starting
>> --
>> 2.25.4
>>
next prev parent reply other threads:[~2023-10-27 8:49 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-25 15:08 [PATCH 0/2] BFD cache management And Exec file with target: prefix Andrew Burgess
2023-10-25 15:08 ` [PATCH 1/2] gdb: move all bfd_cache_close_all calls in gdb_bfd.c Andrew Burgess
2023-10-26 9:06 ` Lancelot SIX
2023-10-27 8:49 ` Andrew Burgess [this message]
2023-10-27 19:30 ` Tom Tromey
2023-10-30 10:20 ` Andrew Burgess
2023-10-31 18:28 ` Tom Tromey
2023-11-01 10:46 ` Andrew Burgess
2023-11-12 23:38 ` Tom Tromey
2023-10-25 15:08 ` [PATCH 2/2] gdb: fix reopen_exec_file for files with target: prefix Andrew Burgess
2023-10-27 18:39 ` Tom Tromey
2023-10-30 13:41 ` [PATCH 0/2] BFD cache management And Exec file " Andrew Burgess
2023-10-30 13:41 ` [PATCH 1/2] gdb: move all bfd_cache_close_all calls in gdb_bfd.c Andrew Burgess
2023-10-30 13:41 ` [PATCH 2/2] gdb: fix reopen_exec_file for files with target: prefix Andrew Burgess
2023-11-12 23:40 ` [PATCH 0/2] BFD cache management And Exec file " Tom Tromey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=878r7o30xq.fsf@redhat.com \
--to=aburgess@redhat.com \
--cc=gdb-patches@sourceware.org \
--cc=lsix@lancelotsix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox