Re: [PATCH RFC 5/5] Add DWARF index cache

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Simon Marchi <simon.marchi@ericsson.com>
To: Tom Tromey <tom@tromey.com>
Cc: <gdb-patches@sourceware.org>
Subject: Re: [PATCH RFC 5/5] Add DWARF index cache
Date: Mon, 09 Jul 2018 20:56:00 -0000	[thread overview]
Message-ID: <e3590eaf-c6c5-ca64-ad26-e16a22fd1846@ericsson.com> (raw)
In-Reply-To: <87tvrfyzf2.fsf@tromey.com>

On 2018-05-10 05:27 PM, Tom Tromey wrote:
>>>>>> "Simon" == Simon Marchi <simon.marchi@ericsson.com> writes:
> 
> Simon> To help make using the DWARF index more transparent, this patch
> Simon> introduces a DWARF index cache.  When enabled, loading an index-less
> Simon> binary in GDB will automatically save an index file in ~/.cache/gdb.
> Simon> When loading that same object file again, the index file will be looked
> Simon> up and used to load the DWARF index.
> 
> Thanks for doing this.  I like this feature.
> 
> I read through the patch, but I didn't read the tests.  Some notes
> below.
> 
> I'm sure you know, since it is an RFC, but the patch needs some more
> comments here and there.

I didn't realize it at the time, but reading it now, I concur.  I added some,
hopefully it helps.

> I have a wacky idea, too -- I'm testing a patch to move psymtabs out of
> the objfile obstack and into a "self managed" object.  With this in
> place, and because psymtabs are effectively immutable, it seems to me
> that the index writing could be done in a background thread.  (I haven't
> looked to see how the DWARF 5 index stuff works, I only really know
> about .gdb_index, so this may not apply there.)

Yes, sounds like a good idea.
> Simon> - The cache is just a directory with files named after the object
> Simon>   file's build-id.  It is not possible to save/load the index for an
> Simon>   object file without build-id in the cache.  Note that Clang does not
> Simon>   add build-ids by default.
> 
> One thing that would be nice is also having some way to associate file
> names with the build-ids.  Then gdb could notice when a build-id is
> stale and unlink that cache entry.
> 
> For example you could have a symlink whose name is the some mangled form
> of the objfile name, and whose target is the cache file.  Then if gdb
> sees an objfile with that name but with a different link target, it can
> remove the link target.  This might be a pain to implement in a non-racy
> way, though.

Good idea, but as you said it could be racy.  There needs to be more thought
put into this, so I'll leave this for later.

> I wonder if gdb should write out a short README into that directory when
> it is first created.

Good idea.  In particular, it could say what these files are used for, and say
that it's perfectly safe to delete them if they take too much space (and refer
to the GDB manual about how to manage the cache size limit, eventually).

I'll keep this idea in mind for later.

> Simon> +/* A cheap (as in low-quality) recursive mkdir.  Try to create all the parents
> Simon> +   directories up to DIR and DIR itself.  Stop if we hit an error along the way.
> Simon> +   There is no attempt to remove created directories in case of failure.  */
> Simon> +
> Simon> +static void
> Simon> +mkdir_recursive (const char *dir)
> 
> gnulib has a 'mkdir-p' module which could perhaps be used instead.

One problem of the 'mkdir-p' module is that it pulls the xmalloc module, and we end
up with multiple definitions of xmalloc, xrealloc, etc.  However, the mkdir-p module
uses the mkancesdirs module internally, which is simpler and would probably be enough
for our needs.  However, I couldn't quite figure out how to use the savewd argument...

Also, the mkancesdirs module is not C++-friendly yet, so there would need to be some
patches to gnulib first, then we would need to update our gnulib import.  I would leave
it as-is for the initial submission to keep it simple, it can always be improved later.

> Simon> +index_cache_resource::~index_cache_resource () = default;
> 
> Simon> +struct index_cache_resource
> Simon> +{
> Simon> +  virtual ~index_cache_resource () = 0;
> Simon> +};
> 
> I don't understand how to reconcile these two things - probably just
> another bit of C++ knowledge I am missing.  Would you mind explaining
> it?

The "= 0" part means the method (or destructor, in this case) is pure.  This
is to prevent somebody instantiating this class directly.

The "= default" one is to ask the compiler to generate the default version
of the destructor for this class.  If you remove it, you'll get some
undefined reference to "~index_cache_resource".  I think it's equivalent
to

  index_cache_resource::~index_cache_resource () {}

but more C++11-y.  We often see "= default" used directly in the class
definition, but AFAIK it's not possible to put both "= 0" and "= default"
on the same declaration, hence the need to place it in the .c file.

> 
> Simon> +std::string
> Simon> +index_cache::make_index_filename (const bfd_build_id *build_id,
> Simon> +				  const char *suffix) const
> Simon> +{
> Simon> +  std::string build_id_str = build_id_to_string (build_id);
> Simon> +
> Simon> +  return string_printf ("%s%s%s%s", m_dir.c_str (), SLASH_STRING,
> Simon> +			build_id_str.c_str (), suffix);
> 
> I think this particular one would be clearer just using "+".

Done.
> Simon> +      /* Err...  I guess /tmp is better than nothing.  */
> Simon> +      index_cache_directory = xstrdup ("/tmp");
> Simon> +      global_index_cache.set_directory ("/tmp");
> 
> I tend to think nothing would be better.  Using /tmp means that gdb will
> be reading files with (sometimes) predictable names that could be
> created by anybody.  Better, IMO, to issue some sort of warning and
> disable the feature.

Done.

Thanks for looking into it!  Since I didn't receive too many negative comments,
I'll send a non-RFC version soon.

Simon

next prev parent reply	other threads:[~2018-07-09 20:56 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-09 21:57 [PATCH RFC 0/5] Add a " Simon Marchi
2018-05-09 21:27 ` [PATCH RFC 3/5] Make index reading functions more modular Simon Marchi
2018-05-10 21:02   ` Tom Tromey
2018-05-10 21:18     ` Simon Marchi
2018-05-09 21:27 ` [PATCH RFC 4/5] Introduce scoped_mmapped_file Simon Marchi
2018-05-10 21:04   ` Tom Tromey
2018-05-10 21:27     ` Simon Marchi
2018-05-12 15:49       ` Tom Tromey
2018-06-13  1:36       ` Simon Marchi
2018-05-09 21:27 ` [PATCH RFC 5/5] Add DWARF index cache Simon Marchi
2018-05-10 22:24   ` Tom Tromey
2018-07-09 20:56     ` Simon Marchi [this message]
2018-05-09 21:27 ` [PATCH RFC 0/5] Add a " Simon Marchi
2018-05-09 21:27 ` [PATCH RFC 1/5] Rename some functions, index -> gdb_index Simon Marchi
2018-05-10 20:40   ` Tom Tromey
2018-06-12  2:03     ` Simon Marchi
2018-05-09 23:08 ` [PATCH RFC 2/5] Remove mapped_index::total_size Simon Marchi
2018-05-09 21:27   ` Simon Marchi
2018-05-10 20:54   ` Tom Tromey
2018-05-18 20:26     ` Simon Marchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e3590eaf-c6c5-ca64-ad26-e16a22fd1846@ericsson.com \
    --to=simon.marchi@ericsson.com \
    --cc=gdb-patches@sourceware.org \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox