Re: [PATCH v2 3/4] Add DWARF index cache

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Simon Marchi <simon.marchi@polymtl.ca>
To: Eli Zaretskii <eliz@gnu.org>
Cc: Simon Marchi <simon.marchi@ericsson.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH v2 3/4] Add DWARF index cache
Date: Fri, 27 Jul 2018 14:01:00 -0000	[thread overview]
Message-ID: <e0035566b1c76e4592790b62cd319b34@polymtl.ca> (raw)
In-Reply-To: <83o9etrqsj.fsf@gnu.org>

On 2018-07-27 05:06, Eli Zaretskii wrote:
>> From: Simon Marchi <simon.marchi@ericsson.com>
>> CC: Simon Marchi <simon.marchi@ericsson.com>
>> Date: Wed, 25 Jul 2018 18:47:03 -0400
>> 
>> GDB can generate indexes for DWARF debug information, which, when
>> integrated in the original binary, can speed up loading object files.
>> This can be done using the gdb-add-index script or directly by the
>> linker itself.
> 
> The information in the last sentence is not mentioned in the
> documentation patch you provided.  I think we should mention that.

I think this is already well covered by the rest of the "18.5 Index 
Files Speed Up GDB" section, just before the content I added, isn't it?

>> To help make using the DWARF index more transparent, this patch
>> introduces a DWARF index cache.  When enabled, loading an index-less
>> binary in GDB will automatically save an index file in ~/.cache/gdb.
>> When loading that same object file again, the index file will be 
>> looked
>> up and used to load the DWARF index.  You therefore get the benefit of
>> the DWARF index without having to do additional manual steps or
>> modifying your build system.  When an index section is already present
>> in the file, GDB will prefer that one over looking up the cache.
> 
> Does GDB check whether the cache became stale, i.e. the program was
> modified since the last save?  This should be documented.  And if it
> does check, what happens if the cache is outdated? are there any user
> level messages?

The cache refers to files using their "build-id", which is different for 
every build.  If you modify a source file and rebuild, the resulting 
binary will have a new build-id, for which there is no entry in the 
cache.  So the first time you load that new binary, a new index will be 
recomputed.

>> - The saved index file is exactly the same as the output of the "save
>>   gdb-index" command.  It is therefore the exact same content that 
>> would
>>   be found in the .gdb_index or .debug_names section.  We just leave 
>> it
>>   as a standalone file instead of merging it in the binary.
> 
> This should be mentioned in the documentation, IMO.

Ok, I thought this was an internal GDB implementation detail (which 
could be subject to change), but I can add it if you think it is 
important.

>> - The code was made to follow the XDG specification: if the
>>   XDG_CACHE_HOME environment variable, it is used, otherwise it falls
>>   back to ~/.cache/gdb.  It is possible to change it using "set
>>   index-cache directory".  On other OSes than GNU/Linux, ~/.cache may
>>   not be the best place to put such data.  On macOS it should probably
>>   default to ~/Library/Caches/...  On Windows, %LocalAppData%/...  I
>>   don't intend to do this part, but further patches are welcome.
> 
> IMO, we should only use LocalAppData on Windows if we already do that
> for .gdbinit etc.  (And the platform standards on Windows actually
> frown on putting files into that directory; you are supposed to put
> them in an application-specific subdirectory.)

Thanks for the info.  I don't really know how things work on Windows...

>> - I think that we need to be careful that multiple instances of GDB
>>   don't interfere with each other (not far fetched at all if you run 
>> GDB
>>   in some automated script) and the cache is always coherent (either 
>> the
>>   file is not found, or it is found and entirely valid).  Writing the
>>   file directly to its final location seems like a recipe for failure.
>>   One GDB could read a file in the index while it is being written by
>>   another GDB.  To mitigate this, I made write_psymtabs_to_index write
>>   to temporary files and rename them once it's done.  Two GDB 
>> instances
>>   writing the index for the same file should not step on each other's
>>   toes (the last file to be renamed will stay).  A GDB looking up a 
>> file
>>   will only see a complete file or no file.  Also, if GDB crashes 
>> while
>>   generating the index file, it will leave a work-in-progress file, 
>> but
>>   it won't be picked up by other instances looking up in the cache.
> 
> Maybe we should consider some kind of interlocking, like Emacs does.

That may be needed when we implement some more advanced things, like 
auto-deletion of entries when the cache has grown past a certain size.

>> +  while (1)
>> +    {
>> +      /* Find the beginning of the next component.  */
>> +      while (*component_start == '/')
>> +	component_start++;
>> +
>> +      /* Are we done?  */
>> +      if (*component_start == '\0')
>> +	return;
>> +
>> +      /* Find the slash or null-terminator after this component.  */
>> +      component_end = component_start;
>> +      while (*component_end != '/' && *component_end != '\0')
>> +	component_end++;
> 
> I believe literal uses of '/' are non-portable; you should use
> IS_DIR_SEPARATOR (defined in filenames.h) instead.  You also cannot
> portably assume an absolute file name always starts with a slash.

Ok.

>> +      /* If we get EEXIST and the existing path is a directory, then 
>> we're
>> +         happy.  If it exists, but it's a regular file and this is 
>> not the last
>> +         component, we'll fail at the next component.  If this is the 
>> last
>> +         component, the caller will fail with ENOTDIR when trying to
>> +         open/create a file under that path.  */
>> +      if (mkdir (start, 0700) != 0)
> 
> This will not work on Windows.

Which part of it?

>> +	if (errno != EEXIST)
>> +	  return;
> 
> I think on Windows the value of errno is different.

Ok.

>> +  std::string filename = make_index_filename (build_id, 
>> INDEX4_SUFFIX);
>> +
>> +  TRY
>> +    {
>> +      if (debug_index_cache)
>> +        printf_unfiltered ("index cache: trying to read %s\n",
>> +			   filename.c_str ());
>> +
>> +      /* Try to map that file.  */
>> +      index_cache_resource_mmap *mmap_resource
>> +	= new index_cache_resource_mmap (filename.c_str ());
>> +
>> +      /* Yay, it worked!  Hand the resource to the caller.  */
>> +      resource->reset (mmap_resource);
>> +
>> +      return gdb::array_view<const gdb_byte>
>> +	  ((const gdb_byte *) mmap_resource->mapping.get (),
>> +	   mmap_resource->mapping.size ());
>> +    }
>> +  CATCH (except, RETURN_MASK_ERROR)
>> +    {
>> +      if (debug_index_cache)
>> +	printf_unfiltered ("index cache: couldn't read %s: %s\n",
>> +			 filename.c_str (), except.message);
>> +    }
>> +  END_CATCH
> 
> Does this mean this feature will only work on systems with mmap?

Indeed.  There is an equivalent API for Windows though I think:

https://docs.microsoft.com/en-us/windows/desktop/Memory/file-mapping

I can't really do the Windows bits, because I just don't have the 
competence to write software for Windows.  Every time I tried to build 
GDB for Windows, I failed miserably (do we have a guide for that?).  So 
my thought (including for the mkdir_recursive function) was that 
somebody who used Windows could later implement the missing parts.  
Also, I guess that somebody debugging native executables on Windows 
wouldn't have a use for the cache, because indices only work for DWARF 
debug info.

Simon

next prev parent reply	other threads:[~2018-07-27 14:01 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-25 22:47 [PATCH v2 0/4] Add a " Simon Marchi
2018-07-25 22:47 ` [PATCH v2 2/4] Introduce mmap_file function Simon Marchi
2018-07-25 22:47 ` [PATCH v2 4/4] Add doc and news for DWARF index cache Simon Marchi
2018-07-27  8:50   ` Eli Zaretskii
2018-07-30 19:04     ` Simon Marchi
2018-07-30 19:23       ` Eli Zaretskii
2018-07-25 22:48 ` [PATCH v2 3/4] Add " Simon Marchi
2018-07-27  9:06   ` Eli Zaretskii
2018-07-27 14:01     ` Simon Marchi [this message]
2018-07-27 21:25       ` Eli Zaretskii
2018-07-30 15:33         ` Simon Marchi
2018-07-30 15:45           ` Simon Marchi
2018-07-30 15:57           ` Eli Zaretskii
2018-07-30 17:05             ` Simon Marchi
2018-07-25 22:48 ` [PATCH v2 1/4] Make index reading functions more modular Simon Marchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e0035566b1c76e4592790b62cd319b34@polymtl.ca \
    --to=simon.marchi@polymtl.ca \
    --cc=eliz@gnu.org \
    --cc=gdb-patches@sourceware.org \
    --cc=simon.marchi@ericsson.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox