From: Simon Marchi <simon.marchi@polymtl.ca>
To: Eli Zaretskii <eliz@gnu.org>
Cc: Simon Marchi <simon.marchi@ericsson.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH v2 3/4] Add DWARF index cache
Date: Fri, 27 Jul 2018 14:01:00 -0000 [thread overview]
Message-ID: <e0035566b1c76e4592790b62cd319b34@polymtl.ca> (raw)
In-Reply-To: <83o9etrqsj.fsf@gnu.org>
On 2018-07-27 05:06, Eli Zaretskii wrote:
>> From: Simon Marchi <simon.marchi@ericsson.com>
>> CC: Simon Marchi <simon.marchi@ericsson.com>
>> Date: Wed, 25 Jul 2018 18:47:03 -0400
>>
>> GDB can generate indexes for DWARF debug information, which, when
>> integrated in the original binary, can speed up loading object files.
>> This can be done using the gdb-add-index script or directly by the
>> linker itself.
>
> The information in the last sentence is not mentioned in the
> documentation patch you provided. I think we should mention that.
I think this is already well covered by the rest of the "18.5 Index
Files Speed Up GDB" section, just before the content I added, isn't it?
>> To help make using the DWARF index more transparent, this patch
>> introduces a DWARF index cache. When enabled, loading an index-less
>> binary in GDB will automatically save an index file in ~/.cache/gdb.
>> When loading that same object file again, the index file will be
>> looked
>> up and used to load the DWARF index. You therefore get the benefit of
>> the DWARF index without having to do additional manual steps or
>> modifying your build system. When an index section is already present
>> in the file, GDB will prefer that one over looking up the cache.
>
> Does GDB check whether the cache became stale, i.e. the program was
> modified since the last save? This should be documented. And if it
> does check, what happens if the cache is outdated? are there any user
> level messages?
The cache refers to files using their "build-id", which is different for
every build. If you modify a source file and rebuild, the resulting
binary will have a new build-id, for which there is no entry in the
cache. So the first time you load that new binary, a new index will be
recomputed.
>> - The saved index file is exactly the same as the output of the "save
>> gdb-index" command. It is therefore the exact same content that
>> would
>> be found in the .gdb_index or .debug_names section. We just leave
>> it
>> as a standalone file instead of merging it in the binary.
>
> This should be mentioned in the documentation, IMO.
Ok, I thought this was an internal GDB implementation detail (which
could be subject to change), but I can add it if you think it is
important.
>> - The code was made to follow the XDG specification: if the
>> XDG_CACHE_HOME environment variable, it is used, otherwise it falls
>> back to ~/.cache/gdb. It is possible to change it using "set
>> index-cache directory". On other OSes than GNU/Linux, ~/.cache may
>> not be the best place to put such data. On macOS it should probably
>> default to ~/Library/Caches/... On Windows, %LocalAppData%/... I
>> don't intend to do this part, but further patches are welcome.
>
> IMO, we should only use LocalAppData on Windows if we already do that
> for .gdbinit etc. (And the platform standards on Windows actually
> frown on putting files into that directory; you are supposed to put
> them in an application-specific subdirectory.)
Thanks for the info. I don't really know how things work on Windows...
>> - I think that we need to be careful that multiple instances of GDB
>> don't interfere with each other (not far fetched at all if you run
>> GDB
>> in some automated script) and the cache is always coherent (either
>> the
>> file is not found, or it is found and entirely valid). Writing the
>> file directly to its final location seems like a recipe for failure.
>> One GDB could read a file in the index while it is being written by
>> another GDB. To mitigate this, I made write_psymtabs_to_index write
>> to temporary files and rename them once it's done. Two GDB
>> instances
>> writing the index for the same file should not step on each other's
>> toes (the last file to be renamed will stay). A GDB looking up a
>> file
>> will only see a complete file or no file. Also, if GDB crashes
>> while
>> generating the index file, it will leave a work-in-progress file,
>> but
>> it won't be picked up by other instances looking up in the cache.
>
> Maybe we should consider some kind of interlocking, like Emacs does.
That may be needed when we implement some more advanced things, like
auto-deletion of entries when the cache has grown past a certain size.
>> + while (1)
>> + {
>> + /* Find the beginning of the next component. */
>> + while (*component_start == '/')
>> + component_start++;
>> +
>> + /* Are we done? */
>> + if (*component_start == '\0')
>> + return;
>> +
>> + /* Find the slash or null-terminator after this component. */
>> + component_end = component_start;
>> + while (*component_end != '/' && *component_end != '\0')
>> + component_end++;
>
> I believe literal uses of '/' are non-portable; you should use
> IS_DIR_SEPARATOR (defined in filenames.h) instead. You also cannot
> portably assume an absolute file name always starts with a slash.
Ok.
>> + /* If we get EEXIST and the existing path is a directory, then
>> we're
>> + happy. If it exists, but it's a regular file and this is
>> not the last
>> + component, we'll fail at the next component. If this is the
>> last
>> + component, the caller will fail with ENOTDIR when trying to
>> + open/create a file under that path. */
>> + if (mkdir (start, 0700) != 0)
>
> This will not work on Windows.
Which part of it?
>> + if (errno != EEXIST)
>> + return;
>
> I think on Windows the value of errno is different.
Ok.
>> + std::string filename = make_index_filename (build_id,
>> INDEX4_SUFFIX);
>> +
>> + TRY
>> + {
>> + if (debug_index_cache)
>> + printf_unfiltered ("index cache: trying to read %s\n",
>> + filename.c_str ());
>> +
>> + /* Try to map that file. */
>> + index_cache_resource_mmap *mmap_resource
>> + = new index_cache_resource_mmap (filename.c_str ());
>> +
>> + /* Yay, it worked! Hand the resource to the caller. */
>> + resource->reset (mmap_resource);
>> +
>> + return gdb::array_view<const gdb_byte>
>> + ((const gdb_byte *) mmap_resource->mapping.get (),
>> + mmap_resource->mapping.size ());
>> + }
>> + CATCH (except, RETURN_MASK_ERROR)
>> + {
>> + if (debug_index_cache)
>> + printf_unfiltered ("index cache: couldn't read %s: %s\n",
>> + filename.c_str (), except.message);
>> + }
>> + END_CATCH
>
> Does this mean this feature will only work on systems with mmap?
Indeed. There is an equivalent API for Windows though I think:
https://docs.microsoft.com/en-us/windows/desktop/Memory/file-mapping
I can't really do the Windows bits, because I just don't have the
competence to write software for Windows. Every time I tried to build
GDB for Windows, I failed miserably (do we have a guide for that?). So
my thought (including for the mkdir_recursive function) was that
somebody who used Windows could later implement the missing parts.
Also, I guess that somebody debugging native executables on Windows
wouldn't have a use for the cache, because indices only work for DWARF
debug info.
Simon
next prev parent reply other threads:[~2018-07-27 14:01 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-25 22:47 [PATCH v2 0/4] Add a " Simon Marchi
2018-07-25 22:47 ` [PATCH v2 2/4] Introduce mmap_file function Simon Marchi
2018-07-25 22:47 ` [PATCH v2 4/4] Add doc and news for DWARF index cache Simon Marchi
2018-07-27 8:50 ` Eli Zaretskii
2018-07-30 19:04 ` Simon Marchi
2018-07-30 19:23 ` Eli Zaretskii
2018-07-25 22:48 ` [PATCH v2 3/4] Add " Simon Marchi
2018-07-27 9:06 ` Eli Zaretskii
2018-07-27 14:01 ` Simon Marchi [this message]
2018-07-27 21:25 ` Eli Zaretskii
2018-07-30 15:33 ` Simon Marchi
2018-07-30 15:45 ` Simon Marchi
2018-07-30 15:57 ` Eli Zaretskii
2018-07-30 17:05 ` Simon Marchi
2018-07-25 22:48 ` [PATCH v2 1/4] Make index reading functions more modular Simon Marchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e0035566b1c76e4592790b62cd319b34@polymtl.ca \
--to=simon.marchi@polymtl.ca \
--cc=eliz@gnu.org \
--cc=gdb-patches@sourceware.org \
--cc=simon.marchi@ericsson.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox