From: Mike Gulick <mgulick@mathworks.com>
To: Andrew Burgess <andrew.burgess@embecosm.com>
Cc: Mike Gulick <mgulick@mathworks.com>, Eli Zaretskii <eliz@gnu.org>,
"gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Subject: Re: [RFC] Apply compilation dir to source_path lookup
Date: Mon, 16 Sep 2019 15:53:00 -0000 [thread overview]
Message-ID: <dad14acc-1fce-5eec-390c-51c2b8256112@mathworks.com> (raw)
In-Reply-To: <20190915040101.GZ6076@embecosm.com>
On 9/15/19 12:01 AM, Andrew Burgess wrote:
> * Mike Gulick <mgulick@mathworks.com> [2019-09-15 02:07:37 +0000]:
>
>> On 9/14/19 11:54 AM, Eli Zaretskii wrote:
>>>> Date: Sat, 14 Sep 2019 11:28:47 -0400
>>>> From: Andrew Burgess <andrew.burgess@embecosm.com>
>>>> Cc: mgulick@mathworks.com, gdb-patches@sourceware.org
>>>>
>>>> If we have:
>>>>
>>>> DW_AT_name: c:/project/foo.c
>>>> DW_AT_comp_dir: d:/project/build/
>>>>
>>>> And if the directory search path is:
>>>>
>>>> g:/mnt/project;$cdir;$cwd
>>>>
>>>> And the current directory is:
>>>>
>>>> h:/home/
>>>>
>>>> Then this is what I think the search order will be:
>>>>
>>>> c:/project/foo.c
>>>> g:/mnt/project/project/foo.c
>>>> d:/project/build/project/foo.c
>>>> h:/home/project/foo.c
>>>>
>>>> d:/project/build/project/foo.c
>>>> g:/mnt/project/project/build/project/foo.c
>>>> d:/project/build/project/build/project/foo.c
>>>> h:/home/project/build/project/foo.c
>>>>
>>>> g:/mnt/project/foo.c
>>>> d:/project/build/foo.c
>>>> h:/home/foo.c
>>>>
>>>> The first and third block is what we currently (pre-patch) do, and the
>>>> second block is what I think the patch should provide after Mike's
>>>> last suggested update.
>>>
>>> SGTM, thanks.
>>>
>>
>> Thanks Andrew and Eli for the clarification on the proper behavior on
>> DOS paths. The only remaining questionable case for me is whether to
>> check $cdir/COMP_DIR/FILENAME, which in your example above is
>> 'd:/project/build/project/build/project/foo.c'. It seems reasonable
>> to me to leave this out of the search, although one could argue
>> leaving it on the source path may make the documentation slightly more
>> straightforward (one less special case).
>>
>> Please let me know if you would like me to provide an updated version
>> of the patch.
>
> Personally I don't see it as a problem. I agree it is unlikely to be
> something that ever makes sense, but we are already trying to access
> lots of paths that don't exist, I think throwing one more into the mix
> isn't going to hurt.
>
> I took another stab at the documentation, this time I've moved the
> explanations for $cdir and $cwd earlier in the section - previously
> these were described really late on, and it made writing the earlier
> text hard as I was trying not to reference these things before they
> were introduced.
>
> The docs now focus more on some worked examples rather than trying to
> just explain the algorithm in text. I hope this is better. I've also
> included an example of how MS drive letters are handled.
Thanks. There's a lot of possible paths to search now, so I think
seeing then written out is helpful. I have some minor corrections and
suggestions annotated in the code below.
> On the code side I have made one change from Mike's last patch,
> looking in openp at the code that glues filenames onto directories it
> had three preprocessing steps:
>
> 1. Remove drive letters (for dos),
> 2. Remove leading slashes (/foo.c -> foo.c), and
> 3. Remove leading './' sequences (./foo.c -> foo.c).
>
> I've now factored these three steps out and we share them between
> openp and open_and_find_source, otherwise the code is unchanged from
> Mike's last patch.
I also have a minor suggestion here that may handle one additional edge
case, please see below.
>
> Let me know what you think.
>
> Thanks,
> Andrew
>
> ---
>
> commit 70bbf4fc4358a8c10b20faf565ab3b0dee2a20c7
> Author: Mike Gulick <mgulick@mathworks.com>
> Date: Thu Sep 12 11:16:06 2019 -0400
>
> gdb: Look for compilation directory relative to directory search path
>
> The 'directory' command allows the user to provide a list of filesystem
> directories in which to search for source code. The directories in this
> search path are used as the base directory for the source filename from
> the debug information (DW_AT_name). Thus the directory search path
> provides alternatives to the existing compilation directory from the
> debug information (DW_AT_comp_dir). Generally speaking, DW_AT_name
> stores the filename argument passed to the compiler (including any
> directory components), and DW_AT_comp_dir stores the current working
> directory from which the compiler was executed. For example:
>
> $ cd /path/to/project/subdir1
> $ gcc -c a/test.c -g
>
> The corresponding debug information will look like this:
>
> DW_AT_name : a/test.c
> DW_AT_comp_dir : /path/to/project/subdir1
>
> When compiling with the -fdebug-prefix-map GCC option, the compilation
> directory can be arbitrarily rewritten. In the above example, we may
> rewrite the compilation directory as follows:
>
> $ gcc -c a/test.c -g -fdebug-prefix-map=/path/to/project=
>
> In this case, the corresponding debug information will look like:
>
> DW_AT_name : a/test.c
> DW_AT_comp_dir : /subdir1
>
> This prevents GDB from finding the corresponding source code based on
> the debug information alone. In some cases, a substitute-path command
> can be used to re-map a consistent prefix in the rewritten compilation
> directory to the real filesystem path. However, there may not be a
> consistent prefix remaining in the debug symbols (for example in a
> project that has source code in many subdirectories under the project's
> root), thereby requiring multiple substitute-path rules. In this case,
> it is easier to add the missing prefix to the directory search path via
> the 'directory' command.
>
> The function find_and_open_source currently searches in:
>
> SEARCH_PATH/FILENAME
>
> where SEARCH_PATH corresponds to each individual entry in the directory
> search path (which is guaranteed to contain the compilation directory
> from the debug information, as well as the current working directory).
> FILENAME corresponds to the source filename (DW_AT_name), which may have
> directory components in it. In addition, GDB searches in:
>
> SEARCH_PATH/FILE_BASENAME
>
> where FILE_BASENAME is the basename of the DW_AT_name entry.
>
> This change modifies find_and_open_source to additionally search in:
>
> SEARCH_PATH/COMP_DIR/FILENAME
>
> where COMP_DIR is the compilation directory from the debug symbols. In
> the example given earlier, running:
>
> (gdb) directory /path/to/project
>
> will now allow GDB to correctly locate the source code from the debug
> information.
>
> gdb/ChangeLog:
>
> * source.c (prepare_path_for_appending): New function.
> (openp): Make use of new function.
> (find_and_open_source): Search for the compilation directory and
> source file as a relative path beneath the directory search path.
>
> gdb/doc/ChangeLog:
>
> * gdb.texinfo (Source Path): Additional text to better describe
> how the source path directory list is used when searching for
> source files.
>
> gdb/testsuite/ChangeLog:
>
> * gdb.base/source-dir.exp: Add extra test for mapped compilation
> directory.
>
> diff --git a/gdb/ChangeLog b/gdb/ChangeLog
> index 6d5f19d04bb..a173113929a 100644
> --- a/gdb/ChangeLog
> +++ b/gdb/ChangeLog
> @@ -1,3 +1,10 @@
> +2019-09-12 Mike Gulick <mgulick@mathworks.com>
> +
> + * source.c (prepare_path_for_appending): New function.
> + (openp): Make use of new function.
> + (find_and_open_source): Search for the compilation directory and
> + source file as a relative path beneath the directory search path.
> +
> 2019-09-12 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
>
> * procfs.c (procfs_target::wait) <PR_FAULTED>: Get signal from
> diff --git a/gdb/doc/ChangeLog b/gdb/doc/ChangeLog
> index 96c04091973..97c99663985 100644
> --- a/gdb/doc/ChangeLog
> +++ b/gdb/doc/ChangeLog
> @@ -1,3 +1,9 @@
> +2019-09-12 Andrew Burgess <andrew.burgess@embecosm.com>
> +
> + * gdb.texinfo (Source Path): Additional text to better describe
> + how the source path directory list is used when searching for
> + source files.
> +
> 2019-09-10 Tom Tromey <tromey@adacore.com>
>
> * gdb.texinfo (Index Files): Update Ada text.
> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
> index 79824a0226a..20ab1e576ec 100644
> --- a/gdb/doc/gdb.texinfo
> +++ b/gdb/doc/gdb.texinfo
> @@ -8954,11 +8954,20 @@
> in the list, until it finds a file with the desired name.
>
> For example, suppose an executable references the file
> -@file{/usr/src/foo-1.0/lib/foo.c}, and our source path is
> -@file{/mnt/cross}. The file is first looked up literally; if this
> -fails, @file{/mnt/cross/usr/src/foo-1.0/lib/foo.c} is tried; if this
> -fails, @file{/mnt/cross/foo.c} is opened; if this fails, an error
> -message is printed. @value{GDBN} does not look up the parts of the
> +@file{/usr/src/foo-1.0/lib/foo.c} and does not record a compilation
^^^^^^^^^^^^^
, does not
> +directory, the @dfn{source path} is @file{/mnt/cross}. @value{GDBN}
^^^^
and the
> +would look for the source file in the following locations:
> +
> +@enumerate
> +
> +@item @file{/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/mnt/cross/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/mnt/cross/foo.c}
> +
> +@end enumerate
> +
> +If the source file is not present at any of the above locations then
> +an error is printed. @value{GDBN} does not look up the parts of the
> source file name, such as @file{/mnt/cross/src/foo-1.0/lib/foo.c}.
> Likewise, the subdirectories of the source path are not searched: if
> the source path is @file{/mnt/cross}, and the binary refers to
> @@ -8966,11 +8975,91 @@
> @file{/mnt/cross/usr/src/foo-1.0/lib}.
>
> Plain file names, relative file names with leading directories, file
> -names containing dots, etc.@: are all treated as described above; for
> -instance, if the source path is @file{/mnt/cross}, and the source file
> -is recorded as @file{../lib/foo.c}, @value{GDBN} would first try
> -@file{../lib/foo.c}, then @file{/mnt/cross/../lib/foo.c}, and after
> -that---@file{/mnt/cross/foo.c}.
> +names containing dots, etc.@: are all treated as described above,
> +except that non-absolute file names are not looked up literally. If
> +the @dfn{source path} is @file{/mnt/cross}, and the source file is
^^^
no 'and' here
> +recorded as @file{../lib/foo.c} and no compilation directory is
^
comma here
> +recorded then @value{GDBN} will search in the following locations:
^
comma here
> +
> +@enumerate
> +
> +@item @file{/mnt/cross/../lib/foo.c}
> +@item @file{/mnt/cross/foo.c}
> +
> +@end enumerate
> +
> +@kindex cdir
> +@kindex cwd
> +@vindex $cdir@r{, convenience variable}
> +@vindex $cwd@r{, convenience variable}
> +@cindex compilation directory
> +@cindex current directory
> +@cindex working directory
> +@cindex directory, current
> +@cindex directory, compilation
> +The @dfn{source path} will always include two special entries
> +@samp{$cdir} and @samp{$cwd}, these refer to the compilation directory
> +(if one is recorded) and the current workig directory respectively.
^^^^^^^
working
> +
> +@samp{cdir} causes @value{GDBN} to search within the compilation
^^^^
$cdir
> +directory, if one is recorded in the debug information. If no
> +compilation directory is recorded in the debug information then
> +@samp{cdir} is ignored.
^^^^
$cdir
> +
> +@samp{$cwd} is not the same as @samp{.}---the former tracks the
> +current working directory as it changes during your @value{GDBN}
> +session, while the latter is immediately expanded to the current
> +directory at the time you add an entry to the source path.
> +
> +If a compilation directory is recorded in the debug information, and
> +@value{GDBN} has not found the source file after the first search
> +using @dfn{source path}, then @value{GDBN} will combine the
> +compilation directory and the filename, and then search for the source
> +file again using the @dfn{source path}.
> +
> +For example, if the executable records the source file as
> +@file{/usr/src/foo-1.0/lib/foo.c} the compilation directory is
^
comma here
> +recorded as @file{/project/build}, and the @dfn{source path} is
> +@file{/mnt/cross;$cdir;$cwd} while the current working directory of
^^^^^^^^^^^^^^^^^^^^^
/mnt/cross:$cdir:$cwd
It is probably better to use colons rather than semicolons since the
example uses unix paths.
> +the @value{GDBN} session is @file{/home/user}, then @value{GDBN} will
> +search for the source file in the following loctions:
> +
> +@enumerate
> +
> +@item @file{/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/mnt/cross/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/project/build/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/home/user/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/mnt/cross/project/build/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/project/build/project/build/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/home/user/project/build/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/mnt/cross/foo.c}
> +@item @file{/project/build/foo.c}
> +@item @file{/home/user/foo.c}
> +
> +@end enumerate
> +
> +If the file name in the previous example had been recorded in the
> +executable as a relative path rather than an absolute path, then the
> +first look up would not have occurred, but all of the remaining steps
> +would be similar.
> +
> +When searching for source files on MS-DOS and MS-Windows, where
> +absolute paths start with a drive letter (e.g.
> +@file{C:/project/foo.c}), @value{GDBN} will remove the drive letter
> +from the file name before appending it to a search directory from
> +@dfn{source path}; for instance if the executable references the
> +source file @file{C:/project/foo.c} and @dfn{source path} is set to
> +@file{D:/mnt/cross}, then @value{GDBN} will search in the following
> +locations for the source file:
> +
> +@enumerate
> +
> +@item @file{C:/project/foo.c}
> +@item @file{D:/mnt/cross/project/foo.c}
> +@item @file{D:/mnt/cross/foo.c}
> +
> +@end enumerate
>
> Note that the executable search path is @emph{not} used to locate the
> source files.
> @@ -9058,21 +9147,12 @@
> whitespace. You may specify a directory that is already in the source
> path; this moves it forward, so @value{GDBN} searches it sooner.
>
> -@kindex cdir
> -@kindex cwd
> -@vindex $cdir@r{, convenience variable}
> -@vindex $cwd@r{, convenience variable}
> -@cindex compilation directory
> -@cindex current directory
> -@cindex working directory
> -@cindex directory, current
> -@cindex directory, compilation
> -You can use the string @samp{$cdir} to refer to the compilation
> -directory (if one is recorded), and @samp{$cwd} to refer to the current
> -working directory. @samp{$cwd} is not the same as @samp{.}---the former
> -tracks the current working directory as it changes during your @value{GDBN}
> -session, while the latter is immediately expanded to the current
> -directory at the time you add an entry to the source path.
> +The special strings @samp{$cdir} (to refer to the compilation
> +directory, if one is recorded), and @samp{$cwd} (to refer to the
> +current working directory) can also be included in the list of
> +directories @var{dirname}. Though these will already be in the source
> +path they will be moved forward in the list so @value{GDBN} searches
> +them sooner.
>
> @item directory
> Reset the source path to its default value (@samp{$cdir:$cwd} on Unix systems). This requires confirmation.
I would also change the other spot that uses the names 'cdir' and 'cwd'
to be '$cdir' and '$cwd'. I don't see any value in referring to
these special names without the leading '$':
@@ -9070,8 +9071,8 @@ each line is in the file.
@kindex directory
@kindex dir
-When you start @value{GDBN}, its source path includes only @samp{cdir}
-and @samp{cwd}, in that order.
+When you start @value{GDBN}, its source path includes only @samp{$cdir}
+and @samp{$cwd}, in that order.
To add other directories, use the @code{directory} command.
The search path is used to find both program source files and @value{GDBN}
> diff --git a/gdb/source.c b/gdb/source.c
> index b27f210802c..ece1ad3d16b 100644
> --- a/gdb/source.c
> +++ b/gdb/source.c
> @@ -654,6 +654,29 @@ info_source_command (const char *ignore, int from_tty)
> }
> \f
>
> +/* Helper function to remove characters from the start of PATH so that
> + PATH can then be appended to a directory name. We remove leading drive
> + letters (for dos) as well as leading '/' characters and './'
> + sequences. */
> +
> +const char *
> +prepare_path_for_appending (const char *path)
> +{
> + /* For dos paths, d:/foo -> /foo, and d:foo -> foo. */
> + if (HAS_DRIVE_SPEC (path))
> + path = STRIP_DRIVE_SPEC (path);
> +
> + /* /foo => foo, to avoid multiple slashes that Emacs doesn't like. */
> + while (IS_DIR_SEPARATOR(path[0]))
> + path++;
> +
> + /* ./foo => foo */
> + while (path[0] == '.' && IS_DIR_SEPARATOR (path[1]))
> + path += 2;
> +
> + return path;
> +}
> +
This may be a little pedantic, but looping around the removal of the '/'
and './' would handle a case like './/':
/* Helper function to remove characters from the start of PATH so that
PATH can then be appended to a directory name. We remove leading drive
letters (for dos) as well as leading '/' characters and './'
sequences. */
const char *
prepare_path_for_appending (const char *path)
{
/* For dos paths, d:/foo -> /foo, and d:foo -> foo. */
if (HAS_DRIVE_SPEC (path))
path = STRIP_DRIVE_SPEC (path);
const char *oldpath;
/* Remove any sequence of '/' and './'. */
do {
oldpath = path;
/* /foo => foo, to avoid multiple slashes that Emacs doesn't like. */
if (IS_DIR_SEPARATOR(path[0]))
path++;
/* ./foo => foo */
if (path[0] == '.' && IS_DIR_SEPARATOR (path[1]))
path += 2;
} while (path != oldpath);
return path;
}
Feel free to use your discretion regarding these changes, I'm fine
either way. Thanks a lot for all of the help. Again, let me know if
there's anything else I should do.
Thanks,
Mike
next prev parent reply other threads:[~2019-09-16 15:53 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-05 22:40 Mike Gulick
2019-09-07 23:51 ` Andrew Burgess
2019-09-09 22:41 ` Mike Gulick
2019-09-13 1:38 ` Andrew Burgess
2019-09-13 6:36 ` Eli Zaretskii
2019-09-13 7:28 ` Eli Zaretskii
2019-09-13 22:45 ` Andrew Burgess
2019-09-13 22:52 ` Mike Gulick
2019-09-14 7:11 ` Eli Zaretskii
2019-09-17 20:22 ` Andrew Burgess
2019-09-17 20:39 ` Mike Gulick
2019-09-14 6:56 ` Eli Zaretskii
2019-09-14 15:28 ` Andrew Burgess
2019-09-14 15:54 ` Eli Zaretskii
2019-09-15 2:07 ` Mike Gulick
2019-09-15 4:01 ` Andrew Burgess
2019-09-15 15:29 ` Eli Zaretskii
2019-09-16 15:53 ` Mike Gulick [this message]
2019-09-13 22:11 ` Mike Gulick
2019-09-13 22:41 ` Andrew Burgess
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dad14acc-1fce-5eec-390c-51c2b8256112@mathworks.com \
--to=mgulick@mathworks.com \
--cc=andrew.burgess@embecosm.com \
--cc=eliz@gnu.org \
--cc=gdb-patches@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox