Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Mike Gulick <mgulick@mathworks.com>
To: Andrew Burgess <andrew.burgess@embecosm.com>
Cc: Mike Gulick <mgulick@mathworks.com>, Eli Zaretskii <eliz@gnu.org>,
	"gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Subject: Re: [RFC] Apply compilation dir to source_path lookup
Date: Mon, 16 Sep 2019 15:53:00 -0000	[thread overview]
Message-ID: <dad14acc-1fce-5eec-390c-51c2b8256112@mathworks.com> (raw)
In-Reply-To: <20190915040101.GZ6076@embecosm.com>

On 9/15/19 12:01 AM, Andrew Burgess wrote:
> * Mike Gulick <mgulick@mathworks.com> [2019-09-15 02:07:37 +0000]:
> 
>> On 9/14/19 11:54 AM, Eli Zaretskii wrote:
>>>> Date: Sat, 14 Sep 2019 11:28:47 -0400
>>>> From: Andrew Burgess <andrew.burgess@embecosm.com>
>>>> Cc: mgulick@mathworks.com, gdb-patches@sourceware.org
>>>>
>>>> If we have:
>>>>
>>>>    DW_AT_name:  	c:/project/foo.c
>>>>    DW_AT_comp_dir:	d:/project/build/
>>>>
>>>> And if the directory search path is:
>>>>
>>>>    g:/mnt/project;$cdir;$cwd
>>>>
>>>> And the current directory is:
>>>>
>>>>    h:/home/
>>>>
>>>> Then this is what I think the search order will be:
>>>>
>>>>    c:/project/foo.c
>>>>    g:/mnt/project/project/foo.c
>>>>    d:/project/build/project/foo.c
>>>>    h:/home/project/foo.c
>>>>
>>>>    d:/project/build/project/foo.c
>>>>    g:/mnt/project/project/build/project/foo.c
>>>>    d:/project/build/project/build/project/foo.c
>>>>    h:/home/project/build/project/foo.c
>>>>
>>>>    g:/mnt/project/foo.c
>>>>    d:/project/build/foo.c
>>>>    h:/home/foo.c
>>>>
>>>> The first and third block is what we currently (pre-patch) do, and the
>>>> second block is what I think the patch should provide after Mike's
>>>> last suggested update.
>>>
>>> SGTM, thanks.
>>>
>>
>> Thanks Andrew and Eli for the clarification on the proper behavior on
>> DOS paths.  The only remaining questionable case for me is whether to
>> check $cdir/COMP_DIR/FILENAME, which in your example above is
>> 'd:/project/build/project/build/project/foo.c'.  It seems reasonable
>> to me to leave this out of the search, although one could argue
>> leaving it on the source path may make the documentation slightly more
>> straightforward (one less special case).
>>
>> Please let me know if you would like me to provide an updated version
>> of the patch.
> 
> Personally I don't see it as a problem.  I agree it is unlikely to be
> something that ever makes sense, but we are already trying to access
> lots of paths that don't exist, I think throwing one more into the mix
> isn't going to hurt.
> 
> I took another stab at the documentation, this time I've moved the
> explanations for $cdir and $cwd earlier in the section - previously
> these were described really late on, and it made writing the earlier
> text hard as I was trying not to reference these things before they
> were introduced.
> 
> The docs now focus more on some worked examples rather than trying to
> just explain the algorithm in text.  I hope this is better.  I've also
> included an example of how MS drive letters are handled.

Thanks.  There's a lot of possible paths to search now, so I think
seeing then written out is helpful.  I have some minor corrections and
suggestions annotated in the code below.

> On the code side I have made one change from Mike's last patch,
> looking in openp at the code that glues filenames onto directories it
> had three preprocessing steps:
> 
>   1. Remove drive letters (for dos),
>   2. Remove leading slashes (/foo.c -> foo.c), and
>   3. Remove leading './' sequences (./foo.c -> foo.c).
> 
> I've now factored these three steps out and we share them between
> openp and open_and_find_source, otherwise the code is unchanged from
> Mike's last patch.

I also have a minor suggestion here that may handle one additional edge
case, please see below.

> 
> Let me know what you think.
> 
> Thanks,
> Andrew
> 
> ---
> 
> commit 70bbf4fc4358a8c10b20faf565ab3b0dee2a20c7
> Author: Mike Gulick <mgulick@mathworks.com>
> Date:   Thu Sep 12 11:16:06 2019 -0400
> 
>     gdb: Look for compilation directory relative to directory search path
>     
>     The 'directory' command allows the user to provide a list of filesystem
>     directories in which to search for source code.  The directories in this
>     search path are used as the base directory for the source filename from
>     the debug information (DW_AT_name).  Thus the directory search path
>     provides alternatives to the existing compilation directory from the
>     debug information (DW_AT_comp_dir).  Generally speaking, DW_AT_name
>     stores the filename argument passed to the compiler (including any
>     directory components), and DW_AT_comp_dir stores the current working
>     directory from which the compiler was executed.  For example:
>     
>         $ cd /path/to/project/subdir1
>         $ gcc -c a/test.c -g
>     
>     The corresponding debug information will look like this:
>     
>         DW_AT_name      : a/test.c
>         DW_AT_comp_dir  : /path/to/project/subdir1
>     
>     When compiling with the -fdebug-prefix-map GCC option, the compilation
>     directory can be arbitrarily rewritten.  In the above example, we may
>     rewrite the compilation directory as follows:
>     
>         $ gcc -c a/test.c -g -fdebug-prefix-map=/path/to/project=
>     
>     In this case, the corresponding debug information will look like:
>     
>         DW_AT_name      : a/test.c
>         DW_AT_comp_dir  : /subdir1
>     
>     This prevents GDB from finding the corresponding source code based on
>     the debug information alone.  In some cases, a substitute-path command
>     can be used to re-map a consistent prefix in the rewritten compilation
>     directory to the real filesystem path.  However, there may not be a
>     consistent prefix remaining in the debug symbols (for example in a
>     project that has source code in many subdirectories under the project's
>     root), thereby requiring multiple substitute-path rules.  In this case,
>     it is easier to add the missing prefix to the directory search path via
>     the 'directory' command.
>     
>     The function find_and_open_source currently searches in:
>     
>         SEARCH_PATH/FILENAME
>     
>     where SEARCH_PATH corresponds to each individual entry in the directory
>     search path (which is guaranteed to contain the compilation directory
>     from the debug information, as well as the current working directory).
>     FILENAME corresponds to the source filename (DW_AT_name), which may have
>     directory components in it.  In addition, GDB searches in:
>     
>         SEARCH_PATH/FILE_BASENAME
>     
>     where FILE_BASENAME is the basename of the DW_AT_name entry.
>     
>     This change modifies find_and_open_source to additionally search in:
>     
>         SEARCH_PATH/COMP_DIR/FILENAME
>     
>     where COMP_DIR is the compilation directory from the debug symbols.  In
>     the example given earlier, running:
>     
>         (gdb) directory /path/to/project
>     
>     will now allow GDB to correctly locate the source code from the debug
>     information.
>     
>     gdb/ChangeLog:
>     
>             * source.c (prepare_path_for_appending): New function.
>             (openp): Make use of new function.
>             (find_and_open_source): Search for the compilation directory and
>             source file as a relative path beneath the directory search path.
>     
>     gdb/doc/ChangeLog:
>     
>             * gdb.texinfo (Source Path): Additional text to better describe
>             how the source path directory list is used when searching for
>             source files.
>     
>     gdb/testsuite/ChangeLog:
>     
>             * gdb.base/source-dir.exp: Add extra test for mapped compilation
>             directory.
> 
> diff --git a/gdb/ChangeLog b/gdb/ChangeLog
> index 6d5f19d04bb..a173113929a 100644
> --- a/gdb/ChangeLog
> +++ b/gdb/ChangeLog
> @@ -1,3 +1,10 @@
> +2019-09-12  Mike Gulick  <mgulick@mathworks.com>
> +
> +	* source.c (prepare_path_for_appending): New function.
> +	(openp): Make use of new function.
> +	(find_and_open_source): Search for the compilation directory and
> +	source file as a relative path beneath the directory search path.
> +
>  2019-09-12  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
>  
>  	* procfs.c (procfs_target::wait) <PR_FAULTED>: Get signal from
> diff --git a/gdb/doc/ChangeLog b/gdb/doc/ChangeLog
> index 96c04091973..97c99663985 100644
> --- a/gdb/doc/ChangeLog
> +++ b/gdb/doc/ChangeLog
> @@ -1,3 +1,9 @@
> +2019-09-12  Andrew Burgess  <andrew.burgess@embecosm.com>
> +
> +	* gdb.texinfo (Source Path): Additional text to better describe
> +	how the source path directory list is used when searching for
> +	source files.
> +
>  2019-09-10  Tom Tromey  <tromey@adacore.com>
>  
>  	* gdb.texinfo (Index Files): Update Ada text.
> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
> index 79824a0226a..20ab1e576ec 100644
> --- a/gdb/doc/gdb.texinfo
> +++ b/gdb/doc/gdb.texinfo
> @@ -8954,11 +8954,20 @@
>  in the list, until it finds a file with the desired name.
>  
>  For example, suppose an executable references the file
> -@file{/usr/src/foo-1.0/lib/foo.c}, and our source path is
> -@file{/mnt/cross}.  The file is first looked up literally; if this
> -fails, @file{/mnt/cross/usr/src/foo-1.0/lib/foo.c} is tried; if this
> -fails, @file{/mnt/cross/foo.c} is opened; if this fails, an error
> -message is printed.  @value{GDBN} does not look up the parts of the
> +@file{/usr/src/foo-1.0/lib/foo.c} and does not record a compilation
                                    ^^^^^^^^^^^^^
                                    , does not

> +directory, the @dfn{source path} is @file{/mnt/cross}. @value{GDBN}
              ^^^^
              and the

> +would look for the source file in the following locations:
> +
> +@enumerate
> +
> +@item @file{/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/mnt/cross/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/mnt/cross/foo.c}
> +
> +@end enumerate
> +
> +If the source file is not present at any of the above locations then
> +an error is printed.  @value{GDBN} does not look up the parts of the
>  source file name, such as @file{/mnt/cross/src/foo-1.0/lib/foo.c}.
>  Likewise, the subdirectories of the source path are not searched: if
>  the source path is @file{/mnt/cross}, and the binary refers to
> @@ -8966,11 +8975,91 @@
>  @file{/mnt/cross/usr/src/foo-1.0/lib}.
>  
>  Plain file names, relative file names with leading directories, file
> -names containing dots, etc.@: are all treated as described above; for
> -instance, if the source path is @file{/mnt/cross}, and the source file
> -is recorded as @file{../lib/foo.c}, @value{GDBN} would first try
> -@file{../lib/foo.c}, then @file{/mnt/cross/../lib/foo.c}, and after
> -that---@file{/mnt/cross/foo.c}.
> +names containing dots, etc.@: are all treated as described above,
> +except that non-absolute file names are not looked up literally.  If
> +the @dfn{source path} is @file{/mnt/cross}, and the source file is
                                               ^^^
                                               no 'and' here

> +recorded as @file{../lib/foo.c} and no compilation directory is
                                  ^
                                  comma here

> +recorded then @value{GDBN} will search in the following locations:
           ^
           comma here
> +
> +@enumerate
> +
> +@item @file{/mnt/cross/../lib/foo.c}
> +@item @file{/mnt/cross/foo.c}
> +
> +@end enumerate
> +
> +@kindex cdir
> +@kindex cwd
> +@vindex $cdir@r{, convenience variable}
> +@vindex $cwd@r{, convenience variable}
> +@cindex compilation directory
> +@cindex current directory
> +@cindex working directory
> +@cindex directory, current
> +@cindex directory, compilation
> +The @dfn{source path} will always include two special entries
> +@samp{$cdir} and @samp{$cwd}, these refer to the compilation directory
> +(if one is recorded) and the current workig directory respectively.
                                        ^^^^^^^
                                        working
> +
> +@samp{cdir} causes @value{GDBN} to search within the compilation
         ^^^^
         $cdir

> +directory, if one is recorded in the debug information.  If no
> +compilation directory is recorded in the debug information then
> +@samp{cdir} is ignored.
         ^^^^
         $cdir

> +
> +@samp{$cwd} is not the same as @samp{.}---the former tracks the
> +current working directory as it changes during your @value{GDBN}
> +session, while the latter is immediately expanded to the current
> +directory at the time you add an entry to the source path.
> +
> +If a compilation directory is recorded in the debug information, and
> +@value{GDBN} has not found the source file after the first search
> +using @dfn{source path}, then @value{GDBN} will combine the
> +compilation directory and the filename, and then search for the source
> +file again using the @dfn{source path}.
> +
> +For example, if the executable records the source file as
> +@file{/usr/src/foo-1.0/lib/foo.c} the compilation directory is
                                    ^
                                    comma here

> +recorded as @file{/project/build}, and the @dfn{source path} is
> +@file{/mnt/cross;$cdir;$cwd} while the current working directory of
         ^^^^^^^^^^^^^^^^^^^^^
         /mnt/cross:$cdir:$cwd

It is probably better to use colons rather than semicolons since the
example uses unix paths.

> +the @value{GDBN} session is @file{/home/user}, then @value{GDBN} will
> +search for the source file in the following loctions:
> +
> +@enumerate
> +
> +@item @file{/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/mnt/cross/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/project/build/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/home/user/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/mnt/cross/project/build/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/project/build/project/build/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/home/user/project/build/usr/src/foo-1.0/lib/foo.c}
> +@item @file{/mnt/cross/foo.c}
> +@item @file{/project/build/foo.c}
> +@item @file{/home/user/foo.c}
> +
> +@end enumerate
> +
> +If the file name in the previous example had been recorded in the
> +executable as a relative path rather than an absolute path, then the
> +first look up would not have occurred, but all of the remaining steps
> +would be similar.
> +
> +When searching for source files on MS-DOS and MS-Windows, where
> +absolute paths start with a drive letter (e.g.
> +@file{C:/project/foo.c}), @value{GDBN} will remove the drive letter
> +from the file name before appending it to a search directory from
> +@dfn{source path}; for instance if the executable references the
> +source file @file{C:/project/foo.c} and @dfn{source path} is set to
> +@file{D:/mnt/cross}, then @value{GDBN} will search in the following
> +locations for the source file:
> +
> +@enumerate
> +
> +@item @file{C:/project/foo.c}
> +@item @file{D:/mnt/cross/project/foo.c}
> +@item @file{D:/mnt/cross/foo.c}
> +
> +@end enumerate
>  
>  Note that the executable search path is @emph{not} used to locate the
>  source files.

> @@ -9058,21 +9147,12 @@
>  whitespace.  You may specify a directory that is already in the source
>  path; this moves it forward, so @value{GDBN} searches it sooner.
>  
> -@kindex cdir
> -@kindex cwd
> -@vindex $cdir@r{, convenience variable}
> -@vindex $cwd@r{, convenience variable}
> -@cindex compilation directory
> -@cindex current directory
> -@cindex working directory
> -@cindex directory, current
> -@cindex directory, compilation
> -You can use the string @samp{$cdir} to refer to the compilation
> -directory (if one is recorded), and @samp{$cwd} to refer to the current
> -working directory.  @samp{$cwd} is not the same as @samp{.}---the former
> -tracks the current working directory as it changes during your @value{GDBN}
> -session, while the latter is immediately expanded to the current
> -directory at the time you add an entry to the source path.
> +The special strings @samp{$cdir} (to refer to the compilation
> +directory, if one is recorded), and @samp{$cwd} (to refer to the
> +current working directory) can also be included in the list of
> +directories @var{dirname}.  Though these will already be in the source
> +path they will be moved forward in the list so @value{GDBN} searches
> +them sooner.
>  
>  @item directory
>  Reset the source path to its default value (@samp{$cdir:$cwd} on Unix systems).  This requires confirmation.

I would also change the other spot that uses the names 'cdir' and 'cwd'
to be '$cdir' and '$cwd'.  I don't see any value in referring to
these special names without the leading '$':

@@ -9070,8 +9071,8 @@ each line is in the file.
 
 @kindex directory
 @kindex dir
-When you start @value{GDBN}, its source path includes only @samp{cdir}
-and @samp{cwd}, in that order.
+When you start @value{GDBN}, its source path includes only @samp{$cdir}
+and @samp{$cwd}, in that order.
 To add other directories, use the @code{directory} command.
 
 The search path is used to find both program source files and @value{GDBN}

> diff --git a/gdb/source.c b/gdb/source.c
> index b27f210802c..ece1ad3d16b 100644
> --- a/gdb/source.c
> +++ b/gdb/source.c
> @@ -654,6 +654,29 @@ info_source_command (const char *ignore, int from_tty)
>  }
>  \f
>  
> +/* Helper function to remove characters from the start of PATH so that
> +   PATH can then be appended to a directory name.  We remove leading drive
> +   letters (for dos) as well as leading '/' characters and './'
> +   sequences.  */
> +
> +const char *
> +prepare_path_for_appending (const char *path)
> +{
> +  /* For dos paths, d:/foo -> /foo, and d:foo -> foo.  */
> +  if (HAS_DRIVE_SPEC (path))
> +    path = STRIP_DRIVE_SPEC (path);
> +
> +  /* /foo => foo, to avoid multiple slashes that Emacs doesn't like.  */
> +  while (IS_DIR_SEPARATOR(path[0]))
> +    path++;
> +
> +  /* ./foo => foo */
> +  while (path[0] == '.' && IS_DIR_SEPARATOR (path[1]))
> +    path += 2;
> +
> +  return path;
> +}
> +

This may be a little pedantic, but looping around the removal of the '/'
and './' would handle a case like './/':

/* Helper function to remove characters from the start of PATH so that
   PATH can then be appended to a directory name.  We remove leading drive
   letters (for dos) as well as leading '/' characters and './'
   sequences.  */

const char *
prepare_path_for_appending (const char *path)
{
  /* For dos paths, d:/foo -> /foo, and d:foo -> foo.  */
  if (HAS_DRIVE_SPEC (path))
    path = STRIP_DRIVE_SPEC (path);

  const char *oldpath;
  /* Remove any sequence of '/' and './'.  */
  do {
    oldpath = path;
    /* /foo => foo, to avoid multiple slashes that Emacs doesn't like.  */
    if (IS_DIR_SEPARATOR(path[0]))
      path++;

    /* ./foo => foo */
    if (path[0] == '.' && IS_DIR_SEPARATOR (path[1]))
      path += 2;
  } while (path != oldpath);

  return path;
}


Feel free to use your discretion regarding these changes, I'm fine
either way.  Thanks a lot for all of the help.  Again, let me know if
there's anything else I should do.

Thanks,
Mike


  parent reply	other threads:[~2019-09-16 15:53 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-05 22:40 Mike Gulick
2019-09-07 23:51 ` Andrew Burgess
2019-09-09 22:41   ` Mike Gulick
2019-09-13  1:38     ` Andrew Burgess
2019-09-13  6:36       ` Eli Zaretskii
2019-09-13  7:28         ` Eli Zaretskii
2019-09-13 22:45           ` Andrew Burgess
2019-09-13 22:52             ` Mike Gulick
2019-09-14  7:11               ` Eli Zaretskii
2019-09-17 20:22               ` Andrew Burgess
2019-09-17 20:39                 ` Mike Gulick
2019-09-14  6:56             ` Eli Zaretskii
2019-09-14 15:28               ` Andrew Burgess
2019-09-14 15:54                 ` Eli Zaretskii
2019-09-15  2:07                   ` Mike Gulick
2019-09-15  4:01                     ` Andrew Burgess
2019-09-15 15:29                       ` Eli Zaretskii
2019-09-16 15:53                       ` Mike Gulick [this message]
2019-09-13 22:11         ` Mike Gulick
2019-09-13 22:41           ` Andrew Burgess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dad14acc-1fce-5eec-390c-51c2b8256112@mathworks.com \
    --to=mgulick@mathworks.com \
    --cc=andrew.burgess@embecosm.com \
    --cc=eliz@gnu.org \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox