From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14916 invoked by alias); 8 Jan 2008 05:46:03 -0000 Received: (qmail 14905 invoked by uid 22791); 8 Jan 2008 05:46:02 -0000 X-Spam-Check-By: sourceware.org Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 08 Jan 2008 05:45:29 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by filtered-rock.gnat.com (Postfix) with ESMTP id A37B82A9681; Tue, 8 Jan 2008 00:45:27 -0500 (EST) Received: from rock.gnat.com ([127.0.0.1]) by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id ZsnU4t7ZTnkS; Tue, 8 Jan 2008 00:45:27 -0500 (EST) Received: from joel.gnat.com (localhost.localdomain [127.0.0.1]) by rock.gnat.com (Postfix) with ESMTP id 89DE32A967E; Tue, 8 Jan 2008 00:45:25 -0500 (EST) Received: by joel.gnat.com (Postfix, from userid 1000) id 20629E7ACB; Mon, 7 Jan 2008 21:45:18 -0800 (PST) Date: Tue, 08 Jan 2008 05:46:00 -0000 From: Joel Brobecker To: Doug Evans Cc: Aleksandar Ristovski , gdb@sourceware.org, Ryan Mansfield Subject: Re: gdb_realpath: dealing with ./ and ../ Message-ID: <20080108054518.GB20580@adacore.com> References: <2F6320727174C448A52CEB63D85D11F40A3F@nova.ott.qnx.com> <20080103170846.GA17263@caradoc.them.org> <20080107143212.GA5923@adacore.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="mP3DRpeJDSE+ciuQ" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.2i Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2008-01/txt/msg00048.txt.bz2 --mP3DRpeJDSE+ciuQ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-length: 3047 Doug, > > Does this sound like it would work? > > It seems like directory names need to be included in the file name > comparison by start_subfile to some extent, otherwise headers with the > same basename will match each other. Yes, it very much looks like I've been pretty naive in my first proposal :). I have thought this over some more, and it looks like indeed manual rewriting of the paths will be needed if we want to be able to handle all the situations currently discussed. So here is a new proposal that adds the dirname matching using a new function that will do the dirname matching first unmodified, and then "normalized". Now, back to your comment regarding changing the DWARF reader: > To what extent is the dwarf patch more complex because it handles the > possibility of multiple matches with the basename of the main source > file. Assume there's only one file with the basename of the main file > name and things become a lot simpler. See > http://sourceware.org/ml/gdb-patches/2008-01/msg00090.html. Maybe I shouldn't have talked about complexity as this may be just me needing time to understand the purpose of your patch. So I withdraw that comment. > Handling the issue in each debug info reader is an important > consideration and may or may not be a problem. One would need to > examine to what extent the issue exists in the other readers, and to > what extent start_subfile can solve it and still be debug-format > independent without being more complex. My suggestion has two ideas behind it: I reallly think that the wrapper around start_subfile that adjusts NAME and DIRNAME so that NAME is always a basename is a good step forward, because it reduces the number of combinations we need to handle during the matching phase later on. We don't have to handle the case where NAME is a full path, or a relative path of a basename. Second, I still believe at this point that the problem should be handled outside of the debug info reader. I know that at least stabs should have the same issues as DWARF. It would be very nice to have the problems fixed in both cases without having to duplicate the algorithms we're developing. > One could rewrite that patch to keep dwarf2_start_subfile, but one > would have to pass an additional parameter so it would know if it's > dealing with the main source file. The patch as submitted just > reorganizes things so that other solutions(/heuristics) are possible > without major reworking of the code (the patch itself had to do some > reworking, but once that's done it's done (in the problem space being > discussed)). Plus it simplifies all call sites to > dwarf2_start_subfile/start_subfile. Previously, each call site had to > process fe->dir_index, and there are three of them. I think that the reorganization will not be necessary. I'd like to make the subfile layer work in a way that the debug info reader would only have to pass the name and dirname as-is, and be confident that it'll just work. -- Joel --mP3DRpeJDSE+ciuQ Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="buildsym.c.diff" Content-length: 4628 Index: buildsym.c =================================================================== --- buildsym.c (revision 44) +++ buildsym.c (working copy) @@ -574,13 +574,32 @@ make_blockvector (struct objfile *objfil return (blockvector); } + +static int +paths_likely_equal_p (const char *path1, const char *path2) +{ + /* The purpose of this function is to handle situations where + the path contains ".", or ".." or double-directory separators. + Before doing the comparison, we manually resolve the paths in + so that ".", ".." and double directory separators no longer + appear. This will allow us to performe a pure string comparison + to determine whether two paths are identical or not. */ + + /* If the paths are already identical, then obviously they are equal. */ + if (FILENAME_CMP (path1, path2) == 0) + return 1; + + /* Normalize path1 and path2, and redo the comparison. */ + [...] +} + /* Start recording information about source code that came from an included (or otherwise merged-in) source file with a different name. NAME is the name of the file (cannot be NULL), DIRNAME is the directory in which the file was compiled (or NULL if not known). */ void -start_subfile (char *name, char *dirname) +start_subfile_1 (char *name, char *dirname) { struct subfile *subfile; @@ -589,27 +608,24 @@ start_subfile (char *name, char *dirname for (subfile = subfiles; subfile; subfile = subfile->next) { - char *subfile_name; - - /* If NAME is an absolute path, and this subfile is not, then - attempt to create an absolute path to compare. */ - if (IS_ABSOLUTE_PATH (name) - && !IS_ABSOLUTE_PATH (subfile->name) - && subfile->dirname != NULL) - subfile_name = concat (subfile->dirname, SLASH_STRING, - subfile->name, NULL); - else - subfile_name = subfile->name; - - if (FILENAME_CMP (subfile_name, name) == 0) - { + /* First, verify that the basename matches. */ + if (FILENAME_CMP (subfile->name, name) != 0) + continue; + + /* Next, verify that the directory name matches too. Some programs + may include two files with the same name but in a different + directory. + + If either DIRNAME or SUBFILE->DIRNAME is NULL, then we cannot + do the verification. In that case, we cannot do the verification, + but the safest bet is that the files are identical. */ + if (dirname == NULL + || subfile->dirname == NULL + || paths_likely_equal_p (dirname, subfile->dirname)) + { current_subfile = subfile; - if (subfile_name != subfile->name) - xfree (subfile_name); return; - } - if (subfile_name != subfile->name) - xfree (subfile_name); + } } /* This subfile is not known. Add an entry for it. Make an entry @@ -681,6 +697,53 @@ start_subfile (char *name, char *dirname } } +/* This function is a wrapper around start_subfile_1 which rewrites + a bit NAME and DIRNAME to make sure that we are always consistent + in how we create subfiles: Turn NAME into a basename, and adjust + DIRNAME accordingly. */ + +void +start_subfile (char *name, char *dirname) +{ + char *new_name, new_dir_name; + + new_name = lbasename (name); + + /* The simple usual case: NAME is the name of a file without any path + (in other words, it's already a basename). In that case, we use + DIR_NAME as is. */ + if (FILENAME_CMP (new_name, name) == 0) + new_dir_name = dir_name; + + /* Another simple case: NAME is an absolute PATH. In that case, + we can ignore the compilation dir, and use the path inside NAME + as the dir name. */ + else if (IS_ABSOLUTE_PATH (name)) + new_dir_name = ldirname (name); + + /* Otherwise, NAME is a relative PATH, so we extract the relative + path from name, and append it at the end of DIRNAME. IF DIRNAME + is NULL, then we use that relative path as the DIRNAME. */ + else + { + char *name_relative_path = ldirname (name); + + if (dirname != NULL) + { + new_dir_name = concat (dirname, SLASH_STRING, name_relative_path, + NULL); + xfree (name_relative_path); + } + else + new_dir_name = name_relative_path; + } + + start_subfile_1 (new_name, new_dir_name); + + xfree (new_name); + xfree (new_dir_name); +} + /* For stabs readers, the first N_SO symbol is assumed to be the source file name, and the subfile struct is initialized using that assumption. If another N_SO symbol is later seen, immediately --mP3DRpeJDSE+ciuQ--