From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-30635-listarch-gdb=sources.redhat.com@sourceware.org>
Received: (qmail 14916 invoked by alias); 8 Jan 2008 05:46:03 -0000
Received: (qmail 14905 invoked by uid 22791); 8 Jan 2008 05:46:02 -0000
X-Spam-Check-By: sourceware.org
Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 08 Jan 2008 05:45:29 +0000
Received: from localhost (localhost.localdomain [127.0.0.1]) 	by filtered-rock.gnat.com (Postfix) with ESMTP id A37B82A9681; 	Tue,  8 Jan 2008 00:45:27 -0500 (EST)
Received: from rock.gnat.com ([127.0.0.1]) 	by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024) 	with LMTP id ZsnU4t7ZTnkS; Tue,  8 Jan 2008 00:45:27 -0500 (EST)
Received: from joel.gnat.com (localhost.localdomain [127.0.0.1]) 	by rock.gnat.com (Postfix) with ESMTP id 89DE32A967E; 	Tue,  8 Jan 2008 00:45:25 -0500 (EST)
Received: by joel.gnat.com (Postfix, from userid 1000) 	id 20629E7ACB; Mon,  7 Jan 2008 21:45:18 -0800 (PST)
Date: Tue, 08 Jan 2008 05:46:00 -0000
From: Joel Brobecker <brobecker@adacore.com>
To: Doug Evans <dje@google.com>
Cc: Aleksandar Ristovski <ARistovski@qnx.com>, gdb@sourceware.org, 	Ryan Mansfield <RMansfield@qnx.com>
Subject: Re: gdb_realpath: dealing with ./ and ../
Message-ID: <20080108054518.GB20580@adacore.com>
References: <2F6320727174C448A52CEB63D85D11F40A3F@nova.ott.qnx.com> <20080103170846.GA17263@caradoc.them.org> <20080107143212.GA5923@adacore.com> <e394668d0801070900y6ea0df1ek5a654383376bc7f3@mail.gmail.com>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="mP3DRpeJDSE+ciuQ"
Content-Disposition: inline
In-Reply-To: <e394668d0801070900y6ea0df1ek5a654383376bc7f3@mail.gmail.com>
User-Agent: Mutt/1.4.2.2i
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb.sourceware.org>
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
X-SW-Source: 2008-01/txt/msg00048.txt.bz2


--mP3DRpeJDSE+ciuQ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-length: 3047

Doug,

> > Does this sound like it would work?
> 
> It seems like directory names need to be included in the file name
> comparison by start_subfile to some extent, otherwise headers with the
> same basename will match each other.

Yes, it very much looks like I've been pretty naive in my first
proposal :). I have thought this over some more, and it looks like
indeed manual rewriting of the paths will be needed if we want to
be able to handle all the situations currently discussed. So here is
a new proposal that adds the dirname matching using a new function that
will do the dirname matching first unmodified, and then "normalized".

Now, back to your comment regarding changing the DWARF reader:

> To what extent is the dwarf patch more complex because it handles the
> possibility of multiple matches with the basename of the main source
> file.  Assume there's only one file with the basename of the main file
> name and things become a lot simpler.  See
> http://sourceware.org/ml/gdb-patches/2008-01/msg00090.html.

Maybe I shouldn't have talked about complexity as this may be just me
needing time to understand the purpose of your patch. So I withdraw
that comment.

> Handling the issue in each debug info reader is an important
> consideration and may or may not be a problem.  One would need to
> examine to what extent the issue exists in the other readers, and to
> what extent start_subfile can solve it and still be debug-format
> independent without being more complex.

My suggestion has two ideas behind it:

    I reallly think that the wrapper around start_subfile that adjusts
    NAME and DIRNAME so that NAME is always a basename is a good step
    forward, because it reduces the number of combinations we need to
    handle during the matching phase later on.  We don't have to handle
    the case where NAME is a full path, or a relative path of a basename.
    
    Second, I still believe at this point that the problem should be
    handled outside of the debug info reader.  I know that at least
    stabs should have the same issues as DWARF. It would be very nice
    to have the problems fixed in both cases without having to duplicate
    the algorithms we're developing.

> One could rewrite that patch to keep dwarf2_start_subfile, but one
> would have to pass an additional parameter so it would know if it's
> dealing with the main source file.  The patch as submitted just
> reorganizes things so that other solutions(/heuristics) are possible
> without major reworking of the code (the patch itself had to do some
> reworking, but once that's done it's done (in the problem space being
> discussed)).  Plus it simplifies all call sites to
> dwarf2_start_subfile/start_subfile.  Previously, each call site had to
> process fe->dir_index, and there are three of them.

I think that the reorganization will not be necessary.  I'd like to
make the subfile layer work in a way that the debug info reader would
only have to pass the name and dirname as-is, and be confident that
it'll just work.

-- 
Joel

--mP3DRpeJDSE+ciuQ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="buildsym.c.diff"
Content-length: 4628

Index: buildsym.c
===================================================================
--- buildsym.c	(revision 44)
+++ buildsym.c	(working copy)
@@ -574,13 +574,32 @@ make_blockvector (struct objfile *objfil
   return (blockvector);
 }
 
+
+static int
+paths_likely_equal_p (const char *path1, const char *path2)
+{
+  /* The purpose of this function is to handle situations where
+     the path contains ".", or ".." or double-directory separators.
+     Before doing the comparison, we manually resolve the paths in
+     so that ".", ".." and double directory separators no longer
+     appear.  This will allow us to performe a pure string comparison
+     to determine whether two paths are identical or not.  */
+
+  /* If the paths are already identical, then obviously they are equal.  */
+  if (FILENAME_CMP (path1, path2) == 0)
+    return 1;
+
+  /* Normalize path1 and path2, and redo the comparison.  */
+  [...]
+}
+
 /* Start recording information about source code that came from an
    included (or otherwise merged-in) source file with a different
    name.  NAME is the name of the file (cannot be NULL), DIRNAME is
    the directory in which the file was compiled (or NULL if not known).  */
 
 void
-start_subfile (char *name, char *dirname)
+start_subfile_1 (char *name, char *dirname)
 {
   struct subfile *subfile;
 
@@ -589,27 +608,24 @@ start_subfile (char *name, char *dirname
 
   for (subfile = subfiles; subfile; subfile = subfile->next)
     {
-      char *subfile_name;
-
-      /* If NAME is an absolute path, and this subfile is not, then
-	 attempt to create an absolute path to compare.  */
-      if (IS_ABSOLUTE_PATH (name)
-	  && !IS_ABSOLUTE_PATH (subfile->name)
-	  && subfile->dirname != NULL)
-	subfile_name = concat (subfile->dirname, SLASH_STRING,
-			       subfile->name, NULL);
-      else
-	subfile_name = subfile->name;
-
-      if (FILENAME_CMP (subfile_name, name) == 0)
-	{
+      /* First, verify that the basename matches.  */
+      if (FILENAME_CMP (subfile->name, name) != 0)
+        continue;
+
+      /* Next, verify that the directory name matches too.  Some programs
+         may include two files with the same name but in a different
+         directory.
+         
+         If either DIRNAME or SUBFILE->DIRNAME is NULL, then we cannot
+         do the verification.  In that case, we cannot do the verification,
+         but the safest bet is that the files are identical.  */
+      if (dirname == NULL
+          || subfile->dirname == NULL
+          || paths_likely_equal_p (dirname, subfile->dirname))
+        {
 	  current_subfile = subfile;
-	  if (subfile_name != subfile->name)
-	    xfree (subfile_name);
 	  return;
-	}
-      if (subfile_name != subfile->name)
-	xfree (subfile_name);
+        }
     }
 
   /* This subfile is not known.  Add an entry for it. Make an entry
@@ -681,6 +697,53 @@ start_subfile (char *name, char *dirname
     }
 }
 
+/* This function is a wrapper around start_subfile_1 which rewrites
+   a bit NAME and DIRNAME to make sure that we are always consistent
+   in how we create subfiles: Turn NAME into a basename, and adjust
+   DIRNAME accordingly.  */
+
+void
+start_subfile (char *name, char *dirname)
+{
+  char *new_name, new_dir_name;
+
+  new_name = lbasename (name);
+
+  /* The simple usual case: NAME is the name of a file without any path
+     (in other words, it's already a basename).  In that case, we use
+     DIR_NAME as is.  */
+  if (FILENAME_CMP (new_name, name) == 0)
+    new_dir_name = dir_name;
+  
+  /* Another simple case: NAME is an absolute PATH.  In that case,
+     we can ignore the compilation dir, and use the path inside NAME
+     as the dir name.  */
+  else if (IS_ABSOLUTE_PATH (name))
+    new_dir_name = ldirname (name);
+
+  /* Otherwise, NAME is a relative PATH, so we extract the relative
+     path from name, and append it at the end of DIRNAME.  IF DIRNAME
+     is NULL, then we use that relative path as the DIRNAME.  */
+  else
+    {
+      char *name_relative_path = ldirname (name);
+
+      if (dirname != NULL)
+        {
+          new_dir_name = concat (dirname, SLASH_STRING, name_relative_path,
+                                 NULL);
+          xfree (name_relative_path);
+        }
+      else
+        new_dir_name = name_relative_path;
+    }
+
+  start_subfile_1 (new_name, new_dir_name);
+
+  xfree (new_name);
+  xfree (new_dir_name);
+}
+
 /* For stabs readers, the first N_SO symbol is assumed to be the
    source file name, and the subfile struct is initialized using that
    assumption.  If another N_SO symbol is later seen, immediately

--mP3DRpeJDSE+ciuQ--