Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* [RFC] Avoid calling gdb_realpath if basenames are different
@ 2011-11-06  6:31 Doug Evans
  2011-11-06  9:07 ` asmwarrior
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Doug Evans @ 2011-11-06  6:31 UTC (permalink / raw)
  To: gdb-patches

Hi.
This patch has been brought up before (by others).
E.g., http://sourceware.org/ml/gdb-patches/2010-04/msg00466.html
I'm hoping we can get this in now.
We're paying a real and significant cost for what is mostly a
theoretical concern.
[E.g., How often is one source file referred to by the user using a basename
that is different than what's recorded in the debug info?]

If people are concerned about breaking someone's usage,
we could default basenames-may-differ to true in 7.4,
with a warning that it will be set to false in 7.5 (or some such).
[We could leave the default set to true, especially if someone knew
of at least some minimally common usage this would break.
I'd hate to otherwise penalize the vast majority of users if not.]

I'll write docs,NEWS if the basics of the patch are approved.

Comments?

2011-11-05  Doug Evans  <dje@google.com>

	* dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
	! basenames_may_differ.
	* psymtab.c (lookup_partial_symtab): Ditto.
	* symtab.c (lookup_partial_symtab): Ditto.
	* symtab.c (lookup_symtab): Ditto.
	(basenames_may_differ): New global.
	(_initialize_symtab): New parameter basenames-may-differ.
	* symtab.h (basenames_may_differ): Declare.

Index: dwarf2read.c
===================================================================
RCS file: /cvs/src/src/gdb/dwarf2read.c,v
retrieving revision 1.578
diff -u -p -r1.578 dwarf2read.c
--- dwarf2read.c	20 Oct 2011 23:13:01 -0000	1.578
+++ dwarf2read.c	6 Nov 2011 06:25:11 -0000
@@ -2444,7 +2444,8 @@ dw2_lookup_symtab (struct objfile *objfi
 		   struct symtab **result)
 {
   int i;
-  int check_basename = lbasename (name) == name;
+  const char *name_basename = lbasename (name);
+  int check_basename = name_basename == name;
   struct dwarf2_per_cu_data *base_cu = NULL;
 
   dw2_setup (objfile);
@@ -2477,6 +2478,12 @@ dw2_lookup_symtab (struct objfile *objfi
 	      && FILENAME_CMP (lbasename (this_name), name) == 0)
 	    base_cu = per_cu;
 
+	  /* Before we invoke realpath, which can get expensive when many
+	     files are involved, do a quick comparison of the basenames.  */
+	  if (! basenames_may_differ
+	      && FILENAME_CMP (lbasename (this_name), name_basename) != 0)
+	    continue;
+
 	  if (full_path != NULL)
 	    {
 	      const char *this_real_name = dw2_get_real_path (objfile,
Index: psymtab.c
===================================================================
RCS file: /cvs/src/src/gdb/psymtab.c,v
retrieving revision 1.31
diff -u -p -r1.31 psymtab.c
--- psymtab.c	28 Oct 2011 17:29:37 -0000	1.31
+++ psymtab.c	6 Nov 2011 06:25:11 -0000
@@ -134,6 +134,7 @@ lookup_partial_symtab (struct objfile *o
 		       const char *full_path, const char *real_path)
 {
   struct partial_symtab *pst;
+  const char *name_basename = lbasename (name);
 
   ALL_OBJFILE_PSYMTABS_REQUIRED (objfile, pst)
   {
@@ -142,6 +143,12 @@ lookup_partial_symtab (struct objfile *o
 	return (pst);
       }
 
+    /* Before we invoke realpath, which can get expensive when many
+       files are involved, do a quick comparison of the basenames.  */
+    if (! basenames_may_differ
+	&& FILENAME_CMP (name_basename, lbasename (pst->filename)) != 0)
+      continue;
+
     /* If the user gave us an absolute path, try to find the file in
        this symtab and use its absolute path.  */
     if (full_path != NULL)
@@ -172,7 +179,7 @@ lookup_partial_symtab (struct objfile *o
 
   /* Now, search for a matching tail (only if name doesn't have any dirs).  */
 
-  if (lbasename (name) == name)
+  if (name_basename == name)
     ALL_OBJFILE_PSYMTABS_REQUIRED (objfile, pst)
     {
       if (FILENAME_CMP (lbasename (pst->filename), name) == 0)
Index: symtab.c
===================================================================
RCS file: /cvs/src/src/gdb/symtab.c,v
retrieving revision 1.285
diff -u -p -r1.285 symtab.c
--- symtab.c	29 Oct 2011 07:26:07 -0000	1.285
+++ symtab.c	6 Nov 2011 06:25:11 -0000
@@ -112,6 +112,8 @@ void _initialize_symtab (void);
 
 /* */
 
+int basenames_may_differ = 1;
+
 /* Allow the user to configure the debugger behavior with respect
    to multiple-choice menus when more than one symbol matches during
    a symbol lookup.  */
@@ -155,6 +157,7 @@ lookup_symtab (const char *name)
   char *real_path = NULL;
   char *full_path = NULL;
   struct cleanup *cleanup;
+  const char* base_name = lbasename (name);
 
   cleanup = make_cleanup (null_cleanup, NULL);
 
@@ -180,6 +183,12 @@ got_symtab:
 	return s;
       }
 
+    /* Before we invoke realpath, which can get expensive when many
+       files are involved, do a quick comparison of the basenames.  */
+    if (! basenames_may_differ
+	&& FILENAME_CMP (base_name, lbasename (s->filename)) != 0)
+      continue;
+
     /* If the user gave us an absolute path, try to find the file in
        this symtab and use its absolute path.  */
 
@@ -4883,5 +4892,20 @@ Show how the debugger handles ambiguitie
 Valid values are \"ask\", \"all\", \"cancel\", and the default is \"all\"."),
                         NULL, NULL, &setlist, &showlist);
 
+  add_setshow_boolean_cmd ("basenames-may-differ", class_files,
+			   &basenames_may_differ, _("\
+Set whether a source file may have multiple base names."), _("\
+Show whether a source file may have multiple names."), _("\
+When doing file name based lookups, gdb will canonicalize file names\n\
+(e.g., expand symlinks) before comparing them, which is an expensive\n\
+operation.\n\
+If set, gdb cannot assume a file is known by one base name, and thus\n\
+it cannot optimize file name comparisions by skipping the canonicalization\n\
+step if the base names are different.\n\
+If not set, all source files must be known by one base name,\n\
+and gdb will do file name comparisons more efficiently."),
+			   NULL, NULL,
+			   &setlist, &showlist);
+
   observer_attach_executable_changed (symtab_observer_executable_changed);
 }
Index: symtab.h
===================================================================
RCS file: /cvs/src/src/gdb/symtab.h,v
retrieving revision 1.190
diff -u -p -r1.190 symtab.h
--- symtab.h	9 Oct 2011 19:34:18 -0000	1.190
+++ symtab.h	6 Nov 2011 06:25:11 -0000
@@ -1307,4 +1307,6 @@ void fixup_section (struct general_symbo
 
 struct objfile *lookup_objfile_from_block (const struct block *block);
 
+extern int basenames_may_differ;
+
 #endif /* !defined(SYMTAB_H) */


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] Avoid calling gdb_realpath if basenames are different
  2011-11-06  6:31 [RFC] Avoid calling gdb_realpath if basenames are different Doug Evans
@ 2011-11-06  9:07 ` asmwarrior
  2011-11-07 17:06 ` Joel Brobecker
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: asmwarrior @ 2011-11-06  9:07 UTC (permalink / raw)
  To: Doug Evans; +Cc: gdb-patches

On 2011-11-6 14:30, Doug Evans wrote:
> Hi.
> This patch has been brought up before (by others).
> E.g., http://sourceware.org/ml/gdb-patches/2010-04/msg00466.html
> I'm hoping we can get this in now.
> We're paying a real and significant cost for what is mostly a
> theoretical concern.
> [E.g., How often is one source file referred to by the user using a basename
> that is different than what's recorded in the debug info?]
> 
> If people are concerned about breaking someone's usage,
> we could default basenames-may-differ to true in 7.4,
> with a warning that it will be set to false in 7.5 (or some such).
> [We could leave the default set to true, especially if someone knew
> of at least some minimally common usage this would break.
> I'd hate to otherwise penalize the vast majority of users if not.]
> 
> I'll write docs,NEWS if the basics of the patch are approved.
> 
> Comments?
> 
> 2011-11-05  Doug Evans<dje@google.com>
> 
> 	* dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
> 	! basenames_may_differ.
> 	* psymtab.c (lookup_partial_symtab): Ditto.
> 	* symtab.c (lookup_partial_symtab): Ditto.
> 	* symtab.c (lookup_symtab): Ditto.
> 	(basenames_may_differ): New global.
> 	(_initialize_symtab): New parameter basenames-may-differ.
> 	* symtab.h (basenames_may_differ): Declare.
> 
> Index: dwarf2read.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/dwarf2read.c,v
> retrieving revision 1.578
> diff -u -p -r1.578 dwarf2read.c
> --- dwarf2read.c	20 Oct 2011 23:13:01 -0000	1.578
> +++ dwarf2read.c	6 Nov 2011 06:25:11 -0000
> @@ -2444,7 +2444,8 @@ dw2_lookup_symtab (struct objfile *objfi
>   		   struct symtab **result)
>   {
>     int i;
> -  int check_basename = lbasename (name) == name;
> +  const char *name_basename = lbasename (name);
> +  int check_basename = name_basename == name;
>     struct dwarf2_per_cu_data *base_cu = NULL;
> 
>     dw2_setup (objfile);
> @@ -2477,6 +2478,12 @@ dw2_lookup_symtab (struct objfile *objfi
>   	&&  FILENAME_CMP (lbasename (this_name), name) == 0)
>   	    base_cu = per_cu;
> 
> +	  /* Before we invoke realpath, which can get expensive when many
> +	     files are involved, do a quick comparison of the basenames.  */
> +	  if (! basenames_may_differ
> +	&&  FILENAME_CMP (lbasename (this_name), name_basename) != 0)
> +	    continue;
> +
>   	  if (full_path != NULL)
>   	    {
>   	      const char *this_real_name = dw2_get_real_path (objfile,
> Index: psymtab.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/psymtab.c,v
> retrieving revision 1.31
> diff -u -p -r1.31 psymtab.c
> --- psymtab.c	28 Oct 2011 17:29:37 -0000	1.31
> +++ psymtab.c	6 Nov 2011 06:25:11 -0000
> @@ -134,6 +134,7 @@ lookup_partial_symtab (struct objfile *o
>   		       const char *full_path, const char *real_path)
>   {
>     struct partial_symtab *pst;
> +  const char *name_basename = lbasename (name);
> 
>     ALL_OBJFILE_PSYMTABS_REQUIRED (objfile, pst)
>     {
> @@ -142,6 +143,12 @@ lookup_partial_symtab (struct objfile *o
>   	return (pst);
>         }
> 
> +    /* Before we invoke realpath, which can get expensive when many
> +       files are involved, do a quick comparison of the basenames.  */
> +    if (! basenames_may_differ
> +	&&  FILENAME_CMP (name_basename, lbasename (pst->filename)) != 0)
> +      continue;
> +
>       /* If the user gave us an absolute path, try to find the file in
>          this symtab and use its absolute path.  */
>       if (full_path != NULL)
> @@ -172,7 +179,7 @@ lookup_partial_symtab (struct objfile *o
> 
>     /* Now, search for a matching tail (only if name doesn't have any dirs).  */
> 
> -  if (lbasename (name) == name)
> +  if (name_basename == name)
>       ALL_OBJFILE_PSYMTABS_REQUIRED (objfile, pst)
>       {
>         if (FILENAME_CMP (lbasename (pst->filename), name) == 0)
> Index: symtab.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/symtab.c,v
> retrieving revision 1.285
> diff -u -p -r1.285 symtab.c
> --- symtab.c	29 Oct 2011 07:26:07 -0000	1.285
> +++ symtab.c	6 Nov 2011 06:25:11 -0000
> @@ -112,6 +112,8 @@ void _initialize_symtab (void);
> 
>   /* */
> 
> +int basenames_may_differ = 1;
> +
>   /* Allow the user to configure the debugger behavior with respect
>      to multiple-choice menus when more than one symbol matches during
>      a symbol lookup.  */
> @@ -155,6 +157,7 @@ lookup_symtab (const char *name)
>     char *real_path = NULL;
>     char *full_path = NULL;
>     struct cleanup *cleanup;
> +  const char* base_name = lbasename (name);
> 
>     cleanup = make_cleanup (null_cleanup, NULL);
> 
> @@ -180,6 +183,12 @@ got_symtab:
>   	return s;
>         }
> 
> +    /* Before we invoke realpath, which can get expensive when many
> +       files are involved, do a quick comparison of the basenames.  */
> +    if (! basenames_may_differ
> +	&&  FILENAME_CMP (base_name, lbasename (s->filename)) != 0)
> +      continue;
> +
>       /* If the user gave us an absolute path, try to find the file in
>          this symtab and use its absolute path.  */
> 
> @@ -4883,5 +4892,20 @@ Show how the debugger handles ambiguitie
>   Valid values are \"ask\", \"all\", \"cancel\", and the default is \"all\"."),
>                           NULL, NULL,&setlist,&showlist);
> 
> +  add_setshow_boolean_cmd ("basenames-may-differ", class_files,
> +			&basenames_may_differ, _("\
> +Set whether a source file may have multiple base names."), _("\
> +Show whether a source file may have multiple names."), _("\
> +When doing file name based lookups, gdb will canonicalize file names\n\
> +(e.g., expand symlinks) before comparing them, which is an expensive\n\
> +operation.\n\
> +If set, gdb cannot assume a file is known by one base name, and thus\n\
> +it cannot optimize file name comparisions by skipping the canonicalization\n\
> +step if the base names are different.\n\
> +If not set, all source files must be known by one base name,\n\
> +and gdb will do file name comparisons more efficiently."),
> +			   NULL, NULL,
> +			&setlist,&showlist);
> +
>     observer_attach_executable_changed (symtab_observer_executable_changed);
>   }
> Index: symtab.h
> ===================================================================
> RCS file: /cvs/src/src/gdb/symtab.h,v
> retrieving revision 1.190
> diff -u -p -r1.190 symtab.h
> --- symtab.h	9 Oct 2011 19:34:18 -0000	1.190
> +++ symtab.h	6 Nov 2011 06:25:11 -0000
> @@ -1307,4 +1307,6 @@ void fixup_section (struct general_symbo
> 
>   struct objfile *lookup_objfile_from_block (const struct block *block);
> 
> +extern int basenames_may_differ;
> +
>   #endif /* !defined(SYMTAB_H) */
> 

Hi, Doug.
I have applied your patch in my Windows XP local copy of gdb source, and build the gdb under msys+mingw gcc.

also change: 

int basenames_may_differ = 0;

I think from my point, I don't have symbol link under windows XP of the source file.

I just test a large projects(which has many dlls loading when start up). I just set the breakpoint before start the app, so the breakpoint is just pending when the app starts.

Without your patch, I need more than 90 seconds to start the app. 
With your patch, It takes only about 15 seconds.

BTW: if I do not set any pending breakpoints before start the app, starting time is about about 15 seconds. Many of the windows gdb user have complain the time lag issue caused by pending breakpoints.

So, the conclusion is that your patches are great!!!, it avoid many unnecessary file IO operations under Windows XP.

asmwarrior
ollydbg from codeblocks' forum



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] Avoid calling gdb_realpath if basenames are different
  2011-11-06  6:31 [RFC] Avoid calling gdb_realpath if basenames are different Doug Evans
  2011-11-06  9:07 ` asmwarrior
@ 2011-11-07 17:06 ` Joel Brobecker
  2011-11-08 17:18 ` Tom Tromey
  2011-11-11  0:57 ` [RFA, doc RFA] " Doug Evans
  3 siblings, 0 replies; 10+ messages in thread
From: Joel Brobecker @ 2011-11-07 17:06 UTC (permalink / raw)
  To: Doug Evans; +Cc: gdb-patches

> If people are concerned about breaking someone's usage,
> we could default basenames-may-differ to true in 7.4,
> with a warning that it will be set to false in 7.5 (or some such).
> [We could leave the default set to true, especially if someone knew
> of at least some minimally common usage this would break.
> I'd hate to otherwise penalize the vast majority of users if not.]

If it was my choice, I would go with setting it to false right now.
As you say, I think that the vast majority of people are going to
benefit from it immediately.

> 2011-11-05  Doug Evans  <dje@google.com>
> 
> 	* dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
> 	! basenames_may_differ.
> 	* psymtab.c (lookup_partial_symtab): Ditto.
> 	* symtab.c (lookup_partial_symtab): Ditto.
> 	* symtab.c (lookup_symtab): Ditto.
> 	(basenames_may_differ): New global.
> 	(_initialize_symtab): New parameter basenames-may-differ.
> 	* symtab.h (basenames_may_differ): Declare.
-- 
Joel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] Avoid calling gdb_realpath if basenames are different
  2011-11-06  6:31 [RFC] Avoid calling gdb_realpath if basenames are different Doug Evans
  2011-11-06  9:07 ` asmwarrior
  2011-11-07 17:06 ` Joel Brobecker
@ 2011-11-08 17:18 ` Tom Tromey
  2011-11-11  0:57 ` [RFA, doc RFA] " Doug Evans
  3 siblings, 0 replies; 10+ messages in thread
From: Tom Tromey @ 2011-11-08 17:18 UTC (permalink / raw)
  To: Doug Evans; +Cc: gdb-patches

>>>>> "Doug" == Doug Evans <dje@google.com> writes:

Doug> This patch has been brought up before (by others).
Doug> E.g., http://sourceware.org/ml/gdb-patches/2010-04/msg00466.html

I didn't read the patch yet, but I wonder if it fixes:

http://sourceware.org/bugzilla/show_bug.cgi?id=8367
http://sourceware.org/bugzilla/show_bug.cgi?id=9831

This is a frequently reported problem...

Tom


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFA, doc RFA] Avoid calling gdb_realpath if basenames are different
  2011-11-06  6:31 [RFC] Avoid calling gdb_realpath if basenames are different Doug Evans
                   ` (2 preceding siblings ...)
  2011-11-08 17:18 ` Tom Tromey
@ 2011-11-11  0:57 ` Doug Evans
  2011-11-11  8:49   ` Eli Zaretskii
  3 siblings, 1 reply; 10+ messages in thread
From: Doug Evans @ 2011-11-11  0:57 UTC (permalink / raw)
  To: gdb-patches, Tom Tromey, Joel Brobecker, Eli Zaretskii

[-- Attachment #1: Type: text/plain, Size: 1602 bytes --]

On Sat, Nov 5, 2011 at 11:30 PM, Doug Evans <dje@google.com> wrote:
> Hi.
> This patch has been brought up before (by others).
> E.g., http://sourceware.org/ml/gdb-patches/2010-04/msg00466.html
> I'm hoping we can get this in now.
> We're paying a real and significant cost for what is mostly a
> theoretical concern.
> [E.g., How often is one source file referred to by the user using a basename
> that is different than what's recorded in the debug info?]
>
> If people are concerned about breaking someone's usage,
> we could default basenames-may-differ to true in 7.4,
> with a warning that it will be set to false in 7.5 (or some such).
> [We could leave the default set to true, especially if someone knew
> of at least some minimally common usage this would break.
> I'd hate to otherwise penalize the vast majority of users if not.]

Hi.
Ok to check in?

Note: I set the default to be the common case (speed up gdb by
assuming basenames never differ).
Let me know if you want the default changed.

Tom: I'll look at the bugs you mentioned separately.

2011-11-10  Doug Evans  <dje@google.com>

        * NEWS: Mention new parameter basenames-may-differ.
        * dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
        ! basenames_may_differ.
        * psymtab.c (lookup_partial_symtab): Ditto.
        * symtab.c (lookup_symtab): Ditto.
        (basenames_may_differ): New global.
        (_initialize_symtab): New parameter basenames-may-differ.
        * symtab.h (basenames_may_differ): Declare.

        doc/
        * gdb.texinfo (Files): Document basenames-may-differ.

[-- Attachment #2: gdb-111110-basenames-may-differ-2.patch.txt --]
[-- Type: text/plain, Size: 9537 bytes --]

2011-11-10  Doug Evans  <dje@google.com>

	* NEWS: Mention new parameter basenames-may-differ.
	* dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
	! basenames_may_differ.
	* psymtab.c (lookup_partial_symtab): Ditto.
	* symtab.c (lookup_symtab): Ditto.
	(basenames_may_differ): New global.
	(_initialize_symtab): New parameter basenames-may-differ.
	* symtab.h (basenames_may_differ): Declare.

	doc/
	* gdb.texinfo (Files): Document basenames-may-differ.

Index: NEWS
===================================================================
RCS file: /cvs/src/src/gdb/NEWS,v
retrieving revision 1.464
diff -u -p -r1.464 NEWS
--- NEWS	2 Nov 2011 23:44:19 -0000	1.464
+++ NEWS	10 Nov 2011 23:49:26 -0000
@@ -150,6 +150,20 @@ show debug entry-values
   Control display of debugging info for determining frame argument values at
   function entry and virtual tail call frames.
 
+set basenames-may-differ
+show basenames-may-differ
+  Set whether a source file may have multiple base names.
+  A "base name" is the name of a file with the directory part removed.
+  Example: The base name of "/home/user/hello.c" is "hello.c".
+  When doing file name based lookups, gdb will canonicalize file names
+  (e.g., expand symlinks) before comparing them, which is an expensive
+  operation.
+  If set, gdb will not assume a file is known by one base name, and thus
+  it cannot optimize file name comparisions by skipping the canonicalization
+  step if the base names are different.
+  If not set, all source files must be known by one base name,
+  and gdb will do file name comparisons more efficiently.
+
 * New remote packets
 
 QTEnable
Index: dwarf2read.c
===================================================================
RCS file: /cvs/src/src/gdb/dwarf2read.c,v
retrieving revision 1.579
diff -u -p -r1.579 dwarf2read.c
--- dwarf2read.c	10 Nov 2011 20:21:27 -0000	1.579
+++ dwarf2read.c	10 Nov 2011 23:49:26 -0000
@@ -2445,7 +2445,8 @@ dw2_lookup_symtab (struct objfile *objfi
 		   struct symtab **result)
 {
   int i;
-  int check_basename = lbasename (name) == name;
+  const char *name_basename = lbasename (name);
+  int check_basename = name_basename == name;
   struct dwarf2_per_cu_data *base_cu = NULL;
 
   dw2_setup (objfile);
@@ -2478,6 +2479,12 @@ dw2_lookup_symtab (struct objfile *objfi
 	      && FILENAME_CMP (lbasename (this_name), name) == 0)
 	    base_cu = per_cu;
 
+	  /* Before we invoke realpath, which can get expensive when many
+	     files are involved, do a quick comparison of the basenames.  */
+	  if (! basenames_may_differ
+	      && FILENAME_CMP (lbasename (this_name), name_basename) != 0)
+	    continue;
+
 	  if (full_path != NULL)
 	    {
 	      const char *this_real_name = dw2_get_real_path (objfile,
Index: psymtab.c
===================================================================
RCS file: /cvs/src/src/gdb/psymtab.c,v
retrieving revision 1.31
diff -u -p -r1.31 psymtab.c
--- psymtab.c	28 Oct 2011 17:29:37 -0000	1.31
+++ psymtab.c	10 Nov 2011 23:49:26 -0000
@@ -134,6 +134,7 @@ lookup_partial_symtab (struct objfile *o
 		       const char *full_path, const char *real_path)
 {
   struct partial_symtab *pst;
+  const char *name_basename = lbasename (name);
 
   ALL_OBJFILE_PSYMTABS_REQUIRED (objfile, pst)
   {
@@ -142,6 +143,12 @@ lookup_partial_symtab (struct objfile *o
 	return (pst);
       }
 
+    /* Before we invoke realpath, which can get expensive when many
+       files are involved, do a quick comparison of the basenames.  */
+    if (! basenames_may_differ
+	&& FILENAME_CMP (name_basename, lbasename (pst->filename)) != 0)
+      continue;
+
     /* If the user gave us an absolute path, try to find the file in
        this symtab and use its absolute path.  */
     if (full_path != NULL)
@@ -172,7 +179,7 @@ lookup_partial_symtab (struct objfile *o
 
   /* Now, search for a matching tail (only if name doesn't have any dirs).  */
 
-  if (lbasename (name) == name)
+  if (name_basename == name)
     ALL_OBJFILE_PSYMTABS_REQUIRED (objfile, pst)
     {
       if (FILENAME_CMP (lbasename (pst->filename), name) == 0)
Index: symtab.c
===================================================================
RCS file: /cvs/src/src/gdb/symtab.c,v
retrieving revision 1.285
diff -u -p -r1.285 symtab.c
--- symtab.c	29 Oct 2011 07:26:07 -0000	1.285
+++ symtab.c	10 Nov 2011 23:49:26 -0000
@@ -112,6 +112,11 @@ void _initialize_symtab (void);
 
 /* */
 
+/* Non-zero if a file may be known by two different basenames.
+   This is the uncommon case, and significantly slows down gdb.
+   Default set to "off" to not slow down the common case.  */
+int basenames_may_differ = 0;
+
 /* Allow the user to configure the debugger behavior with respect
    to multiple-choice menus when more than one symbol matches during
    a symbol lookup.  */
@@ -155,6 +160,7 @@ lookup_symtab (const char *name)
   char *real_path = NULL;
   char *full_path = NULL;
   struct cleanup *cleanup;
+  const char* base_name = lbasename (name);
 
   cleanup = make_cleanup (null_cleanup, NULL);
 
@@ -180,6 +186,12 @@ got_symtab:
 	return s;
       }
 
+    /* Before we invoke realpath, which can get expensive when many
+       files are involved, do a quick comparison of the basenames.  */
+    if (! basenames_may_differ
+	&& FILENAME_CMP (base_name, lbasename (s->filename)) != 0)
+      continue;
+
     /* If the user gave us an absolute path, try to find the file in
        this symtab and use its absolute path.  */
 
@@ -4883,5 +4897,22 @@ Show how the debugger handles ambiguitie
 Valid values are \"ask\", \"all\", \"cancel\", and the default is \"all\"."),
                         NULL, NULL, &setlist, &showlist);
 
+  add_setshow_boolean_cmd ("basenames-may-differ", class_obscure,
+			   &basenames_may_differ, _("\
+Set whether a source file may have multiple base names."), _("\
+Show whether a source file may have multiple base names."), _("\
+A \"base name\" is the name of a file with the directory part removed.\n\
+Example: The base name of \"/home/user/hello.c\" is \"hello.c\".\n\
+When doing file name based lookups, gdb will canonicalize file names\n\
+(e.g., expand symlinks) before comparing them, which is an expensive\n\
+operation.\n\
+If set, gdb will not assume a file is known by one base name, and thus\n\
+it cannot optimize file name comparisions by skipping the canonicalization\n\
+step if the base names are different.\n\
+If not set, all source files must be known by one base name,\n\
+and gdb will do file name comparisons much more efficiently."),
+			   NULL, NULL,
+			   &setlist, &showlist);
+
   observer_attach_executable_changed (symtab_observer_executable_changed);
 }
Index: symtab.h
===================================================================
RCS file: /cvs/src/src/gdb/symtab.h,v
retrieving revision 1.191
diff -u -p -r1.191 symtab.h
--- symtab.h	10 Nov 2011 20:21:28 -0000	1.191
+++ symtab.h	10 Nov 2011 23:49:26 -0000
@@ -1306,4 +1306,6 @@ void fixup_section (struct general_symbo
 
 struct objfile *lookup_objfile_from_block (const struct block *block);
 
+extern int basenames_may_differ;
+
 #endif /* !defined(SYMTAB_H) */
Index: doc/gdb.texinfo
===================================================================
RCS file: /cvs/src/src/gdb/doc/gdb.texinfo,v
retrieving revision 1.890
diff -u -p -r1.890 gdb.texinfo
--- doc/gdb.texinfo	8 Nov 2011 21:34:18 -0000	1.890
+++ doc/gdb.texinfo	10 Nov 2011 23:49:28 -0000
@@ -15680,6 +15680,47 @@ This is the default.
 @end table
 @end table
 
+@cindex file name canonicalization
+@cindex base name differences
+When processing file names provided by the user,
+@value{GDBN} will canonicalize them and remove symbolic links.
+This ensures that @value{GDBN} will find the right file,
+even if the debug information specifies an alternate path.
+However, with large programs this canonicalization can noticeably slow
+down @value{GDBN}.  To compensate, @value{GDBN} will try to avoid
+this canonicalization wherever possible.  One way it can do so
+is by first comparing the @samp{base name} of a file.
+The @samp{base name} of a file is simply the file's name without
+any directory information.  For example, the base name of
+@file{/home/user/hello.c} is @file{hello.c}.
+By doing this @value{GDBN} can skip, for example,
+@file{/usr/include/stdio.h} without having to first canonicalize
+and then compare the directory names.
+This works great, except when the base name of a file
+can have multiple names due to symbolic links.
+For example, if @file{/home/user/bar.c} is a symbolic link to
+@file{/home/user/foo.c} then @value{GDBN} cannot just look at
+the base name of two files, it must canonicalize them, expand
+all symbolic links, and @emph{then} compare the file names
+to see if they match.
+Fortunately, having one file known by two different base names
+does not generally occur in practice.
+Should it occur, however, @value{GDBN} provides an escape hatch
+to allow this to work.
+By setting @code{basenames-may-differ} to @code{true}
+@value{GDBN} will always canonicalize file names before
+comparing them, thus ensuring that one file known by multiple
+base names are treated as the same file.
+
+@table @code
+@item set basenames-may-differ
+@kindex set basenames-may-differ
+Set whether a source file may have multiple base names.
+
+@item show basenames-may-differ
+@kindex show basenames-may-differ
+Show whether a source file may have multiple base names.
+@end table
 
 @node Separate Debug Files
 @section Debugging Information in Separate Files

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA, doc RFA] Avoid calling gdb_realpath if basenames are different
  2011-11-11  0:57 ` [RFA, doc RFA] " Doug Evans
@ 2011-11-11  8:49   ` Eli Zaretskii
  2011-11-11  9:00     ` Doug Evans
  0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2011-11-11  8:49 UTC (permalink / raw)
  To: Doug Evans; +Cc: gdb-patches, tromey, brobecker

> Date: Thu, 10 Nov 2011 15:58:46 -0800
> From: Doug Evans <dje@google.com>
> 
> 2011-11-10  Doug Evans  <dje@google.com>
> 
>         * NEWS: Mention new parameter basenames-may-differ.
>         * dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
>         ! basenames_may_differ.
>         * psymtab.c (lookup_partial_symtab): Ditto.
>         * symtab.c (lookup_symtab): Ditto.
>         (basenames_may_differ): New global.
>         (_initialize_symtab): New parameter basenames-may-differ.
>         * symtab.h (basenames_may_differ): Declare.
> 
>         doc/
>         * gdb.texinfo (Files): Document basenames-may-differ.

Thanks.

> +set basenames-may-differ
> +show basenames-may-differ
> +  Set whether a source file may have multiple base names.
> +  A "base name" is the name of a file with the directory part removed.
> +  Example: The base name of "/home/user/hello.c" is "hello.c".
> +  When doing file name based lookups, gdb will canonicalize file names
> +  (e.g., expand symlinks) before comparing them, which is an expensive
> +  operation.
> +  If set, gdb will not assume a file is known by one base name, and thus
> +  it cannot optimize file name comparisions by skipping the canonicalization
> +  step if the base names are different.
> +  If not set, all source files must be known by one base name,
> +  and gdb will do file name comparisons more efficiently.

I suggest to rearrange the text, so as to put together the parts that
describe what happens when the option is set.  Like this:

  Set whether a source file may have multiple base names.
  (A "base name" is the name of a file with the directory part removed.
  Example: The base name of "/home/user/hello.c" is "hello.c".)
  If set, GDB will canonicalize file names (e.g., expand symlinks)
  before comparing them.  Canonicalization is an expensive operation,
  but it allows the same file be known by more than one base name.
  If not set (the default), all source files are assumed to have just
  one base name, and gdb will do file name comparisons more efficiently.

OK?

> +When processing file names provided by the user,
> +@value{GDBN} will canonicalize them and remove symbolic links.
> +This ensures that @value{GDBN} will find the right file,
> +even if the debug information specifies an alternate path.
> +However, with large programs this canonicalization can noticeably slow
> +down @value{GDBN}.  To compensate, @value{GDBN} will try to avoid
> +this canonicalization wherever possible.  One way it can do so
> +is by first comparing the @samp{base name} of a file.
> +The @samp{base name} of a file is simply the file's name without
> +any directory information.  For example, the base name of
> +@file{/home/user/hello.c} is @file{hello.c}.
> +By doing this @value{GDBN} can skip, for example,
> +@file{/usr/include/stdio.h} without having to first canonicalize
> +and then compare the directory names.
> +This works great, except when the base name of a file
> +can have multiple names due to symbolic links.
> +For example, if @file{/home/user/bar.c} is a symbolic link to
> +@file{/home/user/foo.c} then @value{GDBN} cannot just look at
> +the base name of two files, it must canonicalize them, expand
> +all symbolic links, and @emph{then} compare the file names
> +to see if they match.
> +Fortunately, having one file known by two different base names
> +does not generally occur in practice.
> +Should it occur, however, @value{GDBN} provides an escape hatch
> +to allow this to work.
> +By setting @code{basenames-may-differ} to @code{true}
> +@value{GDBN} will always canonicalize file names before
> +comparing them, thus ensuring that one file known by multiple
> +base names are treated as the same file.

This is written as mostly an apology for having this option.  That is
a wrong angle for describing features in a user manual, because the
user generally trusts the developers by default to DTRT.  So I would
reword it

  When processing file names provided by the user, @value{GDBN}
  frequently needs to compare them to the file names recorded in the
  program's debug info.  Normally, @value{GDBN} compares just the
  @dfn{base names} of the files as strings, which is reasonably fast
  even for very large programs.  (The base name of a file is the last
  portion of its name, after stripping all the leading directories.)
  This shortcut in comparison is based upon the assumption that files
  cannot have more than one base name.  This is usually true, but
  references to files that use symlinks or similar filesystem
  facilities violate that assumption.  If your program records files
  using such facilities, or if you provide file names to @value{GDBN}
  using symlinks etc., you can set @code{basenames-may-differ} to
  @code{true} to instruct @value{GDBN} to completely canonicalize each
  pair of file names it needs to compare.  This will make file-name
  comparisons accurate, but at a price of a significant slowdown.

Do you agree with this wording?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA, doc RFA] Avoid calling gdb_realpath if basenames are different
  2011-11-11  8:49   ` Eli Zaretskii
@ 2011-11-11  9:00     ` Doug Evans
  2011-11-15  4:46       ` Doug Evans
  0 siblings, 1 reply; 10+ messages in thread
From: Doug Evans @ 2011-11-11  9:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches, tromey, brobecker

On Fri, Nov 11, 2011 at 12:47 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Thu, 10 Nov 2011 15:58:46 -0800
>> From: Doug Evans <dje@google.com>
>>
>> 2011-11-10  Doug Evans  <dje@google.com>
>>
>>         * NEWS: Mention new parameter basenames-may-differ.
>>         * dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
>>         ! basenames_may_differ.
>>         * psymtab.c (lookup_partial_symtab): Ditto.
>>         * symtab.c (lookup_symtab): Ditto.
>>         (basenames_may_differ): New global.
>>         (_initialize_symtab): New parameter basenames-may-differ.
>>         * symtab.h (basenames_may_differ): Declare.
>>
>>         doc/
>>         * gdb.texinfo (Files): Document basenames-may-differ.
>
> Thanks.
>
>> +set basenames-may-differ
>> +show basenames-may-differ
>> +  Set whether a source file may have multiple base names.
>> +  A "base name" is the name of a file with the directory part removed.
>> +  Example: The base name of "/home/user/hello.c" is "hello.c".
>> +  When doing file name based lookups, gdb will canonicalize file names
>> +  (e.g., expand symlinks) before comparing them, which is an expensive
>> +  operation.
>> +  If set, gdb will not assume a file is known by one base name, and thus
>> +  it cannot optimize file name comparisions by skipping the canonicalization
>> +  step if the base names are different.
>> +  If not set, all source files must be known by one base name,
>> +  and gdb will do file name comparisons more efficiently.
>
> I suggest to rearrange the text, so as to put together the parts that
> describe what happens when the option is set.  Like this:
>
>  Set whether a source file may have multiple base names.
>  (A "base name" is the name of a file with the directory part removed.
>  Example: The base name of "/home/user/hello.c" is "hello.c".)
>  If set, GDB will canonicalize file names (e.g., expand symlinks)
>  before comparing them.  Canonicalization is an expensive operation,
>  but it allows the same file be known by more than one base name.
>  If not set (the default), all source files are assumed to have just
>  one base name, and gdb will do file name comparisons more efficiently.
>
> OK?
>
>> +When processing file names provided by the user,
>> +@value{GDBN} will canonicalize them and remove symbolic links.
>> +This ensures that @value{GDBN} will find the right file,
>> +even if the debug information specifies an alternate path.
>> +However, with large programs this canonicalization can noticeably slow
>> +down @value{GDBN}.  To compensate, @value{GDBN} will try to avoid
>> +this canonicalization wherever possible.  One way it can do so
>> +is by first comparing the @samp{base name} of a file.
>> +The @samp{base name} of a file is simply the file's name without
>> +any directory information.  For example, the base name of
>> +@file{/home/user/hello.c} is @file{hello.c}.
>> +By doing this @value{GDBN} can skip, for example,
>> +@file{/usr/include/stdio.h} without having to first canonicalize
>> +and then compare the directory names.
>> +This works great, except when the base name of a file
>> +can have multiple names due to symbolic links.
>> +For example, if @file{/home/user/bar.c} is a symbolic link to
>> +@file{/home/user/foo.c} then @value{GDBN} cannot just look at
>> +the base name of two files, it must canonicalize them, expand
>> +all symbolic links, and @emph{then} compare the file names
>> +to see if they match.
>> +Fortunately, having one file known by two different base names
>> +does not generally occur in practice.
>> +Should it occur, however, @value{GDBN} provides an escape hatch
>> +to allow this to work.
>> +By setting @code{basenames-may-differ} to @code{true}
>> +@value{GDBN} will always canonicalize file names before
>> +comparing them, thus ensuring that one file known by multiple
>> +base names are treated as the same file.
>
> This is written as mostly an apology for having this option.  That is
> a wrong angle for describing features in a user manual, because the
> user generally trusts the developers by default to DTRT.  So I would
> reword it
>
>  When processing file names provided by the user, @value{GDBN}
>  frequently needs to compare them to the file names recorded in the
>  program's debug info.  Normally, @value{GDBN} compares just the
>  @dfn{base names} of the files as strings, which is reasonably fast
>  even for very large programs.  (The base name of a file is the last
>  portion of its name, after stripping all the leading directories.)
>  This shortcut in comparison is based upon the assumption that files
>  cannot have more than one base name.  This is usually true, but
>  references to files that use symlinks or similar filesystem
>  facilities violate that assumption.  If your program records files
>  using such facilities, or if you provide file names to @value{GDBN}
>  using symlinks etc., you can set @code{basenames-may-differ} to
>  @code{true} to instruct @value{GDBN} to completely canonicalize each
>  pair of file names it needs to compare.  This will make file-name
>  comparisons accurate, but at a price of a significant slowdown.
>
> Do you agree with this wording?
>

I'm happy if you're happy.
Thanks for the suggested wording.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA, doc RFA] Avoid calling gdb_realpath if basenames are different
  2011-11-11  9:00     ` Doug Evans
@ 2011-11-15  4:46       ` Doug Evans
  2011-11-15  6:00         ` Eli Zaretskii
  2011-11-15 14:23         ` Joel Brobecker
  0 siblings, 2 replies; 10+ messages in thread
From: Doug Evans @ 2011-11-15  4:46 UTC (permalink / raw)
  To: Eli Zaretskii, Tom Tromey, Joel Brobecker, gdb-patches

[-- Attachment #1: Type: text/plain, Size: 6102 bytes --]

On Fri, Nov 11, 2011 at 12:53 AM, Doug Evans <dje@google.com> wrote:
> On Fri, Nov 11, 2011 at 12:47 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>>> Date: Thu, 10 Nov 2011 15:58:46 -0800
>>> From: Doug Evans <dje@google.com>
>>>
>>> 2011-11-10  Doug Evans  <dje@google.com>
>>>
>>>         * NEWS: Mention new parameter basenames-may-differ.
>>>         * dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
>>>         ! basenames_may_differ.
>>>         * psymtab.c (lookup_partial_symtab): Ditto.
>>>         * symtab.c (lookup_symtab): Ditto.
>>>         (basenames_may_differ): New global.
>>>         (_initialize_symtab): New parameter basenames-may-differ.
>>>         * symtab.h (basenames_may_differ): Declare.
>>>
>>>         doc/
>>>         * gdb.texinfo (Files): Document basenames-may-differ.
>>
>> Thanks.
>>
>>> +set basenames-may-differ
>>> +show basenames-may-differ
>>> +  Set whether a source file may have multiple base names.
>>> +  A "base name" is the name of a file with the directory part removed.
>>> +  Example: The base name of "/home/user/hello.c" is "hello.c".
>>> +  When doing file name based lookups, gdb will canonicalize file names
>>> +  (e.g., expand symlinks) before comparing them, which is an expensive
>>> +  operation.
>>> +  If set, gdb will not assume a file is known by one base name, and thus
>>> +  it cannot optimize file name comparisions by skipping the canonicalization
>>> +  step if the base names are different.
>>> +  If not set, all source files must be known by one base name,
>>> +  and gdb will do file name comparisons more efficiently.
>>
>> I suggest to rearrange the text, so as to put together the parts that
>> describe what happens when the option is set.  Like this:
>>
>>  Set whether a source file may have multiple base names.
>>  (A "base name" is the name of a file with the directory part removed.
>>  Example: The base name of "/home/user/hello.c" is "hello.c".)
>>  If set, GDB will canonicalize file names (e.g., expand symlinks)
>>  before comparing them.  Canonicalization is an expensive operation,
>>  but it allows the same file be known by more than one base name.
>>  If not set (the default), all source files are assumed to have just
>>  one base name, and gdb will do file name comparisons more efficiently.
>>
>> OK?
>>
>>> +When processing file names provided by the user,
>>> +@value{GDBN} will canonicalize them and remove symbolic links.
>>> +This ensures that @value{GDBN} will find the right file,
>>> +even if the debug information specifies an alternate path.
>>> +However, with large programs this canonicalization can noticeably slow
>>> +down @value{GDBN}.  To compensate, @value{GDBN} will try to avoid
>>> +this canonicalization wherever possible.  One way it can do so
>>> +is by first comparing the @samp{base name} of a file.
>>> +The @samp{base name} of a file is simply the file's name without
>>> +any directory information.  For example, the base name of
>>> +@file{/home/user/hello.c} is @file{hello.c}.
>>> +By doing this @value{GDBN} can skip, for example,
>>> +@file{/usr/include/stdio.h} without having to first canonicalize
>>> +and then compare the directory names.
>>> +This works great, except when the base name of a file
>>> +can have multiple names due to symbolic links.
>>> +For example, if @file{/home/user/bar.c} is a symbolic link to
>>> +@file{/home/user/foo.c} then @value{GDBN} cannot just look at
>>> +the base name of two files, it must canonicalize them, expand
>>> +all symbolic links, and @emph{then} compare the file names
>>> +to see if they match.
>>> +Fortunately, having one file known by two different base names
>>> +does not generally occur in practice.
>>> +Should it occur, however, @value{GDBN} provides an escape hatch
>>> +to allow this to work.
>>> +By setting @code{basenames-may-differ} to @code{true}
>>> +@value{GDBN} will always canonicalize file names before
>>> +comparing them, thus ensuring that one file known by multiple
>>> +base names are treated as the same file.
>>
>> This is written as mostly an apology for having this option.  That is
>> a wrong angle for describing features in a user manual, because the
>> user generally trusts the developers by default to DTRT.  So I would
>> reword it
>>
>>  When processing file names provided by the user, @value{GDBN}
>>  frequently needs to compare them to the file names recorded in the
>>  program's debug info.  Normally, @value{GDBN} compares just the
>>  @dfn{base names} of the files as strings, which is reasonably fast
>>  even for very large programs.  (The base name of a file is the last
>>  portion of its name, after stripping all the leading directories.)
>>  This shortcut in comparison is based upon the assumption that files
>>  cannot have more than one base name.  This is usually true, but
>>  references to files that use symlinks or similar filesystem
>>  facilities violate that assumption.  If your program records files
>>  using such facilities, or if you provide file names to @value{GDBN}
>>  using symlinks etc., you can set @code{basenames-may-differ} to
>>  @code{true} to instruct @value{GDBN} to completely canonicalize each
>>  pair of file names it needs to compare.  This will make file-name
>>  comparisons accurate, but at a price of a significant slowdown.
>>
>> Do you agree with this wording?
>>
>
> I'm happy if you're happy.
> Thanks for the suggested wording.

Ok to check in?

2011-11-14  Doug Evans  <dje@google.com>

        * NEWS: Mention new parameter basenames-may-differ.
        * dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
        ! basenames_may_differ.
        * psymtab.c (lookup_partial_symtab): Ditto.
        * symtab.c (lookup_symtab): Ditto.
        (basenames_may_differ): New global.
        (_initialize_symtab): New parameter basenames-may-differ.
        * symtab.h (basenames_may_differ): Declare.

        doc/
        * gdb.texinfo (Files): Document basenames-may-differ.

[-- Attachment #2: gdb-111114-basenames-may-differ-3.patch.txt --]
[-- Type: text/plain, Size: 8690 bytes --]

2011-11-14  Doug Evans  <dje@google.com>

	* NEWS: Mention new parameter basenames-may-differ.
	* dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
	! basenames_may_differ.
	* psymtab.c (lookup_partial_symtab): Ditto.
	* symtab.c (lookup_symtab): Ditto.
	(basenames_may_differ): New global.
	(_initialize_symtab): New parameter basenames-may-differ.
	* symtab.h (basenames_may_differ): Declare.

	doc/
	* gdb.texinfo (Files): Document basenames-may-differ.

Index: NEWS
===================================================================
RCS file: /cvs/src/src/gdb/NEWS,v
retrieving revision 1.466
diff -u -p -r1.466 NEWS
--- NEWS	14 Nov 2011 20:07:20 -0000	1.466
+++ NEWS	15 Nov 2011 04:19:38 -0000
@@ -159,6 +159,17 @@ show debug entry-values
   Control display of debugging info for determining frame argument values at
   function entry and virtual tail call frames.
 
+set basenames-may-differ
+show basenames-may-differ
+  Set whether a source file may have multiple base names.
+  (A "base name" is the name of a file with the directory part removed.
+  Example: The base name of "/home/user/hello.c" is "hello.c".)
+  If set, GDB will canonicalize file names (e.g., expand symlinks)
+  before comparing them.  Canonicalization is an expensive operation,
+  but it allows the same file be known by more than one base name.
+  If not set (the default), all source files are assumed to have just
+  one base name, and gdb will do file name comparisons more efficiently.
+
 * New remote packets
 
 QTEnable
Index: dwarf2read.c
===================================================================
RCS file: /cvs/src/src/gdb/dwarf2read.c,v
retrieving revision 1.580
diff -u -p -r1.580 dwarf2read.c
--- dwarf2read.c	11 Nov 2011 00:43:03 -0000	1.580
+++ dwarf2read.c	15 Nov 2011 04:19:39 -0000
@@ -2445,7 +2445,8 @@ dw2_lookup_symtab (struct objfile *objfi
 		   struct symtab **result)
 {
   int i;
-  int check_basename = lbasename (name) == name;
+  const char *name_basename = lbasename (name);
+  int check_basename = name_basename == name;
   struct dwarf2_per_cu_data *base_cu = NULL;
 
   dw2_setup (objfile);
@@ -2478,6 +2479,12 @@ dw2_lookup_symtab (struct objfile *objfi
 	      && FILENAME_CMP (lbasename (this_name), name) == 0)
 	    base_cu = per_cu;
 
+	  /* Before we invoke realpath, which can get expensive when many
+	     files are involved, do a quick comparison of the basenames.  */
+	  if (! basenames_may_differ
+	      && FILENAME_CMP (lbasename (this_name), name_basename) != 0)
+	    continue;
+
 	  if (full_path != NULL)
 	    {
 	      const char *this_real_name = dw2_get_real_path (objfile,
Index: psymtab.c
===================================================================
RCS file: /cvs/src/src/gdb/psymtab.c,v
retrieving revision 1.33
diff -u -p -r1.33 psymtab.c
--- psymtab.c	11 Nov 2011 00:43:04 -0000	1.33
+++ psymtab.c	15 Nov 2011 04:19:39 -0000
@@ -134,6 +134,7 @@ lookup_partial_symtab (struct objfile *o
 		       const char *full_path, const char *real_path)
 {
   struct partial_symtab *pst;
+  const char *name_basename = lbasename (name);
 
   ALL_OBJFILE_PSYMTABS_REQUIRED (objfile, pst)
   {
@@ -142,6 +143,12 @@ lookup_partial_symtab (struct objfile *o
 	return (pst);
       }
 
+    /* Before we invoke realpath, which can get expensive when many
+       files are involved, do a quick comparison of the basenames.  */
+    if (! basenames_may_differ
+	&& FILENAME_CMP (name_basename, lbasename (pst->filename)) != 0)
+      continue;
+
     /* If the user gave us an absolute path, try to find the file in
        this symtab and use its absolute path.  */
     if (full_path != NULL)
@@ -172,7 +179,7 @@ lookup_partial_symtab (struct objfile *o
 
   /* Now, search for a matching tail (only if name doesn't have any dirs).  */
 
-  if (lbasename (name) == name)
+  if (name_basename == name)
     ALL_OBJFILE_PSYMTABS_REQUIRED (objfile, pst)
     {
       if (FILENAME_CMP (lbasename (pst->filename), name) == 0)
Index: symtab.c
===================================================================
RCS file: /cvs/src/src/gdb/symtab.c,v
retrieving revision 1.286
diff -u -p -r1.286 symtab.c
--- symtab.c	11 Nov 2011 00:43:04 -0000	1.286
+++ symtab.c	15 Nov 2011 04:19:39 -0000
@@ -112,6 +112,11 @@ void _initialize_symtab (void);
 
 /* */
 
+/* Non-zero if a file may be known by two different basenames.
+   This is the uncommon case, and significantly slows down gdb.
+   Default set to "off" to not slow down the common case.  */
+int basenames_may_differ = 0;
+
 /* Allow the user to configure the debugger behavior with respect
    to multiple-choice menus when more than one symbol matches during
    a symbol lookup.  */
@@ -155,6 +160,7 @@ lookup_symtab (const char *name)
   char *real_path = NULL;
   char *full_path = NULL;
   struct cleanup *cleanup;
+  const char* base_name = lbasename (name);
 
   cleanup = make_cleanup (null_cleanup, NULL);
 
@@ -180,6 +186,12 @@ got_symtab:
 	return s;
       }
 
+    /* Before we invoke realpath, which can get expensive when many
+       files are involved, do a quick comparison of the basenames.  */
+    if (! basenames_may_differ
+	&& FILENAME_CMP (base_name, lbasename (s->filename)) != 0)
+      continue;
+
     /* If the user gave us an absolute path, try to find the file in
        this symtab and use its absolute path.  */
 
@@ -4885,5 +4897,19 @@ Show how the debugger handles ambiguitie
 Valid values are \"ask\", \"all\", \"cancel\", and the default is \"all\"."),
                         NULL, NULL, &setlist, &showlist);
 
+  add_setshow_boolean_cmd ("basenames-may-differ", class_obscure,
+			   &basenames_may_differ, _("\
+Set whether a source file may have multiple base names."), _("\
+Show whether a source file may have multiple base names."), _("\
+(A \"base name\" is the name of a file with the directory part removed.\n\
+Example: The base name of \"/home/user/hello.c\" is \"hello.c\".)\n\
+If set, GDB will canonicalize file names (e.g., expand symlinks)\n\
+before comparing them.  Canonicalization is an expensive operation,\n\
+but it allows the same file be known by more than one base name.\n\
+If not set (the default), all source files are assumed to have just\n\
+one base name, and gdb will do file name comparisons more efficiently."),
+			   NULL, NULL,
+			   &setlist, &showlist);
+
   observer_attach_executable_changed (symtab_observer_executable_changed);
 }
Index: symtab.h
===================================================================
RCS file: /cvs/src/src/gdb/symtab.h,v
retrieving revision 1.191
diff -u -p -r1.191 symtab.h
--- symtab.h	10 Nov 2011 20:21:28 -0000	1.191
+++ symtab.h	15 Nov 2011 04:19:39 -0000
@@ -1306,4 +1306,6 @@ void fixup_section (struct general_symbo
 
 struct objfile *lookup_objfile_from_block (const struct block *block);
 
+extern int basenames_may_differ;
+
 #endif /* !defined(SYMTAB_H) */
Index: doc/gdb.texinfo
===================================================================
RCS file: /cvs/src/src/gdb/doc/gdb.texinfo,v
retrieving revision 1.895
diff -u -p -r1.895 gdb.texinfo
--- doc/gdb.texinfo	14 Nov 2011 20:07:23 -0000	1.895
+++ doc/gdb.texinfo	15 Nov 2011 04:19:39 -0000
@@ -15702,6 +15702,33 @@ This is the default.
 @end table
 @end table
 
+@cindex file name canonicalization
+@cindex base name differences
+When processing file names provided by the user, @value{GDBN}
+frequently needs to compare them to the file names recorded in the
+program's debug info.  Normally, @value{GDBN} compares just the
+@dfn{base names} of the files as strings, which is reasonably fast
+even for very large programs.  (The base name of a file is the last
+portion of its name, after stripping all the leading directories.)
+This shortcut in comparison is based upon the assumption that files
+cannot have more than one base name.  This is usually true, but
+references to files that use symlinks or similar filesystem
+facilities violate that assumption.  If your program records files
+using such facilities, or if you provide file names to @value{GDBN}
+using symlinks etc., you can set @code{basenames-may-differ} to
+@code{true} to instruct @value{GDBN} to completely canonicalize each
+pair of file names it needs to compare.  This will make file-name
+comparisons accurate, but at a price of a significant slowdown.
+
+@table @code
+@item set basenames-may-differ
+@kindex set basenames-may-differ
+Set whether a source file may have multiple base names.
+
+@item show basenames-may-differ
+@kindex show basenames-may-differ
+Show whether a source file may have multiple base names.
+@end table
 
 @node Separate Debug Files
 @section Debugging Information in Separate Files

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA, doc RFA] Avoid calling gdb_realpath if basenames are different
  2011-11-15  4:46       ` Doug Evans
@ 2011-11-15  6:00         ` Eli Zaretskii
  2011-11-15 14:23         ` Joel Brobecker
  1 sibling, 0 replies; 10+ messages in thread
From: Eli Zaretskii @ 2011-11-15  6:00 UTC (permalink / raw)
  To: Doug Evans; +Cc: tromey, brobecker, gdb-patches

> Date: Mon, 14 Nov 2011 20:46:23 -0800
> From: Doug Evans <dje@google.com>
> 
> Ok to check in?

The documentation parts are OK, thanks.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA, doc RFA] Avoid calling gdb_realpath if basenames are different
  2011-11-15  4:46       ` Doug Evans
  2011-11-15  6:00         ` Eli Zaretskii
@ 2011-11-15 14:23         ` Joel Brobecker
  1 sibling, 0 replies; 10+ messages in thread
From: Joel Brobecker @ 2011-11-15 14:23 UTC (permalink / raw)
  To: Doug Evans; +Cc: Eli Zaretskii, Tom Tromey, gdb-patches

> Ok to check in?
> 
> 2011-11-14  Doug Evans  <dje@google.com>
> 
>         * NEWS: Mention new parameter basenames-may-differ.
>         * dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
>         ! basenames_may_differ.
>         * psymtab.c (lookup_partial_symtab): Ditto.
>         * symtab.c (lookup_symtab): Ditto.
>         (basenames_may_differ): New global.
>         (_initialize_symtab): New parameter basenames-may-differ.
>         * symtab.h (basenames_may_differ): Declare.

Looks good to me.

-- 
Joel


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-11-15 14:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-06  6:31 [RFC] Avoid calling gdb_realpath if basenames are different Doug Evans
2011-11-06  9:07 ` asmwarrior
2011-11-07 17:06 ` Joel Brobecker
2011-11-08 17:18 ` Tom Tromey
2011-11-11  0:57 ` [RFA, doc RFA] " Doug Evans
2011-11-11  8:49   ` Eli Zaretskii
2011-11-11  9:00     ` Doug Evans
2011-11-15  4:46       ` Doug Evans
2011-11-15  6:00         ` Eli Zaretskii
2011-11-15 14:23         ` Joel Brobecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox