Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* [RFC/patch] Sort global symbols in psymtabs only if needed
@ 2008-02-11 16:20 Aleksandar Ristovski
  2008-02-11 16:43 ` Daniel Jacobowitz
  0 siblings, 1 reply; 3+ messages in thread
From: Aleksandar Ristovski @ 2008-02-11 16:20 UTC (permalink / raw)
  To: gdb-patches

[-- Attachment #1: Type: text/plain, Size: 1934 bytes --]

Hello,

In dwarf2read.c we sort global symbols automatically during the read. This, on 
large projects, causes long delay at startup.

For example, a project our customer has is a C++ project and is fairly large. 
Here is some statistics to give you an idea:

Statistics:
     Number of psymtabs         :    16659
     Number of global symbols   :  9134742
     Number of static symbols   :  1021345

The attached patch sorts on first lookup instead of on load.

Startup time (i.e. time until I get control when stopped in main) was before the 
patch:
   %   cumulative   self              self     total
  time   seconds   seconds    calls   s/call   s/call  name
  65.93     21.38    21.38       16     1.34     1.34  strcmp_iw_ordered
   5.70     23.23     1.85 10765999     0.00     0.00  bcache_data
...
And after the patch:
   %   cumulative   self              self     total
  time   seconds   seconds    calls   s/call   s/call  name
  20.43      1.80     1.80 10765999     0.00     0.00  bcache_data
  11.80      2.84     1.04 11137416     0.00     0.00  htab_hash_string
   8.74      3.61     0.77 14851934     0.00     0.00  htab_find_slot_with_hash
...

Note that 'strcmp_iw_ordered' comes from sort_pst_symbols (and indirectly from 
qsort); the number '16' in 'calls' column is due to wrap-around. Number of calls 
is much higher.

Testing on linux, no regressions.

---
Aleksandar Ristovski
QNX Software Systems


ChangeLog
2008-02-11  Aleksandar Ristovski  <aristovski@qnx.com>

	* dwarf2read.c (dwarf2_build_psymtabs_hard): Remove call to
	sort_pst_symbols. It will be done on first lookup.
	* symfile.c (compare_psymbols): Set globals_sorted flag.
	* symmisc.c (maintenance_info_psymtabs): Add some statistics to the
	output.
	* symtab.c (lookup_partial_symbol): Sort global symbols if not
	already sorted, then go on with the appropriate lookup.
	* symtab.h (struct partial_symtab): Add new field globals_sorted.


[-- Attachment #2: sortpsymtabsonfirstlookup.diff --]
[-- Type: text/plain, Size: 7031 bytes --]

Index: gdb/dwarf2read.c
===================================================================
RCS file: /cvs/src/src/gdb/dwarf2read.c,v
retrieving revision 1.251
diff -u -5 -p -r1.251 dwarf2read.c
--- gdb/dwarf2read.c	1 Feb 2008 22:45:13 -0000	1.251
+++ gdb/dwarf2read.c	11 Feb 2008 15:53:37 -0000
@@ -1550,11 +1550,10 @@ dwarf2_build_psymtabs_hard (struct objfi
 
       pst->n_global_syms = objfile->global_psymbols.next -
 	(objfile->global_psymbols.list + pst->globals_offset);
       pst->n_static_syms = objfile->static_psymbols.next -
 	(objfile->static_psymbols.list + pst->statics_offset);
-      sort_pst_symbols (pst);
 
       /* If there is already a psymtab or symtab for a file of this
          name, remove it. (If there is a symtab, more drastic things
          also happen.) This happens in VxWorks.  */
       free_named_symtabs (pst->filename);
Index: gdb/symfile.c
===================================================================
RCS file: /cvs/src/src/gdb/symfile.c,v
retrieving revision 1.198
diff -u -5 -p -r1.198 symfile.c
--- gdb/symfile.c	29 Jan 2008 22:47:20 -0000	1.198
+++ gdb/symfile.c	11 Feb 2008 15:53:39 -0000
@@ -210,14 +210,15 @@ compare_psymbols (const void *s1p, const
 
 void
 sort_pst_symbols (struct partial_symtab *pst)
 {
   /* Sort the global list; don't sort the static list */
-
+  gdb_assert (!pst->globals_sorted);
   qsort (pst->objfile->global_psymbols.list + pst->globals_offset,
 	 pst->n_global_syms, sizeof (struct partial_symbol *),
 	 compare_psymbols);
+  pst->globals_sorted = 1;
 }
 
 /* Make a null terminated copy of the string at PTR with SIZE characters in
    the obstack pointed to by OBSTACKP .  Returns the address of the copy.
    Note that the string at PTR does not have to be null terminated, I.E. it
Index: gdb/symmisc.c
===================================================================
RCS file: /cvs/src/src/gdb/symmisc.c,v
retrieving revision 1.47
diff -u -5 -p -r1.47 symmisc.c
--- gdb/symmisc.c	11 Jan 2008 13:34:15 -0000	1.47
+++ gdb/symmisc.c	11 Feb 2008 15:53:39 -0000
@@ -1051,10 +1051,14 @@ maintenance_info_symtabs (char *regexp, 
 /* List all the partial symbol tables whose names match REGEXP (optional).  */
 void
 maintenance_info_psymtabs (char *regexp, int from_tty)
 {
   struct objfile *objfile;
+  int sum_global_syms = 0;
+  int sum_static_syms  = 0;
+  int sum_num_sorts = 0;
+  int sum_num_psymtabs = 0;
 
   if (regexp)
     re_comp (regexp);
 
   ALL_OBJFILES (objfile)
@@ -1067,10 +1071,11 @@ maintenance_info_psymtabs (char *regexp,
 
       ALL_OBJFILE_PSYMTABS (objfile, psymtab)
         if (! regexp
             || re_exec (psymtab->filename))
           {
+	    sum_num_psymtabs++;
             if (! printed_objfile_start)
               {
                 printf_filtered ("{ objfile %s ", objfile->name);
                 wrap_here ("  ");
                 printf_filtered ("((struct objfile *) %p)\n", objfile);
@@ -1094,20 +1099,26 @@ maintenance_info_psymtabs (char *regexp,
               {
                 printf_filtered ("(* (struct partial_symbol **) %p @ %d)\n",
                                  (psymtab->objfile->global_psymbols.list
                                   + psymtab->globals_offset),
                                  psymtab->n_global_syms);
+		printf_filtered ("    sorted %s\n",
+				 psymtab->globals_sorted ? "yes" : "no");
+		sum_global_syms += psymtab->n_global_syms;
+		if (psymtab->globals_sorted)
+		  sum_num_sorts++;
               }
             else
               printf_filtered ("(none)\n");
             printf_filtered ("    statics ");
             if (psymtab->n_static_syms)
               {
                 printf_filtered ("(* (struct partial_symbol **) %p @ %d)\n",
                                  (psymtab->objfile->static_psymbols.list
                                   + psymtab->statics_offset),
                                  psymtab->n_static_syms);
+		sum_static_syms += psymtab->n_static_syms;
               }
             else
               printf_filtered ("(none)\n");
             printf_filtered ("    dependencies ");
             if (psymtab->number_of_dependencies)
@@ -1132,10 +1143,17 @@ maintenance_info_psymtabs (char *regexp,
           }
 
       if (printed_objfile_start)
         printf_filtered ("}\n");
     }
+
+  /* Print statistics.  */
+  printf_filtered ("Statistics:\n%s%9d\n%s%9d\n%s%9d\n%s%9d\n",
+		   "    Number of psymtabs         :", sum_num_psymtabs,
+		   "    Number of global symbols   :", sum_global_syms,
+		   "    Number of static symbols   :", sum_static_syms,
+		   "    Number of glob. sym. sorts :", sum_num_sorts);
 }
 
 
 /* Check consistency of psymtabs and symtabs.  */
 
Index: gdb/symtab.c
===================================================================
RCS file: /cvs/src/src/gdb/symtab.c,v
retrieving revision 1.173
diff -u -5 -p -r1.173 symtab.c
--- gdb/symtab.c	5 Feb 2008 22:17:40 -0000	1.173
+++ gdb/symtab.c	11 Feb 2008 15:53:41 -0000
@@ -1594,19 +1594,20 @@ lookup_partial_symbol (struct partial_sy
   struct partial_symbol **top, **real_top, **bottom, **center;
   int length = (global ? pst->n_global_syms : pst->n_static_syms);
   int do_linear_search = 1;
   
   if (length == 0)
-    {
       return (NULL);
-    }
   start = (global ?
 	   pst->objfile->global_psymbols.list + pst->globals_offset :
 	   pst->objfile->static_psymbols.list + pst->statics_offset);
+  if (global && !pst->globals_sorted)
+    sort_pst_symbols (pst);
   
-  if (global)			/* This means we can use a binary search. */
+  if (global && pst->globals_sorted)
     {
+      /* This means we can use a binary search.  */
       do_linear_search = 0;
 
       /* Binary search.  This search is guaranteed to end with center
          pointing at the earliest partial symbol whose name might be
          correct.  At that point *all* partial symbols with an
Index: gdb/symtab.h
===================================================================
RCS file: /cvs/src/src/gdb/symtab.h,v
retrieving revision 1.112
diff -u -5 -p -r1.112 symtab.h
--- gdb/symtab.h	5 Feb 2008 22:17:40 -0000	1.112
+++ gdb/symtab.h	11 Feb 2008 15:53:41 -0000
@@ -921,17 +921,19 @@ struct partial_symtab
 
   struct partial_symtab **dependencies;
 
   int number_of_dependencies;
 
-  /* Global symbol list.  This list will be sorted after readin to
+  /* Global symbol list.  This list will be sorted on first lookup to 
      improve access.  Binary search will be the usual method of
      finding a symbol within it. globals_offset is an integer offset
-     within global_psymbols[].  */
+     within global_psymbols[]. globals_sorted is set to non zero when
+     sorted.  */
 
   int globals_offset;
   int n_global_syms;
+  int globals_sorted;
 
   /* Static symbol list.  This list will *not* be sorted after readin;
      to find a symbol in it, exhaustive search must be used.  This is
      reasonable because searches through this list will eventually
      lead to either the read in of a files symbols for real (assumed

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC/patch] Sort global symbols in psymtabs only if needed
  2008-02-11 16:20 [RFC/patch] Sort global symbols in psymtabs only if needed Aleksandar Ristovski
@ 2008-02-11 16:43 ` Daniel Jacobowitz
  2008-02-11 17:43   ` Aleksandar Ristovski
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Jacobowitz @ 2008-02-11 16:43 UTC (permalink / raw)
  To: Aleksandar Ristovski; +Cc: gdb-patches

On Mon, Feb 11, 2008 at 11:19:46AM -0500, Aleksandar Ristovski wrote:
> The attached patch sorts on first lookup instead of on load.

Does this help if you benchmark any non-trivial debug sessions?
You've sped up startup, but the first call to lookup_symbol is going
to search a large number of such psymtabs.

Also, you probably know this, but be careful using gprof for this sort
of thing.  I recommend oprofile.  gprof artificially inflates the cost
of small frequent functions like strcmp_iw_ordered, because of all the
overhead.

The patch may still be a good idea.  Can we eliminate the binary
search / sort entirely?  We do hash table lookups for full symtabs;
that should be faster than sorting.  See dict_create_hashed.

Those are all hardwired to "struct symbol" at the moment.  But
I think it only uses SYMBOL_SEARCH_NAME, so it could work with
psymbols too if we passed the ginfo around.


-- 
Daniel Jacobowitz
CodeSourcery


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC/patch] Sort global symbols in psymtabs only if needed
  2008-02-11 16:43 ` Daniel Jacobowitz
@ 2008-02-11 17:43   ` Aleksandar Ristovski
  0 siblings, 0 replies; 3+ messages in thread
From: Aleksandar Ristovski @ 2008-02-11 17:43 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb-patches

Daniel Jacobowitz wrote:
> On Mon, Feb 11, 2008 at 11:19:46AM -0500, Aleksandar Ristovski wrote:
>> The attached patch sorts on first lookup instead of on load.
> 
> Does this help if you benchmark any non-trivial debug sessions?
> You've sped up startup, but the first call to lookup_symbol is going
> to search a large number of such psymtabs.
The full price will probably be paid with longer debug sessions, but it 
definitely improves time to "stopped in main".

> 
> Also, you probably know this, but be careful using gprof for this sort
> of thing.  I recommend oprofile.  gprof artificially inflates the cost
> of small frequent functions like strcmp_iw_ordered, because of all the
> overhead.
No, the delay really comes from there, I used wall-clock timings as well, but 
thought this is more representative.
> 
> The patch may still be a good idea.  Can we eliminate the binary
> search / sort entirely?  We do hash table lookups for full symtabs;
> that should be faster than sorting.  See dict_create_hashed.
> 
> Those are all hardwired to "struct symbol" at the moment.  But
> I think it only uses SYMBOL_SEARCH_NAME, so it could work with
> psymbols too if we passed the ginfo around.
> 
> 
I haven't investigated this, but I will take a look... So we would only hash and 
use hash lookup for both psymtabs and symtabs. It sounds to me as something doable.




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-02-11 17:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-11 16:20 [RFC/patch] Sort global symbols in psymtabs only if needed Aleksandar Ristovski
2008-02-11 16:43 ` Daniel Jacobowitz
2008-02-11 17:43   ` Aleksandar Ristovski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox