Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* [PATCH] gdb script performance
@ 2006-11-30  0:33 Jean-Marc Saffroy
  2006-11-30  1:04 ` Jim Blandy
  2006-11-30  3:13 ` Daniel Jacobowitz
  0 siblings, 2 replies; 7+ messages in thread
From: Jean-Marc Saffroy @ 2006-11-30  0:33 UTC (permalink / raw)
  To: gdb-patches

Hi folks,

It seems the patches I posted yesterday on the gdb list have gone 
completely unnoticed, so I guess I should resend them here.

Back in October, I had reported performance problems I observed when 
scripting gdb to analyze a kdump core, and even after applying the patch 
gdb-lookup-internal-first-3, gprof showed something like this:
   http://jeanmarc.saffroy.free.fr/gdb/gdb-baseline.html

Daniel Jacobowitz suggested the following:

On Thu, 5 Oct 2006, Daniel Jacobowitz wrote:

>  26.14    381.31   381.31   141867     2.69     3.50  find_pc_sect_psymtab
>  18.54    651.82   270.51   331876     0.82     0.82  find_pc_sect_section
>  17.88    912.69   260.87 144847060     0.00     0.00  lookup_partial_symbol
>  11.01   1073.33   160.64 1254595136     0.00     0.00  strcmp_iw
>   4.98   1145.97    72.64    34010     2.14    16.50  lookup_symbol_aux_psymtabs
>   4.78   1215.73    69.75     9584     7.28     8.12  fixup_section
>   4.46   1280.79    65.06                             lbasename
>   3.42   1330.64    49.85    12183     4.09     4.09  lookup_partial_symtab
>   1.83   1357.34    26.70 33844664     0.00     0.00  strcmp_iw_ordered
>   1.68   1381.82    24.48 1288426966     0.00     0.00  symbol_natural_name
>   1.09   1397.71    15.89 1288426966     0.00     0.00  symbol_search_name
> 
> Well then.  Observations:
>   - We're looking up partial symbols way too often.
>   - It's taking way too long.
>   - And oh my lord is that a lot of calls to the (fairly inefficient)
>     strcmp_iw.
> 
> It looks like only about three calls to find_pc_sect_psymtab per
> expression evaluation, which is bad but not too bad - sounds like
> a last-searched-item cache may be useful.

Recently I took some time and followed Daniel's advice, ie. I implemented 
a (trivial) cache for the last-searched PC, as well as another (more 
convoluted) for the last-searched symbol. The symbol cache patch gives 
this profile:
   http://jeanmarc.saffroy.free.fr/gdb/gdb-symcache.html
With both patches applied:
   http://jeanmarc.saffroy.free.fr/gdb/gdb-symcache-pc_cache.html

On a regular build (no profiling), my script now runs in 23 seconds 
instead of 158. :-)

This is still unfinished (hooks for invalidating the caches are missing, 
and I'm sure performance can still be enhanced significantly), but before 
going further, I'd like to know how you feel about integrating such 
changes.


Cheers,

-- 
saffroy@gmail.com

Index: gdb-6.5-prof/gdb/symtab.c
===================================================================
--- gdb-6.5-prof.orig/gdb/symtab.c	2006-11-29 03:27:18.000000000 +0100
+++ gdb-6.5-prof/gdb/symtab.c	2006-11-29 03:27:20.000000000 +0100
@@ -2022,8 +2022,8 @@
  /* Find the symtab associated with PC and SECTION.  Look through the
     psymtabs and read in another symtab if necessary. */

-struct symtab *
-find_pc_sect_symtab (CORE_ADDR pc, asection *section)
+static struct symtab *
+find_pc_sect_symtab_uncached (CORE_ADDR pc, asection *section)
  {
    struct block *b;
    struct blockvector *bv;
@@ -2123,6 +2123,24 @@
    return (s);
  }

+struct symtab *
+find_pc_sect_symtab (CORE_ADDR pc, asection *section)
+{
+  static CORE_ADDR cached_pc;
+  static asection *cached_section;
+  static struct symtab *cached_symtab;
+  static int try_cache = 0;
+
+  if (try_cache && (cached_pc == pc) && (cached_section == section))
+    return cached_symtab;
+
+  try_cache = 1;
+  cached_pc = pc;
+  cached_section = section;
+  cached_symtab = find_pc_sect_symtab_uncached (pc, section);
+  return cached_symtab;
+}
+
  /* Find the symtab associated with PC.  Look through the psymtabs and
     read in another symtab if necessary.  Backward compatibility, no section */

Index: gdb-6.5-prof/gdb/symtab.c
===================================================================
--- gdb-6.5-prof.orig/gdb/symtab.c	2006-11-27 03:01:22.000000000 +0100
+++ gdb-6.5-prof/gdb/symtab.c	2006-11-28 01:21:28.000000000 +0100
@@ -94,6 +94,20 @@
  					struct symtab **symtab);

  static
+int symcache_lookup (const char *name,
+		     const char *linkage_name,
+		     const domain_enum domain,
+		     struct symbol **sym,
+		     struct symtab **symtab);
+
+static
+void symcache_add (struct symbol *sym,
+		   const char *name,
+		   const char *linkage_name,
+		   const domain_enum domain,
+		   struct symtab **symtab);
+
+static
  struct symbol *lookup_symbol_aux_symtabs (int block_index,
  					  const char *name,
  					  const char *linkage_name,
@@ -1113,6 +1127,9 @@
  	}
      }

+  if (symcache_lookup (name, linkage_name, domain, &sym, symtab))
+    return sym;
+
    /* Now do whatever is appropriate for the current language to look
       up static and global variables.  */

@@ -1120,7 +1137,7 @@
  						     block, domain,
  						     symtab);
    if (sym != NULL)
-    return sym;
+    goto out;

    /* Now search all static file-level symbols.  Not strictly correct,
       but more useful than an error.  Do the symtabs first, then check
@@ -1131,16 +1148,19 @@
    sym = lookup_symbol_aux_symtabs (STATIC_BLOCK, name, linkage_name,
  				   domain, symtab);
    if (sym != NULL)
-    return sym;
- 
+    goto out;
+
    sym = lookup_symbol_aux_psymtabs (STATIC_BLOCK, name, linkage_name,
  				    domain, symtab);
    if (sym != NULL)
-    return sym;
+    goto out;

    if (symtab != NULL)
      *symtab = NULL;
-  return NULL;
+
+ out:
+  symcache_add (sym, name, linkage_name, domain, symtab);
+  return sym;
  }

  /* Check to see if the symbol is defined in BLOCK or its superiors.
@@ -1215,6 +1235,177 @@
    return NULL;
  }

+/* Check for a cached symbol lookup.
+ */
+
+struct symcache_entry {
+  const char *name;
+  const char *linkage_name;
+  domain_enum domain;
+  struct symbol *sym;
+  struct symtab *symtab;
+  struct symcache_entry *next, *prev; /* mru list */
+};
+
+/* max number of entries in symcache */
+#define SYMCACHE_SIZE 256
+
+/* fixed-size hash table of symcache entries */
+static htab_t symcache_hashtab;
+
+/* most recently accessed symcache entry */
+static struct symcache_entry *symcache_newest;
+
+static hashval_t
+symcache_hash(const void *e)
+{
+  struct symcache_entry *entry = (struct symcache_entry *)e;
+
+  return htab_hash_string (entry->name) + entry->domain;
+}
+
+static int
+symcache_eq(const void *e, const void *f)
+{
+  struct symcache_entry *a = (struct symcache_entry *)e;
+  struct symcache_entry *b = (struct symcache_entry *)f;
+
+  return ! strcmp (a->name, b->name)
+	  // FIXME && !strcmp(a->linkage_name, b->linkage_name)
+	  && (a->domain == b->domain);
+}
+
+static void
+symcache_list_add(struct symcache_entry *entry)
+{
+  if (symcache_newest == NULL)
+    {
+      /* init mru */
+      entry->prev = entry;
+      entry->next = entry;
+      symcache_newest = entry;
+    }
+  else
+    {
+      /* add entry at head of mru */
+      entry->next = symcache_newest;
+      entry->prev = symcache_newest->prev;
+      entry->prev->next = entry;
+      entry->next->prev = entry;
+      symcache_newest = entry;
+    }
+}
+
+static void
+symcache_list_del(struct symcache_entry *entry)
+{
+  /* remove entry from mru */
+  entry->prev->next = entry->next;
+  entry->next->prev = entry->prev;
+}
+
+static void
+symcache_del(void *e)
+{
+  struct symcache_entry *entry = (struct symcache_entry *)e;
+
+  symcache_list_del (entry);
+  free (entry->name);
+  free (entry->linkage_name);
+  free (entry);
+}
+
+static void
+symcache_add (struct symbol *sym,
+		   const char *name,
+		   const char *linkage_name,
+		   const domain_enum domain,
+		   struct symtab **symtab)
+{
+  struct symcache_entry request, *entry, **slot;
+
+  /* initialize on first use */
+  if (!symcache_hashtab)
+    symcache_hashtab = htab_create_alloc (SYMCACHE_SIZE,
+					  symcache_hash,
+					  symcache_eq,
+					  symcache_del,
+					  xcalloc,
+					  free);
+
+  /* sym can be known, but with unknown symtab: add it if possible */
+  request.name = name;
+  request.linkage_name = linkage_name;
+  request.domain = domain;
+  entry = htab_find (symcache_hashtab, &request);
+  if (entry != HTAB_EMPTY_ENTRY)
+    {
+      if (symtab != NULL)
+	entry->symtab = *symtab;
+      return;
+    }
+
+  /* when cache is full, remove oldest element */
+  if (htab_elements (symcache_hashtab) == SYMCACHE_SIZE)
+    {
+      struct symcache_entry *oldest;
+
+      oldest = symcache_newest->prev;
+      htab_remove_elt (symcache_hashtab, oldest);
+    }
+
+  /* this is really a new entry */
+  entry = xcalloc (1, sizeof(*entry));
+  entry->name = strdup (name);
+  entry->linkage_name = linkage_name ? strdup (linkage_name) : NULL;
+  entry->domain = domain;
+  entry->sym = sym;
+  if (symtab != NULL)
+    entry->symtab = *symtab;
+  symcache_list_add (entry);
+
+  slot = htab_find_slot (symcache_hashtab, entry, INSERT);
+  *slot = entry;
+}
+
+static int
+symcache_lookup (const char *name,
+				const char *linkage_name,
+				const domain_enum domain,
+				struct symbol **sym,
+				struct symtab **symtab)
+{
+  struct symcache_entry request, *entry;
+
+  if (!symcache_hashtab)
+    return 0;
+
+  request.name = name;
+  request.linkage_name = linkage_name;
+  request.domain = domain;
+  entry = htab_find (symcache_hashtab, &request);
+  if (entry == HTAB_EMPTY_ENTRY)
+    return 0;
+
+  if (symtab != NULL)
+    {
+      if (entry->symtab == NULL)
+	/* symtab is requested, but cached entry does not have it */
+	return 0;
+      *symtab = entry->symtab;
+    }
+  *sym = entry->sym;
+
+  /* keep mru list sorted */
+  if (entry != symcache_newest)
+    {
+      symcache_list_del (entry);
+      symcache_list_add (entry);
+    }
+
+  return 1;
+}
+
  /* Check to see if the symbol is defined in one of the symtabs.
     BLOCK_INDEX should be either GLOBAL_BLOCK or STATIC_BLOCK,
     depending on whether or not we want to search global symbols or


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gdb script performance
  2006-11-30  0:33 [PATCH] gdb script performance Jean-Marc Saffroy
@ 2006-11-30  1:04 ` Jim Blandy
  2006-11-30 13:19   ` Jean-Marc Saffroy
  2006-11-30  3:13 ` Daniel Jacobowitz
  1 sibling, 1 reply; 7+ messages in thread
From: Jim Blandy @ 2006-11-30  1:04 UTC (permalink / raw)
  To: Jean-Marc Saffroy; +Cc: gdb-patches


Jean-Marc Saffroy <saffroy@gmail.com> writes:
> Index: gdb-6.5-prof/gdb/symtab.c
> ===================================================================
> --- gdb-6.5-prof.orig/gdb/symtab.c	2006-11-29 03:27:18.000000000 +0100
> +++ gdb-6.5-prof/gdb/symtab.c	2006-11-29 03:27:20.000000000 +0100
> @@ -2022,8 +2022,8 @@
>  /* Find the symtab associated with PC and SECTION.  Look through the
>     psymtabs and read in another symtab if necessary. */
>
> -struct symtab *
> -find_pc_sect_symtab (CORE_ADDR pc, asection *section)
> +static struct symtab *
> +find_pc_sect_symtab_uncached (CORE_ADDR pc, asection *section)
>  {
>    struct block *b;
>    struct blockvector *bv;
> @@ -2123,6 +2123,24 @@
>    return (s);
>  }
>
> +struct symtab *
> +find_pc_sect_symtab (CORE_ADDR pc, asection *section)
> +{
> +  static CORE_ADDR cached_pc;
> +  static asection *cached_section;
> +  static struct symtab *cached_symtab;
> +  static int try_cache = 0;
> +
> +  if (try_cache && (cached_pc == pc) && (cached_section == section))
> +    return cached_symtab;
> +
> +  try_cache = 1;
> +  cached_pc = pc;
> +  cached_section = section;
> +  cached_symtab = find_pc_sect_symtab_uncached (pc, section);
> +  return cached_symtab;
> +}
> +
>  /* Find the symtab associated with PC.  Look through the psymtabs and
>     read in another symtab if necessary.  Backward compatibility, no section */

When the user changes the symbol file (or when GDB notices it has
changed and automatically re-reads it), the section and symtab objects
your static variables are pointing to will be freed.  If you then get
a spurious cache hit, you'll hand out a pointer to garbage.  So you'll
need to invalidate the cache whenever the section or symtab get freed.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gdb script performance
  2006-11-30  0:33 [PATCH] gdb script performance Jean-Marc Saffroy
  2006-11-30  1:04 ` Jim Blandy
@ 2006-11-30  3:13 ` Daniel Jacobowitz
  2006-11-30 13:57   ` Jean-Marc Saffroy
  1 sibling, 1 reply; 7+ messages in thread
From: Daniel Jacobowitz @ 2006-11-30  3:13 UTC (permalink / raw)
  To: Jean-Marc Saffroy; +Cc: gdb-patches

On Thu, Nov 30, 2006 at 01:33:02AM +0100, Jean-Marc Saffroy wrote:
> It seems the patches I posted yesterday on the gdb list have gone 
> completely unnoticed, so I guess I should resend them here.

This is the right list for patches.  However, please be patient - there
is a chronic scarcity of reviewers, but we all do what we can.

> This is still unfinished (hooks for invalidating the caches are missing, 
> and I'm sure performance can still be enhanced significantly), but before 
> going further, I'd like to know how you feel about integrating such 
> changes.

I'm glad to see some of this stuff sped up.  However, I'm hopeful that
there's a better way to do it - shouldn't there be a more efficiently
searchable data structure for whatever you're caching in the first
place?

Maybe there isn't; just thinking out loud.


-- 
Daniel Jacobowitz
CodeSourcery


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gdb script performance
  2006-11-30  1:04 ` Jim Blandy
@ 2006-11-30 13:19   ` Jean-Marc Saffroy
  2006-11-30 21:54     ` Michael Snyder
  0 siblings, 1 reply; 7+ messages in thread
From: Jean-Marc Saffroy @ 2006-11-30 13:19 UTC (permalink / raw)
  To: Jim Blandy; +Cc: gdb-patches

On Wed, 29 Nov 2006, Jim Blandy wrote:

> When the user changes the symbol file (or when GDB notices it has 
> changed and automatically re-reads it), the section and symtab objects 
> your static variables are pointing to will be freed.  If you then get a 
> spurious cache hit, you'll hand out a pointer to garbage.  So you'll 
> need to invalidate the cache whenever the section or symtab get freed.

Yes, that's what I meant when I said that hooks for invalidating the 
caches are missing, but I haven't looked yet where they should be added.

BTW I don't know if the test suite would catch this; if not, a test should 
probably be added.

-- 
saffroy@gmail.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gdb script performance
  2006-11-30  3:13 ` Daniel Jacobowitz
@ 2006-11-30 13:57   ` Jean-Marc Saffroy
  2006-12-05 21:02     ` Daniel Jacobowitz
  0 siblings, 1 reply; 7+ messages in thread
From: Jean-Marc Saffroy @ 2006-11-30 13:57 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb-patches

On Wed, 29 Nov 2006, Daniel Jacobowitz wrote:

> On Thu, Nov 30, 2006 at 01:33:02AM +0100, Jean-Marc Saffroy wrote:
>> It seems the patches I posted yesterday on the gdb list have gone
>> completely unnoticed, so I guess I should resend them here.
>
> This is the right list for patches.  However, please be patient - there
> is a chronic scarcity of reviewers, but we all do what we can.

Given your usual responsiveness on the lists, I thought my mail may have 
been filtered somehow. Thanks for bearing with me!

>> This is still unfinished (hooks for invalidating the caches are missing,
>> and I'm sure performance can still be enhanced significantly), but before
>> going further, I'd like to know how you feel about integrating such
>> changes.
>
> I'm glad to see some of this stuff sped up.  However, I'm hopeful that
> there's a better way to do it - shouldn't there be a more efficiently
> searchable data structure for whatever you're caching in the first
> place?
>
> Maybe there isn't; just thinking out loud.

I asked myself the same question, and hoped someone would have the answer 
here. ;-)

For the PC cache, I guess that just about any address can be passed, so I 
doubt there is much room for improvement (except maybe a cache with more 
than one entry, should it be useful in some cases, but now it's probably 
unneeded).

For the symbol cache, I think referencing all symbols in a global hash 
table could be a solution; actually I was surprised there was not one 
already. That's a more radical solution than my current patch, it could 
take a fair amount of memory, and also would take me much more time to 
implement, given that I'm still new to gdb internals. But I'd like this 
approach too, if it makes things cleaner and yields good performance.


-- 
saffroy@gmail.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gdb script performance
  2006-11-30 13:19   ` Jean-Marc Saffroy
@ 2006-11-30 21:54     ` Michael Snyder
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Snyder @ 2006-11-30 21:54 UTC (permalink / raw)
  To: Jean-Marc Saffroy; +Cc: Jim Blandy, gdb-patches

On Thu, 2006-11-30 at 14:18 +0100, Jean-Marc Saffroy wrote:
> On Wed, 29 Nov 2006, Jim Blandy wrote:
> 
> > When the user changes the symbol file (or when GDB notices it has 
> > changed and automatically re-reads it), the section and symtab objects 
> > your static variables are pointing to will be freed.  If you then get a 
> > spurious cache hit, you'll hand out a pointer to garbage.  So you'll 
> > need to invalidate the cache whenever the section or symtab get freed.
> 
> Yes, that's what I meant when I said that hooks for invalidating the 
> caches are missing, but I haven't looked yet where they should be added.
> 
> BTW I don't know if the test suite would catch this; if not, a test should 
> probably be added.

A test would be a good idea, yes.
BTW, I think you have a worthwhile idea here, 
and I hope you continue to refine it.  There
are some serious performance issues in gdb.

Michael


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gdb script performance
  2006-11-30 13:57   ` Jean-Marc Saffroy
@ 2006-12-05 21:02     ` Daniel Jacobowitz
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Jacobowitz @ 2006-12-05 21:02 UTC (permalink / raw)
  To: Jean-Marc Saffroy; +Cc: gdb-patches

On Thu, Nov 30, 2006 at 02:56:54PM +0100, Jean-Marc Saffroy wrote:
> For the PC cache, I guess that just about any address can be passed, so I 
> doubt there is much room for improvement (except maybe a cache with more 
> than one entry, should it be useful in some cases, but now it's probably 
> unneeded).

We shouldn't need to cache at all.  I think the right representation
would allow us to look up a single PC and find the best psymtab based
on that PC without having to loop over psymtabs; this requires a pretty
big change to the way we store things, since unfortunately I don't
think we can assume the address ranges of psymtabs are even remotely
sane (multiple psymtabs may overlap, for instance, and it's past time
we started recording DW_AT_ranges discontiguous range information).

Not sure what a good data structure for that would look like.

-- 
Daniel Jacobowitz
CodeSourcery


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-12-05 21:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-11-30  0:33 [PATCH] gdb script performance Jean-Marc Saffroy
2006-11-30  1:04 ` Jim Blandy
2006-11-30 13:19   ` Jean-Marc Saffroy
2006-11-30 21:54     ` Michael Snyder
2006-11-30  3:13 ` Daniel Jacobowitz
2006-11-30 13:57   ` Jean-Marc Saffroy
2006-12-05 21:02     ` Daniel Jacobowitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox