[PATCH 5/6] .gdb_index prod perf regression: Estimate size of psyms_seen

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Pedro Alves <palves@redhat.com>
To: gdb-patches@sourceware.org
Subject: [PATCH 5/6] .gdb_index prod perf regression: Estimate size of psyms_seen
Date: Mon, 12 Jun 2017 16:19:00 -0000	[thread overview]
Message-ID: <1497284051-13795-5-git-send-email-palves@redhat.com> (raw)
In-Reply-To: <1497284051-13795-1-git-send-email-palves@redhat.com>
In-Reply-To: <8efc0742-1014-4fe0-6948-f40a9c5c4975@redhat.com>

Using the same test as the previous patch, perf shows GDB spending
over 7% in "free".  A substantial number of those calls comes from
insertions in the psyms_seen unordered_set causing lots of rehashing
and recreating buckets.  Fix this by computing an estimate of the size
of the set upfront.

Using the same test as in the previous patch, against the same gdb
inferior, timing improves ~8% further:

  ~6.5s => ~6.0s (average of 5 runs).

gdb/ChangeLog:
2017-06-12  Pedro Alves  <palves@redhat.com>

	* dwarf2read.c (recursively_count_psymbols): New function.
	(write_psymtabs_to_index): Call it to compute number of psyms and
	pass estimate size of psyms_seen to unordered_set's ctor.
---
 gdb/ChangeLog    |  6 ++++++
 gdb/dwarf2read.c | 36 +++++++++++++++++++++++++++++++++++-
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index 01b66a1..9dbc059 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,5 +1,11 @@
 2017-06-12  Pedro Alves  <palves@redhat.com>
 
+	* dwarf2read.c (recursively_count_psymbols): New function.
+	(write_psymtabs_to_index): Call it to compute number of psyms and
+	pass estimate size of psyms_seen to unordered_set's ctor.
+
+2017-06-12  Pedro Alves  <palves@redhat.com>
+
 	* dwarf2read.c (write_hash_table): Check if key already exists
 	before emplacing.
 
diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index 93fd275..bff2fcb 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -23691,6 +23691,22 @@ write_one_signatured_type (void **slot, void *d)
   return 1;
 }
 
+/* Recurse into all "included" dependencies and count their symbols as
+   if they appeared in this psymtab.  */
+
+static void
+recursively_count_psymbols (struct partial_symtab *psymtab,
+			    size_t &psyms_seen)
+{
+  for (int i = 0; i < psymtab->number_of_dependencies; ++i)
+    if (psymtab->dependencies[i]->user != NULL)
+      recursively_count_psymbols (psymtab->dependencies[i],
+				  psyms_seen);
+
+  psyms_seen += psymtab->n_global_syms;
+  psyms_seen += psymtab->n_static_syms;
+}
+
 /* Recurse into all "included" dependencies and write their symbols as
    if they appeared in this psymtab.  */
 
@@ -23764,7 +23780,6 @@ write_psymtabs_to_index (struct objfile *objfile, const char *dir)
 
   mapped_symtab symtab;
   data_buf cu_list;
-  std::unordered_set<partial_symbol *> psyms_seen;
 
   /* While we're scanning CU's create a table that maps a psymtab pointer
      (which is what addrmap records) to its index (which is what is recorded
@@ -23776,6 +23791,25 @@ write_psymtabs_to_index (struct objfile *objfile, const char *dir)
   /* The CU list is already sorted, so we don't need to do additional
      work here.  Also, the debug_types entries do not appear in
      all_comp_units, but only in their own hash table.  */
+
+  /* The psyms_seen set is potentially going to be largish (~40k
+     elements when indexing a -g3 build of GDB itself).  Estimate the
+     number of elements in order to avoid too many rehashes, which
+     require rebuilding buckets and thus many trips to
+     malloc/free.  */
+  size_t psyms_count = 0;
+  for (int i = 0; i < dwarf2_per_objfile->n_comp_units; ++i)
+    {
+      struct dwarf2_per_cu_data *per_cu
+	= dwarf2_per_objfile->all_comp_units[i];
+      struct partial_symtab *psymtab = per_cu->v.psymtab;
+
+      if (psymtab != NULL && psymtab->user == NULL)
+	recursively_count_psymbols (psymtab, psyms_count);
+    }
+  /* Generating an index for gdb itself shows a ratio of
+     TOTAL_SEEN_SYMS/UNIQUE_SYMS or ~5.  4 seems like a good bet.  */
+  std::unordered_set<partial_symbol *> psyms_seen (psyms_count / 4);
   for (int i = 0; i < dwarf2_per_objfile->n_comp_units; ++i)
     {
       struct dwarf2_per_cu_data *per_cu
-- 
2.5.5

next prev parent reply	other threads:[~2017-06-12 16:19 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-26 18:25 [PATCH 0/6] DWARF-5: .debug_names index Jan Kratochvil
2017-05-26 18:25 ` [PATCH 1/6] Code cleanup: C++ify .gdb_index producer Jan Kratochvil
2017-06-12 16:08   ` [pushed] " Pedro Alves
2017-06-12 16:14     ` [PATCH 2/6] Code cleanup: dwarf2read.c: Eliminate ::file_write Pedro Alves
2017-06-17 17:35       ` Jan Kratochvil
2017-06-19  9:26         ` Pedro Alves
2017-06-18 18:36       ` Regression: " Jan Kratochvil
2017-06-19  9:27         ` Pedro Alves
2017-06-19  9:39           ` Jan Kratochvil
2017-06-19  9:47             ` Pedro Alves
2017-06-19 10:03               ` Jan Kratochvil
2017-06-19 10:35                 ` Pedro Alves
2017-06-19 12:06                 ` Yao Qi
2017-06-19 12:26                   ` Jan Kratochvil
2017-06-12 16:14     ` [PATCH 6/6] .gdb_index prod perf regression: mapped_symtab now vector of values Pedro Alves
2017-06-12 16:14     ` [PATCH 4/6] .gdb_index prod perf regression: find before insert in unordered_map Pedro Alves
2017-06-12 16:14     ` [PATCH 1/6] Code cleanup: dwarf2read.c:uniquify_cu_indices: Use std::unique Pedro Alves
2017-06-12 16:14     ` [PATCH 3/6] Code cleanup: dwarf2read.c: Add data_buf::append_uint Pedro Alves
2017-06-12 16:18     ` [pushed] Re: [PATCH 1/6] Code cleanup: C++ify .gdb_index producer Pedro Alves
2017-06-12 16:19     ` Pedro Alves [this message]
2017-06-18 14:25     ` Jan Kratochvil
2017-06-18 15:12       ` Eli Zaretskii
2017-06-19 11:50         ` [pushed] .gdb_index writer: close the file before unlinking it (Re: [pushed] Re: [PATCH 1/6] Code cleanup: C++ify .gdb_index producer.) Pedro Alves
2017-06-18 16:50     ` [pushed] Re: [PATCH 1/6] Code cleanup: C++ify .gdb_index producer Jan Kratochvil
2017-05-26 18:25 ` [PATCH 2/6] cc-with-tweaks.sh: Use gdb-add-index.sh Jan Kratochvil
2017-05-26 18:26 ` [PATCH 4/6] Code cleanup: dwarf2_initialize_objfile return value Jan Kratochvil
2017-05-26 18:26 ` [PATCH 3/6] DWARF-5: .debug_names index producer Jan Kratochvil
2017-06-09  5:58   ` [PATCH 3.1/6] " Jan Kratochvil
2017-05-26 18:26 ` [PATCH 5/6] Refactor: Move some generic code out of .gdb_index code Jan Kratochvil
2017-05-26 18:26 ` [PATCH 6/6] DWARF-5: .debug_names index consumer Jan Kratochvil
2017-06-18 19:37 ` obsolete: [PATCH 0/6] DWARF-5: .debug_names index Jan Kratochvil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1497284051-13795-5-git-send-email-palves@redhat.com \
    --to=palves@redhat.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox