From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-46176-listarch-gdb=sources.redhat.com@sourceware.org>
Received: (qmail 103407 invoked by alias); 28 Nov 2017 12:00:43 -0000
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb.sourceware.org>
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
Received: (qmail 103186 invoked by uid 89); 28 Nov 2017 12:00:30 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-25.7 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_LAZY_DOMAIN_SECURITY,KB_WAM_FROM_NAME_SINGLEWORD,SPF_HELO_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=*linkage_name, cores, disappear, 2.40
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 28 Nov 2017 12:00:14 +0000
Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15])	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))	(No client certificate requested)	by mx1.redhat.com (Postfix) with ESMTPS id 3334C80469;	Tue, 28 Nov 2017 12:00:08 +0000 (UTC)
Received: from [127.0.0.1] (ovpn04.gateway.prod.ext.ams2.redhat.com [10.39.146.4])	by smtp.corp.redhat.com (Postfix) with ESMTP id 689935D6B7;	Tue, 28 Nov 2017 12:00:07 +0000 (UTC)
Subject: Re: Note on choosing string hash functions
To: Dmitry Antipov <dantipov@nvidia.com>
References: <33c45098-17a4-4c8a-fb14-137e70c7bb3f@nvidia.com> <4fc8cd33-a362-ddf5-9a7c-e69eab385587@redhat.com> <f85c22af-a467-e4b9-611f-8c50ed10a9f6@redhat.com> <ecac3e58-4bb0-e874-214e-6806ce4b0066@nvidia.com>
Cc: gdb@sourceware.org
From: Pedro Alves <palves@redhat.com>
Message-ID: <2ae7734e-3ea1-e017-4a87-65a3af25d0c0@redhat.com>
Date: Tue, 28 Nov 2017 12:00:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0
MIME-Version: 1.0
In-Reply-To: <ecac3e58-4bb0-e874-214e-6806ce4b0066@nvidia.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-SW-Source: 2017-11/txt/msg00029.txt.bz2

On 11/28/2017 10:35 AM, Dmitry Antipov wrote:
> On 11/22/2017 05:10 AM, Pedro Alves wrote:
> 
>> I observe between 3% / 10% time drop.
>>
>> (I used Joel's version of --readnever from here:
>>   https://sourceware.org/ml/gdb-patches/2016-07/msg00103.html)
>>
>> So in sum, I'm pretty convinced the patch is safe as is.
> 
> BTW, shouldn't we use safe-ctype.h macros through gdb/utils.c as well?
> With this, there is a substantial difference between:
> 
>      9.83%  gdb      gdb                       [.] find_pc_sect_psymtab
>      6.54%  gdb      gdb                       [.] bcache_full
>      5.38%  gdb      libc-2.25.so              [.]
> tolower                             ; hmm...
>      4.48%  gdb      gdb                       [.] htab_find_slot_with_hash
>      4.39%  gdb      gdb                       [.] htab_hash_string
>      3.88%  gdb      gdb                       [.] load_partial_dies
>      3.61%  gdb      gdb                       [.] strcmp_iw_ordered
>      3.45%  gdb      gdb                       [.]
> read_indirect_string_at_offset_from
>      3.18%  gdb      gdb                       [.] cpname_parse
>      3.07%  gdb      gdb                       [.]
> lookup_minimal_symbol_by_pc_name
>      2.70%  gdb      gdb                       [.] read_attribute_value
>      2.68%  gdb      libc-2.25.so              [.] __strcmp_sse2_unaligned
>      2.47%  gdb      gdb                       [.] cpname_lex
>      2.44%  gdb      gdb                       [.] peek_die_abbrev
>      2.32%  gdb      libc-2.25.so              [.] _int_malloc
>      2.15%  gdb      gdb                       [.] d_print_comp_inner
>      1.80%  gdb      libc-2.25.so              [.]
> isspace                             ; hmm...
>      1.47%  gdb      gdb                       [.]
> cp_canonicalize_string[abi:cxx11]
>      1.45%  gdb      gdb                       [.] eq_demangled_name_entry
>      1.39%  gdb      libc-2.25.so              [.]
> malloc_consolidate.part.1
> 
> and:
> 
>     10.97%  gdb      gdb                       [.] find_pc_sect_psymtab
>      7.25%  gdb      gdb                       [.] bcache_full
>      4.82%  gdb      gdb                       [.] htab_hash_string
>      4.65%  gdb      gdb                       [.] htab_find_slot_with_hash
>      4.28%  gdb      gdb                       [.] load_partial_dies
>      3.69%  gdb      gdb                       [.]
> read_indirect_string_at_offset_from
>      3.45%  gdb      gdb                       [.] cpname_parse
>      3.37%  gdb      gdb                       [.]
> lookup_minimal_symbol_by_pc_name
>      3.03%  gdb      gdb                       [.] read_attribute_value
>      3.01%  gdb      gdb                       [.] strcmp_iw_ordered
>      2.99%  gdb      libc-2.25.so              [.] __strcmp_sse2_unaligned
>      2.81%  gdb      gdb                       [.] peek_die_abbrev
>      2.74%  gdb      gdb                       [.] cpname_lex
>      2.57%  gdb      libc-2.25.so              [.] _int_malloc
>      2.40%  gdb      gdb                       [.] d_print_comp_inner
>      1.59%  gdb      gdb                       [.] eq_demangled_name_entry
>      1.56%  gdb      libc-2.25.so              [.]
> malloc_consolidate.part.1
>      1.56%  gdb      gdb                       [.]
> cp_canonicalize_string[abi:cxx11]
>      1.00%  gdb      libc-2.25.so              [.] strlen
> 
> IIUC this is because strcmp_iw_ordered() is quite critical when building
> partial symbol tables.

Yes, I think so.

BTW, with --readnever, I noticed htab_hash_string at the top,
so I experimented with unrolling it, as in the patch below.  This
seems to make htab_hash_string/my_htab_hash_string disappear from the
perf log, and using the same ff.cmd script I shown before, I get:

  0m20.0s -> 0m18.9

I haven't really investigated whether this increases memory
use significantly.  It may be that the extra cost for the string
length is masked by malloc's round-up overhead already,
resulting in no overhead in practice.

Another thing that might be worth it to look at is to see if
we could determine a better initial size for the hash table
depending on number of elf symbols, to avoid too many rehashes.

'ada_decode' seems to be an unfortunate bottleneck that shows up
on top (with --readnever), coming from ada_sniff_from_mangled_name, trying
to sniff each minsym language from the mangled name, which is unfortunate
because that's a lot of work when no symbol turns out to be an Ada symbol...
I'd be good to see about making ada_decode's not-an-ada-symbol path
be cheaper.

An idea I had would be to try to combine most the language_sniff_from_mangled_name
implementations in one call, by deferring to cplus_demangle which seems
like could take care of gnu v3, rust, java, gnat, dlang and old C++ mangling
schemes all at once.  This would avoid the overhead of gdb_demangle and
bfd_demangle for each language-specific demangling call, at least.  Not sure.

And then we just spend a lot of in the C++ demangler, which
is kind of expected since there are a lot of C++ symbols to demangle.
Not sure there's much we can do in the demangler itself.  Maybe
avoiding malloc, and seeing about avoiding multiple passes over the
input string would help.  But ultimately, I think that the major win
would probably come from parallelizing minsym demangling.  It's kind of
silly that we don't take advantage of multiple cores here...

Anyway, this is just a brain dump from a little investigation I did this past
weekend, and I think that this area has a lot of scope for improvements, but
I won't be able to follow up on any of this myself in the next following
weeks at least, with several gdb 8.1 issues on my plate.

>From 0567f3835cbe743ec9d294321ab1002039a3016f Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Tue, 28 Nov 2017 11:17:32 +0000
Subject: [PATCH] unroll htab_hash_string

---
 gdb/symtab.c | 92 ++++++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 81 insertions(+), 11 deletions(-)

diff --git a/gdb/symtab.c b/gdb/symtab.c
index 3d59367..c66874f 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -667,12 +667,77 @@ symbol_set_language (struct general_symbol_info *gsymbol,
     }
 }
 
+/* Hash P/LEN.  Unrolled version of
+   libiberty.c:hashtab.c:htab_hash_string, which is a string hash
+   function good for identifiers.  Should probably be moved to
+   hashtab.c itself.  */
+
+hashval_t
+my_htab_hash_string (const PTR p, int len)
+{
+  const unsigned char *string = (const unsigned char *) p;
+  unsigned int hash = 0;
+
+  int i = 0;
+
+#if 0
+  /* step 1 */
+  for (; i + 3 < len; i += 4)
+    {
+      unsigned int h1 = (hash) * 67 + string[i + 0] - 113;
+      unsigned int h2 = (has1) * 67 + string[i + 1] - 113;
+      unsigned int h3 = (has2) * 67 + string[i + 2] - 113;
+		 hash = (has3) * 67 + string[i + 3] - 113;
+    }
+#endif
+
+#if 0
+  /* step 2 */
+  for (; i + 3 < len; i += 4)
+    {
+      unsigned int h2 = 67 * 67 * hash + string[i + 0] * 67 + string[i + 1] - 113 - 113 * 67;
+      unsigned int h3 = (has2) * 67 + string[i + 2] - 113;
+		 hash = (has3) * 67 + string[i + 3] - 113;
+    }
+#endif
+
+#if 0
+  /* step 3 */
+  for (; i + 3 < len; i += 4)
+    {
+      unsigned int h3 = 67 * 67 * 67 * hash + string[i + 0] * 67 * 67 + string[i + 1] * 67 + string[i + 2] - 113 - 113 * 67 - 113 * 67 * 67;
+		 hash = (h3) * 67 + string[i + 3] - 113;
+    }
+#endif
+
+  /* Unrolled a few times to compensate for multiplication latency due
+     to data dependencies.  */
+  for (; i + 3 < len; i += 4)
+    {
+      hash = (67 * 67 * 67 * 67 * hash
+	      + 67 * 67 * 67 * string[i + 0]
+	      + 67 * 67 * string[i + 1]
+	      + 67 * string[i + 2]
+	      + string[i + 3]
+	      - 113 * 67
+	      - 113 * 67 * 67
+	      - 113 * 67 * 67 * 67
+	      - 113);
+    }
+
+  for (; i < len; i++)
+    hash = hash * 67 + string[i] - 113;
+
+  return hash;
+}
+
 /* Functions to initialize a symbol's mangled name.  */
 
 /* Objects of this type are stored in the demangled name hash table.  */
 struct demangled_name_entry
 {
   const char *mangled;
+  uint32_t mangled_len;
   char demangled[1];
 };
 
@@ -684,7 +749,7 @@ hash_demangled_name_entry (const void *data)
   const struct demangled_name_entry *e
     = (const struct demangled_name_entry *) data;
 
-  return htab_hash_string (e->mangled);
+  return my_htab_hash_string (e->mangled, e->mangled_len);
 }
 
 /* Equality function for the demangled name hash.  */
@@ -697,6 +762,8 @@ eq_demangled_name_entry (const void *a, const void *b)
   const struct demangled_name_entry *db
     = (const struct demangled_name_entry *) b;
 
+  if (da->mangled_len != db->mangled_len)
+    return 0;
   return strcmp (da->mangled, db->mangled) == 0;
 }
 
@@ -770,8 +837,8 @@ symbol_find_demangled_name (struct general_symbol_info *gsymbol,
 
 void
 symbol_set_names (struct general_symbol_info *gsymbol,
-		  const char *linkage_name, int len, int copy_name,
-		  struct objfile *objfile)
+		  const char *linkage_name, const int linkage_name_len,
+		  int copy_name, struct objfile *objfile)
 {
   struct demangled_name_entry **slot;
   /* A 0-terminated copy of the linkage name.  */
@@ -788,10 +855,10 @@ symbol_set_names (struct general_symbol_info *gsymbol,
       else
 	{
 	  char *name = (char *) obstack_alloc (&per_bfd->storage_obstack,
-					       len + 1);
+					       linkage_name_len + 1);
 
-	  memcpy (name, linkage_name, len);
-	  name[len] = '\0';
+	  memcpy (name, linkage_name, linkage_name_len);
+	  name[linkage_name_len] = '\0';
 	  gsymbol->name = name;
 	}
       symbol_set_demangled_name (gsymbol, NULL, &per_bfd->storage_obstack);
@@ -802,19 +869,20 @@ symbol_set_names (struct general_symbol_info *gsymbol,
   if (per_bfd->demangled_names_hash == NULL)
     create_demangled_names_hash (objfile);
 
-  if (linkage_name[len] != '\0')
+  if (linkage_name[linkage_name_len] != '\0')
     {
       char *alloc_name;
 
-      alloc_name = (char *) alloca (len + 1);
-      memcpy (alloc_name, linkage_name, len);
-      alloc_name[len] = '\0';
+      alloc_name = (char *) alloca (linkage_name_len + 1);
+      memcpy (alloc_name, linkage_name, linkage_name_len);
+      alloc_name[linkage_name_len] = '\0';
 
       linkage_name_copy = alloc_name;
     }
   else
     linkage_name_copy = linkage_name;
 
+  entry.mangled_len = linkage_name_len;
   entry.mangled = linkage_name_copy;
   slot = ((struct demangled_name_entry **)
 	  htab_find_slot (per_bfd->demangled_names_hash,
@@ -847,6 +915,7 @@ symbol_set_names (struct general_symbol_info *gsymbol,
 	       obstack_alloc (&per_bfd->storage_obstack,
 			      offsetof (struct demangled_name_entry, demangled)
 			      + demangled_len + 1));
+	  (*slot)->mangled_len = linkage_name_len;
 	  (*slot)->mangled = linkage_name;
 	}
       else
@@ -860,10 +929,11 @@ symbol_set_names (struct general_symbol_info *gsymbol,
 	    = ((struct demangled_name_entry *)
 	       obstack_alloc (&per_bfd->storage_obstack,
 			      offsetof (struct demangled_name_entry, demangled)
-			      + len + demangled_len + 2));
+			      + linkage_name_len + demangled_len + 2));
 	  mangled_ptr = &((*slot)->demangled[demangled_len + 1]);
 	  strcpy (mangled_ptr, linkage_name_copy);
 	  (*slot)->mangled = mangled_ptr;
+	  (*slot)->mangled_len = linkage_name_len;
 	}
 
       if (demangled_name != NULL)
-- 
2.5.5