From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 103407 invoked by alias); 28 Nov 2017 12:00:43 -0000 Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org Received: (qmail 103186 invoked by uid 89); 28 Nov 2017 12:00:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.7 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_LAZY_DOMAIN_SECURITY,KB_WAM_FROM_NAME_SINGLEWORD,SPF_HELO_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=*linkage_name, cores, disappear, 2.40 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 28 Nov 2017 12:00:14 +0000 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3334C80469; Tue, 28 Nov 2017 12:00:08 +0000 (UTC) Received: from [127.0.0.1] (ovpn04.gateway.prod.ext.ams2.redhat.com [10.39.146.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 689935D6B7; Tue, 28 Nov 2017 12:00:07 +0000 (UTC) Subject: Re: Note on choosing string hash functions To: Dmitry Antipov References: <33c45098-17a4-4c8a-fb14-137e70c7bb3f@nvidia.com> <4fc8cd33-a362-ddf5-9a7c-e69eab385587@redhat.com> Cc: gdb@sourceware.org From: Pedro Alves Message-ID: <2ae7734e-3ea1-e017-4a87-65a3af25d0c0@redhat.com> Date: Tue, 28 Nov 2017 12:00:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2017-11/txt/msg00029.txt.bz2 On 11/28/2017 10:35 AM, Dmitry Antipov wrote: > On 11/22/2017 05:10 AM, Pedro Alves wrote: > >> I observe between 3% / 10% time drop. >> >> (I used Joel's version of --readnever from here: >> https://sourceware.org/ml/gdb-patches/2016-07/msg00103.html) >> >> So in sum, I'm pretty convinced the patch is safe as is. > > BTW, shouldn't we use safe-ctype.h macros through gdb/utils.c as well? > With this, there is a substantial difference between: > > 9.83% gdb gdb [.] find_pc_sect_psymtab > 6.54% gdb gdb [.] bcache_full > 5.38% gdb libc-2.25.so [.] > tolower ; hmm... > 4.48% gdb gdb [.] htab_find_slot_with_hash > 4.39% gdb gdb [.] htab_hash_string > 3.88% gdb gdb [.] load_partial_dies > 3.61% gdb gdb [.] strcmp_iw_ordered > 3.45% gdb gdb [.] > read_indirect_string_at_offset_from > 3.18% gdb gdb [.] cpname_parse > 3.07% gdb gdb [.] > lookup_minimal_symbol_by_pc_name > 2.70% gdb gdb [.] read_attribute_value > 2.68% gdb libc-2.25.so [.] __strcmp_sse2_unaligned > 2.47% gdb gdb [.] cpname_lex > 2.44% gdb gdb [.] peek_die_abbrev > 2.32% gdb libc-2.25.so [.] _int_malloc > 2.15% gdb gdb [.] d_print_comp_inner > 1.80% gdb libc-2.25.so [.] > isspace ; hmm... > 1.47% gdb gdb [.] > cp_canonicalize_string[abi:cxx11] > 1.45% gdb gdb [.] eq_demangled_name_entry > 1.39% gdb libc-2.25.so [.] > malloc_consolidate.part.1 > > and: > > 10.97% gdb gdb [.] find_pc_sect_psymtab > 7.25% gdb gdb [.] bcache_full > 4.82% gdb gdb [.] htab_hash_string > 4.65% gdb gdb [.] htab_find_slot_with_hash > 4.28% gdb gdb [.] load_partial_dies > 3.69% gdb gdb [.] > read_indirect_string_at_offset_from > 3.45% gdb gdb [.] cpname_parse > 3.37% gdb gdb [.] > lookup_minimal_symbol_by_pc_name > 3.03% gdb gdb [.] read_attribute_value > 3.01% gdb gdb [.] strcmp_iw_ordered > 2.99% gdb libc-2.25.so [.] __strcmp_sse2_unaligned > 2.81% gdb gdb [.] peek_die_abbrev > 2.74% gdb gdb [.] cpname_lex > 2.57% gdb libc-2.25.so [.] _int_malloc > 2.40% gdb gdb [.] d_print_comp_inner > 1.59% gdb gdb [.] eq_demangled_name_entry > 1.56% gdb libc-2.25.so [.] > malloc_consolidate.part.1 > 1.56% gdb gdb [.] > cp_canonicalize_string[abi:cxx11] > 1.00% gdb libc-2.25.so [.] strlen > > IIUC this is because strcmp_iw_ordered() is quite critical when building > partial symbol tables. Yes, I think so. BTW, with --readnever, I noticed htab_hash_string at the top, so I experimented with unrolling it, as in the patch below. This seems to make htab_hash_string/my_htab_hash_string disappear from the perf log, and using the same ff.cmd script I shown before, I get: 0m20.0s -> 0m18.9 I haven't really investigated whether this increases memory use significantly. It may be that the extra cost for the string length is masked by malloc's round-up overhead already, resulting in no overhead in practice. Another thing that might be worth it to look at is to see if we could determine a better initial size for the hash table depending on number of elf symbols, to avoid too many rehashes. 'ada_decode' seems to be an unfortunate bottleneck that shows up on top (with --readnever), coming from ada_sniff_from_mangled_name, trying to sniff each minsym language from the mangled name, which is unfortunate because that's a lot of work when no symbol turns out to be an Ada symbol... I'd be good to see about making ada_decode's not-an-ada-symbol path be cheaper. An idea I had would be to try to combine most the language_sniff_from_mangled_name implementations in one call, by deferring to cplus_demangle which seems like could take care of gnu v3, rust, java, gnat, dlang and old C++ mangling schemes all at once. This would avoid the overhead of gdb_demangle and bfd_demangle for each language-specific demangling call, at least. Not sure. And then we just spend a lot of in the C++ demangler, which is kind of expected since there are a lot of C++ symbols to demangle. Not sure there's much we can do in the demangler itself. Maybe avoiding malloc, and seeing about avoiding multiple passes over the input string would help. But ultimately, I think that the major win would probably come from parallelizing minsym demangling. It's kind of silly that we don't take advantage of multiple cores here... Anyway, this is just a brain dump from a little investigation I did this past weekend, and I think that this area has a lot of scope for improvements, but I won't be able to follow up on any of this myself in the next following weeks at least, with several gdb 8.1 issues on my plate. >From 0567f3835cbe743ec9d294321ab1002039a3016f Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Tue, 28 Nov 2017 11:17:32 +0000 Subject: [PATCH] unroll htab_hash_string --- gdb/symtab.c | 92 ++++++++++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 81 insertions(+), 11 deletions(-) diff --git a/gdb/symtab.c b/gdb/symtab.c index 3d59367..c66874f 100644 --- a/gdb/symtab.c +++ b/gdb/symtab.c @@ -667,12 +667,77 @@ symbol_set_language (struct general_symbol_info *gsymbol, } } +/* Hash P/LEN. Unrolled version of + libiberty.c:hashtab.c:htab_hash_string, which is a string hash + function good for identifiers. Should probably be moved to + hashtab.c itself. */ + +hashval_t +my_htab_hash_string (const PTR p, int len) +{ + const unsigned char *string = (const unsigned char *) p; + unsigned int hash = 0; + + int i = 0; + +#if 0 + /* step 1 */ + for (; i + 3 < len; i += 4) + { + unsigned int h1 = (hash) * 67 + string[i + 0] - 113; + unsigned int h2 = (has1) * 67 + string[i + 1] - 113; + unsigned int h3 = (has2) * 67 + string[i + 2] - 113; + hash = (has3) * 67 + string[i + 3] - 113; + } +#endif + +#if 0 + /* step 2 */ + for (; i + 3 < len; i += 4) + { + unsigned int h2 = 67 * 67 * hash + string[i + 0] * 67 + string[i + 1] - 113 - 113 * 67; + unsigned int h3 = (has2) * 67 + string[i + 2] - 113; + hash = (has3) * 67 + string[i + 3] - 113; + } +#endif + +#if 0 + /* step 3 */ + for (; i + 3 < len; i += 4) + { + unsigned int h3 = 67 * 67 * 67 * hash + string[i + 0] * 67 * 67 + string[i + 1] * 67 + string[i + 2] - 113 - 113 * 67 - 113 * 67 * 67; + hash = (h3) * 67 + string[i + 3] - 113; + } +#endif + + /* Unrolled a few times to compensate for multiplication latency due + to data dependencies. */ + for (; i + 3 < len; i += 4) + { + hash = (67 * 67 * 67 * 67 * hash + + 67 * 67 * 67 * string[i + 0] + + 67 * 67 * string[i + 1] + + 67 * string[i + 2] + + string[i + 3] + - 113 * 67 + - 113 * 67 * 67 + - 113 * 67 * 67 * 67 + - 113); + } + + for (; i < len; i++) + hash = hash * 67 + string[i] - 113; + + return hash; +} + /* Functions to initialize a symbol's mangled name. */ /* Objects of this type are stored in the demangled name hash table. */ struct demangled_name_entry { const char *mangled; + uint32_t mangled_len; char demangled[1]; }; @@ -684,7 +749,7 @@ hash_demangled_name_entry (const void *data) const struct demangled_name_entry *e = (const struct demangled_name_entry *) data; - return htab_hash_string (e->mangled); + return my_htab_hash_string (e->mangled, e->mangled_len); } /* Equality function for the demangled name hash. */ @@ -697,6 +762,8 @@ eq_demangled_name_entry (const void *a, const void *b) const struct demangled_name_entry *db = (const struct demangled_name_entry *) b; + if (da->mangled_len != db->mangled_len) + return 0; return strcmp (da->mangled, db->mangled) == 0; } @@ -770,8 +837,8 @@ symbol_find_demangled_name (struct general_symbol_info *gsymbol, void symbol_set_names (struct general_symbol_info *gsymbol, - const char *linkage_name, int len, int copy_name, - struct objfile *objfile) + const char *linkage_name, const int linkage_name_len, + int copy_name, struct objfile *objfile) { struct demangled_name_entry **slot; /* A 0-terminated copy of the linkage name. */ @@ -788,10 +855,10 @@ symbol_set_names (struct general_symbol_info *gsymbol, else { char *name = (char *) obstack_alloc (&per_bfd->storage_obstack, - len + 1); + linkage_name_len + 1); - memcpy (name, linkage_name, len); - name[len] = '\0'; + memcpy (name, linkage_name, linkage_name_len); + name[linkage_name_len] = '\0'; gsymbol->name = name; } symbol_set_demangled_name (gsymbol, NULL, &per_bfd->storage_obstack); @@ -802,19 +869,20 @@ symbol_set_names (struct general_symbol_info *gsymbol, if (per_bfd->demangled_names_hash == NULL) create_demangled_names_hash (objfile); - if (linkage_name[len] != '\0') + if (linkage_name[linkage_name_len] != '\0') { char *alloc_name; - alloc_name = (char *) alloca (len + 1); - memcpy (alloc_name, linkage_name, len); - alloc_name[len] = '\0'; + alloc_name = (char *) alloca (linkage_name_len + 1); + memcpy (alloc_name, linkage_name, linkage_name_len); + alloc_name[linkage_name_len] = '\0'; linkage_name_copy = alloc_name; } else linkage_name_copy = linkage_name; + entry.mangled_len = linkage_name_len; entry.mangled = linkage_name_copy; slot = ((struct demangled_name_entry **) htab_find_slot (per_bfd->demangled_names_hash, @@ -847,6 +915,7 @@ symbol_set_names (struct general_symbol_info *gsymbol, obstack_alloc (&per_bfd->storage_obstack, offsetof (struct demangled_name_entry, demangled) + demangled_len + 1)); + (*slot)->mangled_len = linkage_name_len; (*slot)->mangled = linkage_name; } else @@ -860,10 +929,11 @@ symbol_set_names (struct general_symbol_info *gsymbol, = ((struct demangled_name_entry *) obstack_alloc (&per_bfd->storage_obstack, offsetof (struct demangled_name_entry, demangled) - + len + demangled_len + 2)); + + linkage_name_len + demangled_len + 2)); mangled_ptr = &((*slot)->demangled[demangled_len + 1]); strcpy (mangled_ptr, linkage_name_copy); (*slot)->mangled = mangled_ptr; + (*slot)->mangled_len = linkage_name_len; } if (demangled_name != NULL) -- 2.5.5