From: Dmitry Antipov <dantipov@nvidia.com>
To: <gdb@sourceware.org>
Subject: Note on choosing string hash functions
Date: Fri, 17 Nov 2017 09:50:00 -0000 [thread overview]
Message-ID: <33c45098-17a4-4c8a-fb14-137e70c7bb3f@nvidia.com> (raw)
I'm curious why 'htab_hash_string' (libiberty/hashtab.c) uses:
r = r * 67 + c - 113
but 'SYMBOL_HASH_NEXT' (gdb/minsyms.h) prefers an extra 'tolower' with:
((hash) * 67 + tolower ((unsigned char) (c)) - 113)
Everyone assumes that 'tolower' is simple, fast, and usually implemented
as a macro. But when >1M of mangled C++ symbols should be hashed, results
may be somewhat surprising ('perf report'):
8.94% gdb libc-2.25.so [.] tolower
8.45% gdb gdb [.] d_print_comp_inner
5.29% gdb gdb [.] htab_hash_string
4.16% gdb gdb [.] minimal_symbol_reader::install
3.72% gdb libc-2.25.so [.] __memmove_sse2_unaligned_erms
3.63% gdb gdb [.] msymbol_hash_iw
3.24% gdb gdb [.] htab_find_slot
3.07% gdb gdb [.] rust_is_mangled
2.57% gdb gdb [.] eq_demangled_name_entry
2.42% gdb gdb [.] d_print_comp
2.36% gdb gdb [.] compare_minimal_symbols
2.32% gdb gdb [.] default_search_name_hash
2.20% gdb gdb [.] skip_spaces
2.13% gdb gdb [.] rust_demangle_sym
2.02% gdb gdb [.] hash_demangled_name_entry
2.00% gdb gdb [.] tolower@plt
1.90% gdb gdb [.] d_count_templates_scopes
1.89% gdb gdb [.] ada_decode
1.87% gdb libc-2.25.so [.] msort_with_tmp.part.0
1.48% gdb libc-2.25.so [.] strlen
1.46% gdb gdb [.] htab_expand
1.39% gdb [kernel.kallsyms] [.] native_irq_return_iret
1.34% gdb libc-2.25.so [.] __strchr_sse2
1.27% gdb gdb [.] d_name
1.17% gdb gdb [.] cplus_demangle_type
1.11% gdb libc-2.25.so [.] _int_malloc
...
Dropping the 'tolower' from SYMBOL_HASH_NEXT immediately shows:
9.26% gdb gdb [.] d_print_comp_inner
6.97% gdb gdb [.] minimal_symbol_reader::install
5.80% gdb gdb [.] htab_hash_string
3.59% gdb libc-2.25.so [.] __memmove_sse2_unaligned_erms
3.41% gdb gdb [.] htab_find_slot
3.37% gdb gdb [.] rust_is_mangled
3.13% gdb libc-2.25.so [.] isspace
2.91% gdb gdb [.] eq_demangled_name_entry
2.75% gdb gdb [.] d_print_comp
2.56% gdb gdb [.] compare_minimal_symbols
2.42% gdb gdb [.] msymbol_hash_iw
2.37% gdb gdb [.] skip_spaces
2.33% gdb gdb [.] default_search_name_hash
2.32% gdb gdb [.] rust_demangle_sym
2.14% gdb gdb [.] ada_decode
2.14% gdb gdb [.] d_count_templates_scopes
2.05% gdb gdb [.] hash_demangled_name_entry
1.97% gdb libc-2.25.so [.] msort_with_tmp.part.0
1.72% gdb libc-2.25.so [.] strlen
1.61% gdb [kernel.kallsyms] [.] native_irq_return_iret
1.56% gdb gdb [.] htab_expand
1.55% gdb libc-2.25.so [.] __strchr_sse2
1.45% gdb gdb [.] d_name
1.41% gdb libc-2.25.so [.] _int_malloc
1.28% gdb gdb [.] cplus_demangle_type
1.03% gdb gdb [.] minimal_symbol_reader::record_full
1.02% gdb gdb [.] internal_cplus_demangle
1.01% gdb gdb [.] elf_symtab_read
...
Observed on current binutils-gdb trunk, glibc 2.25, gcc 7.2.1, -g -O3 -march=native -mtune=native,
"remote" (both gdb and gdbserver on the same system) debugging Firefox debug build, reports was generated
with 'perf record -F 10000 gdb -batch --readnever -q -ex "set sysroot /" -ex "set pagination off" \
-ex "target extended-remote :5555" -ex "thread apply all bt" -ex "quit"'.
Dmitry
next reply other threads:[~2017-11-17 9:50 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-17 9:50 Dmitry Antipov [this message]
2017-11-17 13:42 ` Pedro Alves
2017-11-22 2:10 ` Pedro Alves
2017-11-22 2:25 ` Pedro Alves
2017-11-28 10:35 ` Dmitry Antipov
2017-11-28 12:00 ` Pedro Alves
2017-11-28 15:13 ` Dmitry Antipov
2017-11-28 15:23 ` Pedro Alves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=33c45098-17a4-4c8a-fb14-137e70c7bb3f@nvidia.com \
--to=dantipov@nvidia.com \
--cc=gdb@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox