From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Berlin To: Elena Zannoni Cc: Daniel Jacobowitz , gdb-patches@sources.redhat.com Subject: Re: [rfa] symbol hashing, part 1/n - updates to hash functions Date: Thu, 11 Oct 2001 16:58:00 -0000 Message-id: References: <15302.12828.829882.614493@krustylu.cygnus.com> X-SW-Source: 2001-10/msg00154.html On Thursday, October 11, 2001, at 07:58 PM, Elena Zannoni wrote: > Daniel Jacobowitz writes: >> This patch still has two logical parts; if you strongly prefer I can >> break >> it up further, but they are somewhat intertwined and I think neither >> should >> be objectionable. They are: >> - Fix a looping bug in msymbol_hash_iw. It would not stop on '(' if >> there >> was whitespace before it. >> - Update to use the identifier hash function that libiberty uses, and >> more buckets. >> >> Is this OK? > > Looks ok to me in theory. Except that, why was the > > '% MINIMAL_SYMBOL_HASH_SIZE;' > > bit moved outside of the msymbol_hash and msymbol_hash_iw functions? So msymbol_hash and hash_iw could be used elsewhere. > You still do the same operation with the results returned by the two > functions anyway. > Except, now they are just hash functions, not hash functions that only work for the minsym hash tables. > Also, where are these 2 functions used besides mynsyms.c? In a further symbol hashing patch, unless he changed it. > I think we > should make them static and remove the extern from symtab.h. > > Can you give me an example where the '(' error comes up? (Just so I > understand it better). How did you come up with the number of > buckets? Averaging based on a large number of gnome, kde, and other real applications (ie emacs), compiled with debug info. 349 is way too small, we ended up with chains > length 100, all the time. > Is this also used in libiberty? Which, the hash function? > Can you fix it and resubmit? > > Thanks > Elena > >> >> -- >> Daniel Jacobowitz Carnegie Mellon University >> MontaVista Software Debian GNU/Linux Developer >> >> 2001-10-01 Daniel Jacobowitz >> >> * minsyms.c (msymbol_hash): Use better hash function. >> Return hash value without taking modulus. >> (msymbol_hash_iw): Likewise. Terminate loop at '(' properly. >> (add_minsym_to_hash_table): Take modulus of msymbol_hash's return >> value. >> (add_minsym_to_demangled_hash_table): Likewise for msymbol_hash_iw. >> (lookup_minimal_symbol): Likewise for both. >> >> * objfiles.h: Increase MINIMAL_SYMBOL_HASH_SIZE to match modern >> binaries. >> >> Index: gdb/minsyms.c >> =================================================================== >> RCS file: /cvs/src/src/gdb/minsyms.c,v >> retrieving revision 1.17 >> diff -u -p -r1.17 minsyms.c >> --- minsyms.c 2001/05/29 10:45:10 1.17 >> +++ minsyms.c 2001/10/01 22:20:47 >> @@ -96,10 +96,12 @@ msymbol_hash_iw (const char *string) >> while (isspace (*string)) >> ++string; >> if (*string && *string != '(') >> - hash = (31 * hash) + *string; >> - ++string; >> + { >> + hash = hash * 67 + *string - 113; >> + ++string; >> + } >> } >> - return hash % MINIMAL_SYMBOL_HASH_SIZE; >> + return hash; >> } >> >> /* Compute a hash code for a string. */ >> @@ -109,8 +111,8 @@ msymbol_hash (const char *string) >> { >> unsigned int hash = 0; >> for (; *string; ++string) >> - hash = (31 * hash) + *string; >> - return hash % MINIMAL_SYMBOL_HASH_SIZE; >> + hash = hash * 67 + *string - 113; >> + return hash; >> } >> >> /* Add the minimal symbol SYM to an objfile's minsym hash table, >> TABLE. */ >> @@ -120,7 +122,7 @@ add_minsym_to_hash_table (struct minimal >> { >> if (sym->hash_next == NULL) >> { >> - unsigned int hash = msymbol_hash (SYMBOL_NAME (sym)); >> + unsigned int hash = msymbol_hash (SYMBOL_NAME (sym)) % >> MINIMAL_SYMBOL_HASH_SIZE; >> sym->hash_next = table[hash]; >> table[hash] = sym; >> } >> @@ -134,7 +136,7 @@ add_minsym_to_demangled_hash_table (stru >> { >> if (sym->demangled_hash_next == NULL) >> { >> - unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME >> (sym)); >> + unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME >> (sym)) % MINIMAL_SYMBOL_HASH_SIZE; >> sym->demangled_hash_next = table[hash]; >> table[hash] = sym; >> } >> @@ -162,8 +164,8 @@ lookup_minimal_symbol (register const ch >> struct minimal_symbol *found_file_symbol = NULL; >> struct minimal_symbol *trampoline_symbol = NULL; >> >> - unsigned int hash = msymbol_hash (name); >> - unsigned int dem_hash = msymbol_hash_iw (name); >> + unsigned int hash = msymbol_hash (name) % MINIMAL_SYMBOL_HASH_SIZE; >> + unsigned int dem_hash = msymbol_hash_iw (name) % >> MINIMAL_SYMBOL_HASH_SIZE; >> >> #ifdef SOFUN_ADDRESS_MAYBE_MISSING >> if (sfile != NULL) >> >> Index: gdb/objfiles.h >> =================================================================== >> RCS file: /cvs/src/src/gdb/objfiles.h,v >> retrieving revision 1.8 >> diff -u -p -r1.8 objfiles.h >> --- objfiles.h 2001/03/06 08:21:11 1.8 >> +++ objfiles.h 2001/10/01 22:20:53 >> @@ -202,7 +202,7 @@ extern void print_objfile_statistics (vo >> extern void print_symbol_bcache_statistics (void); >> >> /* Number of entries in the minimal symbol hash table. */ >> -#define MINIMAL_SYMBOL_HASH_SIZE 349 >> +#define MINIMAL_SYMBOL_HASH_SIZE 2039 >> >> /* Master structure for keeping track of each file from which >> gdb reads symbols. There are several ways these get allocated: 1.