* Re: [rfa] Symbol hashing (for the last time?)
2002-07-10 18:55 ` Daniel Jacobowitz
@ 2002-07-10 19:08 ` Elena Zannoni
2002-07-11 10:43 ` Jim Blandy
2002-07-11 13:12 ` Elena Zannoni
2 siblings, 0 replies; 7+ messages in thread
From: Elena Zannoni @ 2002-07-10 19:08 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: gdb-patches
Daniel Jacobowitz writes:
> Here's a patch from last October, dusted off and merged to the current
> sources. The only substantial changes were some fixes for ada-lang.c,
> merged after I wrote the original patch. I've verified no regressions
> on i386-linux for GCC (2.95,3.0.4,3.1)/(stabs,dwarf2).
>
On my plate as well. I think I reviewed a few precursor patches to
this. I have to reread the old threads.
Of course if anybody else has comments, please feel free.
Elena
> This converts the normal symbol table lookups into hash tables. A few
> sorts of symbol tables aren't hashed: those produced by mdebugread.c
> and dstread.c, because they build symbol tables in lots of ad-hoc code,
> and symbol tables which are actually the arguments to a function
> (because order matters, or at least comments suggest so). A next step
> will be to convert mdebugread.c, delete dstread.c (it's marked for an
> upcoming obsoletion, isn't it?), and then delete all the complicated
> binary search code since the only remaining unhashed symtabs will be
> argument lists, which are small.
>
> This should help performance a bit on large programs. Ok to commit?
> Anyone see any problems with it?
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [rfa] Symbol hashing (for the last time?)
2002-07-10 18:55 ` Daniel Jacobowitz
2002-07-10 19:08 ` Elena Zannoni
@ 2002-07-11 10:43 ` Jim Blandy
2002-07-11 12:25 ` Daniel Jacobowitz
2002-07-11 13:12 ` Elena Zannoni
2 siblings, 1 reply; 7+ messages in thread
From: Jim Blandy @ 2002-07-11 10:43 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: gdb-patches
The definition of `struct block' in symtab.h needs a nice, detailed
comment explaining exactly how the hash table works --- which hash
function is used, chaining, allocation invariants, mangled
vs. unmangled names, etc.
Daniel Jacobowitz <drow@mvista.com> writes:
> Here's a patch from last October, dusted off and merged to the current
> sources. The only substantial changes were some fixes for ada-lang.c,
> merged after I wrote the original patch. I've verified no regressions
> on i386-linux for GCC (2.95,3.0.4,3.1)/(stabs,dwarf2).
>
> This converts the normal symbol table lookups into hash tables. A few
> sorts of symbol tables aren't hashed: those produced by mdebugread.c
> and dstread.c, because they build symbol tables in lots of ad-hoc code,
> and symbol tables which are actually the arguments to a function
> (because order matters, or at least comments suggest so). A next step
> will be to convert mdebugread.c, delete dstread.c (it's marked for an
> upcoming obsoletion, isn't it?), and then delete all the complicated
> binary search code since the only remaining unhashed symtabs will be
> argument lists, which are small.
>
> This should help performance a bit on large programs. Ok to commit?
> Anyone see any problems with it?
>
> --
> Daniel Jacobowitz Carnegie Mellon University
> MontaVista Software Debian GNU/Linux Developer
>
> 2002-07-10 Daniel Jacobowitz <drow@mvista.com>
>
> Based on patch from Daniel Berlin <dberlin@dberlin.org>.
> * buildsym.c: Include "demangle.h" for SYMBOL_INIT_DEMANGLED_NAME.
> (finish_block) For non-function blocks, hash the symbol table. For
> function blocks, mark the symbol table as unhashed.
> * minsyms.c (msymbol_hash): Return hash value without taking modulus.
> (msymbol_hash_iw): Likewise.
> (add_minsym_to_hash_table): Take modulus of msymbol_hash's return
> value.
> (add_minsym_to_demangled_hash_table): Likewise for msymbol_hash_iw.
> (lookup_minimal_symbol): Likewise for both.
> * symtab.h (struct block): Add `hashtable' flag.
> (BLOCK_HASHTABLE): New macro.
> (BLOCK_BUCKETS): New macro.
> (BLOCK_BUCKET): New macro.
> (ALL_BLOCK_SYMBOLS): New macro.
> (BLOCK_SHOULD_SORT): Do not sort hashed blocks.
> (struct symbol): Add `hash_next' pointer.
> * symtab.c (lookup_block_symbol): Search using the hash table when
> possible.
> (find_pc_sect_symtab): Use ALL_BLOCK_SYMBOLS.
> (search_symbols, find_addr_symbol): Likewise.
>
> * dstread.c (process_dst_block): Clear hashtable bit for new block.
> (read_dst_symtab): Likewise.
> * jv-lang.c (get_java_class_symtab): Likewise.
> * mdebugread.c: Include "gdb_assert.h".
> (shrink_block): Assert that the block being modified is not hashed.
> * coffread.c (patch_opaque_types): Use ALL_BLOCK_SYMBOLS.
> * symmisc.c (free_symtab_block): Walk the hash table when freeing
> symbols.
> (dump_symtab): Recognize hashed blocks.
> * printcmd.c (print_frame_args): Assert that function blocks do not
> have hashed symbol tables.
> * ada-lang.c (symtab_for_sym): Use ALL_BLOCK_SYMBOLS.
> (fill_in_ada_prototype, debug_print_block): Likewise.
> (ada_add_block_symbols): Use ALL_BLOCK_SYMBOLS. Handle hash tables.
>
> Index: gdb/ada-lang.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/ada-lang.c,v
> retrieving revision 1.2
> diff -u -p -r1.2 ada-lang.c
> --- gdb/ada-lang.c 19 Jun 2002 16:55:28 -0000 1.2
> +++ gdb/ada-lang.c 11 Jul 2002 00:14:23 -0000
> @@ -3560,6 +3560,7 @@ symtab_for_sym (sym)
> struct symtab* s;
> struct objfile *objfile;
> struct block *b;
> + struct symbol *tmp_sym;
> int i, j;
>
> ALL_SYMTABS (objfile, s)
> @@ -3574,12 +3575,12 @@ symtab_for_sym (sym)
> case LOC_BLOCK:
> case LOC_CONST_BYTES:
> b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), GLOBAL_BLOCK);
> - for (i = 0; i < BLOCK_NSYMS (b); i += 1)
> - if (sym == BLOCK_SYM (b, i))
> + ALL_BLOCK_SYMBOLS (b, i, tmp_sym)
> + if (sym == tmp_sym)
> return s;
> b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), STATIC_BLOCK);
> - for (i = 0; i < BLOCK_NSYMS (b); i += 1)
> - if (sym == BLOCK_SYM (b, i))
> + ALL_BLOCK_SYMBOLS (b, i, tmp_sym)
> + if (sym == tmp_sym)
> return s;
> break;
> default:
> @@ -3601,8 +3602,8 @@ symtab_for_sym (sym)
> j < BLOCKVECTOR_NBLOCKS (BLOCKVECTOR (s)); j += 1)
> {
> b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), j);
> - for (i = 0; i < BLOCK_NSYMS (b); i += 1)
> - if (sym == BLOCK_SYM (b, i))
> + ALL_BLOCK_SYMBOLS (b, i, tmp_sym)
> + if (sym == tmp_sym)
> return s;
> }
> break;
> @@ -4094,14 +4095,14 @@ ada_add_block_symbols (block, name, name
> /* Set true when we find a matching non-argument symbol */
> int found_sym;
> int is_sorted = BLOCK_SHOULD_SORT (block);
> + struct symbol *sym;
>
> arg_sym = NULL; found_sym = 0;
> if (wild)
> {
> - for (i = 0; i < BLOCK_NSYMS (block); i += 1)
> + struct symbol *sym;
> + ALL_BLOCK_SYMBOLS (block, i, sym)
> {
> - struct symbol *sym = BLOCK_SYM (block, i);
> -
> if (SYMBOL_NAMESPACE (sym) == namespace &&
> wild_match (name, name_len, SYMBOL_NAME (sym)))
> {
> @@ -4149,44 +4150,46 @@ ada_add_block_symbols (block, name, name
> else
> i = 0;
>
> - for (; i < BLOCK_NSYMS (block); i += 1)
> - {
> - struct symbol *sym = BLOCK_SYM (block, i);
> -
> - if (SYMBOL_NAMESPACE (sym) == namespace)
> - {
> - int cmp = strncmp (name, SYMBOL_NAME (sym), name_len);
> -
> - if (cmp < 0)
> - {
> - if (is_sorted)
> - break;
> - }
> - else if (cmp == 0
> - && is_name_suffix (SYMBOL_NAME (sym) + name_len))
> - {
> - switch (SYMBOL_CLASS (sym))
> - {
> - case LOC_ARG:
> - case LOC_LOCAL_ARG:
> - case LOC_REF_ARG:
> - case LOC_REGPARM:
> - case LOC_REGPARM_ADDR:
> - case LOC_BASEREG_ARG:
> - arg_sym = sym;
> - break;
> - case LOC_UNRESOLVED:
> - break;
> - default:
> - found_sym = 1;
> - fill_in_ada_prototype (sym);
> - add_defn_to_vec (fixup_symbol_section (sym, objfile),
> - block);
> - break;
> - }
> - }
> - }
> - }
> + for (; i < BLOCK_BUCKETS (block); i += 1)
> + for (sym = BLOCK_BUCKET (block, i); sym != NULL; sym = sym->hash_next)
> + {
> + if (SYMBOL_NAMESPACE (sym) == namespace)
> + {
> + int cmp = strncmp (name, SYMBOL_NAME (sym), name_len);
> +
> + if (cmp < 0)
> + {
> + if (is_sorted)
> + {
> + i = BLOCK_BUCKETS (block);
> + break;
> + }
> + }
> + else if (cmp == 0
> + && is_name_suffix (SYMBOL_NAME (sym) + name_len))
> + {
> + switch (SYMBOL_CLASS (sym))
> + {
> + case LOC_ARG:
> + case LOC_LOCAL_ARG:
> + case LOC_REF_ARG:
> + case LOC_REGPARM:
> + case LOC_REGPARM_ADDR:
> + case LOC_BASEREG_ARG:
> + arg_sym = sym;
> + break;
> + case LOC_UNRESOLVED:
> + break;
> + default:
> + found_sym = 1;
> + fill_in_ada_prototype (sym);
> + add_defn_to_vec (fixup_symbol_section (sym, objfile),
> + block);
> + break;
> + }
> + }
> + }
> + }
> }
>
> if (! found_sym && arg_sym != NULL)
> @@ -4219,53 +4222,57 @@ ada_add_block_symbols (block, name, name
> else
> i = 0;
>
> - for (; i < BLOCK_NSYMS (block); i += 1)
> - {
> - struct symbol *sym = BLOCK_SYM (block, i);
> -
> - if (SYMBOL_NAMESPACE (sym) == namespace)
> - {
> - int cmp;
> -
> - cmp = (int) '_' - (int) SYMBOL_NAME (sym)[0];
> - if (cmp == 0)
> - {
> - cmp = strncmp ("_ada_", SYMBOL_NAME (sym), 5);
> - if (cmp == 0)
> - cmp = strncmp (name, SYMBOL_NAME (sym) + 5, name_len);
> - }
> -
> - if (cmp < 0)
> - {
> - if (is_sorted)
> - break;
> - }
> - else if (cmp == 0
> - && is_name_suffix (SYMBOL_NAME (sym) + name_len + 5))
> - {
> - switch (SYMBOL_CLASS (sym))
> - {
> - case LOC_ARG:
> - case LOC_LOCAL_ARG:
> - case LOC_REF_ARG:
> - case LOC_REGPARM:
> - case LOC_REGPARM_ADDR:
> - case LOC_BASEREG_ARG:
> - arg_sym = sym;
> - break;
> - case LOC_UNRESOLVED:
> - break;
> - default:
> - found_sym = 1;
> - fill_in_ada_prototype (sym);
> - add_defn_to_vec (fixup_symbol_section (sym, objfile),
> - block);
> - break;
> - }
> - }
> - }
> - }
> -
> + for (; i < BLOCK_BUCKETS (block); i += 1)
> + for (sym = BLOCK_BUCKET (block, i); sym != NULL; sym = sym->hash_next)
> + {
> + struct symbol *sym = BLOCK_SYM (block, i);
> +
> + if (SYMBOL_NAMESPACE (sym) == namespace)
> + {
> + int cmp;
> +
> + cmp = (int) '_' - (int) SYMBOL_NAME (sym)[0];
> + if (cmp == 0)
> + {
> + cmp = strncmp ("_ada_", SYMBOL_NAME (sym), 5);
> + if (cmp == 0)
> + cmp = strncmp (name, SYMBOL_NAME (sym) + 5, name_len);
> + }
> +
> + if (cmp < 0)
> + {
> + if (is_sorted)
> + {
> + i = BLOCK_BUCKETS (block);
> + break;
> + }
> + }
> + else if (cmp == 0
> + && is_name_suffix (SYMBOL_NAME (sym) + name_len + 5))
> + {
> + switch (SYMBOL_CLASS (sym))
> + {
> + case LOC_ARG:
> + case LOC_LOCAL_ARG:
> + case LOC_REF_ARG:
> + case LOC_REGPARM:
> + case LOC_REGPARM_ADDR:
> + case LOC_BASEREG_ARG:
> + arg_sym = sym;
> + break;
> + case LOC_UNRESOLVED:
> + break;
> + default:
> + found_sym = 1;
> + fill_in_ada_prototype (sym);
> + add_defn_to_vec (fixup_symbol_section (sym, objfile),
> + block);
> + break;
> + }
> + }
> + }
> + }
> +
> /* NOTE: This really shouldn't be needed for _ada_ symbols.
> They aren't parameters, right? */
> if (! found_sym && arg_sym != NULL)
> @@ -4292,6 +4299,7 @@ fill_in_ada_prototype (func)
> struct type* ftype;
> struct type* rtype;
> size_t max_fields;
> + struct symbol *sym;
>
> if (func == NULL
> || TYPE_CODE (SYMBOL_TYPE (func)) != TYPE_CODE_FUNC
> @@ -4308,16 +4316,13 @@ fill_in_ada_prototype (func)
> SYMBOL_TYPE (func) = ftype;
>
> b = SYMBOL_BLOCK_VALUE (func);
> - nsyms = BLOCK_NSYMS (b);
>
> nargs = 0;
> max_fields = 8;
> TYPE_FIELDS (ftype) =
> (struct field*) xmalloc (sizeof (struct field) * max_fields);
> - for (i = 0; i < nsyms; i += 1)
> + ALL_BLOCK_SYMBOLS (b, i, sym)
> {
> - struct symbol *sym = BLOCK_SYM (b, i);
> -
> GROW_VECT (TYPE_FIELDS (ftype), max_fields, nargs+1);
>
> switch (SYMBOL_CLASS (sym))
> @@ -4903,6 +4908,8 @@ debug_print_block (b)
> struct block* b;
> {
> int i;
> + struct symbol *i;
> +
> fprintf (stderr, "Block: %p; [0x%lx, 0x%lx]",
> b, BLOCK_START(b), BLOCK_END(b));
> if (BLOCK_FUNCTION(b) != NULL)
> @@ -4910,11 +4917,11 @@ debug_print_block (b)
> fprintf (stderr, "\n");
> fprintf (stderr, "\t Superblock: %p\n", BLOCK_SUPERBLOCK(b));
> fprintf (stderr, "\t Symbols:");
> - for (i = 0; i < BLOCK_NSYMS (b); i += 1)
> + ALL_BLOCK_SYMBOLS (b, i, sym)
> {
> if (i > 0 && i % 4 == 0)
> fprintf (stderr, "\n\t\t ");
> - fprintf (stderr, " %s", SYMBOL_NAME (BLOCK_SYM (b, i)));
> + fprintf (stderr, " %s", SYMBOL_NAME (sym));
> }
> fprintf (stderr, "\n");
> }
> Index: gdb/buildsym.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/buildsym.c,v
> retrieving revision 1.16
> diff -u -p -r1.16 buildsym.c
> --- gdb/buildsym.c 15 May 2002 21:19:18 -0000 1.16
> +++ gdb/buildsym.c 11 Jul 2002 00:14:23 -0000
> @@ -40,6 +40,7 @@
> #include "bcache.h"
> #include "filenames.h" /* For DOSish file names */
> #include "macrotab.h"
> +#include "demangle.h" /* Needed by SYMBOL_INIT_DEMANGLED_NAME. */
> /* Ask buildsym.h to define the vars it normally declares `extern'. */
> #define EXTERN
> /**/
> @@ -243,17 +244,45 @@ finish_block (struct symbol *symbol, str
> /* EMPTY */ ;
> }
>
> - block = (struct block *) obstack_alloc (&objfile->symbol_obstack,
> - (sizeof (struct block) + ((i - 1) * sizeof (struct symbol *))));
> -
> /* Copy the symbols into the block. */
>
> - BLOCK_NSYMS (block) = i;
> - for (next = *listhead; next; next = next->next)
> + if (symbol)
> + {
> + block = (struct block *)
> + obstack_alloc (&objfile->symbol_obstack,
> + (sizeof (struct block) +
> + ((i - 1) * sizeof (struct symbol *))));
> + BLOCK_NSYMS (block) = i;
> + for (next = *listhead; next; next = next->next)
> + for (j = next->nsyms - 1; j >= 0; j--)
> + {
> + BLOCK_SYM (block, --i) = next->symbol[j];
> + }
> + }
> + else
> {
> - for (j = next->nsyms - 1; j >= 0; j--)
> + block = (struct block *)
> + obstack_alloc (&objfile->symbol_obstack,
> + (sizeof (struct block) +
> + ((i / 5) * sizeof (struct symbol *))));
> + for (j = 0; j < ((i / 5) + 1); j++)
> {
> - BLOCK_SYM (block, --i) = next->symbol[j];
> + BLOCK_BUCKET (block, j) = 0;
> + }
> + BLOCK_BUCKETS (block) = (i / 5) + 1;
> + for (next = *listhead; next; next = next->next)
> + {
> + for (j = next->nsyms - 1; j >= 0; j--)
> + {
> + struct symbol *sym;
> + unsigned int hash_index;
> + SYMBOL_INIT_DEMANGLED_NAME (next->symbol[j], &objfile->symbol_obstack);
> + hash_index = msymbol_hash_iw (SYMBOL_SOURCE_NAME (next->symbol[j]));
> + hash_index = hash_index % BLOCK_BUCKETS (block);
> + sym = BLOCK_BUCKET (block, hash_index);
> + BLOCK_BUCKET (block, hash_index) = next->symbol[j];
> + next->symbol[j]->hash_next = sym;
> + }
> }
> }
>
> @@ -271,6 +300,7 @@ finish_block (struct symbol *symbol, str
> struct type *ftype = SYMBOL_TYPE (symbol);
> SYMBOL_BLOCK_VALUE (symbol) = block;
> BLOCK_FUNCTION (block) = symbol;
> + BLOCK_HASHTABLE (block) = 0;
>
> if (TYPE_NFIELDS (ftype) <= 0)
> {
> @@ -352,6 +382,7 @@ finish_block (struct symbol *symbol, str
> else
> {
> BLOCK_FUNCTION (block) = NULL;
> + BLOCK_HASHTABLE (block) = 1;
> }
>
> /* Now "free" the links of the list, and empty the list. */
> Index: gdb/coffread.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/coffread.c,v
> retrieving revision 1.26
> diff -u -p -r1.26 coffread.c
> --- gdb/coffread.c 19 Mar 2002 19:00:03 -0000 1.26
> +++ gdb/coffread.c 11 Jul 2002 00:14:23 -0000
> @@ -1409,13 +1409,12 @@ patch_opaque_types (struct symtab *s)
>
> /* Go through the per-file symbols only */
> b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), STATIC_BLOCK);
> - for (i = BLOCK_NSYMS (b) - 1; i >= 0; i--)
> + ALL_BLOCK_SYMBOLS (b, i, real_sym)
> {
> /* Find completed typedefs to use to fix opaque ones.
> Remove syms from the chain when their types are stored,
> but search the whole chain, as there may be several syms
> from different files with the same name. */
> - real_sym = BLOCK_SYM (b, i);
> if (SYMBOL_CLASS (real_sym) == LOC_TYPEDEF &&
> SYMBOL_NAMESPACE (real_sym) == VAR_NAMESPACE &&
> TYPE_CODE (SYMBOL_TYPE (real_sym)) == TYPE_CODE_PTR &&
> Index: gdb/dstread.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/dstread.c,v
> retrieving revision 1.8
> diff -u -p -r1.8 dstread.c
> --- gdb/dstread.c 13 May 2002 14:00:35 -0000 1.8
> +++ gdb/dstread.c 11 Jul 2002 00:14:24 -0000
> @@ -1396,6 +1396,7 @@ process_dst_block (struct objfile *objfi
> symnum++;
> }
> BLOCK_NSYMS (block) = total_symbols;
> + BLOCK_HASHTABLE (block) = 0;
> BLOCK_START (block) = address;
> BLOCK_END (block) = address + size;
> BLOCK_SUPERBLOCK (block) = 0;
> @@ -1480,6 +1481,7 @@ read_dst_symtab (struct objfile *objfile
> (total_globals - 1) *
> sizeof (struct symbol *));
> BLOCK_NSYMS (global_block) = total_globals;
> + BLOCK_HASHTABLE (global_block) = 0;
> for (symnum = 0; symnum < total_globals; symnum++)
> {
> nextsym = dst_global_symbols->next;
> Index: gdb/jv-lang.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/jv-lang.c,v
> retrieving revision 1.11
> diff -u -p -r1.11 jv-lang.c
> --- gdb/jv-lang.c 2 Dec 2001 22:43:59 -0000 1.11
> +++ gdb/jv-lang.c 11 Jul 2002 00:14:24 -0000
> @@ -106,6 +106,7 @@ get_java_class_symtab (void)
> bl = (struct block *)
> obstack_alloc (&objfile->symbol_obstack, sizeof (struct block));
> BLOCK_NSYMS (bl) = 0;
> + BLOCK_HASHTABLE (bl) = 0;
> BLOCK_START (bl) = 0;
> BLOCK_END (bl) = 0;
> BLOCK_FUNCTION (bl) = NULL;
> Index: gdb/mdebugread.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/mdebugread.c,v
> retrieving revision 1.26
> diff -u -p -r1.26 mdebugread.c
> --- gdb/mdebugread.c 13 May 2002 14:00:36 -0000 1.26
> +++ gdb/mdebugread.c 11 Jul 2002 00:14:25 -0000
> @@ -52,6 +52,7 @@
> #include "stabsread.h"
> #include "complaints.h"
> #include "demangle.h"
> +#include "gdb_assert.h"
>
> /* These are needed if the tm.h file does not contain the necessary
> mips specific definitions. */
> @@ -4726,6 +4727,11 @@ shrink_block (struct block *b, struct sy
> (sizeof (struct block)
> + ((BLOCK_NSYMS (b) - 1)
> * sizeof (struct symbol *))));
> +
> + /* FIXME: Not worth hashing this block as it's built. */
> + /* All callers should have created the block with new_block (), which
> + would mean it was not previously hashed. Make sure. */
> + gdb_assert (BLOCK_HASHTABLE (new) == 0);
>
> /* Should chase pointers to old one. Fortunately, that`s just
> the block`s function and inferior blocks */
> Index: gdb/minsyms.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/minsyms.c,v
> retrieving revision 1.21
> diff -u -p -r1.21 minsyms.c
> --- gdb/minsyms.c 24 Apr 2002 08:00:54 -0000 1.21
> +++ gdb/minsyms.c 11 Jul 2002 00:14:25 -0000
> @@ -91,7 +91,7 @@ msymbol_hash_iw (const char *string)
> ++string;
> }
> }
> - return hash % MINIMAL_SYMBOL_HASH_SIZE;
> + return hash;
> }
>
> /* Compute a hash code for a string. */
> @@ -102,7 +102,7 @@ msymbol_hash (const char *string)
> unsigned int hash = 0;
> for (; *string; ++string)
> hash = hash * 67 + *string - 113;
> - return hash % MINIMAL_SYMBOL_HASH_SIZE;
> + return hash;
> }
>
> /* Add the minimal symbol SYM to an objfile's minsym hash table, TABLE. */
> @@ -112,7 +112,7 @@ add_minsym_to_hash_table (struct minimal
> {
> if (sym->hash_next == NULL)
> {
> - unsigned int hash = msymbol_hash (SYMBOL_NAME (sym));
> + unsigned int hash = msymbol_hash (SYMBOL_NAME (sym)) % MINIMAL_SYMBOL_HASH_SIZE;
> sym->hash_next = table[hash];
> table[hash] = sym;
> }
> @@ -126,7 +126,7 @@ add_minsym_to_demangled_hash_table (stru
> {
> if (sym->demangled_hash_next == NULL)
> {
> - unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME (sym));
> + unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME (sym)) % MINIMAL_SYMBOL_HASH_SIZE;
> sym->demangled_hash_next = table[hash];
> table[hash] = sym;
> }
> @@ -154,8 +154,8 @@ lookup_minimal_symbol (register const ch
> struct minimal_symbol *found_file_symbol = NULL;
> struct minimal_symbol *trampoline_symbol = NULL;
>
> - unsigned int hash = msymbol_hash (name);
> - unsigned int dem_hash = msymbol_hash_iw (name);
> + unsigned int hash = msymbol_hash (name) % MINIMAL_SYMBOL_HASH_SIZE;
> + unsigned int dem_hash = msymbol_hash_iw (name) % MINIMAL_SYMBOL_HASH_SIZE;
>
> #ifdef SOFUN_ADDRESS_MAYBE_MISSING
> if (sfile != NULL)
> Index: gdb/printcmd.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/printcmd.c,v
> retrieving revision 1.39
> diff -u -p -r1.39 printcmd.c
> --- gdb/printcmd.c 11 May 2002 23:48:23 -0000 1.39
> +++ gdb/printcmd.c 11 Jul 2002 00:14:26 -0000
> @@ -40,6 +40,7 @@
> #include "objfiles.h" /* ditto */
> #include "completer.h" /* for completion functions */
> #include "ui-out.h"
> +#include "gdb_assert.h"
>
> extern int asm_demangle; /* Whether to demangle syms in asm printouts */
> extern int addressprint; /* Whether to print hex addresses in HLL " */
> @@ -1785,6 +1786,10 @@ print_frame_args (struct symbol *func, s
> if (func)
> {
> b = SYMBOL_BLOCK_VALUE (func);
> + /* Function blocks are order sensitive, and thus should not be
> + hashed. */
> + gdb_assert (BLOCK_HASHTABLE (b) == 0);
> +
> ALL_BLOCK_SYMBOLS (b, i, sym)
> {
> QUIT;
> Index: gdb/symmisc.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/symmisc.c,v
> retrieving revision 1.9
> diff -u -p -r1.9 symmisc.c
> --- gdb/symmisc.c 15 May 2002 21:19:21 -0000 1.9
> +++ gdb/symmisc.c 11 Jul 2002 00:14:26 -0000
> @@ -86,11 +86,17 @@ static void
> free_symtab_block (struct objfile *objfile, struct block *b)
> {
> register int i, n;
> - n = BLOCK_NSYMS (b);
> + struct symbol *sym, *next_sym;
> +
> + n = BLOCK_BUCKETS (b);
> for (i = 0; i < n; i++)
> {
> - xmfree (objfile->md, SYMBOL_NAME (BLOCK_SYM (b, i)));
> - xmfree (objfile->md, (PTR) BLOCK_SYM (b, i));
> + for (sym = BLOCK_BUCKET (b, i); sym; sym = next_sym)
> + {
> + next_sym = sym->hash_next;
> + xmfree (objfile->md, SYMBOL_NAME (sym));
> + xmfree (objfile->md, (PTR) sym);
> + }
> }
> xmfree (objfile->md, (PTR) b);
> }
> @@ -457,8 +463,14 @@ dump_symtab (struct objfile *objfile, st
> fprintf_filtered (outfile, " under ");
> gdb_print_host_address (BLOCK_SUPERBLOCK (b), outfile);
> }
> - blen = BLOCK_NSYMS (b);
> - fprintf_filtered (outfile, ", %d syms in ", blen);
> + /* drow/2002-07-10: We could save the total symbols count
> + even if we're using a hashtable, but nothing else but this message
> + wants it. */
> + blen = BLOCK_BUCKETS (b);
> + if (BLOCK_HASHTABLE (b))
> + fprintf_filtered (outfile, ", %d buckets in ", blen);
> + else
> + fprintf_filtered (outfile, ", %d syms in ", blen);
> print_address_numeric (BLOCK_START (b), 1, outfile);
> fprintf_filtered (outfile, "..");
> print_address_numeric (BLOCK_END (b), 1, outfile);
> @@ -474,8 +486,8 @@ dump_symtab (struct objfile *objfile, st
> if (BLOCK_GCC_COMPILED (b))
> fprintf_filtered (outfile, ", compiled with gcc%d", BLOCK_GCC_COMPILED (b));
> fprintf_filtered (outfile, "\n");
> - /* Now print each symbol in this block. */
> - /* FIXMED: Sort? */
> + /* Now print each symbol in this block (in no particular order, if
> + we're using a hashtable). */
> ALL_BLOCK_SYMBOLS (b, j, sym)
> {
> struct print_symbol_args s;
> Index: gdb/symtab.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/symtab.c,v
> retrieving revision 1.64
> diff -u -p -r1.64 symtab.c
> --- gdb/symtab.c 4 Jul 2002 15:22:42 -0000 1.64
> +++ gdb/symtab.c 11 Jul 2002 00:14:27 -0000
> @@ -1328,6 +1328,22 @@ lookup_block_symbol (register const stru
> register struct symbol *sym_found = NULL;
> register int do_linear_search = 1;
>
> + if (BLOCK_HASHTABLE (block))
> + {
> + unsigned int hash_index;
> + hash_index = msymbol_hash_iw (name);
> + hash_index = hash_index % BLOCK_BUCKETS (block);
> + for (sym = BLOCK_BUCKET (block, hash_index); sym; sym = sym->hash_next)
> + {
> + if (SYMBOL_NAMESPACE (sym) == namespace
> + && (mangled_name
> + ? strcmp (SYMBOL_NAME (sym), mangled_name) == 0
> + : SYMBOL_MATCHES_NAME (sym, name)))
> + return sym;
> + }
> + return NULL;
> + }
> +
> /* If the blocks's symbols were sorted, start with a binary search. */
>
> if (BLOCK_SHOULD_SORT (block))
> @@ -1582,14 +1598,15 @@ find_pc_sect_symtab (CORE_ADDR pc, asect
> if (section != 0)
> {
> int i;
> + struct symbol *sym = NULL;
>
> - for (i = 0; i < b->nsyms; i++)
> + ALL_BLOCK_SYMBOLS (b, i, sym)
> {
> - fixup_symbol_section (b->sym[i], objfile);
> - if (section == SYMBOL_BFD_SECTION (b->sym[i]))
> + fixup_symbol_section (sym, objfile);
> + if (section == SYMBOL_BFD_SECTION (sym))
> break;
> }
> - if (i >= b->nsyms)
> + if ((i >= BLOCK_BUCKETS (b)) && (sym == NULL))
> continue; /* no symbol in this symtab matches section */
> }
> distance = BLOCK_END (b) - BLOCK_START (b);
> @@ -1661,10 +1678,8 @@ find_addr_symbol (CORE_ADDR addr, struct
> {
> QUIT;
> block = BLOCKVECTOR_BLOCK (BLOCKVECTOR (symtab), blocknum);
> - top = BLOCK_NSYMS (block);
> - for (bot = 0; bot < top; bot++)
> + ALL_BLOCK_SYMBOLS (block, bot, sym)
> {
> - sym = BLOCK_SYM (block, bot);
> switch (SYMBOL_CLASS (sym))
> {
> case LOC_STATIC:
> @@ -2795,10 +2810,9 @@ search_symbols (char *regexp, namespace_
> struct symbol_search *prevtail = tail;
> int nfound = 0;
> b = BLOCKVECTOR_BLOCK (bv, i);
> - for (j = 0; j < BLOCK_NSYMS (b); j++)
> + ALL_BLOCK_SYMBOLS (b, j, sym)
> {
> QUIT;
> - sym = BLOCK_SYM (b, j);
> if (file_matches (s->filename, files, nfiles)
> && ((regexp == NULL || SYMBOL_MATCHES_REGEXP (sym))
> && ((kind == VARIABLES_NAMESPACE && SYMBOL_CLASS (sym) != LOC_TYPEDEF
> Index: gdb/symtab.h
> ===================================================================
> RCS file: /cvs/src/src/gdb/symtab.h,v
> retrieving revision 1.33
> diff -u -p -r1.33 symtab.h
> --- gdb/symtab.h 28 Jun 2002 22:09:11 -0000 1.33
> +++ gdb/symtab.h 11 Jul 2002 00:14:27 -0000
> @@ -386,6 +386,10 @@ struct block
>
> unsigned char gcc_compile_flag;
>
> + /* If this is really a hashtable of the symbols, this flag is 1. */
> +
> + unsigned char hashtable;
> +
> /* Number of local symbols. */
>
> int nsyms;
> @@ -398,30 +402,35 @@ struct block
>
> #define BLOCK_START(bl) (bl)->startaddr
> #define BLOCK_END(bl) (bl)->endaddr
> -#define BLOCK_NSYMS(bl) (bl)->nsyms
> -#define BLOCK_SYM(bl, n) (bl)->sym[n]
> #define BLOCK_FUNCTION(bl) (bl)->function
> #define BLOCK_SUPERBLOCK(bl) (bl)->superblock
> #define BLOCK_GCC_COMPILED(bl) (bl)->gcc_compile_flag
> +#define BLOCK_HASHTABLE(bl) (bl)->hashtable
>
> -/* Macro to loop through all symbols in a block BL.
> - i counts which symbol we are looking at, and sym points to the current
> - symbol.
> - The contortion at the end is to avoid reading past the last valid
> - BLOCK_SYM. */
> -#define ALL_BLOCK_SYMBOLS(bl, i, sym) \
> - for ((i) = 0, (sym) = BLOCK_SYM ((bl), (i)); \
> - (i) < BLOCK_NSYMS ((bl)); \
> - ++(i), (sym) = ((i) < BLOCK_NSYMS ((bl))) \
> - ? BLOCK_SYM ((bl), (i)) \
> - : NULL)
> +/* For blocks without a hashtable (BLOCK_HASHTABLE (bl) == 0) only. */
> +#define BLOCK_NSYMS(bl) (bl)->nsyms
> +#define BLOCK_SYM(bl, n) (bl)->sym[n]
> +
> +/* For blocks with a hashtable, but these are valid for non-hashed blocks as
> + well - each symbol will appear to be one bucket by itself. */
> +#define BLOCK_BUCKETS(bl) (bl)->nsyms
> +#define BLOCK_BUCKET(bl, n) (bl)->sym[n]
> +
> +/* Macro to loop through all symbols in a block BL, in no particular order.
> + i counts which bucket we are in, and sym points to the current symbol. */
> +
> +#define ALL_BLOCK_SYMBOLS(bl, i, sym) \
> + for ((i) = 0; (i) < BLOCK_BUCKETS ((bl)); (i)++) \
> + for ((sym) = BLOCK_BUCKET ((bl), (i)); (sym); \
> + (sym) = (sym)->hash_next)
>
> /* Nonzero if symbols of block BL should be sorted alphabetically.
> Don't sort a block which corresponds to a function. If we did the
> sorting would have to preserve the order of the symbols for the
> - arguments. */
> + arguments. Also don't sort any block that we chose to hash. */
>
> -#define BLOCK_SHOULD_SORT(bl) ((bl)->nsyms >= 40 && BLOCK_FUNCTION (bl) == NULL)
> +#define BLOCK_SHOULD_SORT(bl) (! BLOCK_HASHTABLE (bl) \
> + && BLOCK_FUNCTION (bl) == NULL)
> \f
>
> /* Represent one symbol name; a variable, constant, function or typedef. */
> @@ -671,6 +680,8 @@ struct symbol
> /* List of ranges where this symbol is active. This is only
> used by alias symbols at the current time. */
> struct range_list *ranges;
> +
> + struct symbol *hash_next;
> };
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [rfa] Symbol hashing (for the last time?)
2002-07-11 10:43 ` Jim Blandy
@ 2002-07-11 12:25 ` Daniel Jacobowitz
0 siblings, 0 replies; 7+ messages in thread
From: Daniel Jacobowitz @ 2002-07-11 12:25 UTC (permalink / raw)
To: Jim Blandy; +Cc: gdb-patches
On Thu, Jul 11, 2002 at 12:37:02PM -0500, Jim Blandy wrote:
>
> The definition of `struct block' in symtab.h needs a nice, detailed
> comment explaining exactly how the hash table works --- which hash
> function is used, chaining, allocation invariants, mangled
> vs. unmangled names, etc.
Is this about what you had in mind?
/* The symbols for this block are either in a simple linear list or
in a simple hashtable. Blocks which correspond to a function
(which have a list of symbols corresponding to arguments) use
a linear list, as do some older symbol readers. Other blocks
are hashed.
The hashtable uses the same hash function as the minsym hashtables,
found in minsyms.c:minsym_hash_iw. Symbols are hashed based on
their demangled name if appropriate, and on their name otherwise.
The hash function ignores space, and stops at the beginning of the
argument list if any.
The table is laid out in NSYMS/5 buckets and symbols are chained via
their hash_next field. */
Comment added, and I fixed a potential bug (it was hashing
SYMBOL_SOURCE_NAME, which is sensitive to the user setting of
'set print demangled') and a memleak (the comment on
symbol_init_demangled_name suggests that it sets language to unknown if
demangling fails, which is untrue, and repeatedly calling it will also
leak memory). Resubmitting as attached with just those changes. No
test changes.
--
Daniel Jacobowitz Carnegie Mellon University
MontaVista Software Debian GNU/Linux Developer
2002-07-11 Daniel Jacobowitz <drow@mvista.com>
Based on patch from Daniel Berlin <dberlin@dberlin.org>.
* buildsym.c: Include "demangle.h" for SYMBOL_INIT_DEMANGLED_NAME.
(finish_block) For non-function blocks, hash the symbol table. For
function blocks, mark the symbol table as unhashed.
* minsyms.c (msymbol_hash): Return hash value without taking modulus.
(msymbol_hash_iw): Likewise.
(add_minsym_to_hash_table): Take modulus of msymbol_hash's return
value.
(add_minsym_to_demangled_hash_table): Likewise for msymbol_hash_iw.
(lookup_minimal_symbol): Likewise for both.
* symtab.h (struct block): Add `hashtable' flag. Comment the
hashtable.
(BLOCK_HASHTABLE, BLOCK_BUCKETS, BLOCK_BUCKET): New macro.
(ALL_BLOCK_SYMBOLS): Update.
(BLOCK_SHOULD_SORT): Do not sort hashed blocks.
(struct symbol): Add `hash_next' pointer.
* symtab.c (lookup_block_symbol): Search using the hash table when
possible.
(find_pc_sect_symtab): Use ALL_BLOCK_SYMBOLS.
(search_symbols, find_addr_symbol): Likewise.
* dstread.c (process_dst_block): Clear hashtable bit for new block.
(read_dst_symtab): Likewise.
* jv-lang.c (get_java_class_symtab): Likewise.
* mdebugread.c: Include "gdb_assert.h".
(shrink_block): Assert that the block being modified is not hashed.
* coffread.c (patch_opaque_types): Use ALL_BLOCK_SYMBOLS.
* symmisc.c (free_symtab_block): Walk the hash table when freeing
symbols.
(dump_symtab): Recognize hashed blocks.
* printcmd.c (print_frame_args): Assert that function blocks do not
have hashed symbol tables.
* ada-lang.c (symtab_for_sym): Use ALL_BLOCK_SYMBOLS.
(fill_in_ada_prototype, debug_print_block): Likewise.
(ada_add_block_symbols): Use ALL_BLOCK_SYMBOLS. Handle hash tables.
Index: ada-lang.c
===================================================================
RCS file: /cvs/src/src/gdb/ada-lang.c,v
retrieving revision 1.2
diff -u -p -r1.2 ada-lang.c
--- ada-lang.c 19 Jun 2002 16:55:28 -0000 1.2
+++ ada-lang.c 11 Jul 2002 19:17:17 -0000
@@ -3560,6 +3560,7 @@ symtab_for_sym (sym)
struct symtab* s;
struct objfile *objfile;
struct block *b;
+ struct symbol *tmp_sym;
int i, j;
ALL_SYMTABS (objfile, s)
@@ -3574,12 +3575,12 @@ symtab_for_sym (sym)
case LOC_BLOCK:
case LOC_CONST_BYTES:
b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), GLOBAL_BLOCK);
- for (i = 0; i < BLOCK_NSYMS (b); i += 1)
- if (sym == BLOCK_SYM (b, i))
+ ALL_BLOCK_SYMBOLS (b, i, tmp_sym)
+ if (sym == tmp_sym)
return s;
b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), STATIC_BLOCK);
- for (i = 0; i < BLOCK_NSYMS (b); i += 1)
- if (sym == BLOCK_SYM (b, i))
+ ALL_BLOCK_SYMBOLS (b, i, tmp_sym)
+ if (sym == tmp_sym)
return s;
break;
default:
@@ -3601,8 +3602,8 @@ symtab_for_sym (sym)
j < BLOCKVECTOR_NBLOCKS (BLOCKVECTOR (s)); j += 1)
{
b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), j);
- for (i = 0; i < BLOCK_NSYMS (b); i += 1)
- if (sym == BLOCK_SYM (b, i))
+ ALL_BLOCK_SYMBOLS (b, i, tmp_sym)
+ if (sym == tmp_sym)
return s;
}
break;
@@ -4094,14 +4095,14 @@ ada_add_block_symbols (block, name, name
/* Set true when we find a matching non-argument symbol */
int found_sym;
int is_sorted = BLOCK_SHOULD_SORT (block);
+ struct symbol *sym;
arg_sym = NULL; found_sym = 0;
if (wild)
{
- for (i = 0; i < BLOCK_NSYMS (block); i += 1)
+ struct symbol *sym;
+ ALL_BLOCK_SYMBOLS (block, i, sym)
{
- struct symbol *sym = BLOCK_SYM (block, i);
-
if (SYMBOL_NAMESPACE (sym) == namespace &&
wild_match (name, name_len, SYMBOL_NAME (sym)))
{
@@ -4149,44 +4150,46 @@ ada_add_block_symbols (block, name, name
else
i = 0;
- for (; i < BLOCK_NSYMS (block); i += 1)
- {
- struct symbol *sym = BLOCK_SYM (block, i);
-
- if (SYMBOL_NAMESPACE (sym) == namespace)
- {
- int cmp = strncmp (name, SYMBOL_NAME (sym), name_len);
-
- if (cmp < 0)
- {
- if (is_sorted)
- break;
- }
- else if (cmp == 0
- && is_name_suffix (SYMBOL_NAME (sym) + name_len))
- {
- switch (SYMBOL_CLASS (sym))
- {
- case LOC_ARG:
- case LOC_LOCAL_ARG:
- case LOC_REF_ARG:
- case LOC_REGPARM:
- case LOC_REGPARM_ADDR:
- case LOC_BASEREG_ARG:
- arg_sym = sym;
- break;
- case LOC_UNRESOLVED:
- break;
- default:
- found_sym = 1;
- fill_in_ada_prototype (sym);
- add_defn_to_vec (fixup_symbol_section (sym, objfile),
- block);
- break;
- }
- }
- }
- }
+ for (; i < BLOCK_BUCKETS (block); i += 1)
+ for (sym = BLOCK_BUCKET (block, i); sym != NULL; sym = sym->hash_next)
+ {
+ if (SYMBOL_NAMESPACE (sym) == namespace)
+ {
+ int cmp = strncmp (name, SYMBOL_NAME (sym), name_len);
+
+ if (cmp < 0)
+ {
+ if (is_sorted)
+ {
+ i = BLOCK_BUCKETS (block);
+ break;
+ }
+ }
+ else if (cmp == 0
+ && is_name_suffix (SYMBOL_NAME (sym) + name_len))
+ {
+ switch (SYMBOL_CLASS (sym))
+ {
+ case LOC_ARG:
+ case LOC_LOCAL_ARG:
+ case LOC_REF_ARG:
+ case LOC_REGPARM:
+ case LOC_REGPARM_ADDR:
+ case LOC_BASEREG_ARG:
+ arg_sym = sym;
+ break;
+ case LOC_UNRESOLVED:
+ break;
+ default:
+ found_sym = 1;
+ fill_in_ada_prototype (sym);
+ add_defn_to_vec (fixup_symbol_section (sym, objfile),
+ block);
+ break;
+ }
+ }
+ }
+ }
}
if (! found_sym && arg_sym != NULL)
@@ -4219,53 +4222,57 @@ ada_add_block_symbols (block, name, name
else
i = 0;
- for (; i < BLOCK_NSYMS (block); i += 1)
- {
- struct symbol *sym = BLOCK_SYM (block, i);
-
- if (SYMBOL_NAMESPACE (sym) == namespace)
- {
- int cmp;
-
- cmp = (int) '_' - (int) SYMBOL_NAME (sym)[0];
- if (cmp == 0)
- {
- cmp = strncmp ("_ada_", SYMBOL_NAME (sym), 5);
- if (cmp == 0)
- cmp = strncmp (name, SYMBOL_NAME (sym) + 5, name_len);
- }
-
- if (cmp < 0)
- {
- if (is_sorted)
- break;
- }
- else if (cmp == 0
- && is_name_suffix (SYMBOL_NAME (sym) + name_len + 5))
- {
- switch (SYMBOL_CLASS (sym))
- {
- case LOC_ARG:
- case LOC_LOCAL_ARG:
- case LOC_REF_ARG:
- case LOC_REGPARM:
- case LOC_REGPARM_ADDR:
- case LOC_BASEREG_ARG:
- arg_sym = sym;
- break;
- case LOC_UNRESOLVED:
- break;
- default:
- found_sym = 1;
- fill_in_ada_prototype (sym);
- add_defn_to_vec (fixup_symbol_section (sym, objfile),
- block);
- break;
- }
- }
- }
- }
-
+ for (; i < BLOCK_BUCKETS (block); i += 1)
+ for (sym = BLOCK_BUCKET (block, i); sym != NULL; sym = sym->hash_next)
+ {
+ struct symbol *sym = BLOCK_SYM (block, i);
+
+ if (SYMBOL_NAMESPACE (sym) == namespace)
+ {
+ int cmp;
+
+ cmp = (int) '_' - (int) SYMBOL_NAME (sym)[0];
+ if (cmp == 0)
+ {
+ cmp = strncmp ("_ada_", SYMBOL_NAME (sym), 5);
+ if (cmp == 0)
+ cmp = strncmp (name, SYMBOL_NAME (sym) + 5, name_len);
+ }
+
+ if (cmp < 0)
+ {
+ if (is_sorted)
+ {
+ i = BLOCK_BUCKETS (block);
+ break;
+ }
+ }
+ else if (cmp == 0
+ && is_name_suffix (SYMBOL_NAME (sym) + name_len + 5))
+ {
+ switch (SYMBOL_CLASS (sym))
+ {
+ case LOC_ARG:
+ case LOC_LOCAL_ARG:
+ case LOC_REF_ARG:
+ case LOC_REGPARM:
+ case LOC_REGPARM_ADDR:
+ case LOC_BASEREG_ARG:
+ arg_sym = sym;
+ break;
+ case LOC_UNRESOLVED:
+ break;
+ default:
+ found_sym = 1;
+ fill_in_ada_prototype (sym);
+ add_defn_to_vec (fixup_symbol_section (sym, objfile),
+ block);
+ break;
+ }
+ }
+ }
+ }
+
/* NOTE: This really shouldn't be needed for _ada_ symbols.
They aren't parameters, right? */
if (! found_sym && arg_sym != NULL)
@@ -4292,6 +4299,7 @@ fill_in_ada_prototype (func)
struct type* ftype;
struct type* rtype;
size_t max_fields;
+ struct symbol *sym;
if (func == NULL
|| TYPE_CODE (SYMBOL_TYPE (func)) != TYPE_CODE_FUNC
@@ -4308,16 +4316,13 @@ fill_in_ada_prototype (func)
SYMBOL_TYPE (func) = ftype;
b = SYMBOL_BLOCK_VALUE (func);
- nsyms = BLOCK_NSYMS (b);
nargs = 0;
max_fields = 8;
TYPE_FIELDS (ftype) =
(struct field*) xmalloc (sizeof (struct field) * max_fields);
- for (i = 0; i < nsyms; i += 1)
+ ALL_BLOCK_SYMBOLS (b, i, sym)
{
- struct symbol *sym = BLOCK_SYM (b, i);
-
GROW_VECT (TYPE_FIELDS (ftype), max_fields, nargs+1);
switch (SYMBOL_CLASS (sym))
@@ -4903,6 +4908,8 @@ debug_print_block (b)
struct block* b;
{
int i;
+ struct symbol *i;
+
fprintf (stderr, "Block: %p; [0x%lx, 0x%lx]",
b, BLOCK_START(b), BLOCK_END(b));
if (BLOCK_FUNCTION(b) != NULL)
@@ -4910,11 +4917,11 @@ debug_print_block (b)
fprintf (stderr, "\n");
fprintf (stderr, "\t Superblock: %p\n", BLOCK_SUPERBLOCK(b));
fprintf (stderr, "\t Symbols:");
- for (i = 0; i < BLOCK_NSYMS (b); i += 1)
+ ALL_BLOCK_SYMBOLS (b, i, sym)
{
if (i > 0 && i % 4 == 0)
fprintf (stderr, "\n\t\t ");
- fprintf (stderr, " %s", SYMBOL_NAME (BLOCK_SYM (b, i)));
+ fprintf (stderr, " %s", SYMBOL_NAME (sym));
}
fprintf (stderr, "\n");
}
Index: buildsym.c
===================================================================
RCS file: /cvs/src/src/gdb/buildsym.c,v
retrieving revision 1.16
diff -u -p -r1.16 buildsym.c
--- buildsym.c 15 May 2002 21:19:18 -0000 1.16
+++ buildsym.c 11 Jul 2002 19:17:18 -0000
@@ -40,6 +40,7 @@
#include "bcache.h"
#include "filenames.h" /* For DOSish file names */
#include "macrotab.h"
+#include "demangle.h" /* Needed by SYMBOL_INIT_DEMANGLED_NAME. */
/* Ask buildsym.h to define the vars it normally declares `extern'. */
#define EXTERN
/**/
@@ -243,17 +244,47 @@ finish_block (struct symbol *symbol, str
/* EMPTY */ ;
}
- block = (struct block *) obstack_alloc (&objfile->symbol_obstack,
- (sizeof (struct block) + ((i - 1) * sizeof (struct symbol *))));
-
/* Copy the symbols into the block. */
- BLOCK_NSYMS (block) = i;
- for (next = *listhead; next; next = next->next)
+ if (symbol)
+ {
+ block = (struct block *)
+ obstack_alloc (&objfile->symbol_obstack,
+ (sizeof (struct block) +
+ ((i - 1) * sizeof (struct symbol *))));
+ BLOCK_NSYMS (block) = i;
+ for (next = *listhead; next; next = next->next)
+ for (j = next->nsyms - 1; j >= 0; j--)
+ {
+ BLOCK_SYM (block, --i) = next->symbol[j];
+ }
+ }
+ else
{
- for (j = next->nsyms - 1; j >= 0; j--)
+ block = (struct block *)
+ obstack_alloc (&objfile->symbol_obstack,
+ (sizeof (struct block) +
+ ((i / 5) * sizeof (struct symbol *))));
+ for (j = 0; j < ((i / 5) + 1); j++)
{
- BLOCK_SYM (block, --i) = next->symbol[j];
+ BLOCK_BUCKET (block, j) = 0;
+ }
+ BLOCK_BUCKETS (block) = (i / 5) + 1;
+ for (next = *listhead; next; next = next->next)
+ {
+ for (j = next->nsyms - 1; j >= 0; j--)
+ {
+ struct symbol *sym;
+ unsigned int hash_index;
+ const char *name = SYMBOL_DEMANGLED_NAME (next->symbol[j]);
+ if (name == NULL)
+ name = SYMBOL_NAME (next->symbol[j]);
+ hash_index = msymbol_hash_iw (name);
+ hash_index = hash_index % BLOCK_BUCKETS (block);
+ sym = BLOCK_BUCKET (block, hash_index);
+ BLOCK_BUCKET (block, hash_index) = next->symbol[j];
+ next->symbol[j]->hash_next = sym;
+ }
}
}
@@ -271,6 +302,7 @@ finish_block (struct symbol *symbol, str
struct type *ftype = SYMBOL_TYPE (symbol);
SYMBOL_BLOCK_VALUE (symbol) = block;
BLOCK_FUNCTION (block) = symbol;
+ BLOCK_HASHTABLE (block) = 0;
if (TYPE_NFIELDS (ftype) <= 0)
{
@@ -352,6 +384,7 @@ finish_block (struct symbol *symbol, str
else
{
BLOCK_FUNCTION (block) = NULL;
+ BLOCK_HASHTABLE (block) = 1;
}
/* Now "free" the links of the list, and empty the list. */
Index: coffread.c
===================================================================
RCS file: /cvs/src/src/gdb/coffread.c,v
retrieving revision 1.26
diff -u -p -r1.26 coffread.c
--- coffread.c 19 Mar 2002 19:00:03 -0000 1.26
+++ coffread.c 11 Jul 2002 19:17:18 -0000
@@ -1409,13 +1409,12 @@ patch_opaque_types (struct symtab *s)
/* Go through the per-file symbols only */
b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), STATIC_BLOCK);
- for (i = BLOCK_NSYMS (b) - 1; i >= 0; i--)
+ ALL_BLOCK_SYMBOLS (b, i, real_sym)
{
/* Find completed typedefs to use to fix opaque ones.
Remove syms from the chain when their types are stored,
but search the whole chain, as there may be several syms
from different files with the same name. */
- real_sym = BLOCK_SYM (b, i);
if (SYMBOL_CLASS (real_sym) == LOC_TYPEDEF &&
SYMBOL_NAMESPACE (real_sym) == VAR_NAMESPACE &&
TYPE_CODE (SYMBOL_TYPE (real_sym)) == TYPE_CODE_PTR &&
Index: dstread.c
===================================================================
RCS file: /cvs/src/src/gdb/dstread.c,v
retrieving revision 1.8
diff -u -p -r1.8 dstread.c
--- dstread.c 13 May 2002 14:00:35 -0000 1.8
+++ dstread.c 11 Jul 2002 19:17:18 -0000
@@ -1396,6 +1396,7 @@ process_dst_block (struct objfile *objfi
symnum++;
}
BLOCK_NSYMS (block) = total_symbols;
+ BLOCK_HASHTABLE (block) = 0;
BLOCK_START (block) = address;
BLOCK_END (block) = address + size;
BLOCK_SUPERBLOCK (block) = 0;
@@ -1480,6 +1481,7 @@ read_dst_symtab (struct objfile *objfile
(total_globals - 1) *
sizeof (struct symbol *));
BLOCK_NSYMS (global_block) = total_globals;
+ BLOCK_HASHTABLE (global_block) = 0;
for (symnum = 0; symnum < total_globals; symnum++)
{
nextsym = dst_global_symbols->next;
Index: jv-lang.c
===================================================================
RCS file: /cvs/src/src/gdb/jv-lang.c,v
retrieving revision 1.11
diff -u -p -r1.11 jv-lang.c
--- jv-lang.c 2 Dec 2001 22:43:59 -0000 1.11
+++ jv-lang.c 11 Jul 2002 19:17:19 -0000
@@ -106,6 +106,7 @@ get_java_class_symtab (void)
bl = (struct block *)
obstack_alloc (&objfile->symbol_obstack, sizeof (struct block));
BLOCK_NSYMS (bl) = 0;
+ BLOCK_HASHTABLE (bl) = 0;
BLOCK_START (bl) = 0;
BLOCK_END (bl) = 0;
BLOCK_FUNCTION (bl) = NULL;
Index: mdebugread.c
===================================================================
RCS file: /cvs/src/src/gdb/mdebugread.c,v
retrieving revision 1.26
diff -u -p -r1.26 mdebugread.c
--- mdebugread.c 13 May 2002 14:00:36 -0000 1.26
+++ mdebugread.c 11 Jul 2002 19:17:20 -0000
@@ -52,6 +52,7 @@
#include "stabsread.h"
#include "complaints.h"
#include "demangle.h"
+#include "gdb_assert.h"
/* These are needed if the tm.h file does not contain the necessary
mips specific definitions. */
@@ -4726,6 +4727,11 @@ shrink_block (struct block *b, struct sy
(sizeof (struct block)
+ ((BLOCK_NSYMS (b) - 1)
* sizeof (struct symbol *))));
+
+ /* FIXME: Not worth hashing this block as it's built. */
+ /* All callers should have created the block with new_block (), which
+ would mean it was not previously hashed. Make sure. */
+ gdb_assert (BLOCK_HASHTABLE (new) == 0);
/* Should chase pointers to old one. Fortunately, that`s just
the block`s function and inferior blocks */
Index: minsyms.c
===================================================================
RCS file: /cvs/src/src/gdb/minsyms.c,v
retrieving revision 1.21
diff -u -p -r1.21 minsyms.c
--- minsyms.c 24 Apr 2002 08:00:54 -0000 1.21
+++ minsyms.c 11 Jul 2002 19:17:20 -0000
@@ -91,7 +91,7 @@ msymbol_hash_iw (const char *string)
++string;
}
}
- return hash % MINIMAL_SYMBOL_HASH_SIZE;
+ return hash;
}
/* Compute a hash code for a string. */
@@ -102,7 +102,7 @@ msymbol_hash (const char *string)
unsigned int hash = 0;
for (; *string; ++string)
hash = hash * 67 + *string - 113;
- return hash % MINIMAL_SYMBOL_HASH_SIZE;
+ return hash;
}
/* Add the minimal symbol SYM to an objfile's minsym hash table, TABLE. */
@@ -112,7 +112,7 @@ add_minsym_to_hash_table (struct minimal
{
if (sym->hash_next == NULL)
{
- unsigned int hash = msymbol_hash (SYMBOL_NAME (sym));
+ unsigned int hash = msymbol_hash (SYMBOL_NAME (sym)) % MINIMAL_SYMBOL_HASH_SIZE;
sym->hash_next = table[hash];
table[hash] = sym;
}
@@ -126,7 +126,7 @@ add_minsym_to_demangled_hash_table (stru
{
if (sym->demangled_hash_next == NULL)
{
- unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME (sym));
+ unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME (sym)) % MINIMAL_SYMBOL_HASH_SIZE;
sym->demangled_hash_next = table[hash];
table[hash] = sym;
}
@@ -154,8 +154,8 @@ lookup_minimal_symbol (register const ch
struct minimal_symbol *found_file_symbol = NULL;
struct minimal_symbol *trampoline_symbol = NULL;
- unsigned int hash = msymbol_hash (name);
- unsigned int dem_hash = msymbol_hash_iw (name);
+ unsigned int hash = msymbol_hash (name) % MINIMAL_SYMBOL_HASH_SIZE;
+ unsigned int dem_hash = msymbol_hash_iw (name) % MINIMAL_SYMBOL_HASH_SIZE;
#ifdef SOFUN_ADDRESS_MAYBE_MISSING
if (sfile != NULL)
Index: printcmd.c
===================================================================
RCS file: /cvs/src/src/gdb/printcmd.c,v
retrieving revision 1.39
diff -u -p -r1.39 printcmd.c
--- printcmd.c 11 May 2002 23:48:23 -0000 1.39
+++ printcmd.c 11 Jul 2002 19:17:20 -0000
@@ -40,6 +40,7 @@
#include "objfiles.h" /* ditto */
#include "completer.h" /* for completion functions */
#include "ui-out.h"
+#include "gdb_assert.h"
extern int asm_demangle; /* Whether to demangle syms in asm printouts */
extern int addressprint; /* Whether to print hex addresses in HLL " */
@@ -1785,6 +1786,10 @@ print_frame_args (struct symbol *func, s
if (func)
{
b = SYMBOL_BLOCK_VALUE (func);
+ /* Function blocks are order sensitive, and thus should not be
+ hashed. */
+ gdb_assert (BLOCK_HASHTABLE (b) == 0);
+
ALL_BLOCK_SYMBOLS (b, i, sym)
{
QUIT;
Index: symmisc.c
===================================================================
RCS file: /cvs/src/src/gdb/symmisc.c,v
retrieving revision 1.9
diff -u -p -r1.9 symmisc.c
--- symmisc.c 15 May 2002 21:19:21 -0000 1.9
+++ symmisc.c 11 Jul 2002 19:17:21 -0000
@@ -86,11 +86,17 @@ static void
free_symtab_block (struct objfile *objfile, struct block *b)
{
register int i, n;
- n = BLOCK_NSYMS (b);
+ struct symbol *sym, *next_sym;
+
+ n = BLOCK_BUCKETS (b);
for (i = 0; i < n; i++)
{
- xmfree (objfile->md, SYMBOL_NAME (BLOCK_SYM (b, i)));
- xmfree (objfile->md, (PTR) BLOCK_SYM (b, i));
+ for (sym = BLOCK_BUCKET (b, i); sym; sym = next_sym)
+ {
+ next_sym = sym->hash_next;
+ xmfree (objfile->md, SYMBOL_NAME (sym));
+ xmfree (objfile->md, (PTR) sym);
+ }
}
xmfree (objfile->md, (PTR) b);
}
@@ -457,8 +463,14 @@ dump_symtab (struct objfile *objfile, st
fprintf_filtered (outfile, " under ");
gdb_print_host_address (BLOCK_SUPERBLOCK (b), outfile);
}
- blen = BLOCK_NSYMS (b);
- fprintf_filtered (outfile, ", %d syms in ", blen);
+ /* drow/2002-07-10: We could save the total symbols count
+ even if we're using a hashtable, but nothing else but this message
+ wants it. */
+ blen = BLOCK_BUCKETS (b);
+ if (BLOCK_HASHTABLE (b))
+ fprintf_filtered (outfile, ", %d buckets in ", blen);
+ else
+ fprintf_filtered (outfile, ", %d syms in ", blen);
print_address_numeric (BLOCK_START (b), 1, outfile);
fprintf_filtered (outfile, "..");
print_address_numeric (BLOCK_END (b), 1, outfile);
@@ -474,8 +486,8 @@ dump_symtab (struct objfile *objfile, st
if (BLOCK_GCC_COMPILED (b))
fprintf_filtered (outfile, ", compiled with gcc%d", BLOCK_GCC_COMPILED (b));
fprintf_filtered (outfile, "\n");
- /* Now print each symbol in this block. */
- /* FIXMED: Sort? */
+ /* Now print each symbol in this block (in no particular order, if
+ we're using a hashtable). */
ALL_BLOCK_SYMBOLS (b, j, sym)
{
struct print_symbol_args s;
Index: symtab.c
===================================================================
RCS file: /cvs/src/src/gdb/symtab.c,v
retrieving revision 1.64
diff -u -p -r1.64 symtab.c
--- symtab.c 4 Jul 2002 15:22:42 -0000 1.64
+++ symtab.c 11 Jul 2002 19:17:21 -0000
@@ -1328,6 +1328,22 @@ lookup_block_symbol (register const stru
register struct symbol *sym_found = NULL;
register int do_linear_search = 1;
+ if (BLOCK_HASHTABLE (block))
+ {
+ unsigned int hash_index;
+ hash_index = msymbol_hash_iw (name);
+ hash_index = hash_index % BLOCK_BUCKETS (block);
+ for (sym = BLOCK_BUCKET (block, hash_index); sym; sym = sym->hash_next)
+ {
+ if (SYMBOL_NAMESPACE (sym) == namespace
+ && (mangled_name
+ ? strcmp (SYMBOL_NAME (sym), mangled_name) == 0
+ : SYMBOL_MATCHES_NAME (sym, name)))
+ return sym;
+ }
+ return NULL;
+ }
+
/* If the blocks's symbols were sorted, start with a binary search. */
if (BLOCK_SHOULD_SORT (block))
@@ -1582,14 +1598,15 @@ find_pc_sect_symtab (CORE_ADDR pc, asect
if (section != 0)
{
int i;
+ struct symbol *sym = NULL;
- for (i = 0; i < b->nsyms; i++)
+ ALL_BLOCK_SYMBOLS (b, i, sym)
{
- fixup_symbol_section (b->sym[i], objfile);
- if (section == SYMBOL_BFD_SECTION (b->sym[i]))
+ fixup_symbol_section (sym, objfile);
+ if (section == SYMBOL_BFD_SECTION (sym))
break;
}
- if (i >= b->nsyms)
+ if ((i >= BLOCK_BUCKETS (b)) && (sym == NULL))
continue; /* no symbol in this symtab matches section */
}
distance = BLOCK_END (b) - BLOCK_START (b);
@@ -1661,10 +1678,8 @@ find_addr_symbol (CORE_ADDR addr, struct
{
QUIT;
block = BLOCKVECTOR_BLOCK (BLOCKVECTOR (symtab), blocknum);
- top = BLOCK_NSYMS (block);
- for (bot = 0; bot < top; bot++)
+ ALL_BLOCK_SYMBOLS (block, bot, sym)
{
- sym = BLOCK_SYM (block, bot);
switch (SYMBOL_CLASS (sym))
{
case LOC_STATIC:
@@ -2795,10 +2810,9 @@ search_symbols (char *regexp, namespace_
struct symbol_search *prevtail = tail;
int nfound = 0;
b = BLOCKVECTOR_BLOCK (bv, i);
- for (j = 0; j < BLOCK_NSYMS (b); j++)
+ ALL_BLOCK_SYMBOLS (b, j, sym)
{
QUIT;
- sym = BLOCK_SYM (b, j);
if (file_matches (s->filename, files, nfiles)
&& ((regexp == NULL || SYMBOL_MATCHES_REGEXP (sym))
&& ((kind == VARIABLES_NAMESPACE && SYMBOL_CLASS (sym) != LOC_TYPEDEF
Index: symtab.h
===================================================================
RCS file: /cvs/src/src/gdb/symtab.h,v
retrieving revision 1.34
diff -u -p -r1.34 symtab.h
--- symtab.h 11 Jul 2002 13:50:49 -0000 1.34
+++ symtab.h 11 Jul 2002 19:17:22 -0000
@@ -386,6 +386,25 @@ struct block
unsigned char gcc_compile_flag;
+ /* The symbols for this block are either in a simple linear list or
+ in a simple hashtable. Blocks which correspond to a function
+ (which have a list of symbols corresponding to arguments) use
+ a linear list, as do some older symbol readers. Other blocks
+ are hashed.
+
+ The hashtable uses the same hash function as the minsym hashtables,
+ found in minsyms.c:minsym_hash_iw. Symbols are hashed based on
+ their demangled name if appropriate, and on their name otherwise.
+ The hash function ignores space, and stops at the beginning of the
+ argument list if any.
+
+ The table is laid out in NSYMS/5 buckets and symbols are chained via
+ their hash_next field. */
+
+ /* If this is really a hashtable of the symbols, this flag is 1. */
+
+ unsigned char hashtable;
+
/* Number of local symbols. */
int nsyms;
@@ -398,30 +417,35 @@ struct block
#define BLOCK_START(bl) (bl)->startaddr
#define BLOCK_END(bl) (bl)->endaddr
-#define BLOCK_NSYMS(bl) (bl)->nsyms
-#define BLOCK_SYM(bl, n) (bl)->sym[n]
#define BLOCK_FUNCTION(bl) (bl)->function
#define BLOCK_SUPERBLOCK(bl) (bl)->superblock
#define BLOCK_GCC_COMPILED(bl) (bl)->gcc_compile_flag
+#define BLOCK_HASHTABLE(bl) (bl)->hashtable
-/* Macro to loop through all symbols in a block BL.
- i counts which symbol we are looking at, and sym points to the current
- symbol.
- The contortion at the end is to avoid reading past the last valid
- BLOCK_SYM. */
-#define ALL_BLOCK_SYMBOLS(bl, i, sym) \
- for ((i) = 0, (sym) = BLOCK_SYM ((bl), (i)); \
- (i) < BLOCK_NSYMS ((bl)); \
- ++(i), (sym) = ((i) < BLOCK_NSYMS ((bl))) \
- ? BLOCK_SYM ((bl), (i)) \
- : NULL)
+/* For blocks without a hashtable (BLOCK_HASHTABLE (bl) == 0) only. */
+#define BLOCK_NSYMS(bl) (bl)->nsyms
+#define BLOCK_SYM(bl, n) (bl)->sym[n]
+
+/* For blocks with a hashtable, but these are valid for non-hashed blocks as
+ well - each symbol will appear to be one bucket by itself. */
+#define BLOCK_BUCKETS(bl) (bl)->nsyms
+#define BLOCK_BUCKET(bl, n) (bl)->sym[n]
+
+/* Macro to loop through all symbols in a block BL, in no particular order.
+ i counts which bucket we are in, and sym points to the current symbol. */
+
+#define ALL_BLOCK_SYMBOLS(bl, i, sym) \
+ for ((i) = 0; (i) < BLOCK_BUCKETS ((bl)); (i)++) \
+ for ((sym) = BLOCK_BUCKET ((bl), (i)); (sym); \
+ (sym) = (sym)->hash_next)
/* Nonzero if symbols of block BL should be sorted alphabetically.
Don't sort a block which corresponds to a function. If we did the
sorting would have to preserve the order of the symbols for the
- arguments. */
+ arguments. Also don't sort any block that we chose to hash. */
-#define BLOCK_SHOULD_SORT(bl) ((bl)->nsyms >= 40 && BLOCK_FUNCTION (bl) == NULL)
+#define BLOCK_SHOULD_SORT(bl) (! BLOCK_HASHTABLE (bl) \
+ && BLOCK_FUNCTION (bl) == NULL)
\f
/* Represent one symbol name; a variable, constant, function or typedef. */
@@ -671,6 +695,8 @@ struct symbol
/* List of ranges where this symbol is active. This is only
used by alias symbols at the current time. */
struct range_list *ranges;
+
+ struct symbol *hash_next;
};
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [rfa] Symbol hashing (for the last time?)
2002-07-10 18:55 ` Daniel Jacobowitz
2002-07-10 19:08 ` Elena Zannoni
2002-07-11 10:43 ` Jim Blandy
@ 2002-07-11 13:12 ` Elena Zannoni
2002-07-11 13:51 ` Daniel Jacobowitz
2 siblings, 1 reply; 7+ messages in thread
From: Elena Zannoni @ 2002-07-11 13:12 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: gdb-patches
Daniel Jacobowitz writes:
> Here's a patch from last October, dusted off and merged to the current
> sources. The only substantial changes were some fixes for ada-lang.c,
> merged after I wrote the original patch. I've verified no regressions
> on i386-linux for GCC (2.95,3.0.4,3.1)/(stabs,dwarf2).
>
> This converts the normal symbol table lookups into hash tables. A few
> sorts of symbol tables aren't hashed: those produced by mdebugread.c
> and dstread.c, because they build symbol tables in lots of ad-hoc code,
> and symbol tables which are actually the arguments to a function
> (because order matters, or at least comments suggest so).
Would you mind adding the paragraph above, to a comment in the block
structure, maybe above the hashtable flag?
> A next step
> will be to convert mdebugread.c, delete dstread.c (it's marked for an
> upcoming obsoletion, isn't it?), and then delete all the complicated
> binary search code since the only remaining unhashed symtabs will be
> argument lists, which are small.
>
how about os9kread? Anything needs to be done there?
> This should help performance a bit on large programs. Ok to commit?
> Anyone see any problems with it?
>
> --
> Daniel Jacobowitz Carnegie Mellon University
> MontaVista Software Debian GNU/Linux Developer
>
> 2002-07-10 Daniel Jacobowitz <drow@mvista.com>
>
> Based on patch from Daniel Berlin <dberlin@dberlin.org>.
> * buildsym.c: Include "demangle.h" for SYMBOL_INIT_DEMANGLED_NAME.
> (finish_block) For non-function blocks, hash the symbol table. For
> function blocks, mark the symbol table as unhashed.
> * minsyms.c (msymbol_hash): Return hash value without taking modulus.
> (msymbol_hash_iw): Likewise.
> (add_minsym_to_hash_table): Take modulus of msymbol_hash's return
> value.
> (add_minsym_to_demangled_hash_table): Likewise for msymbol_hash_iw.
> (lookup_minimal_symbol): Likewise for both.
> * symtab.h (struct block): Add `hashtable' flag.
> (BLOCK_HASHTABLE): New macro.
> (BLOCK_BUCKETS): New macro.
> (BLOCK_BUCKET): New macro.
> (ALL_BLOCK_SYMBOLS): New macro.
This is not really true. The macro was there already.
> (BLOCK_SHOULD_SORT): Do not sort hashed blocks.
> (struct symbol): Add `hash_next' pointer.
> * symtab.c (lookup_block_symbol): Search using the hash table when
> possible.
> (find_pc_sect_symtab): Use ALL_BLOCK_SYMBOLS.
> (search_symbols, find_addr_symbol): Likewise.
>
> * dstread.c (process_dst_block): Clear hashtable bit for new block.
> (read_dst_symtab): Likewise.
> * jv-lang.c (get_java_class_symtab): Likewise.
> * mdebugread.c: Include "gdb_assert.h".
> (shrink_block): Assert that the block being modified is not hashed.
> * coffread.c (patch_opaque_types): Use ALL_BLOCK_SYMBOLS.
> * symmisc.c (free_symtab_block): Walk the hash table when freeing
> symbols.
> (dump_symtab): Recognize hashed blocks.
> * printcmd.c (print_frame_args): Assert that function blocks do not
> have hashed symbol tables.
> * ada-lang.c (symtab_for_sym): Use ALL_BLOCK_SYMBOLS.
> (fill_in_ada_prototype, debug_print_block): Likewise.
> (ada_add_block_symbols): Use ALL_BLOCK_SYMBOLS. Handle hash tables.
>
I now wonder if it would be better to define another 'for' macro for this:
for (sym = BLOCK_BUCKET (block, hash_index); sym; sym = sym->hash_next)
it seems to reoccur a few times. Even though it's not that important.
I don't like too much the use of the hardcoded '5' maybe define a variable?
Otherwise it's ok.
(modulus the problem you found).
Elena
> Index: gdb/ada-lang.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/ada-lang.c,v
> retrieving revision 1.2
> diff -u -p -r1.2 ada-lang.c
> --- gdb/ada-lang.c 19 Jun 2002 16:55:28 -0000 1.2
> +++ gdb/ada-lang.c 11 Jul 2002 00:14:23 -0000
> @@ -3560,6 +3560,7 @@ symtab_for_sym (sym)
> struct symtab* s;
> struct objfile *objfile;
> struct block *b;
> + struct symbol *tmp_sym;
> int i, j;
>
> ALL_SYMTABS (objfile, s)
> @@ -3574,12 +3575,12 @@ symtab_for_sym (sym)
> case LOC_BLOCK:
> case LOC_CONST_BYTES:
> b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), GLOBAL_BLOCK);
> - for (i = 0; i < BLOCK_NSYMS (b); i += 1)
> - if (sym == BLOCK_SYM (b, i))
> + ALL_BLOCK_SYMBOLS (b, i, tmp_sym)
> + if (sym == tmp_sym)
> return s;
> b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), STATIC_BLOCK);
> - for (i = 0; i < BLOCK_NSYMS (b); i += 1)
> - if (sym == BLOCK_SYM (b, i))
> + ALL_BLOCK_SYMBOLS (b, i, tmp_sym)
> + if (sym == tmp_sym)
> return s;
> break;
> default:
> @@ -3601,8 +3602,8 @@ symtab_for_sym (sym)
> j < BLOCKVECTOR_NBLOCKS (BLOCKVECTOR (s)); j += 1)
> {
> b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), j);
> - for (i = 0; i < BLOCK_NSYMS (b); i += 1)
> - if (sym == BLOCK_SYM (b, i))
> + ALL_BLOCK_SYMBOLS (b, i, tmp_sym)
> + if (sym == tmp_sym)
> return s;
> }
> break;
> @@ -4094,14 +4095,14 @@ ada_add_block_symbols (block, name, name
> /* Set true when we find a matching non-argument symbol */
> int found_sym;
> int is_sorted = BLOCK_SHOULD_SORT (block);
> + struct symbol *sym;
>
> arg_sym = NULL; found_sym = 0;
> if (wild)
> {
> - for (i = 0; i < BLOCK_NSYMS (block); i += 1)
> + struct symbol *sym;
> + ALL_BLOCK_SYMBOLS (block, i, sym)
> {
> - struct symbol *sym = BLOCK_SYM (block, i);
> -
> if (SYMBOL_NAMESPACE (sym) == namespace &&
> wild_match (name, name_len, SYMBOL_NAME (sym)))
> {
> @@ -4149,44 +4150,46 @@ ada_add_block_symbols (block, name, name
> else
> i = 0;
>
> - for (; i < BLOCK_NSYMS (block); i += 1)
> - {
> - struct symbol *sym = BLOCK_SYM (block, i);
> -
> - if (SYMBOL_NAMESPACE (sym) == namespace)
> - {
> - int cmp = strncmp (name, SYMBOL_NAME (sym), name_len);
> -
> - if (cmp < 0)
> - {
> - if (is_sorted)
> - break;
> - }
> - else if (cmp == 0
> - && is_name_suffix (SYMBOL_NAME (sym) + name_len))
> - {
> - switch (SYMBOL_CLASS (sym))
> - {
> - case LOC_ARG:
> - case LOC_LOCAL_ARG:
> - case LOC_REF_ARG:
> - case LOC_REGPARM:
> - case LOC_REGPARM_ADDR:
> - case LOC_BASEREG_ARG:
> - arg_sym = sym;
> - break;
> - case LOC_UNRESOLVED:
> - break;
> - default:
> - found_sym = 1;
> - fill_in_ada_prototype (sym);
> - add_defn_to_vec (fixup_symbol_section (sym, objfile),
> - block);
> - break;
> - }
> - }
> - }
> - }
> + for (; i < BLOCK_BUCKETS (block); i += 1)
> + for (sym = BLOCK_BUCKET (block, i); sym != NULL; sym = sym->hash_next)
> + {
> + if (SYMBOL_NAMESPACE (sym) == namespace)
> + {
> + int cmp = strncmp (name, SYMBOL_NAME (sym), name_len);
> +
> + if (cmp < 0)
> + {
> + if (is_sorted)
> + {
> + i = BLOCK_BUCKETS (block);
> + break;
> + }
> + }
> + else if (cmp == 0
> + && is_name_suffix (SYMBOL_NAME (sym) + name_len))
> + {
> + switch (SYMBOL_CLASS (sym))
> + {
> + case LOC_ARG:
> + case LOC_LOCAL_ARG:
> + case LOC_REF_ARG:
> + case LOC_REGPARM:
> + case LOC_REGPARM_ADDR:
> + case LOC_BASEREG_ARG:
> + arg_sym = sym;
> + break;
> + case LOC_UNRESOLVED:
> + break;
> + default:
> + found_sym = 1;
> + fill_in_ada_prototype (sym);
> + add_defn_to_vec (fixup_symbol_section (sym, objfile),
> + block);
> + break;
> + }
> + }
> + }
> + }
> }
>
> if (! found_sym && arg_sym != NULL)
> @@ -4219,53 +4222,57 @@ ada_add_block_symbols (block, name, name
> else
> i = 0;
>
> - for (; i < BLOCK_NSYMS (block); i += 1)
> - {
> - struct symbol *sym = BLOCK_SYM (block, i);
> -
> - if (SYMBOL_NAMESPACE (sym) == namespace)
> - {
> - int cmp;
> -
> - cmp = (int) '_' - (int) SYMBOL_NAME (sym)[0];
> - if (cmp == 0)
> - {
> - cmp = strncmp ("_ada_", SYMBOL_NAME (sym), 5);
> - if (cmp == 0)
> - cmp = strncmp (name, SYMBOL_NAME (sym) + 5, name_len);
> - }
> -
> - if (cmp < 0)
> - {
> - if (is_sorted)
> - break;
> - }
> - else if (cmp == 0
> - && is_name_suffix (SYMBOL_NAME (sym) + name_len + 5))
> - {
> - switch (SYMBOL_CLASS (sym))
> - {
> - case LOC_ARG:
> - case LOC_LOCAL_ARG:
> - case LOC_REF_ARG:
> - case LOC_REGPARM:
> - case LOC_REGPARM_ADDR:
> - case LOC_BASEREG_ARG:
> - arg_sym = sym;
> - break;
> - case LOC_UNRESOLVED:
> - break;
> - default:
> - found_sym = 1;
> - fill_in_ada_prototype (sym);
> - add_defn_to_vec (fixup_symbol_section (sym, objfile),
> - block);
> - break;
> - }
> - }
> - }
> - }
> -
> + for (; i < BLOCK_BUCKETS (block); i += 1)
> + for (sym = BLOCK_BUCKET (block, i); sym != NULL; sym = sym->hash_next)
> + {
> + struct symbol *sym = BLOCK_SYM (block, i);
> +
> + if (SYMBOL_NAMESPACE (sym) == namespace)
> + {
> + int cmp;
> +
> + cmp = (int) '_' - (int) SYMBOL_NAME (sym)[0];
> + if (cmp == 0)
> + {
> + cmp = strncmp ("_ada_", SYMBOL_NAME (sym), 5);
> + if (cmp == 0)
> + cmp = strncmp (name, SYMBOL_NAME (sym) + 5, name_len);
> + }
> +
> + if (cmp < 0)
> + {
> + if (is_sorted)
> + {
> + i = BLOCK_BUCKETS (block);
> + break;
> + }
> + }
> + else if (cmp == 0
> + && is_name_suffix (SYMBOL_NAME (sym) + name_len + 5))
> + {
> + switch (SYMBOL_CLASS (sym))
> + {
> + case LOC_ARG:
> + case LOC_LOCAL_ARG:
> + case LOC_REF_ARG:
> + case LOC_REGPARM:
> + case LOC_REGPARM_ADDR:
> + case LOC_BASEREG_ARG:
> + arg_sym = sym;
> + break;
> + case LOC_UNRESOLVED:
> + break;
> + default:
> + found_sym = 1;
> + fill_in_ada_prototype (sym);
> + add_defn_to_vec (fixup_symbol_section (sym, objfile),
> + block);
> + break;
> + }
> + }
> + }
> + }
> +
> /* NOTE: This really shouldn't be needed for _ada_ symbols.
> They aren't parameters, right? */
> if (! found_sym && arg_sym != NULL)
> @@ -4292,6 +4299,7 @@ fill_in_ada_prototype (func)
> struct type* ftype;
> struct type* rtype;
> size_t max_fields;
> + struct symbol *sym;
>
> if (func == NULL
> || TYPE_CODE (SYMBOL_TYPE (func)) != TYPE_CODE_FUNC
> @@ -4308,16 +4316,13 @@ fill_in_ada_prototype (func)
> SYMBOL_TYPE (func) = ftype;
>
> b = SYMBOL_BLOCK_VALUE (func);
> - nsyms = BLOCK_NSYMS (b);
>
> nargs = 0;
> max_fields = 8;
> TYPE_FIELDS (ftype) =
> (struct field*) xmalloc (sizeof (struct field) * max_fields);
> - for (i = 0; i < nsyms; i += 1)
> + ALL_BLOCK_SYMBOLS (b, i, sym)
> {
> - struct symbol *sym = BLOCK_SYM (b, i);
> -
> GROW_VECT (TYPE_FIELDS (ftype), max_fields, nargs+1);
>
> switch (SYMBOL_CLASS (sym))
> @@ -4903,6 +4908,8 @@ debug_print_block (b)
> struct block* b;
> {
> int i;
> + struct symbol *i;
> +
> fprintf (stderr, "Block: %p; [0x%lx, 0x%lx]",
> b, BLOCK_START(b), BLOCK_END(b));
> if (BLOCK_FUNCTION(b) != NULL)
> @@ -4910,11 +4917,11 @@ debug_print_block (b)
> fprintf (stderr, "\n");
> fprintf (stderr, "\t Superblock: %p\n", BLOCK_SUPERBLOCK(b));
> fprintf (stderr, "\t Symbols:");
> - for (i = 0; i < BLOCK_NSYMS (b); i += 1)
> + ALL_BLOCK_SYMBOLS (b, i, sym)
> {
> if (i > 0 && i % 4 == 0)
> fprintf (stderr, "\n\t\t ");
> - fprintf (stderr, " %s", SYMBOL_NAME (BLOCK_SYM (b, i)));
> + fprintf (stderr, " %s", SYMBOL_NAME (sym));
> }
> fprintf (stderr, "\n");
> }
> Index: gdb/buildsym.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/buildsym.c,v
> retrieving revision 1.16
> diff -u -p -r1.16 buildsym.c
> --- gdb/buildsym.c 15 May 2002 21:19:18 -0000 1.16
> +++ gdb/buildsym.c 11 Jul 2002 00:14:23 -0000
> @@ -40,6 +40,7 @@
> #include "bcache.h"
> #include "filenames.h" /* For DOSish file names */
> #include "macrotab.h"
> +#include "demangle.h" /* Needed by SYMBOL_INIT_DEMANGLED_NAME. */
> /* Ask buildsym.h to define the vars it normally declares `extern'. */
> #define EXTERN
> /**/
> @@ -243,17 +244,45 @@ finish_block (struct symbol *symbol, str
> /* EMPTY */ ;
> }
>
> - block = (struct block *) obstack_alloc (&objfile->symbol_obstack,
> - (sizeof (struct block) + ((i - 1) * sizeof (struct symbol *))));
> -
> /* Copy the symbols into the block. */
>
> - BLOCK_NSYMS (block) = i;
> - for (next = *listhead; next; next = next->next)
> + if (symbol)
> + {
> + block = (struct block *)
> + obstack_alloc (&objfile->symbol_obstack,
> + (sizeof (struct block) +
> + ((i - 1) * sizeof (struct symbol *))));
> + BLOCK_NSYMS (block) = i;
> + for (next = *listhead; next; next = next->next)
> + for (j = next->nsyms - 1; j >= 0; j--)
> + {
> + BLOCK_SYM (block, --i) = next->symbol[j];
> + }
> + }
> + else
> {
> - for (j = next->nsyms - 1; j >= 0; j--)
> + block = (struct block *)
> + obstack_alloc (&objfile->symbol_obstack,
> + (sizeof (struct block) +
> + ((i / 5) * sizeof (struct symbol *))));
> + for (j = 0; j < ((i / 5) + 1); j++)
> {
> - BLOCK_SYM (block, --i) = next->symbol[j];
> + BLOCK_BUCKET (block, j) = 0;
> + }
> + BLOCK_BUCKETS (block) = (i / 5) + 1;
> + for (next = *listhead; next; next = next->next)
> + {
> + for (j = next->nsyms - 1; j >= 0; j--)
> + {
> + struct symbol *sym;
> + unsigned int hash_index;
> + SYMBOL_INIT_DEMANGLED_NAME (next->symbol[j], &objfile->symbol_obstack);
> + hash_index = msymbol_hash_iw (SYMBOL_SOURCE_NAME (next->symbol[j]));
> + hash_index = hash_index % BLOCK_BUCKETS (block);
> + sym = BLOCK_BUCKET (block, hash_index);
> + BLOCK_BUCKET (block, hash_index) = next->symbol[j];
> + next->symbol[j]->hash_next = sym;
> + }
> }
> }
>
> @@ -271,6 +300,7 @@ finish_block (struct symbol *symbol, str
> struct type *ftype = SYMBOL_TYPE (symbol);
> SYMBOL_BLOCK_VALUE (symbol) = block;
> BLOCK_FUNCTION (block) = symbol;
> + BLOCK_HASHTABLE (block) = 0;
>
> if (TYPE_NFIELDS (ftype) <= 0)
> {
> @@ -352,6 +382,7 @@ finish_block (struct symbol *symbol, str
> else
> {
> BLOCK_FUNCTION (block) = NULL;
> + BLOCK_HASHTABLE (block) = 1;
> }
>
> /* Now "free" the links of the list, and empty the list. */
> Index: gdb/coffread.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/coffread.c,v
> retrieving revision 1.26
> diff -u -p -r1.26 coffread.c
> --- gdb/coffread.c 19 Mar 2002 19:00:03 -0000 1.26
> +++ gdb/coffread.c 11 Jul 2002 00:14:23 -0000
> @@ -1409,13 +1409,12 @@ patch_opaque_types (struct symtab *s)
>
> /* Go through the per-file symbols only */
> b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), STATIC_BLOCK);
> - for (i = BLOCK_NSYMS (b) - 1; i >= 0; i--)
> + ALL_BLOCK_SYMBOLS (b, i, real_sym)
> {
> /* Find completed typedefs to use to fix opaque ones.
> Remove syms from the chain when their types are stored,
> but search the whole chain, as there may be several syms
> from different files with the same name. */
> - real_sym = BLOCK_SYM (b, i);
> if (SYMBOL_CLASS (real_sym) == LOC_TYPEDEF &&
> SYMBOL_NAMESPACE (real_sym) == VAR_NAMESPACE &&
> TYPE_CODE (SYMBOL_TYPE (real_sym)) == TYPE_CODE_PTR &&
> Index: gdb/dstread.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/dstread.c,v
> retrieving revision 1.8
> diff -u -p -r1.8 dstread.c
> --- gdb/dstread.c 13 May 2002 14:00:35 -0000 1.8
> +++ gdb/dstread.c 11 Jul 2002 00:14:24 -0000
> @@ -1396,6 +1396,7 @@ process_dst_block (struct objfile *objfi
> symnum++;
> }
> BLOCK_NSYMS (block) = total_symbols;
> + BLOCK_HASHTABLE (block) = 0;
> BLOCK_START (block) = address;
> BLOCK_END (block) = address + size;
> BLOCK_SUPERBLOCK (block) = 0;
> @@ -1480,6 +1481,7 @@ read_dst_symtab (struct objfile *objfile
> (total_globals - 1) *
> sizeof (struct symbol *));
> BLOCK_NSYMS (global_block) = total_globals;
> + BLOCK_HASHTABLE (global_block) = 0;
> for (symnum = 0; symnum < total_globals; symnum++)
> {
> nextsym = dst_global_symbols->next;
> Index: gdb/jv-lang.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/jv-lang.c,v
> retrieving revision 1.11
> diff -u -p -r1.11 jv-lang.c
> --- gdb/jv-lang.c 2 Dec 2001 22:43:59 -0000 1.11
> +++ gdb/jv-lang.c 11 Jul 2002 00:14:24 -0000
> @@ -106,6 +106,7 @@ get_java_class_symtab (void)
> bl = (struct block *)
> obstack_alloc (&objfile->symbol_obstack, sizeof (struct block));
> BLOCK_NSYMS (bl) = 0;
> + BLOCK_HASHTABLE (bl) = 0;
> BLOCK_START (bl) = 0;
> BLOCK_END (bl) = 0;
> BLOCK_FUNCTION (bl) = NULL;
> Index: gdb/mdebugread.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/mdebugread.c,v
> retrieving revision 1.26
> diff -u -p -r1.26 mdebugread.c
> --- gdb/mdebugread.c 13 May 2002 14:00:36 -0000 1.26
> +++ gdb/mdebugread.c 11 Jul 2002 00:14:25 -0000
> @@ -52,6 +52,7 @@
> #include "stabsread.h"
> #include "complaints.h"
> #include "demangle.h"
> +#include "gdb_assert.h"
>
> /* These are needed if the tm.h file does not contain the necessary
> mips specific definitions. */
> @@ -4726,6 +4727,11 @@ shrink_block (struct block *b, struct sy
> (sizeof (struct block)
> + ((BLOCK_NSYMS (b) - 1)
> * sizeof (struct symbol *))));
> +
> + /* FIXME: Not worth hashing this block as it's built. */
> + /* All callers should have created the block with new_block (), which
> + would mean it was not previously hashed. Make sure. */
> + gdb_assert (BLOCK_HASHTABLE (new) == 0);
>
> /* Should chase pointers to old one. Fortunately, that`s just
> the block`s function and inferior blocks */
> Index: gdb/minsyms.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/minsyms.c,v
> retrieving revision 1.21
> diff -u -p -r1.21 minsyms.c
> --- gdb/minsyms.c 24 Apr 2002 08:00:54 -0000 1.21
> +++ gdb/minsyms.c 11 Jul 2002 00:14:25 -0000
> @@ -91,7 +91,7 @@ msymbol_hash_iw (const char *string)
> ++string;
> }
> }
> - return hash % MINIMAL_SYMBOL_HASH_SIZE;
> + return hash;
> }
>
> /* Compute a hash code for a string. */
> @@ -102,7 +102,7 @@ msymbol_hash (const char *string)
> unsigned int hash = 0;
> for (; *string; ++string)
> hash = hash * 67 + *string - 113;
> - return hash % MINIMAL_SYMBOL_HASH_SIZE;
> + return hash;
> }
>
> /* Add the minimal symbol SYM to an objfile's minsym hash table, TABLE. */
> @@ -112,7 +112,7 @@ add_minsym_to_hash_table (struct minimal
> {
> if (sym->hash_next == NULL)
> {
> - unsigned int hash = msymbol_hash (SYMBOL_NAME (sym));
> + unsigned int hash = msymbol_hash (SYMBOL_NAME (sym)) % MINIMAL_SYMBOL_HASH_SIZE;
> sym->hash_next = table[hash];
> table[hash] = sym;
> }
> @@ -126,7 +126,7 @@ add_minsym_to_demangled_hash_table (stru
> {
> if (sym->demangled_hash_next == NULL)
> {
> - unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME (sym));
> + unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME (sym)) % MINIMAL_SYMBOL_HASH_SIZE;
> sym->demangled_hash_next = table[hash];
> table[hash] = sym;
> }
> @@ -154,8 +154,8 @@ lookup_minimal_symbol (register const ch
> struct minimal_symbol *found_file_symbol = NULL;
> struct minimal_symbol *trampoline_symbol = NULL;
>
> - unsigned int hash = msymbol_hash (name);
> - unsigned int dem_hash = msymbol_hash_iw (name);
> + unsigned int hash = msymbol_hash (name) % MINIMAL_SYMBOL_HASH_SIZE;
> + unsigned int dem_hash = msymbol_hash_iw (name) % MINIMAL_SYMBOL_HASH_SIZE;
>
> #ifdef SOFUN_ADDRESS_MAYBE_MISSING
> if (sfile != NULL)
> Index: gdb/printcmd.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/printcmd.c,v
> retrieving revision 1.39
> diff -u -p -r1.39 printcmd.c
> --- gdb/printcmd.c 11 May 2002 23:48:23 -0000 1.39
> +++ gdb/printcmd.c 11 Jul 2002 00:14:26 -0000
> @@ -40,6 +40,7 @@
> #include "objfiles.h" /* ditto */
> #include "completer.h" /* for completion functions */
> #include "ui-out.h"
> +#include "gdb_assert.h"
>
> extern int asm_demangle; /* Whether to demangle syms in asm printouts */
> extern int addressprint; /* Whether to print hex addresses in HLL " */
> @@ -1785,6 +1786,10 @@ print_frame_args (struct symbol *func, s
> if (func)
> {
> b = SYMBOL_BLOCK_VALUE (func);
> + /* Function blocks are order sensitive, and thus should not be
> + hashed. */
> + gdb_assert (BLOCK_HASHTABLE (b) == 0);
> +
> ALL_BLOCK_SYMBOLS (b, i, sym)
> {
> QUIT;
> Index: gdb/symmisc.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/symmisc.c,v
> retrieving revision 1.9
> diff -u -p -r1.9 symmisc.c
> --- gdb/symmisc.c 15 May 2002 21:19:21 -0000 1.9
> +++ gdb/symmisc.c 11 Jul 2002 00:14:26 -0000
> @@ -86,11 +86,17 @@ static void
> free_symtab_block (struct objfile *objfile, struct block *b)
> {
> register int i, n;
> - n = BLOCK_NSYMS (b);
> + struct symbol *sym, *next_sym;
> +
> + n = BLOCK_BUCKETS (b);
> for (i = 0; i < n; i++)
> {
> - xmfree (objfile->md, SYMBOL_NAME (BLOCK_SYM (b, i)));
> - xmfree (objfile->md, (PTR) BLOCK_SYM (b, i));
> + for (sym = BLOCK_BUCKET (b, i); sym; sym = next_sym)
> + {
> + next_sym = sym->hash_next;
> + xmfree (objfile->md, SYMBOL_NAME (sym));
> + xmfree (objfile->md, (PTR) sym);
> + }
> }
> xmfree (objfile->md, (PTR) b);
> }
> @@ -457,8 +463,14 @@ dump_symtab (struct objfile *objfile, st
> fprintf_filtered (outfile, " under ");
> gdb_print_host_address (BLOCK_SUPERBLOCK (b), outfile);
> }
> - blen = BLOCK_NSYMS (b);
> - fprintf_filtered (outfile, ", %d syms in ", blen);
> + /* drow/2002-07-10: We could save the total symbols count
> + even if we're using a hashtable, but nothing else but this message
> + wants it. */
> + blen = BLOCK_BUCKETS (b);
> + if (BLOCK_HASHTABLE (b))
> + fprintf_filtered (outfile, ", %d buckets in ", blen);
> + else
> + fprintf_filtered (outfile, ", %d syms in ", blen);
> print_address_numeric (BLOCK_START (b), 1, outfile);
> fprintf_filtered (outfile, "..");
> print_address_numeric (BLOCK_END (b), 1, outfile);
> @@ -474,8 +486,8 @@ dump_symtab (struct objfile *objfile, st
> if (BLOCK_GCC_COMPILED (b))
> fprintf_filtered (outfile, ", compiled with gcc%d", BLOCK_GCC_COMPILED (b));
> fprintf_filtered (outfile, "\n");
> - /* Now print each symbol in this block. */
> - /* FIXMED: Sort? */
> + /* Now print each symbol in this block (in no particular order, if
> + we're using a hashtable). */
> ALL_BLOCK_SYMBOLS (b, j, sym)
> {
> struct print_symbol_args s;
> Index: gdb/symtab.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/symtab.c,v
> retrieving revision 1.64
> diff -u -p -r1.64 symtab.c
> --- gdb/symtab.c 4 Jul 2002 15:22:42 -0000 1.64
> +++ gdb/symtab.c 11 Jul 2002 00:14:27 -0000
> @@ -1328,6 +1328,22 @@ lookup_block_symbol (register const stru
> register struct symbol *sym_found = NULL;
> register int do_linear_search = 1;
>
> + if (BLOCK_HASHTABLE (block))
> + {
> + unsigned int hash_index;
> + hash_index = msymbol_hash_iw (name);
> + hash_index = hash_index % BLOCK_BUCKETS (block);
> + for (sym = BLOCK_BUCKET (block, hash_index); sym; sym = sym->hash_next)
> + {
> + if (SYMBOL_NAMESPACE (sym) == namespace
> + && (mangled_name
> + ? strcmp (SYMBOL_NAME (sym), mangled_name) == 0
> + : SYMBOL_MATCHES_NAME (sym, name)))
> + return sym;
> + }
> + return NULL;
> + }
> +
> /* If the blocks's symbols were sorted, start with a binary search. */
>
> if (BLOCK_SHOULD_SORT (block))
> @@ -1582,14 +1598,15 @@ find_pc_sect_symtab (CORE_ADDR pc, asect
> if (section != 0)
> {
> int i;
> + struct symbol *sym = NULL;
>
> - for (i = 0; i < b->nsyms; i++)
> + ALL_BLOCK_SYMBOLS (b, i, sym)
> {
> - fixup_symbol_section (b->sym[i], objfile);
> - if (section == SYMBOL_BFD_SECTION (b->sym[i]))
> + fixup_symbol_section (sym, objfile);
> + if (section == SYMBOL_BFD_SECTION (sym))
> break;
> }
> - if (i >= b->nsyms)
> + if ((i >= BLOCK_BUCKETS (b)) && (sym == NULL))
> continue; /* no symbol in this symtab matches section */
> }
> distance = BLOCK_END (b) - BLOCK_START (b);
> @@ -1661,10 +1678,8 @@ find_addr_symbol (CORE_ADDR addr, struct
> {
> QUIT;
> block = BLOCKVECTOR_BLOCK (BLOCKVECTOR (symtab), blocknum);
> - top = BLOCK_NSYMS (block);
> - for (bot = 0; bot < top; bot++)
> + ALL_BLOCK_SYMBOLS (block, bot, sym)
> {
> - sym = BLOCK_SYM (block, bot);
> switch (SYMBOL_CLASS (sym))
> {
> case LOC_STATIC:
> @@ -2795,10 +2810,9 @@ search_symbols (char *regexp, namespace_
> struct symbol_search *prevtail = tail;
> int nfound = 0;
> b = BLOCKVECTOR_BLOCK (bv, i);
> - for (j = 0; j < BLOCK_NSYMS (b); j++)
> + ALL_BLOCK_SYMBOLS (b, j, sym)
> {
> QUIT;
> - sym = BLOCK_SYM (b, j);
> if (file_matches (s->filename, files, nfiles)
> && ((regexp == NULL || SYMBOL_MATCHES_REGEXP (sym))
> && ((kind == VARIABLES_NAMESPACE && SYMBOL_CLASS (sym) != LOC_TYPEDEF
> Index: gdb/symtab.h
> ===================================================================
> RCS file: /cvs/src/src/gdb/symtab.h,v
> retrieving revision 1.33
> diff -u -p -r1.33 symtab.h
> --- gdb/symtab.h 28 Jun 2002 22:09:11 -0000 1.33
> +++ gdb/symtab.h 11 Jul 2002 00:14:27 -0000
> @@ -386,6 +386,10 @@ struct block
>
> unsigned char gcc_compile_flag;
>
> + /* If this is really a hashtable of the symbols, this flag is 1. */
> +
> + unsigned char hashtable;
> +
> /* Number of local symbols. */
>
> int nsyms;
> @@ -398,30 +402,35 @@ struct block
>
> #define BLOCK_START(bl) (bl)->startaddr
> #define BLOCK_END(bl) (bl)->endaddr
> -#define BLOCK_NSYMS(bl) (bl)->nsyms
> -#define BLOCK_SYM(bl, n) (bl)->sym[n]
> #define BLOCK_FUNCTION(bl) (bl)->function
> #define BLOCK_SUPERBLOCK(bl) (bl)->superblock
> #define BLOCK_GCC_COMPILED(bl) (bl)->gcc_compile_flag
> +#define BLOCK_HASHTABLE(bl) (bl)->hashtable
>
> -/* Macro to loop through all symbols in a block BL.
> - i counts which symbol we are looking at, and sym points to the current
> - symbol.
> - The contortion at the end is to avoid reading past the last valid
> - BLOCK_SYM. */
> -#define ALL_BLOCK_SYMBOLS(bl, i, sym) \
> - for ((i) = 0, (sym) = BLOCK_SYM ((bl), (i)); \
> - (i) < BLOCK_NSYMS ((bl)); \
> - ++(i), (sym) = ((i) < BLOCK_NSYMS ((bl))) \
> - ? BLOCK_SYM ((bl), (i)) \
> - : NULL)
> +/* For blocks without a hashtable (BLOCK_HASHTABLE (bl) == 0) only. */
> +#define BLOCK_NSYMS(bl) (bl)->nsyms
> +#define BLOCK_SYM(bl, n) (bl)->sym[n]
> +
> +/* For blocks with a hashtable, but these are valid for non-hashed blocks as
> + well - each symbol will appear to be one bucket by itself. */
> +#define BLOCK_BUCKETS(bl) (bl)->nsyms
> +#define BLOCK_BUCKET(bl, n) (bl)->sym[n]
> +
> +/* Macro to loop through all symbols in a block BL, in no particular order.
> + i counts which bucket we are in, and sym points to the current symbol. */
> +
> +#define ALL_BLOCK_SYMBOLS(bl, i, sym) \
> + for ((i) = 0; (i) < BLOCK_BUCKETS ((bl)); (i)++) \
> + for ((sym) = BLOCK_BUCKET ((bl), (i)); (sym); \
> + (sym) = (sym)->hash_next)
>
> /* Nonzero if symbols of block BL should be sorted alphabetically.
> Don't sort a block which corresponds to a function. If we did the
> sorting would have to preserve the order of the symbols for the
> - arguments. */
> + arguments. Also don't sort any block that we chose to hash. */
>
> -#define BLOCK_SHOULD_SORT(bl) ((bl)->nsyms >= 40 && BLOCK_FUNCTION (bl) == NULL)
> +#define BLOCK_SHOULD_SORT(bl) (! BLOCK_HASHTABLE (bl) \
> + && BLOCK_FUNCTION (bl) == NULL)
> \f
>
> /* Represent one symbol name; a variable, constant, function or typedef. */
> @@ -671,6 +680,8 @@ struct symbol
> /* List of ranges where this symbol is active. This is only
> used by alias symbols at the current time. */
> struct range_list *ranges;
> +
> + struct symbol *hash_next;
> };
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [rfa] Symbol hashing (for the last time?)
2002-07-11 13:12 ` Elena Zannoni
@ 2002-07-11 13:51 ` Daniel Jacobowitz
0 siblings, 0 replies; 7+ messages in thread
From: Daniel Jacobowitz @ 2002-07-11 13:51 UTC (permalink / raw)
To: Elena Zannoni; +Cc: gdb-patches
On Thu, Jul 11, 2002 at 04:00:02PM -0400, Elena Zannoni wrote:
> Daniel Jacobowitz writes:
> > Here's a patch from last October, dusted off and merged to the current
> > sources. The only substantial changes were some fixes for ada-lang.c,
> > merged after I wrote the original patch. I've verified no regressions
> > on i386-linux for GCC (2.95,3.0.4,3.1)/(stabs,dwarf2).
> >
> > This converts the normal symbol table lookups into hash tables. A few
> > sorts of symbol tables aren't hashed: those produced by mdebugread.c
> > and dstread.c, because they build symbol tables in lots of ad-hoc code,
> > and symbol tables which are actually the arguments to a function
> > (because order matters, or at least comments suggest so).
>
> Would you mind adding the paragraph above, to a comment in the block
> structure, maybe above the hashtable flag?
Sure.
> > A next step
> > will be to convert mdebugread.c, delete dstread.c (it's marked for an
> > upcoming obsoletion, isn't it?), and then delete all the complicated
> > binary search code since the only remaining unhashed symtabs will be
> > argument lists, which are small.
> >
>
> how about os9kread? Anything needs to be done there?
Nope (though that one's scheduled to go away too, I think?). Any
reader which uses finish_block is fine.
> > (ALL_BLOCK_SYMBOLS): New macro.
>
> This is not really true. The macro was there already.
Yup, just noticed that. Thanks. I added it in an earlier patch and
never updated my changelog.
> > (BLOCK_SHOULD_SORT): Do not sort hashed blocks.
> > (struct symbol): Add `hash_next' pointer.
> > * symtab.c (lookup_block_symbol): Search using the hash table when
> > possible.
> > (find_pc_sect_symtab): Use ALL_BLOCK_SYMBOLS.
> > (search_symbols, find_addr_symbol): Likewise.
> >
> > * dstread.c (process_dst_block): Clear hashtable bit for new block.
> > (read_dst_symtab): Likewise.
> > * jv-lang.c (get_java_class_symtab): Likewise.
> > * mdebugread.c: Include "gdb_assert.h".
> > (shrink_block): Assert that the block being modified is not hashed.
> > * coffread.c (patch_opaque_types): Use ALL_BLOCK_SYMBOLS.
> > * symmisc.c (free_symtab_block): Walk the hash table when freeing
> > symbols.
> > (dump_symtab): Recognize hashed blocks.
> > * printcmd.c (print_frame_args): Assert that function blocks do not
> > have hashed symbol tables.
> > * ada-lang.c (symtab_for_sym): Use ALL_BLOCK_SYMBOLS.
> > (fill_in_ada_prototype, debug_print_block): Likewise.
> > (ada_add_block_symbols): Use ALL_BLOCK_SYMBOLS. Handle hash tables.
> >
>
> I now wonder if it would be better to define another 'for' macro for this:
> for (sym = BLOCK_BUCKET (block, hash_index); sym; sym = sym->hash_next)
> it seems to reoccur a few times. Even though it's not that important.
I think I'll avoid introducing any more of those if I possibly can....
> I don't like too much the use of the hardcoded '5' maybe define a variable?
I thought about this for a little bit and settled on:
/* Macro used to set the size of a hashtable for N symbols. */
#define BLOCK_HASHTABLE_SIZE(n) ((N)/5 + 1)
I don't really see the point in making it user tunable; if we ever find
a case that this one is bad for, I'll gladly change my mind, though.
> Otherwise it's ok.
> (modulus the problem you found).
Thanks! Committed.
--
Daniel Jacobowitz Carnegie Mellon University
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 7+ messages in thread