From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3525 invoked by alias); 6 Jan 2003 19:11:35 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 3494 invoked from network); 6 Jan 2003 19:11:26 -0000 Received: from unknown (HELO mx1.redhat.com) (66.187.233.31) by 209.249.29.67 with SMTP; 6 Jan 2003 19:11:26 -0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.11.6/8.11.6) with ESMTP id h06IhfB22447 for ; Mon, 6 Jan 2003 13:43:41 -0500 Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [172.16.52.156]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id h06JBEa11729 for ; Mon, 6 Jan 2003 14:11:14 -0500 Received: from localhost.redhat.com (romulus-int.sfbay.redhat.com [172.16.27.46]) by pobox.corp.redhat.com (8.11.6/8.11.6) with ESMTP id h06JBCn17985; Mon, 6 Jan 2003 14:11:12 -0500 Received: by localhost.redhat.com (Postfix, from userid 469) id AAC47FF79; Mon, 6 Jan 2003 14:15:34 -0500 (EST) From: Elena Zannoni MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15897.54742.251353.385541@localhost.redhat.com> Date: Mon, 06 Jan 2003 19:11:00 -0000 To: Daniel Jacobowitz Cc: gdb-patches@sources.redhat.com, jimb@redhat.com, ezannoni@redhat.com Subject: Re: [RFA] Kill some linear searches in minsyms.c In-Reply-To: <20030106042203.GA28848@nevyn.them.org> References: <20030106042203.GA28848@nevyn.them.org> X-SW-Source: 2003-01/txt/msg00221.txt.bz2 Daniel Jacobowitz writes: > So uh, yeah, and stuff. I found an interesting one today when I wasn't even > looking for it. Earlier today I posted an updated patch to make it easy, > very easy, to profile GDB. Here's a pretty good sample of why I think this > is important. I was actually looking for something completely different, > but I noticed this on the way. > > Consider this patch: > > 2000-03-06 Jim Blandy > > From Tom Tromey and Keith Seitz : > > * minsyms.c: #include , for msymbol_hash_iw. > (compact_minimal_symbols): Added `objfile' argument. > Put symbols in the objfile's hash table. > (install_minimal_symbols): Put symbols in the objfile's demangled > hash table. > (lookup_minimal_symbol): Use hash table to find symbol in > objfile. > (msymbol_hash_iw, msymbol_hash, add_minsym_to_hash_table): New > functions. > (prim_record_minimal_symbol_and_info): Initialize the > hash link fields of the new minimal symbol. > * symtab.h (struct minimal_symbol): New fields `hash_next', > `demangled_hash_next'. > (msymbol_hash_iw, msymbol_hash, add_minsym_to_hash_table): Declare. > * objfiles.h (MINIMAL_SYMBOL_HASH_SIZE): New define. > (struct objfile): New fields `msymbol_hash', > `msymbol_demangled_hash'. > > These replaced a linear search with a hash table. We'd better use it, too, > since a good-sized application has an awful lot of minimal symbols. > However, two other related functions (that's lookup_minimal_symbol_text and > lookup_minimal_symbol_solib_trampoline) were not converted. These get > called reasonably often, vis: > What command gdb executing at this point? > 0.75 0.43 62/310 create_overlay_event_breakpoint [16] > 3.01 1.72 248/310 create_longjmp_breakpoint [5] > [4] 48.1 3.76 2.16 310 lookup_minimal_symbol_text [4] > > That "3.76" is in _seconds_, by the way. Then they proceed to do: > 1.26 0.00 9567591/9567734 strcmp_iw [15] > 0.90 0.00 27551335/30742029 symbol_demangled_name [17] > > 27.5 million calls to a wrapper function. My first instinct was that a > wrapper function was a bad idea, then. My second instinct was to smack my > first instinct upside the head and find out why we were calling it so many > times, since I knew we hashed minsyms. Converting them to the hash table > once I found the problem was quite easy. The patch is attached; is it OK to > commit? No regressions on i386-pc-linux-gnu, for what that's worth, and > it's quite straightforward. > how do the numbers for those functions look after your patch? > > > Future things to examine: > > Before, GDB's CPU time to load symbols for all of mozilla, get through a > large number of threaded shared library events (ow, but I'm glad the pread64 > patch is in, or this measurement would have been just impossible to sort out > of the noise), and reach the first dialog box was 26.630000 seconds. After > the attached patch, 17.820000 seconds. Much better. The new hot spots are: > > 21.32 1.42 1.42 1594586 0.00 0.00 hash > 10.66 2.13 0.71 185486 0.00 0.00 find_corresponding_bincl_psymtab > 8.11 2.67 0.54 1128766 0.00 0.00 bcache > 6.76 3.12 0.45 41 10.98 91.28 read_dbx_symtab > 6.31 3.54 0.42 1519 0.28 0.30 end_psymtab > > I suspect we can cut find_corresponding_bincl_psymtab out of the way too; > and maybe the bcache needs some rethinking into a more specialized data > structure since we're calling it so often; but maybe there's nothing that > can be done about that (this test has a terrifyingly large number of symbols > in it). There's 3.3 million calls to mmalloc, too - a million of them are > from a call to save_inferior_ptid in thread_db_xfer_memory. And > symbol_demangled_name is still on the top ten, via compare_psymbols. That's > another good candidate for a clever data structure if it becomes worth the > time, which it isn't in this test. > > -- > Daniel Jacobowitz > MontaVista Software Debian GNU/Linux Developer > > 2003-01-05 Daniel Jacobowitz > > * minsyms.c (enum ms_search_which): New. > (lookup_minimal_symbol_internal): New function, broken out from > lookup_minimal_symbol. > (lookup_minimal_symbol, lookup_minimal_symbol_text) > (lookup_minimal_symbol_solib_trampoline): Use them. ^^^^ lookup_minimal_symbol_internal??? I don't like the _internal part, could you call it '_aux' instead? This seems more in line with the rest of gdb's names. Hmm, but maybe we already have a function by that name. I also winder how much work would it be to eventually remove these new functions, that now become wrappers. > > Index: minsyms.c > =================================================================== > RCS file: /cvs/src/src/gdb/minsyms.c,v > retrieving revision 1.22 > diff -u -p -r1.22 minsyms.c > --- minsyms.c 11 Jul 2002 20:46:19 -0000 1.22 > +++ minsyms.c 6 Jan 2003 04:19:16 -0000 > @@ -1,5 +1,6 @@ > /* GDB routines for manipulating the minimal symbol tables. > - Copyright 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001 > + Copyright 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, > + 2002, 2003 > Free Software Foundation, Inc. > Contributed by Cygnus Support, using pieces from other GDB modules. > > @@ -132,11 +133,18 @@ add_minsym_to_demangled_hash_table (stru > } > } > > +enum ms_search_which { > + ms_search_all, > + ms_search_text, > + ms_search_trampoline > +}; > > /* Look through all the current minimal symbol tables and find the > first minimal symbol that matches NAME. If OBJF is non-NULL, limit > - the search to that objfile. If SFILE is non-NULL, limit the search > - to that source file. Returns a pointer to the minimal symbol that > + the search to that objfile. If SFILE is non-NULL, only find file-scope > + symbols from that source file (global symbols are still preferred). > + WHICH indicates which minimal symbols to accept: all, text symbols only, > + or trampolines only. Returns a pointer to the minimal symbol that > matches, or NULL if no match is found. > > Note: One instance where there may be duplicate minimal symbols with > @@ -144,9 +152,10 @@ add_minsym_to_demangled_hash_table (stru > symbol tables for an executable contain global symbols with the same > names (the dynamic linker deals with the duplication). */ > > -struct minimal_symbol * > -lookup_minimal_symbol (register const char *name, const char *sfile, > - struct objfile *objf) > +static struct minimal_symbol * > +lookup_minimal_symbol_internal (register const char *name, const char *sfile, > + struct objfile *objf, > + enum ms_search_which which) > { > struct objfile *objfile; > struct minimal_symbol *msymbol; > @@ -170,68 +179,77 @@ lookup_minimal_symbol (register const ch > objfile != NULL && found_symbol == NULL; > objfile = objfile->next) > { > - if (objf == NULL || objf == objfile) > + int pass; > + if (objf && objf != objfile) > + continue; > + > + /* Do two passes: the first over the ordinary hash table, > + and the second over the demangled hash table. */ > + for (pass = 1; pass <= 2 && found_symbol == NULL; pass++) > { > - /* Do two passes: the first over the ordinary hash table, > - and the second over the demangled hash table. */ > - int pass; > - > - for (pass = 1; pass <= 2 && found_symbol == NULL; pass++) > + /* Select hash list according to pass. */ > + if (pass == 1) > + msymbol = objfile->msymbol_hash[hash]; > + else > + msymbol = objfile->msymbol_demangled_hash[dem_hash]; > + > + for (; msymbol != NULL && found_symbol == NULL; > + msymbol = (pass == 1) > + ? msymbol->hash_next > + : msymbol->demangled_hash_next) please, don't do this. This is much worse than the previous while, with the updates at the end. > { > - /* Select hash list according to pass. */ > - if (pass == 1) > - msymbol = objfile->msymbol_hash[hash]; > - else > - msymbol = objfile->msymbol_demangled_hash[dem_hash]; > + if (!SYMBOL_MATCHES_NAME (msymbol, name)) > + continue; > > - while (msymbol != NULL && found_symbol == NULL) > + if (which == ms_search_trampoline) > { > - if (SYMBOL_MATCHES_NAME (msymbol, name)) > - { > - switch (MSYMBOL_TYPE (msymbol)) > - { > - case mst_file_text: > - case mst_file_data: > - case mst_file_bss: > + if (MSYMBOL_TYPE (msymbol) == mst_solib_trampoline) > + return msymbol; > + continue; > + } > + > + if (which == ms_search_text > + && MSYMBOL_TYPE (msymbol) != mst_file_text > + && MSYMBOL_TYPE (msymbol) != mst_text) > + continue; > + I don't like the fact that the check for ms_search_all is buried inside the switch statement, while the other 2 are here. Could you have an outer switch for the ms_search_*? Also with your changes will the internal switch case mst_solib_trampoline ever be reached? It seems like we would find it earlier and return right away. The whole flow of control of this function seems a bit confused. > + switch (MSYMBOL_TYPE (msymbol)) > + { > + case mst_file_data: > + case mst_file_bss: > + case mst_file_text: > #ifdef SOFUN_ADDRESS_MAYBE_MISSING > - if (sfile == NULL || STREQ (msymbol->filename, sfile)) > - found_file_symbol = msymbol; > + if (sfile == NULL || STREQ (msymbol->filename, sfile)) > + found_file_symbol = msymbol; > #else > - /* We have neither the ability nor the need to > - deal with the SFILE parameter. If we find > - more than one symbol, just return the latest > - one (the user can't expect useful behavior in > - that case). */ > - found_file_symbol = msymbol; > + /* We have neither the ability nor the need to deal > + with the SFILE parameter. If we find more than > + one symbol, just return the latest one (the user > + can't expect useful behavior in that case). */ > + found_file_symbol = msymbol; > #endif > - break; > + break; > > - case mst_solib_trampoline: > + case mst_solib_trampoline: > > - /* If a trampoline symbol is found, we prefer to > - keep looking for the *real* symbol. If the > - actual symbol is not found, then we'll use the > - trampoline entry. */ > - if (trampoline_symbol == NULL) > - trampoline_symbol = msymbol; > - break; > - > - case mst_unknown: > - default: > - found_symbol = msymbol; > - break; > - } > - } > - > - /* Find the next symbol on the hash chain. */ > - if (pass == 1) > - msymbol = msymbol->hash_next; > - else > - msymbol = msymbol->demangled_hash_next; > + /* If a trampoline symbol is found, we prefer to > + keep looking for the *real* symbol. If the actual > + symbol is not found, then we'll use the > + trampoline entry. */ > + if (trampoline_symbol == NULL) > + trampoline_symbol = msymbol; > + break; > + > + case mst_unknown: > + default: > + if (which == ms_search_all) > + found_symbol = msymbol; > + break; > } > } > } > } > + > /* External symbols are best. */ > if (found_symbol) > return found_symbol; > @@ -247,127 +265,28 @@ lookup_minimal_symbol (register const ch > return NULL; > } > > -/* Look through all the current minimal symbol tables and find the > - first minimal symbol that matches NAME and of text type. > - If OBJF is non-NULL, limit > - the search to that objfile. If SFILE is non-NULL, limit the search > - to that source file. Returns a pointer to the minimal symbol that > - matches, or NULL if no match is found. > - */ Please, don't deleted the comments, they are still relevant, in a sense. I mean, the function still searches for the NAME symbol of text type. Do we have a comment for this function? > +struct minimal_symbol * > +lookup_minimal_symbol (register const char *name, const char *sfile, > + struct objfile *objf) > +{ > + return lookup_minimal_symbol_internal (name, sfile, objf, ms_search_all); > +} > > struct minimal_symbol * > lookup_minimal_symbol_text (register const char *name, const char *sfile, > struct objfile *objf) > { > - struct objfile *objfile; > - struct minimal_symbol *msymbol; > - struct minimal_symbol *found_symbol = NULL; > - struct minimal_symbol *found_file_symbol = NULL; > - > -#ifdef SOFUN_ADDRESS_MAYBE_MISSING > - if (sfile != NULL) > - { > - char *p = strrchr (sfile, '/'); > - if (p != NULL) > - sfile = p + 1; > - } > -#endif > - > - for (objfile = object_files; > - objfile != NULL && found_symbol == NULL; > - objfile = objfile->next) > - { > - if (objf == NULL || objf == objfile) > - { > - for (msymbol = objfile->msymbols; > - msymbol != NULL && SYMBOL_NAME (msymbol) != NULL && > - found_symbol == NULL; > - msymbol++) > - { > - if (SYMBOL_MATCHES_NAME (msymbol, name) && > - (MSYMBOL_TYPE (msymbol) == mst_text || > - MSYMBOL_TYPE (msymbol) == mst_file_text)) > - { > - switch (MSYMBOL_TYPE (msymbol)) > - { > - case mst_file_text: > -#ifdef SOFUN_ADDRESS_MAYBE_MISSING > - if (sfile == NULL || STREQ (msymbol->filename, sfile)) > - found_file_symbol = msymbol; > -#else > - /* We have neither the ability nor the need to > - deal with the SFILE parameter. If we find > - more than one symbol, just return the latest > - one (the user can't expect useful behavior in > - that case). */ > - found_file_symbol = msymbol; > -#endif > - break; > - default: > - found_symbol = msymbol; > - break; > - } > - } > - } > - } > - } > - /* External symbols are best. */ > - if (found_symbol) > - return found_symbol; > - > - /* File-local symbols are next best. */ > - if (found_file_symbol) > - return found_file_symbol; > - > - return NULL; > + return lookup_minimal_symbol_internal (name, sfile, objf, ms_search_text); > } > > -/* Look through all the current minimal symbol tables and find the > - first minimal symbol that matches NAME and of solib trampoline type. > - If OBJF is non-NULL, limit > - the search to that objfile. If SFILE is non-NULL, limit the search > - to that source file. Returns a pointer to the minimal symbol that > - matches, or NULL if no match is found. > - */ > - Same here, leave the comment (updated of course). > struct minimal_symbol * > lookup_minimal_symbol_solib_trampoline (register const char *name, > - const char *sfile, struct objfile *objf) > + const char *sfile, > + struct objfile *objf) > { > - struct objfile *objfile; > - struct minimal_symbol *msymbol; > - struct minimal_symbol *found_symbol = NULL; > - > -#ifdef SOFUN_ADDRESS_MAYBE_MISSING > - if (sfile != NULL) > - { > - char *p = strrchr (sfile, '/'); > - if (p != NULL) > - sfile = p + 1; > - } > -#endif > - > - for (objfile = object_files; > - objfile != NULL && found_symbol == NULL; > - objfile = objfile->next) > - { > - if (objf == NULL || objf == objfile) > - { > - for (msymbol = objfile->msymbols; > - msymbol != NULL && SYMBOL_NAME (msymbol) != NULL && > - found_symbol == NULL; > - msymbol++) > - { > - if (SYMBOL_MATCHES_NAME (msymbol, name) && > - MSYMBOL_TYPE (msymbol) == mst_solib_trampoline) > - return msymbol; > - } > - } > - } > - > - return NULL; > + return lookup_minimal_symbol_internal (name, sfile, objf, > + ms_search_trampoline); > } > - > > /* Search through the minimal symbol table for each objfile and find > the symbol whose address is the largest address that is still less