From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14720 invoked by alias); 11 Feb 2008 16:20:16 -0000 Received: (qmail 14707 invoked by uid 22791); 11 Feb 2008 16:20:15 -0000 X-Spam-Check-By: sourceware.org Received: from qnxmail.qnx.com (HELO qnxmail.qnx.com) (209.226.137.76) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 11 Feb 2008 16:19:49 +0000 Received: from smtp.ott.qnx.com (smtp.ott.qnx.com [10.42.96.5]) by hub.ott.qnx.com (8.9.3/8.9.3) with ESMTP id LAA18032 for ; Mon, 11 Feb 2008 11:05:14 -0500 Received: from [10.42.100.129] (dhcp-100-129 [10.42.100.129]) by smtp.ott.qnx.com (8.8.8/8.6.12) with ESMTP id LAA11002 for ; Mon, 11 Feb 2008 11:19:47 -0500 Message-ID: <47B075A2.5010207@qnx.com> Date: Mon, 11 Feb 2008 16:20:00 -0000 From: Aleksandar Ristovski User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: gdb-patches@sourceware.org Subject: [RFC/patch] Sort global symbols in psymtabs only if needed Content-Type: multipart/mixed; boundary="------------020401020602090306080402" X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2008-02/txt/msg00174.txt.bz2 This is a multi-part message in MIME format. --------------020401020602090306080402 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-length: 1934 Hello, In dwarf2read.c we sort global symbols automatically during the read. This, on large projects, causes long delay at startup. For example, a project our customer has is a C++ project and is fairly large. Here is some statistics to give you an idea: Statistics: Number of psymtabs : 16659 Number of global symbols : 9134742 Number of static symbols : 1021345 The attached patch sorts on first lookup instead of on load. Startup time (i.e. time until I get control when stopped in main) was before the patch: % cumulative self self total time seconds seconds calls s/call s/call name 65.93 21.38 21.38 16 1.34 1.34 strcmp_iw_ordered 5.70 23.23 1.85 10765999 0.00 0.00 bcache_data ... And after the patch: % cumulative self self total time seconds seconds calls s/call s/call name 20.43 1.80 1.80 10765999 0.00 0.00 bcache_data 11.80 2.84 1.04 11137416 0.00 0.00 htab_hash_string 8.74 3.61 0.77 14851934 0.00 0.00 htab_find_slot_with_hash ... Note that 'strcmp_iw_ordered' comes from sort_pst_symbols (and indirectly from qsort); the number '16' in 'calls' column is due to wrap-around. Number of calls is much higher. Testing on linux, no regressions. --- Aleksandar Ristovski QNX Software Systems ChangeLog 2008-02-11 Aleksandar Ristovski * dwarf2read.c (dwarf2_build_psymtabs_hard): Remove call to sort_pst_symbols. It will be done on first lookup. * symfile.c (compare_psymbols): Set globals_sorted flag. * symmisc.c (maintenance_info_psymtabs): Add some statistics to the output. * symtab.c (lookup_partial_symbol): Sort global symbols if not already sorted, then go on with the appropriate lookup. * symtab.h (struct partial_symtab): Add new field globals_sorted. --------------020401020602090306080402 Content-Type: text/plain; name="sortpsymtabsonfirstlookup.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="sortpsymtabsonfirstlookup.diff" Content-length: 7031 Index: gdb/dwarf2read.c =================================================================== RCS file: /cvs/src/src/gdb/dwarf2read.c,v retrieving revision 1.251 diff -u -5 -p -r1.251 dwarf2read.c --- gdb/dwarf2read.c 1 Feb 2008 22:45:13 -0000 1.251 +++ gdb/dwarf2read.c 11 Feb 2008 15:53:37 -0000 @@ -1550,11 +1550,10 @@ dwarf2_build_psymtabs_hard (struct objfi pst->n_global_syms = objfile->global_psymbols.next - (objfile->global_psymbols.list + pst->globals_offset); pst->n_static_syms = objfile->static_psymbols.next - (objfile->static_psymbols.list + pst->statics_offset); - sort_pst_symbols (pst); /* If there is already a psymtab or symtab for a file of this name, remove it. (If there is a symtab, more drastic things also happen.) This happens in VxWorks. */ free_named_symtabs (pst->filename); Index: gdb/symfile.c =================================================================== RCS file: /cvs/src/src/gdb/symfile.c,v retrieving revision 1.198 diff -u -5 -p -r1.198 symfile.c --- gdb/symfile.c 29 Jan 2008 22:47:20 -0000 1.198 +++ gdb/symfile.c 11 Feb 2008 15:53:39 -0000 @@ -210,14 +210,15 @@ compare_psymbols (const void *s1p, const void sort_pst_symbols (struct partial_symtab *pst) { /* Sort the global list; don't sort the static list */ - + gdb_assert (!pst->globals_sorted); qsort (pst->objfile->global_psymbols.list + pst->globals_offset, pst->n_global_syms, sizeof (struct partial_symbol *), compare_psymbols); + pst->globals_sorted = 1; } /* Make a null terminated copy of the string at PTR with SIZE characters in the obstack pointed to by OBSTACKP . Returns the address of the copy. Note that the string at PTR does not have to be null terminated, I.E. it Index: gdb/symmisc.c =================================================================== RCS file: /cvs/src/src/gdb/symmisc.c,v retrieving revision 1.47 diff -u -5 -p -r1.47 symmisc.c --- gdb/symmisc.c 11 Jan 2008 13:34:15 -0000 1.47 +++ gdb/symmisc.c 11 Feb 2008 15:53:39 -0000 @@ -1051,10 +1051,14 @@ maintenance_info_symtabs (char *regexp, /* List all the partial symbol tables whose names match REGEXP (optional). */ void maintenance_info_psymtabs (char *regexp, int from_tty) { struct objfile *objfile; + int sum_global_syms = 0; + int sum_static_syms = 0; + int sum_num_sorts = 0; + int sum_num_psymtabs = 0; if (regexp) re_comp (regexp); ALL_OBJFILES (objfile) @@ -1067,10 +1071,11 @@ maintenance_info_psymtabs (char *regexp, ALL_OBJFILE_PSYMTABS (objfile, psymtab) if (! regexp || re_exec (psymtab->filename)) { + sum_num_psymtabs++; if (! printed_objfile_start) { printf_filtered ("{ objfile %s ", objfile->name); wrap_here (" "); printf_filtered ("((struct objfile *) %p)\n", objfile); @@ -1094,20 +1099,26 @@ maintenance_info_psymtabs (char *regexp, { printf_filtered ("(* (struct partial_symbol **) %p @ %d)\n", (psymtab->objfile->global_psymbols.list + psymtab->globals_offset), psymtab->n_global_syms); + printf_filtered (" sorted %s\n", + psymtab->globals_sorted ? "yes" : "no"); + sum_global_syms += psymtab->n_global_syms; + if (psymtab->globals_sorted) + sum_num_sorts++; } else printf_filtered ("(none)\n"); printf_filtered (" statics "); if (psymtab->n_static_syms) { printf_filtered ("(* (struct partial_symbol **) %p @ %d)\n", (psymtab->objfile->static_psymbols.list + psymtab->statics_offset), psymtab->n_static_syms); + sum_static_syms += psymtab->n_static_syms; } else printf_filtered ("(none)\n"); printf_filtered (" dependencies "); if (psymtab->number_of_dependencies) @@ -1132,10 +1143,17 @@ maintenance_info_psymtabs (char *regexp, } if (printed_objfile_start) printf_filtered ("}\n"); } + + /* Print statistics. */ + printf_filtered ("Statistics:\n%s%9d\n%s%9d\n%s%9d\n%s%9d\n", + " Number of psymtabs :", sum_num_psymtabs, + " Number of global symbols :", sum_global_syms, + " Number of static symbols :", sum_static_syms, + " Number of glob. sym. sorts :", sum_num_sorts); } /* Check consistency of psymtabs and symtabs. */ Index: gdb/symtab.c =================================================================== RCS file: /cvs/src/src/gdb/symtab.c,v retrieving revision 1.173 diff -u -5 -p -r1.173 symtab.c --- gdb/symtab.c 5 Feb 2008 22:17:40 -0000 1.173 +++ gdb/symtab.c 11 Feb 2008 15:53:41 -0000 @@ -1594,19 +1594,20 @@ lookup_partial_symbol (struct partial_sy struct partial_symbol **top, **real_top, **bottom, **center; int length = (global ? pst->n_global_syms : pst->n_static_syms); int do_linear_search = 1; if (length == 0) - { return (NULL); - } start = (global ? pst->objfile->global_psymbols.list + pst->globals_offset : pst->objfile->static_psymbols.list + pst->statics_offset); + if (global && !pst->globals_sorted) + sort_pst_symbols (pst); - if (global) /* This means we can use a binary search. */ + if (global && pst->globals_sorted) { + /* This means we can use a binary search. */ do_linear_search = 0; /* Binary search. This search is guaranteed to end with center pointing at the earliest partial symbol whose name might be correct. At that point *all* partial symbols with an Index: gdb/symtab.h =================================================================== RCS file: /cvs/src/src/gdb/symtab.h,v retrieving revision 1.112 diff -u -5 -p -r1.112 symtab.h --- gdb/symtab.h 5 Feb 2008 22:17:40 -0000 1.112 +++ gdb/symtab.h 11 Feb 2008 15:53:41 -0000 @@ -921,17 +921,19 @@ struct partial_symtab struct partial_symtab **dependencies; int number_of_dependencies; - /* Global symbol list. This list will be sorted after readin to + /* Global symbol list. This list will be sorted on first lookup to improve access. Binary search will be the usual method of finding a symbol within it. globals_offset is an integer offset - within global_psymbols[]. */ + within global_psymbols[]. globals_sorted is set to non zero when + sorted. */ int globals_offset; int n_global_syms; + int globals_sorted; /* Static symbol list. This list will *not* be sorted after readin; to find a symbol in it, exhaustive search must be used. This is reasonable because searches through this list will eventually lead to either the read in of a files symbols for real (assumed --------------020401020602090306080402--