From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 59904 invoked by alias); 18 Nov 2017 05:23:09 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 58913 invoked by uid 89); 18 Nov 2017 05:23:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-6.7 required=5.0 tests=BAYES_00,GIT_PATCH_1,KB_WAM_FROM_NAME_SINGLEWORD,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: simark.ca Received: from simark.ca (HELO simark.ca) (158.69.221.121) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 18 Nov 2017 05:23:06 +0000 Received: from [10.0.0.11] (192-222-251-162.qc.cable.ebox.net [192.222.251.162]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 908571E0A6; Sat, 18 Nov 2017 00:23:04 -0500 (EST) Subject: Re: [PATCH 26/40] Optimize .gdb_index symbol name searching To: Pedro Alves , gdb-patches@sourceware.org References: <1496406158-12663-1-git-send-email-palves@redhat.com> <1496406158-12663-27-git-send-email-palves@redhat.com> From: Simon Marchi Message-ID: <87b7a366-9b57-c169-6336-78b3326bba89@simark.ca> Date: Sat, 18 Nov 2017 05:23:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <1496406158-12663-27-git-send-email-palves@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2017-11/txt/msg00375.txt.bz2 On 2017-06-02 08:22 AM, Pedro Alves wrote: > @@ -4186,30 +4239,214 @@ dw2_expand_symtabs_matching > } > } > > - gdb_index_symbol_name_matcher lookup_name_matcher (lookup_name); > + mapped_index &index = *dwarf2_per_objfile->index_table; > > - for (iter = 0; iter < index->symbol_table_slots; ++iter) > + dw2_expand_symtabs_matching_symbol (index, lookup_name, > + symbol_matcher, > + kind, [&] (offset_type idx) > { > - offset_type idx = 2 * iter; > - const char *name; > - offset_type *vec, vec_len, vec_idx; > - int global_seen = 0; > + dw2_expand_marked_cus (index, idx, objfile, file_matcher, > + expansion_notify, kind); > + }); > +} > > - QUIT; > +/* Helper for dw2_expand_symtabs_matching that works with a > + mapped_index instead of the containing objfile. This is split to a > + separate function in order to be able to unit test the > + name_components matching using a mock mapped_index. For each > + symbol name that matches, calls MATCH_CALLBACK, passing it the > + symbol's index in the mapped_index symbol table. */ > > - if (index->symbol_table[idx] == 0 && index->symbol_table[idx + 1] == 0) > - continue; > +static void > +dw2_expand_symtabs_matching_symbol > + (mapped_index &index, > + const lookup_name_info &lookup_name, > + gdb::function_view symbol_matcher, > + enum search_domain kind, > + gdb::function_view match_callback) > +{ > + gdb_index_symbol_name_matcher lookup_name_matcher > + (lookup_name); > + > + auto *name_cmp = case_sensitivity == case_sensitive_on ? strcmp : strcasecmp; > + > + /* Build the symbol name component sorted vector, if we haven't yet. > + The code below only knows how to break apart components of C++ > + symbol names (and other languages that use '::' as > + namespace/module separator). If we add support for wild matching > + to some language that uses some other operator (E.g., Ada, Go and > + D use '.'), then we'll need to try splitting the symbol name > + according to that language too. Note that Ada does support wild > + matching, but doesn't currently support .gdb_index. */ > + if (index.name_components.empty ()) > + { > + for (size_t iter = 0; iter < index.symbol_table_slots; ++iter) > + { > + offset_type idx = 2 * iter; > + > + if (index.symbol_table[idx] == 0 > + && index.symbol_table[idx + 1] == 0) > + continue; > + > + const char *name = index.symbol_name_at (idx); > + > + /* Add each name component to the name component table. */ > + unsigned int previous_len = 0; > + for (unsigned int current_len = cp_find_first_component (name); > + name[current_len] != '\0'; > + current_len += cp_find_first_component (name + current_len)) > + { > + gdb_assert (name[current_len] == ':'); > + index.name_components.push_back ({previous_len, idx}); > + /* Skip the '::'. */ > + current_len += 2; > + previous_len = current_len; > + } > + index.name_components.push_back ({previous_len, idx}); > + } > + > + /* Sort name_comp elements by name. */ > + auto name_comp_compare = [&] (const name_component &left, > + const name_component &right) > + { > + const char *left_qualified = index.symbol_name_at (left.idx); > + const char *right_qualified = index.symbol_name_at (right.idx); > + > + const char *left_name = left_qualified + left.name_offset; > + const char *right_name = right_qualified + right.name_offset; > + > + return name_cmp (left_name, right_name) < 0; > + }; > + > + std::sort (index.name_components.begin (), > + index.name_components.end (), > + name_comp_compare); > + } > + > + const char *cplus > + = lookup_name.cplus ().lookup_name ().c_str (); > > - name = index->constant_pool + MAYBE_SWAP (index->symbol_table[idx]); > + /* Comparison function object for lower_bound that matches against a > + given symbol name. */ > + auto lookup_compare_lower = [&] (const name_component &elem, > + const char *name) > + { > + const char *elem_qualified = index.symbol_name_at (elem.idx); > + const char *elem_name = elem_qualified + elem.name_offset; > + return name_cmp (elem_name, name) < 0; > + }; > + > + /* Comparison function object for upper_bound that matches against a > + given symbol name. */ > + auto lookup_compare_upper = [&] (const char *name, > + const name_component &elem) > + { > + const char *elem_qualified = index.symbol_name_at (elem.idx); > + const char *elem_name = elem_qualified + elem.name_offset; > + return name_cmp (name, elem_name) < 0; > + }; > + > + auto begin = index.name_components.begin (); > + auto end = index.name_components.end (); > + > + /* Find the lower bound. */ > + auto lower = [&] () > + { > + if (lookup_name.completion_mode () && cplus[0] == '\0') > + return begin; > + else > + return std::lower_bound (begin, end, cplus, lookup_compare_lower); > + } (); > > - if (!lookup_name_matcher.matches (name) > - || (symbol_matcher != NULL && !symbol_matcher (name))) > + /* Find the upper bound. */ > + auto upper = [&] () > + { > + if (lookup_name.completion_mode ()) > + { > + /* The string frobbing below won't work if the string is > + empty. We don't need it then, anyway -- if we're > + completing an empty string, then we want to iterate over > + the whole range. */ > + if (cplus[0] == '\0') > + return end; > + > + /* In completion mode, increment the last character because > + we want UPPER to point past all symbols names that have > + the same prefix. */ > + std::string after = cplus; > + > + gdb_assert (after.back () != 0xff); Hi Pedro, With Clang, I get this warning: /home/simark/src/binutils-gdb/gdb/dwarf2read.c:4316:30: error: comparison of constant 255 with expression of type '__gnu_cxx::__alloc_traits >::value_type' (aka 'char') is always true [-Werror,-Wtautological-constant-out-of-range-compare] gdb_assert (after.back () != 0xff); ~~~~~~~~~~~~~ ^ ~~~~ /home/simark/src/binutils-gdb/gdb/common/gdb_assert.h:34:13: note: expanded from macro 'gdb_assert' ((void) ((expr) ? 0 : \ ^~~~ Simon