From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-143503-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 59904 invoked by alias); 18 Nov 2017 05:23:09 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 58913 invoked by uid 89); 18 Nov 2017 05:23:08 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-6.7 required=5.0 tests=BAYES_00,GIT_PATCH_1,KB_WAM_FROM_NAME_SINGLEWORD,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=
X-HELO: simark.ca
Received: from simark.ca (HELO simark.ca) (158.69.221.121) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 18 Nov 2017 05:23:06 +0000
Received: from [10.0.0.11] (192-222-251-162.qc.cable.ebox.net [192.222.251.162])	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))	(No client certificate requested)	by simark.ca (Postfix) with ESMTPSA id 908571E0A6;	Sat, 18 Nov 2017 00:23:04 -0500 (EST)
Subject: Re: [PATCH 26/40] Optimize .gdb_index symbol name searching
To: Pedro Alves <palves@redhat.com>, gdb-patches@sourceware.org
References: <1496406158-12663-1-git-send-email-palves@redhat.com> <1496406158-12663-27-git-send-email-palves@redhat.com>
From: Simon Marchi <simark@simark.ca>
Message-ID: <87b7a366-9b57-c169-6336-78b3326bba89@simark.ca>
Date: Sat, 18 Nov 2017 05:23:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <1496406158-12663-27-git-send-email-palves@redhat.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-SW-Source: 2017-11/txt/msg00375.txt.bz2

On 2017-06-02 08:22 AM, Pedro Alves wrote:
> @@ -4186,30 +4239,214 @@ dw2_expand_symtabs_matching
>  	}
>      }
>  
> -  gdb_index_symbol_name_matcher lookup_name_matcher (lookup_name);
> +  mapped_index &index = *dwarf2_per_objfile->index_table;
>  
> -  for (iter = 0; iter < index->symbol_table_slots; ++iter)
> +  dw2_expand_symtabs_matching_symbol (index, lookup_name,
> +				      symbol_matcher,
> +				      kind, [&] (offset_type idx)
>      {
> -      offset_type idx = 2 * iter;
> -      const char *name;
> -      offset_type *vec, vec_len, vec_idx;
> -      int global_seen = 0;
> +      dw2_expand_marked_cus (index, idx, objfile, file_matcher,
> +			     expansion_notify, kind);
> +    });
> +}
>  
> -      QUIT;
> +/* Helper for dw2_expand_symtabs_matching that works with a
> +   mapped_index instead of the containing objfile.  This is split to a
> +   separate function in order to be able to unit test the
> +   name_components matching using a mock mapped_index.  For each
> +   symbol name that matches, calls MATCH_CALLBACK, passing it the
> +   symbol's index in the mapped_index symbol table.  */
>  
> -      if (index->symbol_table[idx] == 0 && index->symbol_table[idx + 1] == 0)
> -	continue;
> +static void
> +dw2_expand_symtabs_matching_symbol
> +  (mapped_index &index,
> +   const lookup_name_info &lookup_name,
> +   gdb::function_view<expand_symtabs_symbol_matcher_ftype> symbol_matcher,
> +   enum search_domain kind,
> +   gdb::function_view<void (offset_type)> match_callback)
> +{
> +  gdb_index_symbol_name_matcher lookup_name_matcher
> +    (lookup_name);
> +
> +  auto *name_cmp = case_sensitivity == case_sensitive_on ? strcmp : strcasecmp;
> +
> +  /* Build the symbol name component sorted vector, if we haven't yet.
> +     The code below only knows how to break apart components of C++
> +     symbol names (and other languages that use '::' as
> +     namespace/module separator).  If we add support for wild matching
> +     to some language that uses some other operator (E.g., Ada, Go and
> +     D use '.'), then we'll need to try splitting the symbol name
> +     according to that language too.  Note that Ada does support wild
> +     matching, but doesn't currently support .gdb_index.  */
> +  if (index.name_components.empty ())
> +    {
> +      for (size_t iter = 0; iter < index.symbol_table_slots; ++iter)
> +	{
> +	  offset_type idx = 2 * iter;
> +
> +	  if (index.symbol_table[idx] == 0
> +	      && index.symbol_table[idx + 1] == 0)
> +	    continue;
> +
> +	  const char *name = index.symbol_name_at (idx);
> +
> +	  /* Add each name component to the name component table.  */
> +	  unsigned int previous_len = 0;
> +	  for (unsigned int current_len = cp_find_first_component (name);
> +	       name[current_len] != '\0';
> +	       current_len += cp_find_first_component (name + current_len))
> +	    {
> +	      gdb_assert (name[current_len] == ':');
> +	      index.name_components.push_back ({previous_len, idx});
> +	      /* Skip the '::'.  */
> +	      current_len += 2;
> +	      previous_len = current_len;
> +	    }
> +	  index.name_components.push_back ({previous_len, idx});
> +	}
> +
> +      /* Sort name_comp elements by name.   */
> +      auto name_comp_compare = [&] (const name_component &left,
> +				    const name_component &right)
> +	{
> +	  const char *left_qualified = index.symbol_name_at (left.idx);
> +	  const char *right_qualified = index.symbol_name_at (right.idx);
> +
> +	  const char *left_name = left_qualified + left.name_offset;
> +	  const char *right_name = right_qualified + right.name_offset;
> +
> +	  return name_cmp (left_name, right_name) < 0;
> +	};
> +
> +      std::sort (index.name_components.begin (),
> +		 index.name_components.end (),
> +		 name_comp_compare);
> +    }
> +
> +  const char *cplus
> +    = lookup_name.cplus ().lookup_name ().c_str ();
>  
> -      name = index->constant_pool + MAYBE_SWAP (index->symbol_table[idx]);
> +  /* Comparison function object for lower_bound that matches against a
> +     given symbol name.  */
> +  auto lookup_compare_lower = [&] (const name_component &elem,
> +				   const char *name)
> +    {
> +      const char *elem_qualified = index.symbol_name_at (elem.idx);
> +      const char *elem_name = elem_qualified + elem.name_offset;
> +      return name_cmp (elem_name, name) < 0;
> +    };
> +
> +  /* Comparison function object for upper_bound that matches against a
> +     given symbol name.  */
> +  auto lookup_compare_upper = [&] (const char *name,
> +				   const name_component &elem)
> +    {
> +      const char *elem_qualified = index.symbol_name_at (elem.idx);
> +      const char *elem_name = elem_qualified + elem.name_offset;
> +      return name_cmp (name, elem_name) < 0;
> +    };
> +
> +  auto begin = index.name_components.begin ();
> +  auto end = index.name_components.end ();
> +
> +  /* Find the lower bound.  */
> +  auto lower = [&] ()
> +    {
> +      if (lookup_name.completion_mode () && cplus[0] == '\0')
> +	return begin;
> +      else
> +	return std::lower_bound (begin, end, cplus, lookup_compare_lower);
> +    } ();
>  
> -      if (!lookup_name_matcher.matches (name)
> -	  || (symbol_matcher != NULL && !symbol_matcher (name)))
> +  /* Find the upper bound.  */
> +  auto upper = [&] ()
> +    {
> +      if (lookup_name.completion_mode ())
> +	{
> +	  /* The string frobbing below won't work if the string is
> +	     empty.  We don't need it then, anyway -- if we're
> +	     completing an empty string, then we want to iterate over
> +	     the whole range.  */
> +	  if (cplus[0] == '\0')
> +	    return end;
> +
> +	  /* In completion mode, increment the last character because
> +	     we want UPPER to point past all symbols names that have
> +	     the same prefix.  */
> +	  std::string after = cplus;
> +
> +	  gdb_assert (after.back () != 0xff);

Hi Pedro,

With Clang, I get this warning:

/home/simark/src/binutils-gdb/gdb/dwarf2read.c:4316:30: error: comparison of constant 255 with expression of type '__gnu_cxx::__alloc_traits<std::allocator<char> >::value_type' (aka 'char') is always true [-Werror,-Wtautological-constant-out-of-range-compare]
          gdb_assert (after.back () != 0xff);
                      ~~~~~~~~~~~~~ ^  ~~~~
/home/simark/src/binutils-gdb/gdb/common/gdb_assert.h:34:13: note: expanded from macro 'gdb_assert'
  ((void) ((expr) ? 0 :                                                       \
            ^~~~

Simon