From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-156441-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 21532 invoked by alias); 30 May 2019 00:08:55 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 21524 invoked by uid 89); 30 May 2019 00:08:55 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-25.4 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_STOCKGEN,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 spammy=werent, weren't
X-HELO: mail-wm1-f68.google.com
Received: from mail-wm1-f68.google.com (HELO mail-wm1-f68.google.com) (209.85.128.68) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 30 May 2019 00:08:53 +0000
Received: by mail-wm1-f68.google.com with SMTP id t5so2666820wmh.3        for <gdb-patches@sourceware.org>; Wed, 29 May 2019 17:08:52 -0700 (PDT)
Return-Path: <palves@redhat.com>
Received: from ?IPv6:2001:8a0:f913:f700:4eeb:42ff:feef:f164? ([2001:8a0:f913:f700:4eeb:42ff:feef:f164])        by smtp.gmail.com with ESMTPSA id s10sm620488wrt.66.2019.05.29.17.08.49        (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128);        Wed, 29 May 2019 17:08:50 -0700 (PDT)
Subject: Re: [PATCH v3 2/8] Remove static buffer from ada_decode
To: Tom Tromey <tom@tromey.com>, gdb-patches@sourceware.org
References: <20190529212916.23721-1-tom@tromey.com> <20190529212916.23721-3-tom@tromey.com>
From: Pedro Alves <palves@redhat.com>
Message-ID: <a6a0036d-7976-9443-5cac-669941dd1119@redhat.com>
Date: Thu, 30 May 2019 00:08:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1
MIME-Version: 1.0
In-Reply-To: <20190529212916.23721-3-tom@tromey.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-SW-Source: 2019-05/txt/msg00673.txt.bz2

On 5/29/19 10:29 PM, Tom Tromey wrote:
> ada_decode has a static buffer, which means it is not thread-safe.
> This patch removes the buffer and adds a parameter so that the storage
> can be managed by the caller.

Is there a reason the buffer is a unique_ptr instead of a std::string?
Or a reason that a std::string could be better?  Wondering
out loud --  the small string optimization crossed my mind, though
probably Ada symbol names rarely fit in it.

I wonder whether an interface like the below be preferred?

 class ada_decoder_storage
 {
 public:
   ada_decoder_storage () = default;

   /* May return a pointer to m_storage, therefore the result
      is only guaranteed valid for as long as this
      ada_decoder_storage object is live, or until the next
      decode call.  */
   const char *decode (const char *encoded);

 private:
   gdb::unique_xmalloc_ptr<char> m_storage;
 };

That would allow hiding the storage type from clients, like:

  ada_sniff_from_mangled_name (const char *mangled, char **out)
  {
 -  const char *demangled = ada_decode (mangled);
 +  ada_decoder_storage decoder;
 +  const char *demangled = decoder.decode (mangled);

With that, sometimes you wouldn't need to name a temp, like in:

      error (_("Could not find renamed variable: %s"),
	     ada_decode_storage ().decode (name));

Note: while I was working on the C++ wildmatching stuff a couple years
back, I ran into pretty simple C/C++ use case that consistently showed
ada_decode on top of perf profiles, I think coming from sniffing the
minsym's language.  I wrote some patches optimizing it, but in the end since
they weren't required for the wildmatching stuff, I didn't include them
in the upstream submission back then and never found the time to submit
them since:

 https://github.com/palves/gdb/commits/palves/ada-decode-speedups

Unfortunately apparently I didn't write down anywhere that the usecase
was...  And worse, that branch is causing regressions now, but I'm quite
sure that it didn't back then...  Oh well.  Just a FYI anyway.

The reason I'm bringing this up, is that your patch will introduce
heap allocations for the storage, and that reminded me of the desire to
optimize ada_decode.  I would hope that the buffer/storage
can be reused when we need to decode many symbols.

Anyway, again I don't have a testcase handy, so take it as a FYI.
Might be the recent-ish glibc malloc improvements manage to reuse
the same heap block in such case, making the issue moot in practice.

Thanks,
Pedro Alves

> 
> gdb/ChangeLog
> 2019-05-29  Tom Tromey  <tom@tromey.com>
> 
> 	* ada-varobj.c (ada_varobj_describe_simple_array_child): Update.
> 	* ada-lang.h (ada_decode): Add "storage" parameter.
> 	* ada-lang.c (ada_decode): Add "storage" parameter.
> 	(ada_decode_symbol, ada_la_decode, ada_sniff_from_mangled_name)
> 	(is_valid_name_for_wild_match, name_matches_regex)
> 	(ada_add_global_exceptions): Update.
> 	* ada-exp.y (write_object_renaming): Update.
> ---
>  gdb/ChangeLog    | 10 +++++++++
>  gdb/ada-exp.y    |  6 +++++-
>  gdb/ada-lang.c   | 55 ++++++++++++++++++++++++++++++------------------
>  gdb/ada-lang.h   |  3 ++-
>  gdb/ada-varobj.c |  3 ++-
>  5 files changed, 54 insertions(+), 23 deletions(-)
> 
> diff --git a/gdb/ada-exp.y b/gdb/ada-exp.y
> index f7ce27aca35..4771ec0ed56 100644
> --- a/gdb/ada-exp.y
> +++ b/gdb/ada-exp.y
> @@ -816,7 +816,11 @@ write_object_renaming (struct parser_state *par_state,
>  				 renamed_entity_len);
>    ada_lookup_encoded_symbol (name, orig_left_context, VAR_DOMAIN, &sym_info);
>    if (sym_info.symbol == NULL)
> -    error (_("Could not find renamed variable: %s"), ada_decode (name));
> +    {
> +      gdb::unique_xmalloc_ptr<char> storage;
> +      error (_("Could not find renamed variable: %s"),
> +	     ada_decode (name, &storage));
> +    }
>    else if (SYMBOL_CLASS (sym_info.symbol) == LOC_TYPEDEF)
>      /* We have a renaming of an old-style renaming symbol.  Don't
>         trust the block information.  */
> diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
> index 99c099aa07d..75325100dc9 100644
> --- a/gdb/ada-lang.c
> +++ b/gdb/ada-lang.c
> @@ -1106,22 +1106,18 @@ ada_remove_po_subprogram_suffix (const char *encoded, int *len)
>  
>  /* If ENCODED follows the GNAT entity encoding conventions, then return
>     the decoded form of ENCODED.  Otherwise, return "<%s>" where "%s" is
> -   replaced by ENCODED.
> -
> -   The resulting string is valid until the next call of ada_decode.
> -   If the string is unchanged by decoding, the original string pointer
> -   is returned.  */
> +   replaced by ENCODED.  */
>  
>  const char *
> -ada_decode (const char *encoded)
> +ada_decode (const char *encoded, gdb::unique_xmalloc_ptr<char> *storage)
>  {
>    int i, j;
>    int len0;
>    const char *p;
>    char *decoded;
>    int at_start_name;
> -  static char *decoding_buffer = NULL;
> -  static size_t decoding_buffer_size = 0;
> +  char *decoding_buffer = NULL;
> +  size_t decoding_buffer_size = 0;
>  
>    /* With function descriptors on PPC64, the value of a symbol named
>       ".FN", if it exists, is the entry point of the function "FN".  */
> @@ -1349,9 +1345,15 @@ ada_decode (const char *encoded)
>        goto Suppress;
>  
>    if (strcmp (decoded, encoded) == 0)
> -    return encoded;
> +    {
> +      xfree (decoding_buffer);
> +      return encoded;
> +    }
>    else
> -    return decoded;
> +    {
> +      storage->reset (decoded);
> +      return decoded;
> +    }
>  
>  Suppress:
>    GROW_VECT (decoding_buffer, decoding_buffer_size, strlen (encoded) + 3);
> @@ -1360,8 +1362,8 @@ Suppress:
>      strcpy (decoded, encoded);
>    else
>      xsnprintf (decoded, decoding_buffer_size, "<%s>", encoded);
> +  storage->reset (decoded);
>    return decoded;
> -
>  }
>  
>  /* Table for keeping permanent unique copies of decoded names.  Once
> @@ -1390,7 +1392,8 @@ ada_decode_symbol (const struct general_symbol_info *arg)
>  
>    if (!gsymbol->ada_mangled)
>      {
> -      const char *decoded = ada_decode (gsymbol->name);
> +      gdb::unique_xmalloc_ptr<char> storage;
> +      const char *decoded = ada_decode (gsymbol->name, &storage);
>        struct obstack *obstack = gsymbol->language_specific.obstack;
>  
>        gsymbol->ada_mangled = 1;
> @@ -1420,7 +1423,8 @@ ada_decode_symbol (const struct general_symbol_info *arg)
>  static char *
>  ada_la_decode (const char *encoded, int options)
>  {
> -  return xstrdup (ada_decode (encoded));
> +  gdb::unique_xmalloc_ptr<char> storage;
> +  return xstrdup (ada_decode (encoded, &storage));
>  }
>  
>  /* Implement la_sniff_from_mangled_name for Ada.  */
> @@ -1428,7 +1432,8 @@ ada_la_decode (const char *encoded, int options)
>  static int
>  ada_sniff_from_mangled_name (const char *mangled, char **out)
>  {
> -  const char *demangled = ada_decode (mangled);
> +  gdb::unique_xmalloc_ptr<char> storage;
> +  const char *demangled = ada_decode (mangled, &storage);
>  
>    *out = NULL;
>  
> @@ -6001,7 +6006,8 @@ is_name_suffix (const char *str)
>  static int
>  is_valid_name_for_wild_match (const char *name0)
>  {
> -  const char *decoded_name = ada_decode (name0);
> +  gdb::unique_xmalloc_ptr<char> storage;
> +  const char *decoded_name = ada_decode (name0, &storage);
>    int i;
>  
>    /* If the decoded name starts with an angle bracket, it means that
> @@ -6249,8 +6255,9 @@ ada_lookup_name_info::matches
>           is not a suitable completion.  */
>        const char *sym_name_copy = sym_name;
>        bool has_angle_bracket;
> +      gdb::unique_xmalloc_ptr<char> storage;
>  
> -      sym_name = ada_decode (sym_name);
> +      sym_name = ada_decode (sym_name, &storage);
>        has_angle_bracket = (sym_name[0] == '<');
>        match = (has_angle_bracket == m_verbatim_p);
>        sym_name = sym_name_copy;
> @@ -6272,12 +6279,14 @@ ada_lookup_name_info::matches
>  
>    /* Second: Try wild matching...  */
>  
> +  gdb::unique_xmalloc_ptr<char> sym_name_storage;
>    if (!match && m_wild_match_p)
>      {
>        /* Since we are doing wild matching, this means that TEXT
>           may represent an unqualified symbol name.  We therefore must
>           also compare TEXT against the unqualified name of the symbol.  */
> -      sym_name = ada_unqualified_name (ada_decode (sym_name));
> +      sym_name = ada_unqualified_name (ada_decode (sym_name,
> +						   &sym_name_storage));
>  
>        if (strncmp (sym_name, text, text_len) == 0)
>  	match = true;
> @@ -6293,7 +6302,10 @@ ada_lookup_name_info::matches
>        std::string &match_str = comp_match_res->match.storage ();
>  
>        if (!m_encoded_p)
> -	match_str = ada_decode (sym_name);
> +	{
> +	  gdb::unique_xmalloc_ptr<char> storage;
> +	  match_str = ada_decode (sym_name, &storage);
> +	}
>        else
>  	{
>  	  if (m_verbatim_p)
> @@ -13416,8 +13428,9 @@ ada_add_exceptions_from_frame (compiled_regex *preg,
>  static bool
>  name_matches_regex (const char *name, compiled_regex *preg)
>  {
> +  gdb::unique_xmalloc_ptr<char> storage;
>    return (preg == NULL
> -	  || preg->exec (ada_decode (name), 0, NULL, 0) == 0);
> +	  || preg->exec (ada_decode (name, &storage), 0, NULL, 0) == 0);
>  }
>  
>  /* Add all exceptions defined globally whose name name match
> @@ -13450,7 +13463,9 @@ ada_add_global_exceptions (compiled_regex *preg,
>  			   lookup_name_info::match_any (),
>  			   [&] (const char *search_name)
>  			   {
> -			     const char *decoded = ada_decode (search_name);
> +			     gdb::unique_xmalloc_ptr<char> storage;
> +			     const char *decoded = ada_decode (search_name,
> +							       &storage);
>  			     return name_matches_regex (decoded, preg);
>  			   },
>  			   NULL,
> diff --git a/gdb/ada-lang.h b/gdb/ada-lang.h
> index ff6c3399eaf..d0fc87c23a9 100644
> --- a/gdb/ada-lang.h
> +++ b/gdb/ada-lang.h
> @@ -227,7 +227,8 @@ extern struct type *ada_get_decoded_type (struct type *type);
>  
>  extern const char *ada_decode_symbol (const struct general_symbol_info *);
>  
> -extern const char *ada_decode (const char*);
> +extern const char *ada_decode (const char *,
> +			       gdb::unique_xmalloc_ptr<char> *storage);
>  
>  extern enum language ada_update_initial_language (enum language);
>  
> diff --git a/gdb/ada-varobj.c b/gdb/ada-varobj.c
> index a4d553d3786..29a4c86cc5f 100644
> --- a/gdb/ada-varobj.c
> +++ b/gdb/ada-varobj.c
> @@ -624,6 +624,7 @@ ada_varobj_describe_simple_array_child (struct value *parent_value,
>  	 of the array index type when such type qualification is
>  	 needed.  */
>        const char *index_type_name = NULL;
> +      gdb::unique_xmalloc_ptr<char> storage;
>  
>        /* If the index type is a range type, find the base type.  */
>        while (TYPE_CODE (index_type) == TYPE_CODE_RANGE)
> @@ -634,7 +635,7 @@ ada_varobj_describe_simple_array_child (struct value *parent_value,
>  	{
>  	  index_type_name = ada_type_name (index_type);
>  	  if (index_type_name)
> -	    index_type_name = ada_decode (index_type_name);
> +	    index_type_name = ada_decode (index_type_name, &storage);
>  	}
>  
>        if (index_type_name != NULL)
>