From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-bounces@sourceware.org>
Received: from simark.ca
	by simark.ca with LMTP
	id l01kJWqIzF8nbwAAWB0awg
	(envelope-from <gdb-patches-bounces@sourceware.org>)
	for <public-inbox@simark.ca>; Sun, 06 Dec 2020 02:29:46 -0500
Received: by simark.ca (Postfix, from userid 112)
	id 8C9A41F0B8; Sun,  6 Dec 2020 02:29:46 -0500 (EST)
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca
X-Spam-Level: 
X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI
	autolearn=ham autolearn_force=no version=3.4.2
Received: from sourceware.org (server2.sourceware.org [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by simark.ca (Postfix) with ESMTPS id 69E481E99A
	for <public-inbox@simark.ca>; Sun,  6 Dec 2020 02:29:45 -0500 (EST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 08B6E3851C2B;
	Sun,  6 Dec 2020 07:29:44 +0000 (GMT)
Received: from rock.gnat.com (rock.gnat.com
 [IPv6:2620:20:4000:0:a9e:1ff:fe9b:1d1])
 by sourceware.org (Postfix) with ESMTP id 1B5103851C21
 for <gdb-patches@sourceware.org>; Sun,  6 Dec 2020 07:29:41 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1B5103851C21
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=adacore.com
Authentication-Results: sourceware.org;
 spf=pass smtp.mailfrom=brobecker@adacore.com
Received: from localhost (localhost.localdomain [127.0.0.1])
 by filtered-rock.gnat.com (Postfix) with ESMTP id D4B11116C98;
 Sun,  6 Dec 2020 02:29:40 -0500 (EST)
X-Virus-Scanned: Debian amavisd-new at gnat.com
Received: from rock.gnat.com ([127.0.0.1])
 by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024)
 with LMTP id Sjef-wcg+n9b; Sun,  6 Dec 2020 02:29:40 -0500 (EST)
Received: from float.home (localhost.localdomain [127.0.0.1])
 (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by rock.gnat.com (Postfix) with ESMTPS id 744B1116BEE;
 Sun,  6 Dec 2020 02:29:40 -0500 (EST)
Received: by float.home (Postfix, from userid 1000)
 id 05346A1880; Sun,  6 Dec 2020 11:29:34 +0400 (+04)
Date: Sun, 6 Dec 2020 11:29:34 +0400
From: Joel Brobecker <brobecker@adacore.com>
To: Simon Marchi via Gdb-patches <gdb-patches@sourceware.org>
Subject: Re: [PATCH 3/4] gdb: split get_discrete_bounds in two
Message-ID: <20201206072934.GB327270@adacore.com>
References: <20201123162120.3778679-1-simon.marchi@efficios.com>
 <20201123162120.3778679-4-simon.marchi@efficios.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20201123162120.3778679-4-simon.marchi@efficios.com>
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
Cc: Simon Marchi <simon.marchi@efficios.com>
Errors-To: gdb-patches-bounces@sourceware.org
Sender: "Gdb-patches" <gdb-patches-bounces@sourceware.org>

Hi Simon,

On Mon, Nov 23, 2020 at 11:21:19AM -0500, Simon Marchi via Gdb-patches wrote:
> get_discrete_bounds is not flexible for ranges (TYPE_CODE_RANGE), in the
> sense that it returns true (success) only if both bounds are present and
> constant values.
> 
> This is a problem for code that only needs to know the low bound and
> fails unnecessarily if the high bound is unknown.
> 
> Split the function in two, get_discrete_low_bound and
> get_discrete_high_bound, that both return an optional.  Provide a new
> implementation of get_discrete_bounds based on the two others, so the
> callers don't have to be changed.
> 
> gdb/ChangeLog:
> 
> 	* gdbtypes.c (get_discrete_bounds): Implement with
> 	get_discrete_low_bound and get_discrete_high_bound.
> 	(get_discrete_low_bound): New.
> 	(get_discrete_high_bound): New.
> 
> Change-Id: I986b5e9c0dd969800e3fb9546af9c827d52e80d0

Small reminder to remove the Change-Id in the ChangeLog.

I agree with the problem, and modulo the minor question I had below
(which might not actually be an issue), the patch looks good to me.

I do have one minor concern. On the one hand, the code seems to be
slightly simpler and clearer. On the other hand, I worry a bit about
keeping the two functions in sync. I'm wondering if it might be worth
investigating ways to have achieve the same end result without
duplicating the skeleton of the code. I tried thinking about it
for a while, and the only option that might be viable that I could
think of was to allow users of get_discrete_bounds to pass nullptr
for either LOWP or HIGHP, and then only handle the bounds whose
parameter is not null. Then we can have the low/high functions
if we wanted. In trying to visualize how this would look like,
I found that two scenarios:

  - I would be bracketing small pieces of code with lots of
    "if <param> != nullptr" (just for TYPE_CODE_RANGE, I counted
    6 instances of an addional condition); FTR, this is what it
    would look like for TYPE_CODE_RANGE:

        | case TYPE_CODE_RANGE:
        |   if ((lowp != nullptr && type->bounds ()->low.kind () != PROP_CONST)
        |       || (highp != nullptr && type->bounds ()->high.kind () != PROP_CONST))
        |     return false;

        |   if (lowp != nullptr)
        |     *lowp = type->bounds ()->low.const_val ();
        |   if (highp != nullptr)
        |     *highp = type->bounds ()->high.const_val ();

        |   if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
        |     {
        |       if (lowp != nullptr)
        |         {
        |           gdb::optional<LONGEST> low_pos
        |             = discrete_position (TYPE_TARGET_TYPE (type), *lowp);

        |           if (low_pos.has_value ())
        |             *lowp = *low_pos;
        |         }

        |       if (highp != nullptr)
        |         {
        |           gdb::optional<LONGEST> high_pos
        |             = discrete_position (TYPE_TARGET_TYPE (type), *highp);

        |           if (high_pos.has_value ())
        |             *highp = *high_pos;
        |         }
        |     }
        |   return true;


  - I would be splitting the lowp and highp code in two inside
    each case; in this case, when one looks at how things look like,
    it takes away half of the reasons why we might want to keep
    the code together. E.g. for TYPE_CODE_RANGE, it woudl look like this:

        | case TYPE_CODE_RANGE:
        |   if (lowp != nullptr)
        |     {
        |       if (type->bounds ()->low.kind () != PROP_CONST)
        |         return false;

        |       *lowp = type->bounds ()->low.const_val ();

        |       if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
        |         {
        |           gdb::optional<LONGEST> low_pos
        |             = discrete_position (TYPE_TARGET_TYPE (type), *lowp);

        |           if (low_pos.has_value ())
        |             *lowp = *low_pos;
        |         }
        |     }
        |   if (lowp != nullptr)
        |     {
        |       if (type->bounds ()->high.kind () != PROP_CONST)
        |         return false;

        |       *highp = type->bounds ()->high.const_val ();

        |       if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
        |         {
        |           gdb::optional<LONGEST> high_pos
        |             = discrete_position (TYPE_TARGET_TYPE (type), *highp);

        |           if (high_pos.has_value ())
        |             *highp = *high_pos;
        |         }
        |     }
        |   return true;

For the record, I thought about perhaps using templates, which would
work, but I think at the cost of the templated code being more complex.

So, in the end, unless you have some better ideas, I am find with
either your solution (splitting), or the solution above which has
the property of better keeping the code together, but is far from
perfect because it is at the cost of complexifying the implementation
a bit (one has to always check that LOWP/HIGHP is not nullptr).

See comments below, though.

> ---
>  gdb/gdbtypes.c | 187 ++++++++++++++++++++++++++++++++++---------------
>  1 file changed, 130 insertions(+), 57 deletions(-)
> 
> diff --git a/gdb/gdbtypes.c b/gdb/gdbtypes.c
> index b47bd28945d..4df23cfe0fe 100644
> --- a/gdb/gdbtypes.c
> +++ b/gdb/gdbtypes.c
> @@ -1036,71 +1036,127 @@ has_static_range (const struct range_bounds *bounds)
>  	  && bounds->stride.kind () == PROP_CONST);
>  }
>  
> -/* See gdbtypes.h.  */
> +/* If TYPE's low bound is a known constant, return it, else return nullopt.  */
>  
> -bool
> -get_discrete_bounds (struct type *type, LONGEST *lowp, LONGEST *highp)
> +static gdb::optional<LONGEST>
> +get_discrete_low_bound (struct type *type)
>  {
>    type = check_typedef (type);
>    switch (type->code ())
>      {
>      case TYPE_CODE_RANGE:
> -      /* This function currently only works for ranges with two defined,
> -	 constant bounds.  */
> -      if (type->bounds ()->low.kind () != PROP_CONST
> -	  || type->bounds ()->high.kind () != PROP_CONST)
> +      {
> +	/* This function only works for ranges with a constant low bound.  */
> +	if (type->bounds ()->low.kind () != PROP_CONST)
> +	  return {};
> +
> +	LONGEST low = type->bounds ()->low.const_val ();
> +
> +	if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
> +	  {
> +	    gdb::optional<LONGEST> low_pos
> +	      = discrete_position (TYPE_TARGET_TYPE (type), low);
> +
> +	    if (low_pos.has_value ())
> +	      low = *low_pos;
> +	  }
> +
> +	return low;
> +      }
> +
> +    case TYPE_CODE_ENUM:
> +      {
> +	if (type->num_fields () > 0)
> +	  {
> +	    /* The enums may not be sorted by value, so search all
> +	       entries.  */
> +	    LONGEST low = TYPE_FIELD_ENUMVAL (type, 0);
> +
> +	    for (int i = 0; i < type->num_fields (); i++)
> +	      {
> +		if (TYPE_FIELD_ENUMVAL (type, i) < low)
> +		  low = TYPE_FIELD_ENUMVAL (type, i);
> +	      }
> +
> +	    /* Set unsigned indicator if warranted.  */
> +	    if (low >= 0)
> +	      type->set_is_unsigned (true);
> +
> +	    return low;
> +	  }
> +	else
> +	  return 0;
> +      }
> +
> +    case TYPE_CODE_BOOL:
> +      return 0;
> +
> +    case TYPE_CODE_INT:
> +      if (TYPE_LENGTH (type) > sizeof (LONGEST))	/* Too big */
>  	return false;

Should this be "return {};" here?

>  
> -      *lowp = type->bounds ()->low.const_val ();
> -      *highp = type->bounds ()->high.const_val ();
> +      if (!type->is_unsigned ())
> +	return -(1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1));
>  
> -      if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
> -	{
> -	  gdb::optional<LONGEST> low_pos
> -	    = discrete_position (TYPE_TARGET_TYPE (type), *lowp);
> +      /* fall through */
> +    case TYPE_CODE_CHAR:
> +      return 0;
>  
> -	  if (low_pos.has_value ())
> -	    *lowp = *low_pos;
> +    default:
> +      return false;

Same as above.

> +    }
> +}
>  
> -	  gdb::optional<LONGEST> high_pos
> -	    = discrete_position (TYPE_TARGET_TYPE (type), *highp);
> +/* If TYPE's high bound is a known constant, return it, else return nullopt.  */
>  
> -	  if (high_pos.has_value ())
> -	    *highp = *high_pos;
> -	}
> -      return true;
> +static gdb::optional<LONGEST>
> +get_discrete_high_bound (struct type *type)
> +{
> +  type = check_typedef (type);
> +  switch (type->code ())
> +    {
> +    case TYPE_CODE_RANGE:
> +      {
> +	/* This function only works for ranges with a constant high bound.  */
> +	if (type->bounds ()->high.kind () != PROP_CONST)
> +	  return {};
> +
> +	LONGEST high = type->bounds ()->high.const_val ();
> +
> +	if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
> +	  {
> +	    gdb::optional<LONGEST> high_pos
> +	      = discrete_position (TYPE_TARGET_TYPE (type), high);
> +
> +	    if (high_pos.has_value ())
> +	      high = *high_pos;
> +	  }
> +
> +	return high;
> +      }
>  
>      case TYPE_CODE_ENUM:
> -      if (type->num_fields () > 0)
> -	{
> -	  /* The enums may not be sorted by value, so search all
> -	     entries.  */
> -	  int i;
> +      {
> +	if (type->num_fields () > 0)
> +	  {
> +	    /* The enums may not be sorted by value, so search all
> +	       entries.  */
> +	    LONGEST high = TYPE_FIELD_ENUMVAL (type, 0);
>  
> -	  *lowp = *highp = TYPE_FIELD_ENUMVAL (type, 0);
> -	  for (i = 0; i < type->num_fields (); i++)
> -	    {
> -	      if (TYPE_FIELD_ENUMVAL (type, i) < *lowp)
> -		*lowp = TYPE_FIELD_ENUMVAL (type, i);
> -	      if (TYPE_FIELD_ENUMVAL (type, i) > *highp)
> -		*highp = TYPE_FIELD_ENUMVAL (type, i);
> -	    }
> +	    for (int i = 0; i < type->num_fields (); i++)
> +	      {
> +		if (TYPE_FIELD_ENUMVAL (type, i) > high)
> +		  high = TYPE_FIELD_ENUMVAL (type, i);
> +	      }
>  
> -	  /* Set unsigned indicator if warranted.  */
> -	  if (*lowp >= 0)
> -	    type->set_is_unsigned (true);
> -	}
> -      else
> -	{
> -	  *lowp = 0;
> -	  *highp = -1;
> -	}
> -      return true;
> +	    return high;
> +	  }
> +	else
> +	  return -1;
> +      }
>  
>      case TYPE_CODE_BOOL:
> -      *lowp = 0;
> -      *highp = 1;
> -      return true;
> +      return 1;
>  
>      case TYPE_CODE_INT:
>        if (TYPE_LENGTH (type) > sizeof (LONGEST))	/* Too big */

In that section, there is a "return false" that I'm wondering
needs to be changed to "return {}" as well...

> @@ -1108,25 +1164,42 @@ get_discrete_bounds (struct type *type, LONGEST *lowp, LONGEST *highp)
>  
>        if (!type->is_unsigned ())
>  	{
> -	  *lowp = -(1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1));
> -	  *highp = -*lowp - 1;
> -	  return true;
> +	  LONGEST low = -(1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1));
> +	  return -low - 1;
>  	}
> +
>        /* fall through */
>      case TYPE_CODE_CHAR:
> -      *lowp = 0;
> -      /* This round-about calculation is to avoid shifting by
> -	 TYPE_LENGTH (type) * TARGET_CHAR_BIT, which will not work
> -	 if TYPE_LENGTH (type) == sizeof (LONGEST).  */
> -      *highp = 1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1);
> -      *highp = (*highp - 1) | *highp;
> -      return true;
> +      {
> +	/* This round-about calculation is to avoid shifting by
> +	   TYPE_LENGTH (type) * TARGET_CHAR_BIT, which will not work
> +	   if TYPE_LENGTH (type) == sizeof (LONGEST).  */
> +	LONGEST high = 1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1);
> +	return (high - 1) | high;
> +      }
>  
>      default:
>        return false;

Same as above.

>      }
>  }
>  
> +/* See gdbtypes.h.  */
> +
> +bool
> +get_discrete_bounds (struct type *type, LONGEST *lowp, LONGEST *highp)
> +{
> +  gdb::optional<LONGEST> low = get_discrete_low_bound (type);
> +  gdb::optional<LONGEST> high = get_discrete_high_bound (type);
> +
> +  if (!low.has_value () || !high.has_value ())
> +    return false;

What do you think of computing high only after having verified
that "low" has a value? It would only be a small optimization,
but since it doesn't really complexify the code, and only adds
a couple of extra lines of code, it seems worth having?

One thing also that this patch makes me realize is that it answers
one question I had in the back of my mind: LOWP and HIGHP are only
set if both bounds can be computed -- otherwise, they are both
left untouched. Perhaps we can in a followup patch update the function's
doc to mention that?

> +  *lowp = *low;
> +  *highp = *high;
> +
> +  return true;
> +}
> +
>  /* See gdbtypes.h  */

-- 
Joel