From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-bounces@sourceware.org>
Received: from simark.ca
	by simark.ca with LMTP
	id p6MiBAZPzl8hSAAAWB0awg
	(envelope-from <gdb-patches-bounces@sourceware.org>)
	for <public-inbox@simark.ca>; Mon, 07 Dec 2020 10:49:26 -0500
Received: by simark.ca (Postfix, from userid 112)
	id 0557A1F096; Mon,  7 Dec 2020 10:49:26 -0500 (EST)
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca
X-Spam-Level: 
X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID,
	DKIM_VALID_AU,MAILING_LIST_MULTI,URIBL_BLOCKED autolearn=ham
	autolearn_force=no version=3.4.2
Received: from sourceware.org (server2.sourceware.org [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by simark.ca (Postfix) with ESMTPS id 3FB021E552
	for <public-inbox@simark.ca>; Mon,  7 Dec 2020 10:49:24 -0500 (EST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 8FFA7389682C;
	Mon,  7 Dec 2020 15:49:23 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8FFA7389682C
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org;
	s=default; t=1607356163;
	bh=ulvUzsyzK23nIeKZFKJTG1jbFAnSiVrKN97jIW7SdWg=;
	h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe:
	 List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
	 From;
	b=wWaUv+f6CEsbDuDWbe0Un55fX+ZULkie1v1xXN/wkvgZCMu3H/VLXHIL/WWoVJsDM
	 Dk3rcSKuzKGgB84KPls1tpaR9quKL12Wu8l3sszqECY1LDql5+neolxB1Sg2t58Cnf
	 kxKC6irWxCxtNQIMMLS/L1PX5kgNINwVD6ic1cqU=
Received: from mail.efficios.com (mail.efficios.com [167.114.26.124])
 by sourceware.org (Postfix) with ESMTPS id 459EB389680B
 for <gdb-patches@sourceware.org>; Mon,  7 Dec 2020 15:49:21 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 459EB389680B
Received: from localhost (localhost [127.0.0.1])
 by mail.efficios.com (Postfix) with ESMTP id 118FC300827;
 Mon,  7 Dec 2020 10:49:21 -0500 (EST)
Received: from mail.efficios.com ([127.0.0.1])
 by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id W9Z_Ytw3H_zf; Mon,  7 Dec 2020 10:49:20 -0500 (EST)
Received: from localhost (localhost [127.0.0.1])
 by mail.efficios.com (Postfix) with ESMTP id 65995300A04;
 Mon,  7 Dec 2020 10:49:20 -0500 (EST)
DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 65995300A04
X-Virus-Scanned: amavisd-new at efficios.com
Received: from mail.efficios.com ([127.0.0.1])
 by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id 6XKP0PcZWQtW; Mon,  7 Dec 2020 10:49:20 -0500 (EST)
Received: from [10.0.0.11] (173-246-6-90.qc.cable.ebox.net [173.246.6.90])
 by mail.efficios.com (Postfix) with ESMTPSA id 433AC300A83;
 Mon,  7 Dec 2020 10:49:20 -0500 (EST)
Subject: Re: [PATCH 3/4] gdb: split get_discrete_bounds in two
To: Joel Brobecker <brobecker@adacore.com>,
 Simon Marchi via Gdb-patches <gdb-patches@sourceware.org>
References: <20201123162120.3778679-1-simon.marchi@efficios.com>
 <20201123162120.3778679-4-simon.marchi@efficios.com>
 <20201206072934.GB327270@adacore.com>
Message-ID: <277023b8-dff6-54df-5346-ed28bfb12673@efficios.com>
Date: Mon, 7 Dec 2020 10:49:19 -0500
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.4.3
MIME-Version: 1.0
In-Reply-To: <20201206072934.GB327270@adacore.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
From: Simon Marchi via Gdb-patches <gdb-patches@sourceware.org>
Reply-To: Simon Marchi <simon.marchi@efficios.com>
Errors-To: gdb-patches-bounces@sourceware.org
Sender: "Gdb-patches" <gdb-patches-bounces@sourceware.org>


On 2020-12-06 2:29 a.m., Joel Brobecker wrote:
> Hi Simon,
> 
> On Mon, Nov 23, 2020 at 11:21:19AM -0500, Simon Marchi via Gdb-patches wrote:
>> get_discrete_bounds is not flexible for ranges (TYPE_CODE_RANGE), in the
>> sense that it returns true (success) only if both bounds are present and
>> constant values.
>>
>> This is a problem for code that only needs to know the low bound and
>> fails unnecessarily if the high bound is unknown.
>>
>> Split the function in two, get_discrete_low_bound and
>> get_discrete_high_bound, that both return an optional.  Provide a new
>> implementation of get_discrete_bounds based on the two others, so the
>> callers don't have to be changed.
>>
>> gdb/ChangeLog:
>>
>> 	* gdbtypes.c (get_discrete_bounds): Implement with
>> 	get_discrete_low_bound and get_discrete_high_bound.
>> 	(get_discrete_low_bound): New.
>> 	(get_discrete_high_bound): New.
>>
>> Change-Id: I986b5e9c0dd969800e3fb9546af9c827d52e80d0
> 
> Small reminder to remove the Change-Id in the ChangeLog.
> 
> I agree with the problem, and modulo the minor question I had below
> (which might not actually be an issue), the patch looks good to me.
> 
> I do have one minor concern. On the one hand, the code seems to be
> slightly simpler and clearer. On the other hand, I worry a bit about
> keeping the two functions in sync. I'm wondering if it might be worth
> investigating ways to have achieve the same end result without
> duplicating the skeleton of the code. I tried thinking about it
> for a while, and the only option that might be viable that I could
> think of was to allow users of get_discrete_bounds to pass nullptr
> for either LOWP or HIGHP, and then only handle the bounds whose
> parameter is not null. Then we can have the low/high functions
> if we wanted. In trying to visualize how this would look like,
> I found that two scenarios:
> 
>    - I would be bracketing small pieces of code with lots of
>      "if <param> != nullptr" (just for TYPE_CODE_RANGE, I counted
>      6 instances of an addional condition); FTR, this is what it
>      would look like for TYPE_CODE_RANGE:
> 
>          | case TYPE_CODE_RANGE:
>          |   if ((lowp != nullptr && type->bounds ()->low.kind () != PROP_CONST)
>          |       || (highp != nullptr && type->bounds ()->high.kind () != PROP_CONST))
>          |     return false;
> 
>          |   if (lowp != nullptr)
>          |     *lowp = type->bounds ()->low.const_val ();
>          |   if (highp != nullptr)
>          |     *highp = type->bounds ()->high.const_val ();
> 
>          |   if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
>          |     {
>          |       if (lowp != nullptr)
>          |         {
>          |           gdb::optional<LONGEST> low_pos
>          |             = discrete_position (TYPE_TARGET_TYPE (type), *lowp);
> 
>          |           if (low_pos.has_value ())
>          |             *lowp = *low_pos;
>          |         }
> 
>          |       if (highp != nullptr)
>          |         {
>          |           gdb::optional<LONGEST> high_pos
>          |             = discrete_position (TYPE_TARGET_TYPE (type), *highp);
> 
>          |           if (high_pos.has_value ())
>          |             *highp = *high_pos;
>          |         }
>          |     }
>          |   return true;
> 
> 
>    - I would be splitting the lowp and highp code in two inside
>      each case; in this case, when one looks at how things look like,
>      it takes away half of the reasons why we might want to keep
>      the code together. E.g. for TYPE_CODE_RANGE, it woudl look like this:
> 
>          | case TYPE_CODE_RANGE:
>          |   if (lowp != nullptr)
>          |     {
>          |       if (type->bounds ()->low.kind () != PROP_CONST)
>          |         return false;
> 
>          |       *lowp = type->bounds ()->low.const_val ();
> 
>          |       if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
>          |         {
>          |           gdb::optional<LONGEST> low_pos
>          |             = discrete_position (TYPE_TARGET_TYPE (type), *lowp);
> 
>          |           if (low_pos.has_value ())
>          |             *lowp = *low_pos;
>          |         }
>          |     }
>          |   if (lowp != nullptr)
>          |     {
>          |       if (type->bounds ()->high.kind () != PROP_CONST)
>          |         return false;
> 
>          |       *highp = type->bounds ()->high.const_val ();
> 
>          |       if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
>          |         {
>          |           gdb::optional<LONGEST> high_pos
>          |             = discrete_position (TYPE_TARGET_TYPE (type), *highp);
> 
>          |           if (high_pos.has_value ())
>          |             *highp = *high_pos;
>          |         }
>          |     }
>          |   return true;
> 
> For the record, I thought about perhaps using templates, which would
> work, but I think at the cost of the templated code being more complex.
> 
> So, in the end, unless you have some better ideas, I am find with
> either your solution (splitting), or the solution above which has
> the property of better keeping the code together, but is far from
> perfect because it is at the cost of complexifying the implementation
> a bit (one has to always check that LOWP/HIGHP is not nullptr).

Yeah, I tried several solutions and splitting was the one I preferred, 
as I preferred having two simpler functions than one more complex one.

> 
> See comments below, though.
> 
>> ---
>>   gdb/gdbtypes.c | 187 ++++++++++++++++++++++++++++++++++---------------
>>   1 file changed, 130 insertions(+), 57 deletions(-)
>>
>> diff --git a/gdb/gdbtypes.c b/gdb/gdbtypes.c
>> index b47bd28945d..4df23cfe0fe 100644
>> --- a/gdb/gdbtypes.c
>> +++ b/gdb/gdbtypes.c
>> @@ -1036,71 +1036,127 @@ has_static_range (const struct range_bounds *bounds)
>>   	  && bounds->stride.kind () == PROP_CONST);
>>   }
>>   
>> -/* See gdbtypes.h.  */
>> +/* If TYPE's low bound is a known constant, return it, else return nullopt.  */
>>   
>> -bool
>> -get_discrete_bounds (struct type *type, LONGEST *lowp, LONGEST *highp)
>> +static gdb::optional<LONGEST>
>> +get_discrete_low_bound (struct type *type)
>>   {
>>     type = check_typedef (type);
>>     switch (type->code ())
>>       {
>>       case TYPE_CODE_RANGE:
>> -      /* This function currently only works for ranges with two defined,
>> -	 constant bounds.  */
>> -      if (type->bounds ()->low.kind () != PROP_CONST
>> -	  || type->bounds ()->high.kind () != PROP_CONST)
>> +      {
>> +	/* This function only works for ranges with a constant low bound.  */
>> +	if (type->bounds ()->low.kind () != PROP_CONST)
>> +	  return {};
>> +
>> +	LONGEST low = type->bounds ()->low.const_val ();
>> +
>> +	if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
>> +	  {
>> +	    gdb::optional<LONGEST> low_pos
>> +	      = discrete_position (TYPE_TARGET_TYPE (type), low);
>> +
>> +	    if (low_pos.has_value ())
>> +	      low = *low_pos;
>> +	  }
>> +
>> +	return low;
>> +      }
>> +
>> +    case TYPE_CODE_ENUM:
>> +      {
>> +	if (type->num_fields () > 0)
>> +	  {
>> +	    /* The enums may not be sorted by value, so search all
>> +	       entries.  */
>> +	    LONGEST low = TYPE_FIELD_ENUMVAL (type, 0);
>> +
>> +	    for (int i = 0; i < type->num_fields (); i++)
>> +	      {
>> +		if (TYPE_FIELD_ENUMVAL (type, i) < low)
>> +		  low = TYPE_FIELD_ENUMVAL (type, i);
>> +	      }
>> +
>> +	    /* Set unsigned indicator if warranted.  */
>> +	    if (low >= 0)
>> +	      type->set_is_unsigned (true);
>> +
>> +	    return low;
>> +	  }
>> +	else
>> +	  return 0;
>> +      }
>> +
>> +    case TYPE_CODE_BOOL:
>> +      return 0;
>> +
>> +    case TYPE_CODE_INT:
>> +      if (TYPE_LENGTH (type) > sizeof (LONGEST))	/* Too big */
>>   	return false;
> 
> Should this be "return {};" here?

Hmm, yes, nice catch!

>>   
>> -      *lowp = type->bounds ()->low.const_val ();
>> -      *highp = type->bounds ()->high.const_val ();
>> +      if (!type->is_unsigned ())
>> +	return -(1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1));
>>   
>> -      if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
>> -	{
>> -	  gdb::optional<LONGEST> low_pos
>> -	    = discrete_position (TYPE_TARGET_TYPE (type), *lowp);
>> +      /* fall through */
>> +    case TYPE_CODE_CHAR:
>> +      return 0;
>>   
>> -	  if (low_pos.has_value ())
>> -	    *lowp = *low_pos;
>> +    default:
>> +      return false;
> 
> Same as above.

Yes.

> 
>> +    }
>> +}
>>   
>> -	  gdb::optional<LONGEST> high_pos
>> -	    = discrete_position (TYPE_TARGET_TYPE (type), *highp);
>> +/* If TYPE's high bound is a known constant, return it, else return nullopt.  */
>>   
>> -	  if (high_pos.has_value ())
>> -	    *highp = *high_pos;
>> -	}
>> -      return true;
>> +static gdb::optional<LONGEST>
>> +get_discrete_high_bound (struct type *type)
>> +{
>> +  type = check_typedef (type);
>> +  switch (type->code ())
>> +    {
>> +    case TYPE_CODE_RANGE:
>> +      {
>> +	/* This function only works for ranges with a constant high bound.  */
>> +	if (type->bounds ()->high.kind () != PROP_CONST)
>> +	  return {};
>> +
>> +	LONGEST high = type->bounds ()->high.const_val ();
>> +
>> +	if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
>> +	  {
>> +	    gdb::optional<LONGEST> high_pos
>> +	      = discrete_position (TYPE_TARGET_TYPE (type), high);
>> +
>> +	    if (high_pos.has_value ())
>> +	      high = *high_pos;
>> +	  }
>> +
>> +	return high;
>> +      }
>>   
>>       case TYPE_CODE_ENUM:
>> -      if (type->num_fields () > 0)
>> -	{
>> -	  /* The enums may not be sorted by value, so search all
>> -	     entries.  */
>> -	  int i;
>> +      {
>> +	if (type->num_fields () > 0)
>> +	  {
>> +	    /* The enums may not be sorted by value, so search all
>> +	       entries.  */
>> +	    LONGEST high = TYPE_FIELD_ENUMVAL (type, 0);
>>   
>> -	  *lowp = *highp = TYPE_FIELD_ENUMVAL (type, 0);
>> -	  for (i = 0; i < type->num_fields (); i++)
>> -	    {
>> -	      if (TYPE_FIELD_ENUMVAL (type, i) < *lowp)
>> -		*lowp = TYPE_FIELD_ENUMVAL (type, i);
>> -	      if (TYPE_FIELD_ENUMVAL (type, i) > *highp)
>> -		*highp = TYPE_FIELD_ENUMVAL (type, i);
>> -	    }
>> +	    for (int i = 0; i < type->num_fields (); i++)
>> +	      {
>> +		if (TYPE_FIELD_ENUMVAL (type, i) > high)
>> +		  high = TYPE_FIELD_ENUMVAL (type, i);
>> +	      }
>>   
>> -	  /* Set unsigned indicator if warranted.  */
>> -	  if (*lowp >= 0)
>> -	    type->set_is_unsigned (true);
>> -	}
>> -      else
>> -	{
>> -	  *lowp = 0;
>> -	  *highp = -1;
>> -	}
>> -      return true;
>> +	    return high;
>> +	  }
>> +	else
>> +	  return -1;
>> +      }
>>   
>>       case TYPE_CODE_BOOL:
>> -      *lowp = 0;
>> -      *highp = 1;
>> -      return true;
>> +      return 1;
>>   
>>       case TYPE_CODE_INT:
>>         if (TYPE_LENGTH (type) > sizeof (LONGEST))	/* Too big */
> 
> In that section, there is a "return false" that I'm wondering
> needs to be changed to "return {}" as well...

There were 2 in get_discrete_low_bound and 2 in get_discrete_high_bound, 
which I have now fixed locally, I think that's it.

The TYPE_CODE_ENUM case also looks suspicious.  When the enum type has 
no enumerators, the old code returns 0/-1, which looks like some kind of 
error value, but also returns true, which means "success".  I kept the 
existing behavior for now, but I am wondering if this should be turned 
into an error (returning nullopt) - maybe as a follow-up patch.  I would 
need to investigate all callers that might handle enums.

> 
>> @@ -1108,25 +1164,42 @@ get_discrete_bounds (struct type *type, LONGEST *lowp, LONGEST *highp)
>>   
>>         if (!type->is_unsigned ())
>>   	{
>> -	  *lowp = -(1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1));
>> -	  *highp = -*lowp - 1;
>> -	  return true;
>> +	  LONGEST low = -(1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1));
>> +	  return -low - 1;
>>   	}
>> +
>>         /* fall through */
>>       case TYPE_CODE_CHAR:
>> -      *lowp = 0;
>> -      /* This round-about calculation is to avoid shifting by
>> -	 TYPE_LENGTH (type) * TARGET_CHAR_BIT, which will not work
>> -	 if TYPE_LENGTH (type) == sizeof (LONGEST).  */
>> -      *highp = 1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1);
>> -      *highp = (*highp - 1) | *highp;
>> -      return true;
>> +      {
>> +	/* This round-about calculation is to avoid shifting by
>> +	   TYPE_LENGTH (type) * TARGET_CHAR_BIT, which will not work
>> +	   if TYPE_LENGTH (type) == sizeof (LONGEST).  */
>> +	LONGEST high = 1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1);
>> +	return (high - 1) | high;
>> +      }
>>   
>>       default:
>>         return false;
> 
> Same as above.

Fixed.

> 
>>       }
>>   }
>>   
>> +/* See gdbtypes.h.  */
>> +
>> +bool
>> +get_discrete_bounds (struct type *type, LONGEST *lowp, LONGEST *highp)
>> +{
>> +  gdb::optional<LONGEST> low = get_discrete_low_bound (type);
>> +  gdb::optional<LONGEST> high = get_discrete_high_bound (type);
>> +
>> +  if (!low.has_value () || !high.has_value ())
>> +    return false;
> 
> What do you think of computing high only after having verified
> that "low" has a value? It would only be a small optimization,
> but since it doesn't really complexify the code, and only adds
> a couple of extra lines of code, it seems worth having?

Ok, I'll do that.

> 
> One thing also that this patch makes me realize is that it answers
> one question I had in the back of my mind: LOWP and HIGHP are only
> set if both bounds can be computed -- otherwise, they are both
> left untouched. Perhaps we can in a followup patch update the function's
> doc to mention that?

I'll just add it in this patch, since I am changing this function in 
this patch.

Thanks,

Simon