Re: [PATCH 3/4] gdb: split get_discrete_bounds in two

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Simon Marchi via Gdb-patches <gdb-patches@sourceware.org>
To: Joel Brobecker <brobecker@adacore.com>,
	Simon Marchi via Gdb-patches <gdb-patches@sourceware.org>
Subject: Re: [PATCH 3/4] gdb: split get_discrete_bounds in two
Date: Mon, 7 Dec 2020 10:49:19 -0500	[thread overview]
Message-ID: <277023b8-dff6-54df-5346-ed28bfb12673@efficios.com> (raw)
In-Reply-To: <20201206072934.GB327270@adacore.com>



On 2020-12-06 2:29 a.m., Joel Brobecker wrote:
> Hi Simon,
> 
> On Mon, Nov 23, 2020 at 11:21:19AM -0500, Simon Marchi via Gdb-patches wrote:
>> get_discrete_bounds is not flexible for ranges (TYPE_CODE_RANGE), in the
>> sense that it returns true (success) only if both bounds are present and
>> constant values.
>>
>> This is a problem for code that only needs to know the low bound and
>> fails unnecessarily if the high bound is unknown.
>>
>> Split the function in two, get_discrete_low_bound and
>> get_discrete_high_bound, that both return an optional.  Provide a new
>> implementation of get_discrete_bounds based on the two others, so the
>> callers don't have to be changed.
>>
>> gdb/ChangeLog:
>>
>> 	* gdbtypes.c (get_discrete_bounds): Implement with
>> 	get_discrete_low_bound and get_discrete_high_bound.
>> 	(get_discrete_low_bound): New.
>> 	(get_discrete_high_bound): New.
>>
>> Change-Id: I986b5e9c0dd969800e3fb9546af9c827d52e80d0
> 
> Small reminder to remove the Change-Id in the ChangeLog.
> 
> I agree with the problem, and modulo the minor question I had below
> (which might not actually be an issue), the patch looks good to me.
> 
> I do have one minor concern. On the one hand, the code seems to be
> slightly simpler and clearer. On the other hand, I worry a bit about
> keeping the two functions in sync. I'm wondering if it might be worth
> investigating ways to have achieve the same end result without
> duplicating the skeleton of the code. I tried thinking about it
> for a while, and the only option that might be viable that I could
> think of was to allow users of get_discrete_bounds to pass nullptr
> for either LOWP or HIGHP, and then only handle the bounds whose
> parameter is not null. Then we can have the low/high functions
> if we wanted. In trying to visualize how this would look like,
> I found that two scenarios:
> 
>    - I would be bracketing small pieces of code with lots of
>      "if <param> != nullptr" (just for TYPE_CODE_RANGE, I counted
>      6 instances of an addional condition); FTR, this is what it
>      would look like for TYPE_CODE_RANGE:
> 
>          | case TYPE_CODE_RANGE:
>          |   if ((lowp != nullptr && type->bounds ()->low.kind () != PROP_CONST)
>          |       || (highp != nullptr && type->bounds ()->high.kind () != PROP_CONST))
>          |     return false;
> 
>          |   if (lowp != nullptr)
>          |     *lowp = type->bounds ()->low.const_val ();
>          |   if (highp != nullptr)
>          |     *highp = type->bounds ()->high.const_val ();
> 
>          |   if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
>          |     {
>          |       if (lowp != nullptr)
>          |         {
>          |           gdb::optional<LONGEST> low_pos
>          |             = discrete_position (TYPE_TARGET_TYPE (type), *lowp);
> 
>          |           if (low_pos.has_value ())
>          |             *lowp = *low_pos;
>          |         }
> 
>          |       if (highp != nullptr)
>          |         {
>          |           gdb::optional<LONGEST> high_pos
>          |             = discrete_position (TYPE_TARGET_TYPE (type), *highp);
> 
>          |           if (high_pos.has_value ())
>          |             *highp = *high_pos;
>          |         }
>          |     }
>          |   return true;
> 
> 
>    - I would be splitting the lowp and highp code in two inside
>      each case; in this case, when one looks at how things look like,
>      it takes away half of the reasons why we might want to keep
>      the code together. E.g. for TYPE_CODE_RANGE, it woudl look like this:
> 
>          | case TYPE_CODE_RANGE:
>          |   if (lowp != nullptr)
>          |     {
>          |       if (type->bounds ()->low.kind () != PROP_CONST)
>          |         return false;
> 
>          |       *lowp = type->bounds ()->low.const_val ();
> 
>          |       if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
>          |         {
>          |           gdb::optional<LONGEST> low_pos
>          |             = discrete_position (TYPE_TARGET_TYPE (type), *lowp);
> 
>          |           if (low_pos.has_value ())
>          |             *lowp = *low_pos;
>          |         }
>          |     }
>          |   if (lowp != nullptr)
>          |     {
>          |       if (type->bounds ()->high.kind () != PROP_CONST)
>          |         return false;
> 
>          |       *highp = type->bounds ()->high.const_val ();
> 
>          |       if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
>          |         {
>          |           gdb::optional<LONGEST> high_pos
>          |             = discrete_position (TYPE_TARGET_TYPE (type), *highp);
> 
>          |           if (high_pos.has_value ())
>          |             *highp = *high_pos;
>          |         }
>          |     }
>          |   return true;
> 
> For the record, I thought about perhaps using templates, which would
> work, but I think at the cost of the templated code being more complex.
> 
> So, in the end, unless you have some better ideas, I am find with
> either your solution (splitting), or the solution above which has
> the property of better keeping the code together, but is far from
> perfect because it is at the cost of complexifying the implementation
> a bit (one has to always check that LOWP/HIGHP is not nullptr).

Yeah, I tried several solutions and splitting was the one I preferred, 
as I preferred having two simpler functions than one more complex one.

> 
> See comments below, though.
> 
>> ---
>>   gdb/gdbtypes.c | 187 ++++++++++++++++++++++++++++++++++---------------
>>   1 file changed, 130 insertions(+), 57 deletions(-)
>>
>> diff --git a/gdb/gdbtypes.c b/gdb/gdbtypes.c
>> index b47bd28945d..4df23cfe0fe 100644
>> --- a/gdb/gdbtypes.c
>> +++ b/gdb/gdbtypes.c
>> @@ -1036,71 +1036,127 @@ has_static_range (const struct range_bounds *bounds)
>>   	  && bounds->stride.kind () == PROP_CONST);
>>   }
>>   
>> -/* See gdbtypes.h.  */
>> +/* If TYPE's low bound is a known constant, return it, else return nullopt.  */
>>   
>> -bool
>> -get_discrete_bounds (struct type *type, LONGEST *lowp, LONGEST *highp)
>> +static gdb::optional<LONGEST>
>> +get_discrete_low_bound (struct type *type)
>>   {
>>     type = check_typedef (type);
>>     switch (type->code ())
>>       {
>>       case TYPE_CODE_RANGE:
>> -      /* This function currently only works for ranges with two defined,
>> -	 constant bounds.  */
>> -      if (type->bounds ()->low.kind () != PROP_CONST
>> -	  || type->bounds ()->high.kind () != PROP_CONST)
>> +      {
>> +	/* This function only works for ranges with a constant low bound.  */
>> +	if (type->bounds ()->low.kind () != PROP_CONST)
>> +	  return {};
>> +
>> +	LONGEST low = type->bounds ()->low.const_val ();
>> +
>> +	if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
>> +	  {
>> +	    gdb::optional<LONGEST> low_pos
>> +	      = discrete_position (TYPE_TARGET_TYPE (type), low);
>> +
>> +	    if (low_pos.has_value ())
>> +	      low = *low_pos;
>> +	  }
>> +
>> +	return low;
>> +      }
>> +
>> +    case TYPE_CODE_ENUM:
>> +      {
>> +	if (type->num_fields () > 0)
>> +	  {
>> +	    /* The enums may not be sorted by value, so search all
>> +	       entries.  */
>> +	    LONGEST low = TYPE_FIELD_ENUMVAL (type, 0);
>> +
>> +	    for (int i = 0; i < type->num_fields (); i++)
>> +	      {
>> +		if (TYPE_FIELD_ENUMVAL (type, i) < low)
>> +		  low = TYPE_FIELD_ENUMVAL (type, i);
>> +	      }
>> +
>> +	    /* Set unsigned indicator if warranted.  */
>> +	    if (low >= 0)
>> +	      type->set_is_unsigned (true);
>> +
>> +	    return low;
>> +	  }
>> +	else
>> +	  return 0;
>> +      }
>> +
>> +    case TYPE_CODE_BOOL:
>> +      return 0;
>> +
>> +    case TYPE_CODE_INT:
>> +      if (TYPE_LENGTH (type) > sizeof (LONGEST))	/* Too big */
>>   	return false;
> 
> Should this be "return {};" here?

Hmm, yes, nice catch!

>>   
>> -      *lowp = type->bounds ()->low.const_val ();
>> -      *highp = type->bounds ()->high.const_val ();
>> +      if (!type->is_unsigned ())
>> +	return -(1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1));
>>   
>> -      if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
>> -	{
>> -	  gdb::optional<LONGEST> low_pos
>> -	    = discrete_position (TYPE_TARGET_TYPE (type), *lowp);
>> +      /* fall through */
>> +    case TYPE_CODE_CHAR:
>> +      return 0;
>>   
>> -	  if (low_pos.has_value ())
>> -	    *lowp = *low_pos;
>> +    default:
>> +      return false;
> 
> Same as above.

Yes.

> 
>> +    }
>> +}
>>   
>> -	  gdb::optional<LONGEST> high_pos
>> -	    = discrete_position (TYPE_TARGET_TYPE (type), *highp);
>> +/* If TYPE's high bound is a known constant, return it, else return nullopt.  */
>>   
>> -	  if (high_pos.has_value ())
>> -	    *highp = *high_pos;
>> -	}
>> -      return true;
>> +static gdb::optional<LONGEST>
>> +get_discrete_high_bound (struct type *type)
>> +{
>> +  type = check_typedef (type);
>> +  switch (type->code ())
>> +    {
>> +    case TYPE_CODE_RANGE:
>> +      {
>> +	/* This function only works for ranges with a constant high bound.  */
>> +	if (type->bounds ()->high.kind () != PROP_CONST)
>> +	  return {};
>> +
>> +	LONGEST high = type->bounds ()->high.const_val ();
>> +
>> +	if (TYPE_TARGET_TYPE (type)->code () == TYPE_CODE_ENUM)
>> +	  {
>> +	    gdb::optional<LONGEST> high_pos
>> +	      = discrete_position (TYPE_TARGET_TYPE (type), high);
>> +
>> +	    if (high_pos.has_value ())
>> +	      high = *high_pos;
>> +	  }
>> +
>> +	return high;
>> +      }
>>   
>>       case TYPE_CODE_ENUM:
>> -      if (type->num_fields () > 0)
>> -	{
>> -	  /* The enums may not be sorted by value, so search all
>> -	     entries.  */
>> -	  int i;
>> +      {
>> +	if (type->num_fields () > 0)
>> +	  {
>> +	    /* The enums may not be sorted by value, so search all
>> +	       entries.  */
>> +	    LONGEST high = TYPE_FIELD_ENUMVAL (type, 0);
>>   
>> -	  *lowp = *highp = TYPE_FIELD_ENUMVAL (type, 0);
>> -	  for (i = 0; i < type->num_fields (); i++)
>> -	    {
>> -	      if (TYPE_FIELD_ENUMVAL (type, i) < *lowp)
>> -		*lowp = TYPE_FIELD_ENUMVAL (type, i);
>> -	      if (TYPE_FIELD_ENUMVAL (type, i) > *highp)
>> -		*highp = TYPE_FIELD_ENUMVAL (type, i);
>> -	    }
>> +	    for (int i = 0; i < type->num_fields (); i++)
>> +	      {
>> +		if (TYPE_FIELD_ENUMVAL (type, i) > high)
>> +		  high = TYPE_FIELD_ENUMVAL (type, i);
>> +	      }
>>   
>> -	  /* Set unsigned indicator if warranted.  */
>> -	  if (*lowp >= 0)
>> -	    type->set_is_unsigned (true);
>> -	}
>> -      else
>> -	{
>> -	  *lowp = 0;
>> -	  *highp = -1;
>> -	}
>> -      return true;
>> +	    return high;
>> +	  }
>> +	else
>> +	  return -1;
>> +      }
>>   
>>       case TYPE_CODE_BOOL:
>> -      *lowp = 0;
>> -      *highp = 1;
>> -      return true;
>> +      return 1;
>>   
>>       case TYPE_CODE_INT:
>>         if (TYPE_LENGTH (type) > sizeof (LONGEST))	/* Too big */
> 
> In that section, there is a "return false" that I'm wondering
> needs to be changed to "return {}" as well...

There were 2 in get_discrete_low_bound and 2 in get_discrete_high_bound, 
which I have now fixed locally, I think that's it.

The TYPE_CODE_ENUM case also looks suspicious.  When the enum type has 
no enumerators, the old code returns 0/-1, which looks like some kind of 
error value, but also returns true, which means "success".  I kept the 
existing behavior for now, but I am wondering if this should be turned 
into an error (returning nullopt) - maybe as a follow-up patch.  I would 
need to investigate all callers that might handle enums.

> 
>> @@ -1108,25 +1164,42 @@ get_discrete_bounds (struct type *type, LONGEST *lowp, LONGEST *highp)
>>   
>>         if (!type->is_unsigned ())
>>   	{
>> -	  *lowp = -(1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1));
>> -	  *highp = -*lowp - 1;
>> -	  return true;
>> +	  LONGEST low = -(1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1));
>> +	  return -low - 1;
>>   	}
>> +
>>         /* fall through */
>>       case TYPE_CODE_CHAR:
>> -      *lowp = 0;
>> -      /* This round-about calculation is to avoid shifting by
>> -	 TYPE_LENGTH (type) * TARGET_CHAR_BIT, which will not work
>> -	 if TYPE_LENGTH (type) == sizeof (LONGEST).  */
>> -      *highp = 1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1);
>> -      *highp = (*highp - 1) | *highp;
>> -      return true;
>> +      {
>> +	/* This round-about calculation is to avoid shifting by
>> +	   TYPE_LENGTH (type) * TARGET_CHAR_BIT, which will not work
>> +	   if TYPE_LENGTH (type) == sizeof (LONGEST).  */
>> +	LONGEST high = 1 << (TYPE_LENGTH (type) * TARGET_CHAR_BIT - 1);
>> +	return (high - 1) | high;
>> +      }
>>   
>>       default:
>>         return false;
> 
> Same as above.

Fixed.

> 
>>       }
>>   }
>>   
>> +/* See gdbtypes.h.  */
>> +
>> +bool
>> +get_discrete_bounds (struct type *type, LONGEST *lowp, LONGEST *highp)
>> +{
>> +  gdb::optional<LONGEST> low = get_discrete_low_bound (type);
>> +  gdb::optional<LONGEST> high = get_discrete_high_bound (type);
>> +
>> +  if (!low.has_value () || !high.has_value ())
>> +    return false;
> 
> What do you think of computing high only after having verified
> that "low" has a value? It would only be a small optimization,
> but since it doesn't really complexify the code, and only adds
> a couple of extra lines of code, it seems worth having?

Ok, I'll do that.

> 
> One thing also that this patch makes me realize is that it answers
> one question I had in the back of my mind: LOWP and HIGHP are only
> set if both bounds can be computed -- otherwise, they are both
> left untouched. Perhaps we can in a followup patch update the function's
> doc to mention that?

I'll just add it in this patch, since I am changing this function in 
this patch.

Thanks,

Simon

next prev parent reply	other threads:[~2020-12-07 15:49 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-23 16:21 [PATCH 0/4] Fix bug in value_subscript when range's high bound is not known Simon Marchi via Gdb-patches
2020-11-23 16:21 ` [PATCH 1/4] gdb: make discrete_position return optional Simon Marchi via Gdb-patches
2020-12-06  5:25   ` Joel Brobecker
2020-12-06  5:38   ` Joel Brobecker
2020-12-07 14:58     ` Simon Marchi via Gdb-patches
2020-12-08  3:06       ` Joel Brobecker
2020-12-08 11:41         ` Maciej W. Rozycki
2020-12-09 19:29           ` Simon Marchi via Gdb-patches
2020-12-09 19:53             ` Maciej W. Rozycki
2020-11-23 16:21 ` [PATCH 2/4] gdb: make get_discrete_bounds return bool Simon Marchi via Gdb-patches
2020-12-06  6:03   ` Joel Brobecker
2020-12-07 15:19     ` Simon Marchi via Gdb-patches
2020-11-23 16:21 ` [PATCH 3/4] gdb: split get_discrete_bounds in two Simon Marchi via Gdb-patches
2020-12-06  7:29   ` Joel Brobecker
2020-12-07 15:49     ` Simon Marchi via Gdb-patches [this message]
2020-11-23 16:21 ` [PATCH 4/4] gdb: fix value_subscript when array upper bound is not known Simon Marchi via Gdb-patches
2020-12-06  7:54   ` Joel Brobecker
2020-12-07 16:06     ` Simon Marchi via Gdb-patches
2020-12-08  3:14       ` Joel Brobecker
2020-12-09 18:50         ` Simon Marchi via Gdb-patches
2020-12-09 19:57     ` Simon Marchi via Gdb-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=277023b8-dff6-54df-5346-ed28bfb12673@efficios.com \
    --to=gdb-patches@sourceware.org \
    --cc=brobecker@adacore.com \
    --cc=simon.marchi@efficios.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox