From: Simon Marchi via Gdb-patches <gdb-patches@sourceware.org>
To: Joel Brobecker <brobecker@adacore.com>,
Simon Marchi via Gdb-patches <gdb-patches@sourceware.org>
Cc: Simon Marchi <simon.marchi@efficios.com>
Subject: Re: [PATCH 4/4] gdb: fix value_subscript when array upper bound is not known
Date: Mon, 7 Dec 2020 11:06:39 -0500 [thread overview]
Message-ID: <8a48f527-bd8f-5aa9-f232-7523359515fa@polymtl.ca> (raw)
In-Reply-To: <20201206075437.GC327270@adacore.com>
On 2020-12-06 2:54 a.m., Joel Brobecker wrote:
> On Mon, Nov 23, 2020 at 11:21:20AM -0500, Simon Marchi via Gdb-patches wrote:
>> Since commit 7c6f27129631 ("gdb: make get_discrete_bounds check for
>> non-constant range bounds"), subscripting flexible array member fails:
>>
>> struct no_size
>> {
>> int n;
>> int items[];
>> };
>>
>> (gdb) p *ns
>> $1 = {n = 3, items = 0x5555555592a4}
>> (gdb) p ns->items[0]
>> Cannot access memory at address 0xfffe555b733a0164
>> (gdb) p *((int *) 0x5555555592a4)
>> $2 = 101 <--- we would expect that
>> (gdb) p &ns->items[0]
>> $3 = (int *) 0xfffe5559ee829a24 <--- wrong address
>>
>> Since the flexible array member (items) has an unspecified size, the array type
>> created for it in the DWARF doesn't have dimensions (this is with gcc 9.3.0,
>> Ubuntu 20.04):
>>
>> 0x000000a4: DW_TAG_array_type
>> DW_AT_type [DW_FORM_ref4] (0x00000038 "int")
>> DW_AT_sibling [DW_FORM_ref4] (0x000000b3)
>>
>> 0x000000ad: DW_TAG_subrange_type
>> DW_AT_type [DW_FORM_ref4] (0x00000031 "long unsigned int")
>>
>> This causes GDB to create a range type (TYPE_CODE_RANGE) with a defined
>> constant low bound (dynamic_prop with kind PROP_CONST) and an undefined
>> high bound (dynamic_prop with kind PROP_UNDEFINED).
>>
>> value_subscript gets both bounds of that range using
>> get_discrete_bounds. Before commit 7c6f27129631, get_discrete_bounds
>> didn't check the kind of the dynamic_props and would just blindly read
>> them as if they were PROP_CONST. It would return 0 for the high bound,
>> because we zero-initialize the range_bounds structure. And it didn't
>> really matter in this case, because the returned high bound wasn't used
>> in the end.
>>
>> Commit 7c6f27129631 changed get_discrete_bounds to return a failure if
>> either the low or high bound is not a constant, to make sure we don't
>> read a dynamic prop that isn't a PROP_CONST as a PROP_CONST. This
>> change made get_discrete_bounds start to return a failure for that
>> range, and as a result would not set *lowp and *highp. And since
>> value_subscript doesn't check get_discrete_bounds' return value, it just
>> carries on an uses an uninitialized value for the low bound. If
> ^^
>
> "an" -> "and"
>
>> value_subscript did check the return value of get_discrete_bounds, we
>> would get an error message instead of a bogus value. But it would still
>> be a bug, as we wouldn't be able to print the flexible array member's
>> elements.
>>
>> Looking at value_subscript, we see that the low bound is always needed,
>> but the high bound is only needed if !c_style. So, change
>> value_subscript to use get_discrete_low_bound and
>> get_discrete_high_bound separately. This fixes the case described
>> above, where the low bound is known but the high bound isn't (and is not
>> needed). This restores the original behavior without accessing a
>> dynamic_prop in a wrong way.
>>
>> A test is added. In addition to the case described above, a case with
>> an array member of size 0 is added, which is a GNU C extension that
>> existed before flexible array members were introduced. That case
>> currently fails when compiled with gcc <= 8. gcc <= 8 produces DWARF
>> similar to the one shown above, while gcc 9 adds a DW_AT_count of 0 in
>> there, which makes the high bound known. A case where an array member
>> of size 0 is the only member of the struct is also added, as that was
>> how PR 28675 was originally reported, and it's an interesting corner
>> case that I think could trigger other funny bugs.
>>
>> Question about the implementation: in value_subscript, I made it such
>> that if the low or high bound is unknown, we fall back to zero. That
>> effectively makes it the same as it was before 7c6f27129631. But should
>> we instead error() out?
>
> Good question!
>
> In thinking about it, I think there is a bigger question. But before
> asking the bigger question, why would we treat an unknown high bound
> differently from an unknown high bound?
Do you mean "unknown low bound differently from an unknown high bound"?
I don't really have a solid
> answer to that question. I almost want to say that we want to treat
> both the same, which argues in favor of the approach you took in
> this patch, and absent a good answer to that question, because that's
> what we were already doing in the past, it's another justification
> for the status quo.
>
> Now, the bigger question is: Why are we even getting to this point?
> Shouldn't the bounds simply be resolved before we do the subscripting?
If the array is c-style, there's no concept of high bound, so I don't
think there's a point resolving the high bound then.
> Thinking about this questions brings me back to the difficulties
> that AdaCore had with dealing with this kind of situation where
> resolving the array bounds can lead to values whose size if ginormous,
> and thus the (impossible) need to be careful to never actually fetch
> the value -- something I believe we side-stepped by replacing the value
> by a pointer to the array itself. It was tricky, but the question of
> whether we should only subscript arrays remains open, especially when
> the actual low bound after bound resolution ends up being nonzero,
> because it then affects which element of the array you display.
I don't have much experience with languages others than C/C++ that have
smarter array ranges, so I don't have any meaningful opinion on this.
> Another smaller question is about what we should be doing if we detect
> a situation when the user tries to subscript an array at an index
> which is outside the array's range. I believe we should let the user
> try it, but should we warn, for instance?
Again, I've never really been a user of such a language. Intuitively,
I'd like the debugger to work as close to the source language as
possible, so I'd like it to validate the bounds in my expression. But
it might be that people who use these languages sometimes need to
side-step the language's protections.
>
> While those are interesting discussions, given the impact of this
> regressions, and the fact that this patch is restoring the status quo,
> I'm in favor of discussing the above separately from this patch.
Ok, thanks.
>
>> gdb/ChangeLog:
>>
>> PR 26875, PR 26901
>> * gdbtypes.c (get_discrete_low_bound): Make non-static.
>> (get_discrete_high_bound): Make non-static.
>> * gdbtypes.h (get_discrete_low_bound): New declaration.
>> (get_discrete_high_bound): New declaration.
>> * valarith.c (value_subscript): Only fetch high bound if
>> necessary.
>>
>> gdb/testsuite/ChangeLog:
>>
>> PR 26875, PR 26901
>> * gdb.base/flexible-array-member.c: New test.
>> * gdb.base/flexible-array-member.exp: New test.
>>
>> Change-Id: I832056f80e6c56f621f398b4780d55a3a1e299d7
>
> Small reminder to remove this Change-Id.
As mentioned in my reply on patch 1, I'd prefer to keep those if it
doesn't bother you too much, as they are useful for bookkeeping on my end.
>
> Other than that, the change looks OK to me.
>
> I believe, between the impact of the fix, the fact that it is
> a recent regression, and the relatively straightforward nature
> of your patch series, I believe it will be safe to backport to
> the gdb-10-branch.
Thanks for the review!
Simon
next prev parent reply other threads:[~2020-12-07 16:06 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-23 16:21 [PATCH 0/4] Fix bug in value_subscript when range's high " Simon Marchi via Gdb-patches
2020-11-23 16:21 ` [PATCH 1/4] gdb: make discrete_position return optional Simon Marchi via Gdb-patches
2020-12-06 5:25 ` Joel Brobecker
2020-12-06 5:38 ` Joel Brobecker
2020-12-07 14:58 ` Simon Marchi via Gdb-patches
2020-12-08 3:06 ` Joel Brobecker
2020-12-08 11:41 ` Maciej W. Rozycki
2020-12-09 19:29 ` Simon Marchi via Gdb-patches
2020-12-09 19:53 ` Maciej W. Rozycki
2020-11-23 16:21 ` [PATCH 2/4] gdb: make get_discrete_bounds return bool Simon Marchi via Gdb-patches
2020-12-06 6:03 ` Joel Brobecker
2020-12-07 15:19 ` Simon Marchi via Gdb-patches
2020-11-23 16:21 ` [PATCH 3/4] gdb: split get_discrete_bounds in two Simon Marchi via Gdb-patches
2020-12-06 7:29 ` Joel Brobecker
2020-12-07 15:49 ` Simon Marchi via Gdb-patches
2020-11-23 16:21 ` [PATCH 4/4] gdb: fix value_subscript when array upper bound is not known Simon Marchi via Gdb-patches
2020-12-06 7:54 ` Joel Brobecker
2020-12-07 16:06 ` Simon Marchi via Gdb-patches [this message]
2020-12-08 3:14 ` Joel Brobecker
2020-12-09 18:50 ` Simon Marchi via Gdb-patches
2020-12-09 19:57 ` Simon Marchi via Gdb-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8a48f527-bd8f-5aa9-f232-7523359515fa@polymtl.ca \
--to=gdb-patches@sourceware.org \
--cc=brobecker@adacore.com \
--cc=simon.marchi@efficios.com \
--cc=simon.marchi@polymtl.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox