Re: [PATCH] gdb: Don't hard code 0 as end marker in GDB's line table

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Simon Marchi <simark@simark.ca>
To: Andrew Burgess <andrew.burgess@embecosm.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH] gdb: Don't hard code 0 as end marker in GDB's line table
Date: Tue, 21 Jul 2020 13:36:42 -0400	[thread overview]
Message-ID: <0ff85fdb-821d-e704-7b37-bfd95361b9a4@simark.ca> (raw)
In-Reply-To: <20200721130600.GC853475@embecosm.com>

On 2020-07-21 9:06 a.m., Andrew Burgess wrote:
> This was never meant to be a "provide full support for line == 0"
> patch.  Rather, I just happened to have this patch mostly sitting
> around already and when I saw it discussed in bug 26243 I figured I
> should get it merged.
> 
> However, your feedback has helped me understand the mistake I made
> here.  My incorrect thinking went:
> 
>   + Currently end-marker == 0 which conflicts with DWARF line == 0,
>   therefore DWARF line == 0 debug will not work.
> 
>   + If we want to support DWARF line == 0 then we need to stop using
>   end-marker == 0, so
> 
>   + Make end-marker == -1, the DWARF line == 0 will continue to not
>   work just as it didn't work before.
> 
>   + In the future we can now make DWARF line == 0 work without the
>   end-marker being in the way.
> 
> The problem with this logic is that yes, things are broken now, but
> actually, they are broken in a way where things actually mostly kind
> of work, while, after my change, they are still broken, but now they
> mostly don't work.
>
> So, right now we see DWARF line == 0, and create a GDB end-marker,
> this is wrong, but, for example, end-markers don't have source code
> associated, which, it turns out, is exactly what we want for DWARF
> line == 0.  If we remove the conflict then DWARF line == 0 is no
> longer special, and we start trying to access line 0 from the source
> files.

I agree with all of the above.

> So, when you say:
> 
>> So, I think we just need to be careful and look at the possible behavior changes
>> of all these spots, not just blindly change `line == 0` for `line == end_marker`.
> 
> I both agree, and disagree with this statement.  I agree that you are
> correct, we (I) need to think more about where special handling for
> line == 0 needs to be added to GDB.
> 
> But, I disagree that there were any mistakes made in the original
> patch, GDB doesn't currently support line == 0, so any place we are
> currently checking line == 0 (I claim) we do mean 'line ==
> end_marker'.

You are right that all the `line == 0` do become `line == end_marker`.  I
meant that in some places, we also need to handle the new meaning of
`line == 0`.

> My proposal for now then is that we could (should?) make the change in
> the patch below.  Unlike the previous patch, the _value_ for
> end_marker is now left as 0, rather than changing it to -1.  Then
> means that the "it just sort-of to works" behaviour for DWARF line == 0,
> will continue to "just sort-of work" as before, however, I think GDB
> is now saying what it means.
> 
> I already have a patch that adds in special handling for line == 0
> that fixes the disassembler.  The problem is (as you pointed out)
> there are likely many places in GDB where line == 0 will cause
> problems, if 0 != end_marker.  This close to a release branch I don't
> think we should change the end_marker value to anything but 0.
> 
> After the release we can change end_marker to -1, fix the
> disassembler, and then try to find other cases where line == 0 might
> cause issues.

Agreed.

> 
> So, how would you feel to merging the patch below?

I'm ambivalent.  I don't really see the point of merging this one its
own right now, versus when we do the complete work after branching.

In the mean time, I think something like what Tom de Vries proposed [1],
that filters out the line 0 entries, would be good for GDB 10.  It brings
us back to GDB 8.3 behavior, w.r.t. these line 0 entries, while keeping
the "is_stmt == false" entries that you added in 8c95582da858a.

[1] https://sourceware.org/pipermail/gdb-patches/2020-July/170506.html

> Thanks,
> Andrew
> 
> ---
> 
> commit 5da1cb2497b06371aa4f1a3091d3888db35c3348
> Author: Andrew Burgess <andrew.burgess@embecosm.com>
> Date:   Mon Jul 20 13:10:00 2020 +0100
> 
>     gdb: Don't hard code 0 as end marker in GDB's line table
>     
>     Currently GDB hard-codes the use of 0 as the end of sequence marker in
>     its line tables.  After this commit GDB now uses a named
>     constant (linetable_entry::end_marker) as the end of sequence marker.
>     
>     The motivation for this change was discussion on PR gdb/26243.  This
>     bug discusses the possibility of supporting DWARF's use of line number
>     0 to indicate an address in the program with no associated source
>     code.
>     
>     If we want to have line number 0 and end-of-sequence marker within
>     GDB's line tables then it seems clear that the end-of-sequence marker
>     can't take the value 0.  This change is a first step towards changing
>     the value of the end-of-sequence marker.
>     
>     Initially I did change the value of the marker in this commit from 0
>     to -1, however, it was pointed out during review that this causes
>     problems in GDB when presented with some DWARF that includes use of
>     line number 0.
>     
>     The problem is that currently and DWARF that uses line number 0 will
>     conflict with the end-of-sequence marker, those addresses marked as
>     line number 0 will (to GDB) appear as end-of-sequence markers.  It
>     just so happens that in many cases treating an address (with line
>     number 0) as an end-of-sequence marker is good enough, for example,
>     end of sequence markers don't have associated source code.
>     
>     The minute we change the end-of-sequence marker value to -1 the line
>     number 0 stops being special and we start to run into problems, for
>     example GDB will try to access line 0 from source files.
>     
>     It is my proposal that we merge this commit when introduces the
>     end_marker constant, and makes use of this throughout GDB, but leave
>     the value of this constant as 0 so that GDB will continue to treat any
>     DWARF with line number 0 as end-of-sequence markers.
>     
>     This commit can then be followed up with a series of work to extend
>     GDB to correctly handle line number 0 in its own right.  Once we
>     believe that all, or most cases needed to support line number 0 are
>     being handled then the value of end_marker can be changed from 0 to -1
>     in a future commit.
>     
>     For testing this commit I did change the value of end_marker from 0 to
>     -1 and ran the regression testsuite which passed with no regressions.
>     This indicates (to me) that I have correctly updated all (or the
>     majority of) of the hard coded checks for 0 and replaced these with a
>     check for end_marker instead.

I don't understand this last part.  If you change the end_marker value to -1,
then we expect things to break with programs built with clang, like the
disassembly example I gave.  If you ran the testsuite with gcc, which presumably
never assigns an address to line 0, then you won't see a difference.  And we
probably don't have any DWARF tests right now that specifically check situations
with an address mapped to line 0.

>     This commit does mention PR gdb/26243 in its ChangeLog, but does not
>     fix this issue.  I placed this bug reference just so people can see
>     the connection between this change and that bug in the future.
>     
>     gdb/ChangeLog:
>     
>             PR gdb/26243
>             * buildsym.c (buildsym_compunit::record_line): Add an assert for
>             the incoming line number.  Update comments to not mention 0
>             specifically.  Update to check for linetable_entry::end_marker
>             rather than 0.
>             (buildsym_compunit::end_symtab_with_blockvector): Check for
>             linetable_entry::end_marker, not 0.
>             * disasm.c (line_is_less_than): Likewise.
>             (do_mixed_source_and_assembly_deprecated): Likewise.
>             (do_mixed_source_and_assembly): Likewise.
>             * dwarf2/read.c (dwarf_finish_line): Pass
>             linetable_entry::end_marker not 0.
>             * mdebugread.c (add_line): Set is_stmt field.
>             * python/py-linetable.c (ltpy_get_all_source_lines): Check for
>             linetable_entry::end_marker not 0, update comments.
>             (ltpy_iternext): Likewise.
>             * record-btrace.c (btrace_find_line_range): Likewise.
>             * symmisc.c (maintenance_print_one_line_table): Likewise.
>             * symtab.c (find_pc_sect_line): Likewise.
>             (find_line_symtab): Likewise.
>             (skip_prologue_using_sal): Likewise.
>             * symtab.h (linetable_entry::end_marker): New const static member
>             variable.  Add a static assert for this field.
>             * xcoffread.c (arrange_linetable): Check for
>             linetable_entry::end_marker not 0.
> 
> diff --git a/gdb/buildsym.c b/gdb/buildsym.c
> index bd0ca491401..e6d4dc117c1 100644
> --- a/gdb/buildsym.c
> +++ b/gdb/buildsym.c
> @@ -671,6 +671,10 @@ buildsym_compunit::record_line (struct subfile *subfile, int line,
>  {
>    struct linetable_entry *e;
>  
> +  /* Is this an asset?  Or is this processing user input and so should we
> +     be handling, or throwing an error for invalid data?  */
> +  gdb_assert (line == linetable_entry::end_marker || line >= 0);

I think it's fine to assert here.  Any invalid input should be rejected earlier,
in the debug info reader.

Simon

     prev parent reply	other threads:[~2020-07-21 17:36 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-20 12:55 Andrew Burgess
2020-07-20 16:01 ` Simon Marchi
2020-07-20 16:57   ` Andrew Burgess
2020-07-21 13:06   ` Andrew Burgess
2020-07-21 17:36     ` Simon Marchi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0ff85fdb-821d-e704-7b37-bfd95361b9a4@simark.ca \
    --to=simark@simark.ca \
    --cc=andrew.burgess@embecosm.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox