From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id 3EC913857C78 for ; Tue, 21 Jul 2020 17:36:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3EC913857C78 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark@simark.ca Received: from [172.16.0.95] (192-222-181-218.qc.cable.ebox.net [192.222.181.218]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 06CC61E794; Tue, 21 Jul 2020 13:36:44 -0400 (EDT) Subject: Re: [PATCH] gdb: Don't hard code 0 as end marker in GDB's line table To: Andrew Burgess Cc: gdb-patches@sourceware.org References: <20200720125505.1506140-1-andrew.burgess@embecosm.com> <81619b06-4380-7543-1c0c-aa504bb0a0ea@simark.ca> <20200721130600.GC853475@embecosm.com> From: Simon Marchi Message-ID: <0ff85fdb-821d-e704-7b37-bfd95361b9a4@simark.ca> Date: Tue, 21 Jul 2020 13:36:42 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20200721130600.GC853475@embecosm.com> Content-Type: text/plain; charset=utf-8 Content-Language: tl Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Jul 2020 17:36:48 -0000 On 2020-07-21 9:06 a.m., Andrew Burgess wrote: > This was never meant to be a "provide full support for line == 0" > patch. Rather, I just happened to have this patch mostly sitting > around already and when I saw it discussed in bug 26243 I figured I > should get it merged. > > However, your feedback has helped me understand the mistake I made > here. My incorrect thinking went: > > + Currently end-marker == 0 which conflicts with DWARF line == 0, > therefore DWARF line == 0 debug will not work. > > + If we want to support DWARF line == 0 then we need to stop using > end-marker == 0, so > > + Make end-marker == -1, the DWARF line == 0 will continue to not > work just as it didn't work before. > > + In the future we can now make DWARF line == 0 work without the > end-marker being in the way. > > The problem with this logic is that yes, things are broken now, but > actually, they are broken in a way where things actually mostly kind > of work, while, after my change, they are still broken, but now they > mostly don't work. > > So, right now we see DWARF line == 0, and create a GDB end-marker, > this is wrong, but, for example, end-markers don't have source code > associated, which, it turns out, is exactly what we want for DWARF > line == 0. If we remove the conflict then DWARF line == 0 is no > longer special, and we start trying to access line 0 from the source > files. I agree with all of the above. > So, when you say: > >> So, I think we just need to be careful and look at the possible behavior changes >> of all these spots, not just blindly change `line == 0` for `line == end_marker`. > > I both agree, and disagree with this statement. I agree that you are > correct, we (I) need to think more about where special handling for > line == 0 needs to be added to GDB. > > But, I disagree that there were any mistakes made in the original > patch, GDB doesn't currently support line == 0, so any place we are > currently checking line == 0 (I claim) we do mean 'line == > end_marker'. You are right that all the `line == 0` do become `line == end_marker`. I meant that in some places, we also need to handle the new meaning of `line == 0`. > My proposal for now then is that we could (should?) make the change in > the patch below. Unlike the previous patch, the _value_ for > end_marker is now left as 0, rather than changing it to -1. Then > means that the "it just sort-of to works" behaviour for DWARF line == 0, > will continue to "just sort-of work" as before, however, I think GDB > is now saying what it means. > > I already have a patch that adds in special handling for line == 0 > that fixes the disassembler. The problem is (as you pointed out) > there are likely many places in GDB where line == 0 will cause > problems, if 0 != end_marker. This close to a release branch I don't > think we should change the end_marker value to anything but 0. > > After the release we can change end_marker to -1, fix the > disassembler, and then try to find other cases where line == 0 might > cause issues. Agreed. > > So, how would you feel to merging the patch below? I'm ambivalent. I don't really see the point of merging this one its own right now, versus when we do the complete work after branching. In the mean time, I think something like what Tom de Vries proposed [1], that filters out the line 0 entries, would be good for GDB 10. It brings us back to GDB 8.3 behavior, w.r.t. these line 0 entries, while keeping the "is_stmt == false" entries that you added in 8c95582da858a. [1] https://sourceware.org/pipermail/gdb-patches/2020-July/170506.html > Thanks, > Andrew > > --- > > commit 5da1cb2497b06371aa4f1a3091d3888db35c3348 > Author: Andrew Burgess > Date: Mon Jul 20 13:10:00 2020 +0100 > > gdb: Don't hard code 0 as end marker in GDB's line table > > Currently GDB hard-codes the use of 0 as the end of sequence marker in > its line tables. After this commit GDB now uses a named > constant (linetable_entry::end_marker) as the end of sequence marker. > > The motivation for this change was discussion on PR gdb/26243. This > bug discusses the possibility of supporting DWARF's use of line number > 0 to indicate an address in the program with no associated source > code. > > If we want to have line number 0 and end-of-sequence marker within > GDB's line tables then it seems clear that the end-of-sequence marker > can't take the value 0. This change is a first step towards changing > the value of the end-of-sequence marker. > > Initially I did change the value of the marker in this commit from 0 > to -1, however, it was pointed out during review that this causes > problems in GDB when presented with some DWARF that includes use of > line number 0. > > The problem is that currently and DWARF that uses line number 0 will > conflict with the end-of-sequence marker, those addresses marked as > line number 0 will (to GDB) appear as end-of-sequence markers. It > just so happens that in many cases treating an address (with line > number 0) as an end-of-sequence marker is good enough, for example, > end of sequence markers don't have associated source code. > > The minute we change the end-of-sequence marker value to -1 the line > number 0 stops being special and we start to run into problems, for > example GDB will try to access line 0 from source files. > > It is my proposal that we merge this commit when introduces the > end_marker constant, and makes use of this throughout GDB, but leave > the value of this constant as 0 so that GDB will continue to treat any > DWARF with line number 0 as end-of-sequence markers. > > This commit can then be followed up with a series of work to extend > GDB to correctly handle line number 0 in its own right. Once we > believe that all, or most cases needed to support line number 0 are > being handled then the value of end_marker can be changed from 0 to -1 > in a future commit. > > For testing this commit I did change the value of end_marker from 0 to > -1 and ran the regression testsuite which passed with no regressions. > This indicates (to me) that I have correctly updated all (or the > majority of) of the hard coded checks for 0 and replaced these with a > check for end_marker instead. I don't understand this last part. If you change the end_marker value to -1, then we expect things to break with programs built with clang, like the disassembly example I gave. If you ran the testsuite with gcc, which presumably never assigns an address to line 0, then you won't see a difference. And we probably don't have any DWARF tests right now that specifically check situations with an address mapped to line 0. > This commit does mention PR gdb/26243 in its ChangeLog, but does not > fix this issue. I placed this bug reference just so people can see > the connection between this change and that bug in the future. > > gdb/ChangeLog: > > PR gdb/26243 > * buildsym.c (buildsym_compunit::record_line): Add an assert for > the incoming line number. Update comments to not mention 0 > specifically. Update to check for linetable_entry::end_marker > rather than 0. > (buildsym_compunit::end_symtab_with_blockvector): Check for > linetable_entry::end_marker, not 0. > * disasm.c (line_is_less_than): Likewise. > (do_mixed_source_and_assembly_deprecated): Likewise. > (do_mixed_source_and_assembly): Likewise. > * dwarf2/read.c (dwarf_finish_line): Pass > linetable_entry::end_marker not 0. > * mdebugread.c (add_line): Set is_stmt field. > * python/py-linetable.c (ltpy_get_all_source_lines): Check for > linetable_entry::end_marker not 0, update comments. > (ltpy_iternext): Likewise. > * record-btrace.c (btrace_find_line_range): Likewise. > * symmisc.c (maintenance_print_one_line_table): Likewise. > * symtab.c (find_pc_sect_line): Likewise. > (find_line_symtab): Likewise. > (skip_prologue_using_sal): Likewise. > * symtab.h (linetable_entry::end_marker): New const static member > variable. Add a static assert for this field. > * xcoffread.c (arrange_linetable): Check for > linetable_entry::end_marker not 0. > > diff --git a/gdb/buildsym.c b/gdb/buildsym.c > index bd0ca491401..e6d4dc117c1 100644 > --- a/gdb/buildsym.c > +++ b/gdb/buildsym.c > @@ -671,6 +671,10 @@ buildsym_compunit::record_line (struct subfile *subfile, int line, > { > struct linetable_entry *e; > > + /* Is this an asset? Or is this processing user input and so should we > + be handling, or throwing an error for invalid data? */ > + gdb_assert (line == linetable_entry::end_marker || line >= 0); I think it's fine to assert here. Any invalid input should be rejected earlier, in the debug info reader. Simon