From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by sourceware.org (Postfix) with ESMTPS id C53F6386F442 for ; Sat, 6 Jun 2020 06:51:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C53F6386F442 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=andrew.burgess@embecosm.com Received: by mail-wr1-x442.google.com with SMTP id l11so11888604wru.0 for ; Fri, 05 Jun 2020 23:51:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=QX+UudO87CJrssv6cW4D0u0DrkNP6D0ZPOoYm99CCqI=; b=c8IoVzlGVqceIzCGg1ESpATTPAjEJZ1uryEo4XQdJPO0PmIM8mf3AF8jCvt+eqerIr hmkbm02z9e/tk+kdHo+iaqsDEXJzhZwuGdTcpQJoPdghOJ+3y9LXhcNrHHFTdT/b5OI3 SYTIY0BaHOdY2V/CUauPwmrBtmnoWvS/9NbAdCFK600k0JmgoZxF7tZl9QGiSt+QwGpl tLatJeYFaoiTDqecMu61oRqGVH9oamVZ+e6VHX4Qr0w5WWY0Wfbx3v0yuPVSaGErjyUU xghB0S1z1vKGSryPrGpSA3hBd2RNmrOY5TU/puKq7VVkC7fWg+7xwo/YxCs7zkScaSS0 554A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=QX+UudO87CJrssv6cW4D0u0DrkNP6D0ZPOoYm99CCqI=; b=cC23VIAWgKczQyFL98k2y/VjvV+JItvDP4syfsmrTPhbaa+cMefChO1Ww76+77H0FO LNdb7RS4H1cMurWNFYPJcH85zgf+dDVGNH160X40cGLrA/Gi4Feh+TmAu8GQiruKez02 y1PuImHxIegGWVA1f7TxzPdjxpeSZ++NmnoLYfqPBSDZ837rGi90nJUR8lYWHIm+nPW+ SKnRVWSwkcz/QF83ZppiK6sMM8DAW5NW98PqVv04KZlqT4sU81fDmZTyZkM9JJiJnzf9 v9djAW4JeZJGTc0AAZpPtNnyXdPeuknOrVZGiztSwbup2yy63YZUnniESF+EYUahw5yl KMGg== X-Gm-Message-State: AOAM531JoihWe7Hjsvr5zzA2bwrcJO++m3xJJXb3LvVBuoOy2z68cEDv yQYbBF4Rej2V6oFu8hDFiKEzWg== X-Google-Smtp-Source: ABdhPJz5kz8HJo5oUB2ltn+EoYe+HPa4MFO8VVwTMF3Z8f8sbnDAdZBuH0FI/c0Sz8cnyr+C5BRKxg== X-Received: by 2002:adf:e908:: with SMTP id f8mr13039506wrm.184.1591426314719; Fri, 05 Jun 2020 23:51:54 -0700 (PDT) Received: from localhost (host86-128-12-16.range86-128.btcentralplus.com. [86.128.12.16]) by smtp.gmail.com with ESMTPSA id v27sm15704722wrv.81.2020.06.05.23.51.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jun 2020 23:51:54 -0700 (PDT) Date: Sat, 6 Jun 2020 07:51:52 +0100 From: Andrew Burgess To: Tom de Vries Cc: gdb-patches@sourceware.org Subject: Re: [PATCH][gdb/symtab] Fix line-table end-of-sequence sorting Message-ID: <20200606065152.GF3522@embecosm.com> References: <51b43dba-2213-a428-d4dd-10154f8d1f52@suse.de> <446082ca-c3d4-1a90-2a35-2669c8407095@suse.de> <18b1ee90-2ece-a5b4-787b-2507b081da81@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: <18b1ee90-2ece-a5b4-787b-2507b081da81@suse.de> X-Operating-System: Linux/5.5.17-200.fc31.x86_64 (x86_64) X-Uptime: 07:50:02 up 46 days, 22:24, X-Editor: GNU Emacs [ http://www.gnu.org/software/emacs ] X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Jun 2020 06:51:58 -0000 * Tom de Vries [2020-06-06 01:44:42 +0200]: > [ was: Re: [PATCH 2/3] gdb: Don't reorder line table entries too much > when sorting. ] >=20 > On 05-06-2020 18:00, Tom de Vries wrote: > > On 05-06-2020 16:49, Tom de Vries wrote: > >> On 23-12-2019 02:51, Andrew Burgess wrote: > >>> I had to make a small adjustment in find_pc_sect_line in order to > >>> correctly find the previous line in the line table. In some line > >>> tables I was seeing an actual line entry and an end of sequence marker > >>> at the same address, before this commit these would reorder to move > >>> the end of sequence marker before the line entry (end of sequence has > >>> line number 0). Now the end of sequence marker remains in its correct > >>> location, and in order to find a previous line we should step backward > >>> over any end of sequence markers. > >>> > >>> As an example, the binary: > >>> gdb/testsuite/outputs/gdb.dwarf2/dw2-ranges-func/dw2-ranges-func-lo= -cold > >>> > >>> Has this line table before the patch: > >>> > >>> INDEX LINE ADDRESS > >>> 0 48 0x0000000000400487 > >>> 1 END 0x000000000040048e > >>> 2 52 0x000000000040048e > >>> 3 54 0x0000000000400492 > >>> 4 56 0x0000000000400497 > >>> 5 END 0x000000000040049a > >>> 6 62 0x000000000040049a > >>> 7 END 0x00000000004004a1 > >>> 8 66 0x00000000004004a1 > >>> 9 68 0x00000000004004a5 > >>> 10 70 0x00000000004004aa > >>> 11 72 0x00000000004004b9 > >>> 12 END 0x00000000004004bc > >>> 13 76 0x00000000004004bc > >>> 14 78 0x00000000004004c0 > >>> 15 80 0x00000000004004c5 > >>> 16 END 0x00000000004004cc > >>> > >>> And after this patch: > >>> > >>> INDEX LINE ADDRESS > >>> 0 48 0x0000000000400487 > >>> 1 52 0x000000000040048e > >>> 2 END 0x000000000040048e > >>> 3 54 0x0000000000400492 > >>> 4 56 0x0000000000400497 > >>> 5 END 0x000000000040049a > >>> 6 62 0x000000000040049a > >>> 7 66 0x00000000004004a1 > >>> 8 END 0x00000000004004a1 > >>> 9 68 0x00000000004004a5 > >>> 10 70 0x00000000004004aa > >>> 11 72 0x00000000004004b9 > >>> 12 END 0x00000000004004bc > >>> 13 76 0x00000000004004bc > >>> 14 78 0x00000000004004c0 > >>> 15 80 0x00000000004004c5 > >>> 16 END 0x00000000004004cc > >>> > >>> When calling find_pc_sect_line with the address 0x000000000040048e, in > >>> both cases we find entry #3, we then try to find the previous entry, > >>> which originally found this entry '2 52 0x000000000040048e', > >>> after the patch it finds '2 END 0x000000000040048e', which > >>> cases the lookup to fail. > >>> > >>> By skipping the END marker after this patch we get back to the correct > >>> entry, which is now #1: '1 52 0x000000000040048e', and > >>> everything works again. > >> > >> I start to suspect that you have been working around an incorrect line > >> table. > >> > >> Consider this bit: > >> ... > >> 0 48 0x0000000000400487 > >> 1 52 0x000000000040048e > >> 2 END 0x000000000040048e > >> ... > >> > >> The end marker marks the address one past the end of the sequence. > >> Therefore, it makes no sense to have an entry in the sequence with the > >> same address as the end marker. > >> > >> [ dwarf doc: > >> > >> end_sequence: > >> > >> A boolean indicating that the current address is that of the first byte > >> after the end of a sequence of target machine instructions. end_sequen= ce > >> terminates a sequence of lines; therefore other information in the same > >> row is not meaningful. > >> > >> DW_LNE_end_sequence: > >> > >> The DW_LNE_end_sequence opcode takes no operands. It sets the > >> end_sequence register of the state machine to =E2=80=9Ctrue=E2=80=9D a= nd appends a row > >> to the matrix using the current values of the state-machine registers. > >> Then it resets the registers to the initial values specified above (see > >> Section 6.2.2). Every line number program sequence must end with a > >> DW_LNE_end_sequence instruction which creates a row whose address is > >> that of the byte after the last target machine instruction of the sequ= ence. > >> > >> ] > >> > >> The incorrect entry is generated by this dwarf assembler sequence: > >> ... > >> {DW_LNS_copy} > >> {DW_LNE_end_sequence} > >> ... > >> > >> I think we should probably fix the dwarf assembly test-cases. > >> > >> If we want to handle this in gdb, the thing that seems most logical to > >> me is to ignore this kind of entries. > >=20 > > Hmm, that seems to be done already, in buildsym_compunit::record_line. > >=20 > > Anyway, I was looking at the line table for > > gdb.dwarf2/dw2-ranges-base.exp, and got a line table with subsequent end > > markers: > > ... > > INDEX LINE ADDRESS IS-STMT > > 0 31 0x00000000004004a7 Y > > 1 21 0x00000000004004ae Y > > 2 END 0x00000000004004ae Y > > 3 11 0x00000000004004ba Y > > 4 END 0x00000000004004ba Y > > 5 END 0x00000000004004c6 Y > > ... > >=20 > > By using this patch: > > ... > > diff --git a/gdb/buildsym.c b/gdb/buildsym.c > > index 33bf6523e9..76f0b54ff6 100644 > > --- a/gdb/buildsym.c > > +++ b/gdb/buildsym.c > > @@ -943,6 +943,10 @@ buildsym_compunit::end_symtab_with_blockvector > > (struct block *static_block, > > =3D [] (const linetable_entry &ln1, > > const linetable_entry &ln2) -> bool > > { > > + if (ln1.pc =3D=3D ln2.pc > > + && ((ln1.line =3D=3D 0) !=3D (ln2.line =3D=3D 0))) > > + return ln1.line =3D=3D 0 ? true : false; I will take a look at this patch properly as soon as I can, but just spotted this pet peeve of mine, please just write: return ln1.line =3D=3D 0; Thanks, Andrew > > + > > return (ln1.pc < ln2.pc); > > }; > >=20 > > ... > > I get the expected: > > ... > > INDEX LINE ADDRESS IS-STMT > > 0 31 0x00000000004004a7 Y > > 1 END 0x00000000004004ae Y > > 2 21 0x00000000004004ae Y > > 3 END 0x00000000004004ba Y > > 4 11 0x00000000004004ba Y > > 5 END 0x00000000004004c6 Y > > ... >=20 > Any comments on patch below? >=20 > Thanks, > - Tom >=20 > [gdb/symtab] Fix line-table end-of-sequence sorting >=20 > Consider test-case gdb.dwarf2/dw2-ranges-base.exp. It has a line-table f= or > dw2-ranges-base.c like this: > ... > Line Number Statements: > [0x0000014e] Extended opcode 2: set Address to 0x4004ba > [0x00000159] Advance Line by 10 to 11 > [0x0000015b] Copy > [0x0000015c] Advance PC by 12 to 0x4004c6 > [0x0000015e] Advance Line by 19 to 30 > [0x00000160] Copy > [0x00000161] Extended opcode 1: End of Sequence >=20 > [0x00000164] Extended opcode 2: set Address to 0x4004ae > [0x0000016f] Advance Line by 20 to 21 > [0x00000171] Copy > [0x00000172] Advance PC by 12 to 0x4004ba > [0x00000174] Advance Line by 29 to 50 > [0x00000176] Copy > [0x00000177] Extended opcode 1: End of Sequence >=20 > [0x0000017a] Extended opcode 2: set Address to 0x4004a7 > [0x00000185] Advance Line by 30 to 31 > [0x00000187] Copy > [0x00000188] Advance PC by 7 to 0x4004ae > [0x0000018a] Advance Line by 39 to 70 > [0x0000018c] Copy > [0x0000018d] Extended opcode 1: End of Sequence > ... >=20 > The Copy followed by End-of-Sequence is as specified in the dwarf assembl= y, > but incorrect. F.i., consider: > ... > [0x0000015c] Advance PC by 12 to 0x4004c6 > [0x0000015e] Advance Line by 19 to 30 > [0x00000160] Copy > [0x00000161] Extended opcode 1: End of Sequence > ... >=20 > Both the Copy and the End-of-Sequence append a row to the matrix using the > same addres: 0x4004c6. The Copy declares a target instruction at that > address. The End-of-Sequence declares that the sequence ends before that > address. It's a contradiction that the target instruction is both part o= f the > sequence (according to Copy) and not part of the sequence (according to > End-of-Sequence). >=20 > The offending Copy is skipped though by buildsym_compunit::record_line for > unrelated reasons. So, if we disable the sorting in > buildsym_compunit::end_symtab_with_blockvector, we have: > ... > INDEX LINE ADDRESS IS-STMT > 0 11 0x00000000004004ba Y > 1 END 0x00000000004004c6 Y > 2 21 0x00000000004004ae Y > 3 END 0x00000000004004ba Y > 4 31 0x00000000004004a7 Y > 5 END 0x00000000004004ae Y > ... > but if we re-enable the sorting, we have: > ... > INDEX LINE ADDRESS IS-STMT > 0 31 0x00000000004004a7 Y > 1 21 0x00000000004004ae Y > 2 END 0x00000000004004ae Y > 3 11 0x00000000004004ba Y > 4 END 0x00000000004004ba Y > 5 END 0x00000000004004c6 Y > ... > which has both: > - the contradictory order for the same-address pairs 1/2 and 3/4, as well= as > - a non-sensical pair of ENDs, > while we'd like: > ... > INDEX LINE ADDRESS IS-STMT > 0 31 0x00000000004004a7 Y > 1 END 0x00000000004004ae Y > 2 21 0x00000000004004ae Y > 3 END 0x00000000004004ba Y > 4 11 0x00000000004004ba Y > 5 END 0x00000000004004c6 Y > ... >=20 > This is a regression since commit 3d92a3e313 "gdb: Don't reorder line tab= le > entries too much when sorting", that introduced sorting on address while > keeping entries with the same address in pre-sort order, which leads to > incorrect results if one of the entries is an End-Of-Sequence. >=20 > Fix this by handling End-Of-Sequence entries in the sorting function. >=20 > Tested on x86_64-linux. >=20 > gdb/ChangeLog: >=20 > 2020-06-06 Tom de Vries >=20 > * buildsym.c (buildsym_compunit::end_symtab_with_blockvector): Handle > End-Of-Sequence in lte_is_less_than. >=20 > gdb/testsuite/ChangeLog: >=20 > 2020-06-06 Tom de Vries >=20 > * gdb.dwarf2/dw2-ranges-base.exp: Test line-table order. >=20 > --- > gdb/buildsym.c | 4 ++++ > gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp | 14 ++++++++++++++ > 2 files changed, 18 insertions(+) >=20 > diff --git a/gdb/buildsym.c b/gdb/buildsym.c > index 33bf6523e9..76f0b54ff6 100644 > --- a/gdb/buildsym.c > +++ b/gdb/buildsym.c > @@ -943,6 +943,10 @@ buildsym_compunit::end_symtab_with_blockvector (stru= ct block *static_block, > =3D [] (const linetable_entry &ln1, > const linetable_entry &ln2) -> bool > { > + if (ln1.pc =3D=3D ln2.pc > + && ((ln1.line =3D=3D 0) !=3D (ln2.line =3D=3D 0))) > + return ln1.line =3D=3D 0 ? true : false; > + > return (ln1.pc < ln2.pc); > }; > =20 > diff --git a/gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp b/gdb/testsuite= /gdb.dwarf2/dw2-ranges-base.exp > index 92f8f6cecb..39281a8857 100644 > --- a/gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp > +++ b/gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp > @@ -144,12 +144,26 @@ gdb_test "info line frame3" \ > =20 > # Ensure that the line table correctly tracks the end of sequence marker= s. > set end_seq_count 0 > +set prev -1 > +set seq_count 0 > gdb_test_multiple "maint info line-table gdb.dwarf2/dw2-ranges-base.c" \ > "count END markers in line table" { > -re "^$decimal\[ \t\]+$decimal\[ \t\]+$hex\(\[ \t\]+Y\)? *\r\n" { > + if { $prev !=3D -1 } { > + gdb_assert "$prev =3D=3D 1" \ > + "prev of normal entry at $seq_count is end marker" > + } > + set prev 0 > + incr seq_count > exp_continue > } > -re "^$decimal\[ \t\]+END\[ \t\]+$hex\(\[ \t\]+Y\)? *\r\n" { > + if { $prev !=3D -1 } { > + gdb_assert "$prev =3D=3D 0" \ > + "prev of end marker at $seq_count is normal entry" > + } > + set prev 1 > + incr seq_count > incr end_seq_count > exp_continue > }