From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id 9E14C385B835 for ; Sun, 29 Mar 2020 15:56:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9E14C385B835 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tdevries@suse.de X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 7246CAC79; Sun, 29 Mar 2020 15:56:35 +0000 (UTC) Subject: Re: [PATCH][gdb] Fix missing symtab includes To: Simon Marchi , gdb-patches@sourceware.org Cc: Tom Tromey References: <20200327204948.GA23365@delia> <44300932-0e15-c3b6-9ce8-44e08dd5d8ef@suse.de> From: Tom de Vries Autocrypt: addr=tdevries@suse.de; keydata= xsBNBF0ltCcBCADDhsUnMMdEXiHFfqJdXeRvgqSEUxLCy/pHek88ALuFnPTICTwkf4g7uSR7 HvOFUoUyu8oP5mNb4VZHy3Xy8KRZGaQuaOHNhZAT1xaVo6kxjswUi3vYgGJhFMiLuIHdApoc u5f7UbV+egYVxmkvVLSqsVD4pUgHeSoAcIlm3blZ1sDKviJCwaHxDQkVmSsGXImaAU+ViJ5l CwkvyiiIifWD2SoOuFexZyZ7RUddLosgsO0npVUYbl6dEMq2a5ijGF6/rBs1m3nAoIgpXk6P TCKlSWVW6OCneTaKM5C387972qREtiArTakRQIpvDJuiR2soGfdeJ6igGA1FZjU+IsM5ABEB AAHNH1RvbSBkZSBWcmllcyA8dGRldnJpZXNAc3VzZS5kZT7CwKsEEwEIAD4WIQSsnSe5hKbL MK1mGmjuhV2rbOJEoAUCXSW0JwIbAwUJA8JnAAULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAh CRDuhV2rbOJEoBYhBKydJ7mEpsswrWYaaO6FXats4kSgc48H/Ra2lq5p3dHsrlQLqM7N68Fo eRDf3PMevXyMlrCYDGLVncQwMw3O/AkousktXKQ42DPJh65zoXB22yUt8m0g12xkLax98KFJ 5NyUloa6HflLl+wQL/uZjIdNUQaHQLw3HKwRMVi4l0/Jh/TygYG1Dtm8I4o708JS4y8GQxoQ UL0z1OM9hyM3gI2WVTTyprsBHy2EjMOu/2Xpod95pF8f90zBLajy6qXEnxlcsqreMaqmkzKn 3KTZpWRxNAS/IH3FbGQ+3RpWkNGSJpwfEMVCeyK5a1n7yt1podd1ajY5mA1jcaUmGppqx827 8TqyteNe1B/pbiUt2L/WhnTgW1NC1QDOwE0EXSW0JwEIAM99H34Bu4MKM7HDJVt864MXbx7B 1M93wVlpJ7Uq+XDFD0A0hIal028j+h6jA6bhzWto4RUfDl/9mn1StngNVFovvwtfzbamp6+W pKHZm9X5YvlIwCx131kTxCNDcF+/adRW4n8CU3pZWYmNVqhMUiPLxElA6QhXTtVBh1RkjCZQ Kmbd1szvcOfaD8s+tJABJzNZsmO2hVuFwkDrRN8Jgrh92a+yHQPd9+RybW2l7sJv26nkUH5Z 5s84P6894ebgimcprJdAkjJTgprl1nhgvptU5M9Uv85Pferoh2groQEAtRPlCGrZ2/2qVNe9 XJfSYbiyedvApWcJs5DOByTaKkcAEQEAAcLAkwQYAQgAJhYhBKydJ7mEpsswrWYaaO6FXats 4kSgBQJdJbQnAhsMBQkDwmcAACEJEO6FXats4kSgFiEErJ0nuYSmyzCtZhpo7oVdq2ziRKD3 twf7BAQBZ8TqR812zKAD7biOnWIJ0McV72PFBxmLIHp24UVe0ZogtYMxSWKLg3csh0yLVwc7 H3vldzJ9AoK3Qxp0Q6K/rDOeUy3HMqewQGcqrsRRh0NXDIQk5CgSrZslPe47qIbe3O7ik/MC q31FNIAQJPmKXX25B115MMzkSKlv4udfx7KdyxHrTSkwWZArLQiEZj5KG4cCKhIoMygPTA3U yGaIvI/BGOtHZ7bEBVUCFDFfOWJ26IOCoPnSVUvKPEOH9dv+sNy7jyBsP5QxeTqwxC/1ZtNS DUCSFQjqA6bEGwM22dP8OUY6SC94x1G81A9/xbtm9LQxKm0EiDH8KBMLfQ== Message-ID: Date: Sun, 29 Mar 2020 17:56:34 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------338363131132F5BED4941A8E" Content-Language: en-US X-Spam-Status: No, score=-33.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Mar 2020 15:56:38 -0000 This is a multi-part message in MIME format. --------------338363131132F5BED4941A8E Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit On 29-03-2020 17:39, Simon Marchi wrote: > On 2020-03-28 1:24 p.m., Tom de Vries wrote: >>> I am not able to reproduce this. In both cases, I don't get the `includes`. >>> >>> What transformation is dwz expected to do on the binary? Here, it looks like >>> it just compressed the debug info a little bit, changing the addresses, but the >>> general structure was untouched. >>> >>> I'm using the latest git version of dwz (commit b7111689a2ccec2f57343f1051ec8f1df5e89e5c). >> >> Hi Simon, >> >> thanks for trying this out. >> >> I've attached the original a.out here ( >> https://sourceware.org/bugzilla/show_bug.cgi?id=25718#c3 ) and the >> dwz-ed a.out here ( >> https://sourceware.org/bugzilla/show_bug.cgi?id=25718#c4 ). >> >> I'm hoping you might be able to reproduce using the latter file (and >> FWIW, I'm using the same dwz version). >> >> I think the reason for the difference in what we are seeing is due to me >> using openSUSE, which has debug info on various linked in objects like >> glibc's init.c and elf-init.c. Looking at the readelf -wi output of the >> dwz-ed executable, dwz exploits commonality between those objects and >> hello.c, so it's not surprising dwz does not create partial units for >> platforms that do not have debug info for those objects. >> >> Anyway, I'll need to construct a better test-case that reproduces the >> problem on other platforms. >> >> Thanks, >> - Tom >> > > Thanks, your explanations make more sense with the the DWARF in those files, > I can reproduce the problem now. > > I have a bit of trouble understanding the current code. In particular, I am > trying to put in words what's the difference between the `read_symtab` and > `expand_symtab` methods of `partial_symtab`, but I can't quite do it. I think > this would help to determine what's the right solution. Hi, I gave that a try, but got stalled working on the test-case. Anyway here's the current state. Thanks, - Tom --------------338363131132F5BED4941A8E Content-Type: text/x-patch; charset=UTF-8; name="0001-gdb-Fix-missing-symtab-includes.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="0001-gdb-Fix-missing-symtab-includes.patch" [gdb] Fix missing symtab includes Consider hello.h: ... inline static const char* foo (void) { return "foo"; } ... and hello.c: ... #include #include "hello.h" int main (void) { printf ("hello: %s\n", foo ()); return 0; } ... compiled with -g, and dwz-compressed: ... $ gcc -g hello.c $ dwz a.out ... When breaking on foo and printing the symbol tables, we have two includes for the hello.c compunit_symtab, representing two imported partial units: ... $ gdb -iex "set language c" -batch a.out -ex "b foo" -ex "maint info symtabs" Breakpoint 1 at 0x40050b: file hello.h, line 4. ... { ((struct compunit_symtab *) 0x38fa890) debugformat DWARF 2 producer GNU C11 7.5.0 -mtune=generic -march=x86-64 -g dirname /data/gdb_versions/devel blockvector ((struct blockvector *) 0x39af9f0) user ((struct compunit_symtab *) (null)) ( includes ((struct compunit_symtab *) 0x39afb10) ((struct compunit_symtab *) 0x39b00c0) ) { symtab hello.c ((struct symtab *) 0x38fa940) fullname (null) linetable ((struct linetable *) 0x39afa80) } ... But if we instead break on the same location using a linespec, the includes are gone: ... $ gdb -iex "set language c" -batch a.out -ex "b hello.h:4" -ex "maint info symtabs" Breakpoint 1 at 0x40050b: file hello.h, line 4. ... { ((struct compunit_symtab *) 0x2283500) debugformat DWARF 2 producer GNU C11 7.5.0 -mtune=generic -march=x86-64 -g dirname /data/gdb_versions/devel blockvector ((struct blockvector *) 0x23086c0) user ((struct compunit_symtab *) (null)) { symtab hello.c ((struct symtab *) 0x22835b0) fullname (null) linetable ((struct linetable *) 0x2308750) } ... The includes are calculated by process_cu_includes in gdb/dwarf2/read.c. In the case of "b foo": - the hello.c partial symtab is found to contain foo - the hello.c partial symtab is expanded using psymtab_to_symtab - psymtab_to_symtab calls dwarf2_psymtab::read_symtab - dwarf2_psymtab::read_symtab calls dwarf2_psymtab::expand_psymtab - dwarf2_psymtab::read_symtab calls process_cu_includes, and we have the includes In the case of "b hello.h:4": - the hello.h partial symtab is found to represent hello.h - the hello.h partial symtab is expanded using psymtab_to_symtab - psymtab_to_symtab calls dwarf2_include_psymtab::read_symtab - dwarf2_include_psymtab::read_symtab calls dwarf2_include_psymtab::expand_psymtab - dwarf2_include_psymtab::expand_psymtab calls partial_symtab::read_dependencies - partial_symtab::read_dependencies calls dwarf2_psymtab::expand_psymtab for partial symtab hello.c - the hello.c partial symtab is expanded using dwarf2_psymtab::expand_psymtab - process_cu_includes is never called Fix this by making sure in dwarf2_include_psymtab::read_symtab that read_symtab is called for the hello.c partial symtab. Tested on x86_64-linux, with native, and target board cc-with-dwz and cc-with-dwz-m. gdb/ChangeLog: 2020-03-27 Tom de Vries PR symtab/25718 * psympriv.h (struct partial_symtab::read_symtab) (struct partial_symtab::expand_psymtab) (struct partial_symtab::read_dependencies): Update comments. * dwarf2/read.c (dwarf2_include_psymtab::read_symtab): Directly call read_symtab for includer. (dwarf2_include_psymtab::expand_psymtab): Assert false. --- gdb/dwarf2/read.c | 28 +++++++++++++++++++++------- gdb/psympriv.h | 22 ++++++++++++++++------ 2 files changed, 37 insertions(+), 13 deletions(-) diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c index 8c5046ef41..9d366449f7 100644 --- a/gdb/dwarf2/read.c +++ b/gdb/dwarf2/read.c @@ -5830,20 +5830,34 @@ struct dwarf2_include_psymtab : public partial_symtab } void read_symtab (struct objfile *objfile) override - { - expand_psymtab (objfile); - } - - void expand_psymtab (struct objfile *objfile) override { if (m_readin) return; + /* It's an include file, no symbols to read for it. - Everything is in the parent symtab. */ - read_dependencies (objfile); + Everything is in the includer symtab. */ + + /* The expansion of a dwarf2_include_psymtab is just a trigger for + expansion of the includer psymtab. We use the dependencies[0] field to + model the includer. But if we go the regular route of calling + expand_psymtab here, and having expand_psymtab call read_dependencies to + expand the includer, we'll only use expand_psymtab on the includer + (making it a non-toplevel psymtab), while if we expand the includer via + another path, we'll use read_symtab (making it a toplevel psymtab). + So, don't pretend a dwarf2_include_psymtab is an actual toplevel + psymtab, and trigger read_symtab on the includer here directly. */ + if (!dependencies[0]->readin_p ()) + dependencies[0]->read_symtab (objfile); m_readin = true; } + void expand_psymtab (struct objfile *objfile) override + { + /* This is not called by read_symtab, and should not be called by any + read_dependencies. */ + gdb_assert (false); + } + bool readin_p () const override { return m_readin; diff --git a/gdb/psympriv.h b/gdb/psympriv.h index 0effedc4ec..4317392149 100644 --- a/gdb/psympriv.h +++ b/gdb/psympriv.h @@ -124,16 +124,26 @@ struct partial_symtab { } - /* Read the full symbol table corresponding to this partial symbol - table. */ + /* Psymtab expansion is done in two steps: + - a call to read_symtab + - while that call is in progress, calls to expand_psymtab can be made, + both for this psymtab, and its dependencies. + This makes a distinction between a toplevel psymtab (for which both + read_symtab and expand_psymtab will be called) and a non-toplevel + psymtab (for which only expand_psymtab will be called). The + distinction can be used f.i. to do things before and after all + dependencies of a top-level psymtab have been expanded. + + Read the full symbol table corresponding to this partial symbol + table. Typically calls expand_psymtab. */ virtual void read_symtab (struct objfile *) = 0; - /* Psymtab expansion is done in two steps. The first step is a call - to read_symtab; but while that is in progress, calls to - expand_psymtab can be made. */ + /* Expand the full symbol table for this partial symbol table. Typically + calls read_dependencies. */ virtual void expand_psymtab (struct objfile *) = 0; - /* Ensure that all the dependencies are read in. */ + /* Ensure that all the dependencies are read in. Typically calls + expand_psymtab for each dependency. */ void read_dependencies (struct objfile *); /* Return true if the symtab corresponding to this psymtab has been --------------338363131132F5BED4941A8E--