From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12550 invoked by alias); 18 Mar 2014 17:56:20 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 12541 invoked by uid 89); 18 Mar 2014 17:56:20 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=3.7 required=5.0 tests=AWL,BAYES_00,SPAM_BODY,SPF_PASS,T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: mx3.valvesoftware.com Received: from mx3.valvesoftware.com (HELO mx3.valvesoftware.com) (208.64.203.145) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 18 Mar 2014 17:56:17 +0000 Received: from mx3.valvesoftware.com (127.0.0.1) id h524c20171s7 for ; Tue, 18 Mar 2014 10:56:16 -0700 (envelope-from ) Received: from mx3.valvesoftware.com ([172.16.1.53]) by mx3.valvesoftware.com (SonicWALL 7.4.4.8435) with ESMTP id 201403181756160095472; Tue, 18 Mar 2014 10:56:16 -0700 Received: from FEDEX1.valvesoftware.com (172.16.1.101) by exchange10.valvesoftware.com (172.16.1.53) with Microsoft SMTP Server (TLS) id 14.3.146.0; Tue, 18 Mar 2014 10:56:15 -0700 Received: from EXCHANGE10.valvesoftware.com ([fe80::995a:b010:5730:9af]) by fedex1.valvesoftware.com ([::1]) with mapi id 14.03.0169.001; Tue, 18 Mar 2014 10:56:16 -0700 From: Bruce Dawson To: 'Gerhard Gappmeier' , "gdb-patches@sourceware.org" Subject: RE: New feature "source-id" Date: Tue, 18 Mar 2014 17:56:00 -0000 Message-ID: <2AC155A009400B4C8B05D518E4819AEF1A347F70@exchange10.valvesoftware.com> References: <7365721.BnaR1nHazz@lt-gergap> <1905500.YOUlx3S3mT@ws-gergap> <1395154991.27876.65.camel@bordewijk.wildebeest.org> <1825450.WDlHRVxHcI@ws-gergap> In-Reply-To: <1825450.WDlHRVxHcI@ws-gergap> x-exclaimer-md-config: 86b76815-e903-4403-b95d-5abb05264373 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mlf-Version: 7.4.4.8435 X-Mlf-UniqueId: o201403181756160095472 X-IsSubscribed: yes X-SW-Source: 2014-03/txt/msg00427.txt.bz2 I understand that some Linux distributions already make source packages for= each package that they distribute, and this technique offers some unique a= dvantages. However, this is orthogonal to the source-id proposal. Source-id's offer di= fferent value that is complementary. Our build system spits out dozens of builds a day. Some of these are run by= developers, others by testers, and others by customers. Any one of them mi= ght crash. I might end up debugging (live debugging or a core file) any one= of these builds, perhaps weeks after it was created. Because we have the s= ource-id system set up I know that I can walk up and down the stack and hav= e the source files automatically show up, with *zero* effort on my part. I = don't' have to install source packages, I can have multiple core files from= multiple versions loaded simultaneously. Only the source files that I need= are downloaded so it is *extremely* efficient. Retrieving the needed sourc= e files is essentially instantaneous and requires zero developer effort. A source package to go with every package has some advantages, such as gett= ing all of the generated files. However it is a very heavyweight solution b= ecause it requires retrieving thousands of files from dozens of shared obje= cts in order to look at a crash. In comparison, the source-id solution requ= ires retrieving exactly the set of files that are needed to view the functi= ons on the back-trace so it is extremely lightweight. Another advantage to the source-id solution is it actually tells you the ve= rsions of the files. When I get a crash I can look at the source-id informa= tion and see that, for instance, it was built with foo.cpp#17 (version 17 o= f foo.cpp). That information is literally embedded in the source-id section= . I can then look at that file in my VCS client and see if there is a newer= version, perhaps with a fix. Doing this with a source package is more cumb= ersome because (if I understand correctly) you are getting a copy of the so= urce file rather than a reference to a particular version. A source package is like copying the files, whereas source-id is like havin= g a symbolic link to the files. I'm not suggesting that the source-id solution is better than a source pack= age, I'm just saying that they are orthogonal. They support different work = flows. They can co-exist perfectly.=20 -----Original Message----- From: gdb-patches-owner@sourceware.org [mailto:gdb-patches-owner@sourceware= .org] On Behalf Of Gerhard Gappmeier Sent: Tuesday, March 18, 2014 9:41 AM To: gdb-patches@sourceware.org Subject: Re: New feature "source-id" On Tuesday, March 18, 2014 04:03:11 PM you wrote: > On Tue, 2014-03-18 at 15:00 +0100, Gerhard Gappmeier wrote: > > On Tuesday, March 18, 2014 02:22:04 PM you wrote: > > > On Sat, 2014-03-15 at 11:49 +0100, Gerhard Gappmeier wrote: > > > That way you can use the build-id from the ELF note section to=20 > > > retrieve both the separate .debug files and the corresponding=20 > > > source files. And on my distro gdb even helpfully suggests how to do = this: > > > Missing separate debuginfos, use: debuginfo-install > > > at-3.1.13-14.fc20.x86_64 Which will then fetch the debuginfo=20 > > > package and all dependencies so gdb can find the .debug files and=20 > > > the corresponding source code those .debug files refer to. I don't=20 > > > know if the debuginfo-install suggestion is upstream or only in=20 > > > the distro package of gdb. > >=20 > > If I understood this right, this means whenever a software is built=20 > > the sources get archived with the debug symbols in an debuginfo RPM fil= e. > > This way the build-id is all you need to get the correct sources and=20 > > debug symbols. >=20 > Indeed. Just turn the build-id into the package, either through=20 > something like https://darkserver.fedoraproject.org/ or through yum=20 > install=20 > /usr/lib/debug/.build-id/b7/07011ecdbd5bcb1fad73cdc9b4433c791d8328.deb > ug or just through debuginfo-install and you get both the .debug files=20 > and all sources files that .debug file refers to. > > However my idea is somewhat different and a little bit smarter IMO: > > * The SHA1 id of a git repo gets stored in the source-id meta info=20 > > when building. > > * There is no need of archiving the source files in RPM, deb, tar.gz=20 > > or zip files. We have them already in the version control system and=20 > > we don't want to duplicate the data > > * This solution is independent from any package format. > > * You can analyze coredumps of executables that you don't have on=20 > > your system. There is no need to install any RPM package for that.=20 > > This way you can analyze e.g. a crash within a Ubuntu package on a Fedo= ra system. > > * The fetch-script fetches only the sources required by GDB, not the=20 > > complete project. >=20 > Some of those features are already possible with the way distros=20 > package the debuginfo files. But your way might indeed be more=20 > flexible. I am mostly wondering how to take advantage of the way=20 > distros do it currently in your scheme. How do you describe the=20 > default distro setup and how do you make sure not to duplicate the storag= e of source files? >=20 > One difference with your scheme is that the distros packages the=20 > post-processed source files. That means they are the actual files,=20 > however generated, that the compiler compiled to object code. Not=20 > necessarily the pristine source files. That is so in a debugger you=20 > can step through the source file as seen by the compiler (e.g. it will=20 > include source files generated by configure or the lex and yacc=20 > generated files that the compiler builds). Having generated files (that are not in a VCS) available is indeed an advan= tage of this concept. However you this is very focused on sources that get packaged by a Linux di= stribution. But there is also the usecase for proprietary software the gets not bundled= with your distribution. Vendors are creating there own installers or simpl= y a tar.gz file which gets installed in /opt/somewhere. So there is no debuginfo package available in the package manager. Companies could recreate this concept of creating debug packages, but I rea= lly prefer to just fetch the sources from git. That's the way I work today: * Getting a problem report from a customer * Hoping that the customer reports the correct version * Search for a version tag in git which matches the reported version * Checkout that version using git The source-id simplifies that process. Just open the crashdump -> fetch sep= arate debug info using build-id -> fetch source file that should be display= ed in GDB using the fetch-script from an internal cgit web interface. Our build server copies the separate debug info to an NFS share which I hav= e mounted via /etc/fstab. So fetching symbols just works. The sources are already in git and available via cgit web interface. The missing part is just this "source-id" feature, then everything works ou= t- of-the-box. >=20 > > > > * We need to make the new section ".note.gnu.source-id"=20 > > > > official. I don't know who maintains this and this needs to be=20 > > > > registered somewhere. > > > > [...] > > > > * adding file hashes (SHA1) for each source file to the debug info. > > > > This > > > > way we can completely remove the mtime check and replace it with=20 > > > > a check of the SHA1 sum. When we can replace the existing=20 > > > > warning with a message like "The source file does not match the=20 > > > > executable." > > >=20 > > > For DWARF5 there is a proposal to add the MD5 digest to debug-line=20 > > > file > > > table: http://dwarfstd.org/ShowIssue.php?issue=3D130701.1 > > >=20 > > > Would that be a good alternative location to store the hash of the=20 > > > source file? > >=20 > > That's exactly what I proposed. Only that I proposed SHA1 instead of=20 > > MD5, but this doesn't matter. > > If this is already in the DWARD standard we should use this feature=20 > > and don't reinvent the wheel. >=20 > It is currently just a proposal for DWARF5. The proposal deadline is=20 > end of this month. I just reviewed that proposal and saw that it is=20 > not very extensible, so I suggested some additions. See the discussion he= re: > http://thread.gmane.org/gmane.comp.standards.dwarf/100 This feature really makes sense independently from the source-id feature. I'm really looking forward to see that being accepted. Cheers, Gerhard >=20 > Cheers, >=20 > Mark