From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23416 invoked by alias); 18 Mar 2014 01:39:04 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 23403 invoked by uid 89); 18 Mar 2014 01:39:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.9 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mail-vc0-f175.google.com Received: from mail-vc0-f175.google.com (HELO mail-vc0-f175.google.com) (209.85.220.175) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 18 Mar 2014 01:39:01 +0000 Received: by mail-vc0-f175.google.com with SMTP id lh14so6427266vcb.6 for ; Mon, 17 Mar 2014 18:38:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=u2eN8lXv3mi6YmEaBX/l4nBPgYHU3ikIvGzY8AQ2uoI=; b=NLHFPkb+AMfRztcJV2t+x2DC8sdxCp+MLB3OtgxQstMnA7YfwzHMrXmTW90RDKZxNA dbwXwxVG/lhDPJLAjdUyRwUsQf8UeUg72rjg/aEnWrbcmoWZmtrI/8fKD353RYrBF95t fvdDg7CDxEH1j64gdqE0Bovstcc+21mlx9wU9CmLa6ytHMwDDEdnuE6r0jfCyS4W8XRz uUxkp7IY4O7ICak2C+gSHQ31KsAbd5lpLPkgvd56yqM1GyfzWpy8Bpm2GYhV1U/vV/vc DI4CfoAFTjDhoY1+imv6AmR/zTc3Y6sVD4kK7mDP9KbvcsVAiLBLKk/3SYyYUI3lhrzU ezBQ== X-Gm-Message-State: ALoCoQnCUf7Ux6EfBv/5kLtQYDt9nGo5h4ndnh+fHjswOJR5uDv/w2co7W1ClUSW7X+A/ysHpvOzFgWGHNb8XY7ZlqfQFsoRI+LLED++/Ig+7K5gX6iSv4wxk/6hMmtzeFZBtVI03WTjMH31hHyCJ3fBD0rgh4alHI2ZZlLvsRWMoyRz+gp+h//54SjW7W/03C7sZqBXAmmp1rqrx1rfy6Dyr+RLxPM0iQ== MIME-Version: 1.0 X-Received: by 10.220.147.16 with SMTP id j16mr3327369vcv.28.1395106739073; Mon, 17 Mar 2014 18:38:59 -0700 (PDT) Received: by 10.52.13.101 with HTTP; Mon, 17 Mar 2014 18:38:58 -0700 (PDT) In-Reply-To: <2AC155A009400B4C8B05D518E4819AEF1A347889@exchange10.valvesoftware.com> References: <7365721.BnaR1nHazz@lt-gergap> <10414444.6eOdp1cvY6@ws-gergap> <2739108.yJ4Vgng9gv@ws-gergap> <2AC155A009400B4C8B05D518E4819AEF1A347889@exchange10.valvesoftware.com> Date: Tue, 18 Mar 2014 01:39:00 -0000 Message-ID: Subject: Re: New feature "source-id" From: Doug Evans To: Bruce Dawson Cc: Gerhard Gappmeier , gdb-patches Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2014-03/txt/msg00406.txt.bz2 On Mon, Mar 17, 2014 at 5:48 PM, Bruce Dawson wr= ote: > Hi, I thought I'd chime in since I'm the one who suggested this idea (at = Steam Dev Days), and I have a (cruder, not worth sharing) version of this w= hich we have been using for about a year, so I speak from experience. > > The size issue is not really a code-size issue but a file-size issue. Tra= ditionally the debug information has been in strippable sections so that no= n-developer users don't need to pay any price (download bandwidth, storage)= for debug information which they don't care about. I think that the source= -id information falls into the same category. A simple hello world program = could end up with many KB of source-id information and it would be a shame = to have that in the data segment, loaded or not. Stripped ELF files don't h= ave debug information, and they shouldn't have source-id information, philo= sophically and practically. > > Getting something into the .note section is done in our build system by u= sing objcopy -- add-section. For some reason we do this in our build pipeli= ne *after* we've stripped the symbols so we actually add the section to our= .dbg file rather than the original .so file. We then have to remove and re= create the .gnu_debuglink. Adding the .note section to the original .so fil= e would be cleaner since then the normal stripping steps would just handle = it like other debug information. I think I did it wrong because it was simp= ler, but it might have been through ignorance. It might be serendipitous that you do it afterwards. Note that strip doesn't remove .note.gnu.build-id. I'm guessing there's a way to mark a .note section as strippable, but I only skimmed binutils. > The way that our workflow works is that after we have done a build we use= objdump -Wl (slightly customized to reduce the volume of information) get = a list of all of the source files used to create a shared object. Then we q= uery our source-control system for the version numbers associated with the = files (we use Perforce). We then inject this data (a mapping from each loca= l file system path to a Perforce path/version information) into the custom = section. We have a hacky system for getting gdb to find the files, and we a= re looking forward to Gerhard's superior system. > > I don't see how this would work with a source file. You can easily know t= he set of input files (which includes header files, and source files from a= rchive files) until the build has completed. If you put the mappings in a s= ource file then you would need to compile that after the initial link and = then relink. That would be inefficient. So, you would need a way to find th= e list of source files (including header files) prior to building. That sou= nds messy. I can imagine generating a .c file (say, could even be .S) from the post-link binary that contains source information, compile it, and then using objcopy to add the resultant section with the source information into the resultant binary. The key here being that source file information gets put in a specific section so it's easy to do this. If one is splitting the debug information into a separate file, one could objcopy --add-section the source-file-section there instead. There's no real "relinking" here. I don't see the difference with what you're doing now. > BTW, one thing to think about is that this system should produce reproduc= ible results. The gcc way is that if your input files have not changed then= you should get an identical output. With Perforce this automatically works= -- the file version numbers only change if new versions of the files have = been submitted. For VCS systems with a global version number (git, IIRC) so= me care must be taken. If the global version number is used to specify whic= h versions of the files to retrieve then a check-in of an unrelated file wi= ll change the source-id information and make the binaries not match. I beli= eve that would be bad. This doesn't affect Gerhard's code at all, but it do= es affect how a sample script to create the source-id information should wo= rk. > > Finally, I think that a new command-line option to objdump to dump *just*= source paths would be helpful. objdump -Wl is too slow -- something better= is needed and I could not find anything. The patch to objdump would be tri= vial. Or a stand-alone tool could be created, although that is less temptin= g. > > Thanks to Gerhard for pushing this through. I dream of a future where I c= an step through the code in (almost) any package and have symbols and sourc= e code magically show up on-demand. > > Bruce Dawson, Valve > > -----Original Message----- > From: gdb-patches-owner@sourceware.org [mailto:gdb-patches-owner@sourcewa= re.org] On Behalf Of Doug Evans > Sent: Monday, March 17, 2014 5:26 PM > To: Gerhard Gappmeier > Cc: gdb-patches > Subject: Re: New feature "source-id" > > On Mon, Mar 17, 2014 at 12:01 PM, Gerhard Gappmeier wrote: >>> > example vcsinfo.c: >>> > /* this file was genarated, bla bla, don't modifiy */ static const >>> > char vcs_type[] =3D "git"; static const char vcs_url[] =3D >>> > "git@github.com:gergap/source-id.git" >>> > static const char vcs_version[] =3D >>> > "c2ec66e6a36451ba47422d186fd97311989ef278" >>> I think its weird to store this in .rodata instead of somewhere it >>> can be easily stripped, especially if you plan on adding the sha1 >>> file hashes through this same mechanism, since that is a less >>> constant size, though you did mention adding that to the debug info >>> specifically. >> I agree. That's a good point. I think we should stay with the original >> idea of having a .note section. It is also more consistent with the buil= d-id feature. > > I agree the consistency of .note is nice, but I wouldn't preclude people = wanting something different. > Getting something into a .note section may involve more build changes tha= n some group may want to take on. > >> Another argument against adding this to the source might be code size. >> For small programs on embedded devices memory matters, so saving these >> strings would be a benefit. The .note section can be stripped and the >> feature would still work with the "separate-debug-info" approach. > > Technically, even if the info was added to the source (so to speak), it n= eedn't affect code size. > I can imagine all of these (so called) global variables being put in a sp= ecific section which is put in a non-loadable segment. > > The solution in gdb needn't preclude any implementation, that is up to th= e script. > So, assuming the community wants this feature, let's separate out how the= source information is obtained from how gdb uses it. > > btw, If Python doesn't have a library for reading ELF files, it should. > Thus we needn't hardcode anything about where the data lives into gdb > - leave it to the externally supplied script.