From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23797 invoked by alias); 27 Jan 2009 13:09:33 -0000 Received: (qmail 23675 invoked by uid 22791); 27 Jan 2009 13:09:32 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,SARE_MSGID_LONG40,SPF_PASS X-Spam-Check-By: sourceware.org Received: from wf-out-1314.google.com (HELO wf-out-1314.google.com) (209.85.200.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 27 Jan 2009 13:09:27 +0000 Received: by wf-out-1314.google.com with SMTP id 28so6384597wfc.24 for ; Tue, 27 Jan 2009 05:09:25 -0800 (PST) MIME-Version: 1.0 Received: by 10.142.217.17 with SMTP id p17mr134961wfg.109.1233061764672; Tue, 27 Jan 2009 05:09:24 -0800 (PST) In-Reply-To: <8ac60eac0901260851o2a93a13di8a6b8c9cd4f8c15f@mail.gmail.com> References: <74fef6df0901260724p188c5507x2cfa3a4283f6fd41@mail.gmail.com> <20090126154138.GA14406@caradoc.them.org> <8ac60eac0901260851o2a93a13di8a6b8c9cd4f8c15f@mail.gmail.com> Date: Tue, 27 Jan 2009 13:09:00 -0000 Message-ID: <74fef6df0901270509s3d6d2075rb08989ea7e886823@mail.gmail.com> Subject: Re: baffling assembly-level weirdness From: Mathieu Lacage To: Paul Pluzhnikov Cc: gdb@sourceware.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2009-01/txt/msg00181.txt.bz2 On Mon, Jan 26, 2009 at 5:51 PM, Paul Pluzhnikov wrote: >>> The following gdb session baffles me completely: %edx is reset to zero >>> by the mov at address 0x0804ad62 instead of being set to the constant >>> 0x804ad62. Of course, this code segfaults at $pc = 0x804ad68 when zero >>> is dereferenced... >>> >>> Version: GNU gdb 6.8 >>> >>> (gdb) disas $pc $pc+10 >>> Dump of assembler code from 0x804ad62 to 0x804ad6c: >>> 0x0804ad62 : mov 0x805e3c0,%edx >> >> This is a load from memory at address 0x805e3c0, in x86 syntax. > > Additional clues: > > (gdb) p/a 0x805e3c0 > > will likely print "stdout". If you break in main, and do > > (gbd) x/a 0x805e3c0 > > it will likely print something like: > > 0x8053ac0 : 0x4dcdb5e0 <_IO_2_1_stdout_> > > It sounds like your program is corrupting stdout somewhere. > The fastest way to find out where that happens: > > (gdb) watch *(int **)0x8053ac0 You did put me on the right track so, in case this might be useful to someone else, here is what was going on: 1) I am writing an ELF loader so, the bug was not in my program corrupting the variable but in the variable not being initialized correctly by the loader during relocation 2) the libc has a global stdout variable which is initialized with a R_386_GLOB_DAT relocation by the loader: mathieu@mathieu-boulot:~/code/elf-loader$ readelf -s /lib/libc.so.6|grep stdout@@ 1063: 00135860 4 OBJECT GLOBAL DEFAULT 32 stdout@@GLIBC_2.0 mathieu@mathieu-boulot:~/code/elf-loader$ readelf -r /lib/libc.so.6|grep 00135860 00134f34 00042706 R_386_GLOB_DAT 00135860 stdout 00135860 00032d01 R_386_32 001354e0 _IO_2_1_stdout_ 3) the main executable has a global stdout variable which is referenced by the code in the main binary and initialized by a R_386_COPY relocation: mathieu@mathieu-boulot:~/code/elf-loader$ readelf -s /bin/ls|grep stdout@ 93: 0805e3c0 4 OBJECT GLOBAL DEFAULT 25 stdout@GLIBC_2.0 (2) mathieu@mathieu-boulot:~/code/elf-loader$ readelf -r /bin/ls|grep 0805e3c0 0805e3c0 00005d05 R_386_COPY 0805e3c0 stdout which is expected to copy the value of the stdout symbol from the libc.so.6 4) it turns out that the symbol lookup associated to a R_*_COPY relocation is supposed to ignore the main executable which is something I had no idea about and adding an extra if statement to ignore the main executable during symbol resolution for a R_*_COPY reloc fixed the problem. As a side-note, I really wonder why (3): none of the executables I link myself contain an stdout variable so, I am somewhat curious as to where this is coming from (I would expect each access to stdout from the main binary to directly reference the symbol from the libc through the GOT). But, well, next time. anyway, thanks a lot to the very helpful souls here, Mathieu -- Mathieu Lacage