From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31242 invoked by alias); 28 Apr 2008 22:54:33 -0000 Received: (qmail 31225 invoked by uid 22791); 28 Apr 2008 22:54:31 -0000 X-Spam-Check-By: sourceware.org Received: from mtagate7.de.ibm.com (HELO mtagate7.de.ibm.com) (195.212.29.156) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 28 Apr 2008 22:54:03 +0000 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate7.de.ibm.com (8.13.8/8.13.8) with ESMTP id m3SMrwTX350500 for ; Mon, 28 Apr 2008 22:53:58 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m3SMrwS03219508 for ; Tue, 29 Apr 2008 00:53:58 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m3SMrwJE005605 for ; Tue, 29 Apr 2008 00:53:58 +0200 Received: from tuxmaker.boeblingen.de.ibm.com (tuxmaker.boeblingen.de.ibm.com [9.152.85.9]) by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with SMTP id m3SMrwQF005602; Tue, 29 Apr 2008 00:53:58 +0200 Message-Id: <200804282253.m3SMrwQF005602@d12av02.megacenter.de.ibm.com> Received: by tuxmaker.boeblingen.de.ibm.com (sSMTP sendmail emulation); Tue, 29 Apr 2008 00:53:58 +0200 Subject: Shared library call problems on PowerPC with current binutils/gdb To: gdb-patches@sourceware.org Date: Tue, 29 Apr 2008 02:38:00 -0000 From: "Ulrich Weigand" Cc: bauerman@br.ibm.com, amodra@bigpond.net.au X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2008-04/txt/msg00643.txt.bz2 Hello, using current binutils and gdb head on powerpc-linux, you see the following quite annoying effect(s) with shared library calls. I'm starting out with a simple "hello world" program (compiled with -ffreestanding to avoid the compiler optimizing the printf to puts), and the following debugging session: >GNU gdb 6.8.50.20080408 >Copyright (C) 2008 Free Software Foundation, Inc. >License GPLv3+: GNU GPL version 3 or later >This is free software: you are free to change and redistribute it. >There is NO WARRANTY, to the extent permitted by law. Type "show copying" >and "show warranty" for details. >This GDB was configured as "powerpc64-linux"... >(gdb) break printf >Function "printf" not defined. >Make breakpoint pending on future shared library load? (y or [n]) n Hmmm. Ok, maybe because libc is not loaded yet ... >(gdb) start >Breakpoint 1 at 0x10000474: file hello.c, line 6. >Starting program: /home/uweigand/a.out >main () at hello.c:6 >6 printf ("Hello, world!\n"); >(gdb) info sharedlibrary >From To Syms Read Shared Object Library >0x0ffc1960 0x0ffda700 Yes /lib/ld.so.1 >0x0fe3db20 0x0ff5e230 Yes /lib/libc.so.6 >(gdb) break printf >Function "printf" not defined. >Make breakpoint pending on future shared library load? (y or [n]) n Well, libc is definitely loaded now, but I still cannot set a breakpoint ... Maybe stepping in works? >(gdb) s >0x10000800 in call___do_global_ctors_aux () Not really. >(gdb) start >The program being debugged has been started already. >Start it from the beginning? (y or n) y >Breakpoint 2 at 0x10000474: file hello.c, line 6. >Starting program: /home/uweigand/a.out >main () at hello.c:6 >6 printf ("Hello, world!\n"); >(gdb) n >0x10000800 in call___do_global_ctors_aux () Huh. Not even stepping over works ... >(gdb) bt >#0 0x10000800 in call___do_global_ctors_aux () >#1 0x0fe3de0c in generic_start_main () from /lib/libc.so.6 >#2 0x0fe3e060 in __libc_start_main () from /lib/libc.so.6 >#3 0x00000000 in ?? () ... and I guess that's the reason why. >(gdb) n >Single stepping until exit from function call___do_global_ctors_aux, >which has no line number information. >0x0ffd544c in _dl_runtime_resolve () from /lib/ld.so.1 >(gdb) >Single stepping until exit from function _dl_runtime_resolve, >which has no line number information. >0x0fe75820 in printf@@GLIBC_2.4 () from /lib/libc.so.6 >(gdb) >Single stepping until exit from function printf@@GLIBC_2.4, >which has no line number information. >Hello, world! >main () at hello.c:7 >7 } Also, it's quite tedious to get back. So, what's going on here? It looks like a combination of multiple different problems. 1) Setting a breakpoint on a shared library function (that is called by the main program) before libraries are loaded used to work because of the so-called "solib trampoline" minimal symbols. These are generated by GDB when it finds an *undefined* symbol in the dynamic symbol table with a *non-zero* value. Those are a special "hack" used by the linker to implement function pointer comparison correctly; their "value" points to the PLT call stub in the main executable used to call the shared library function. However, over time BFD has been optimized to only use this hack when it is actually necessary, i.e. when the symbol is in fact used for purposed of function pointer comparisons. In simple cases like this where the function is just called, the value of the undefined symbol is now always 0 when using current binutils. On the other hand, BFD now provides "synthetic symbols" that point to those same PLT call stubs (on many targets). In fact, the synthetic symbol "printf@plt" is actually defined. However, elfread.c does not consider this to be a "solib trampoline" symbol for printf. Even if it would, there is an additional complication on PowerPC: when using the new-style "secure" PLTs, the "printf@plt" entry point actually points to a *data* variable holding a pointer to the "glink" stub, not the original PLT call stub itself. This could still work, as ppc-linux-tdep.c actually contains code to treat this as a case of "function descriptors" and would resolve to the real target. However, "break printf" actually still wouldn't work because linespec.c:minsym_found does not handle function descriptors ... 2) What about when libc is already loaded? Why is printf still not found? This is because the (static) symbol table of libc.so on current powerpc systems does not contain a symbol "printf", only "printf@@GLIBC_2.4" and "printf@GLIBC_2.0". This is because of symbol versioning needed to handle both 128-bit and 64-bit long double types. Now, the *dynamic* table *does* contain "printf" (twice, with different version information), but elfread.c ignores the dynamic table "as the dynamic symbol table is usually a subset of the main symbol table." Note that even if full debug information for libc.so is available, we do not get a debug symbol "printf" either -- the two entry points are called __printf and __nldbl_printf in the original source, and that's what debug symbols show. 3) As to stepping in and/or over the printf call, the primary reason why this doesn't work is that unwinding breaks. The immediate target of the call instruction is a PLT call stub; these stubs are part of the .text section (when using the secure PLT scheme) and have no symbols. The immediately preceding symbol happens to be call___do_global_ctors_aux from GCC's crtend.o, which is compiled with -finhibit-size-directive, so that function is considered by GDB to span until the end of .text. Thus prolog parsing of call___do_global_ctors_aux detects building of a stack frame, and GDB assumes that the PLT call stubs have one -- which they really don't. 4) Even if that worked, stepping *into* the call would require support for gdbarch_skip_trampoline_code, and the current ppc32 implementation of that (for secure PLTs) simply calls find_solib_trampoline_target -- which requires the old-style "solib trampoline" symbols to work. To fix this, I'd propose the following implementation steps: - Extend elf_symtab_read to treat a synthetic symbol XXX@plt as a mst_solib_trampoline symbol for XXX. - Change elf_symtab_read to register symbols with version name simply under their base name. - Include something along the lines of Markus' "multiply-defined symbol" patch so that "break printf" will break on *both* definitions (created by the two symbols with different version names) after libc has been loaded. - Teach minsym_found about function descriptors. - Add extra unwinders to ppc-linux-tdep that recognize the various PLT call and glink stubs, and properly treat them as frameless. (This will probably require code reading ...) - Extend ppc_skip_trampoline_code to likewise handle those stubs. Does this look reasonable? Am I overlooking anything? Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE Ulrich.Weigand@de.ibm.com