From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9199 invoked by alias); 27 Aug 2008 15:22:00 -0000 Received: (qmail 9181 invoked by uid 22791); 27 Aug 2008 15:21:58 -0000 X-Spam-Check-By: sourceware.org Received: from pas38-1-82-67-71-117.fbx.proxad.net (HELO siegfried.gbfo.org) (82.67.71.117) by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 27 Aug 2008 15:21:26 +0000 Received: from erda.mds (localhost [127.0.0.1]) by siegfried.gbfo.org (8.13.6/8.13.6) with ESMTP id m7RFLMnc005508 for ; Wed, 27 Aug 2008 17:21:22 +0200 Received: from localhost (saffroy@localhost) by erda.mds (8.13.6/8.13.6/Submit) with ESMTP id m7RFLMKF005505 for ; Wed, 27 Aug 2008 17:21:22 +0200 Date: Thu, 28 Aug 2008 13:55:00 -0000 From: Jean-Marc Saffroy To: gdb@sourceware.org Subject: how could gdb handle truncated core files? Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-IsSubscribed: yes Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2008-08/txt/msg00281.txt.bz2 Hi, For now, gdb does not seem to be able to do anything useful with a truncated core file on Linux (ie. what you get when your process dies and the core size limit is not 0 but less than the size of the process). In a number of cases, I think it would be nice to be able to at least get a stack trace, and examine local variables. This could require a limited amount of data to be dumped by the kernel. I'm curious what could be done to improve this situation, because I see two potential use cases: - embedded systems developpers: sometimes it's hard to find enough space to write your core file (eg. the application uses 80% of your RAM, and your only writable filesystem is a tiny temporary RAM disk) - parallel application developpers on large clusters: sometimes you use a huge amount of RAM in a bunch of processes (eg. an MPI parallel program), and dumping all that on your home directory will fill your disk quota and/or keep your file server busy for a very long time In search of a solution, I patched my Linux kernel so that dumping a core would start with the segments that hold a stack (assuming user stack pointers are valid): thus these segments have a chance of being dumped before the core limit is reached. This approach gives interesting results with a (very simple) single threaded process. However, my attempts with a multithreaded process failed, like this: $ gdb GNU gdb 6.8 This GDB was configured as "x86_64-unknown-linux-gnu"... Cannot access memory at address 0x2aaaaabc29c8 (gdb) bt #0 0x00002aaaaabc9345 in ?? () #1 0x00000000400179f0 in ?? () #2 0x0000000000000000 in ?? () That is: - gdb does not load symbols from binaries - as a result, gdb does not detect threads (because IIRC libthread_db would be loaded when some libpthread.so symbols are detected in the process) - the backtrace seems incorrect: if I have a "full" core dump, gdb shows the following stack trace: (gdb) bt #0 0x00002aaaaabc9345 in pthread_create@@GLIBC_2.2.5 () from /lib/libpthread.so.0 #1 0x00000000004005c8 in main (argc=, argv=) at thrcore.c:24 So, I have the following questions to the community: - what can I do (eg. in my kernel patch) to have gdb load symbols from binaries? - do you have any comment on my approach? (eg. I *think* I've seen AIX produce small dumps, but I have no idea how they do it, if it's a special file format, etc.) Thanks for your comments! Cheers, Jean-Marc -- saffroy@gmail.com