From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-34423-listarch-gdb=sources.redhat.com@sourceware.org>
Received: (qmail 25887 invoked by alias); 18 May 2009 01:23:49 -0000
Received: (qmail 25874 invoked by uid 22791); 18 May 2009 01:23:49 -0000
X-SWARE-Spam-Status: No, hits=0.4 required=5.0 	tests=BAYES_40,SPF_SOFTFAIL
X-Spam-Check-By: sourceware.org
Received: from smtp02.lnh.mail.rcn.net (HELO smtp02.lnh.mail.rcn.net) (207.172.157.102)     by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 18 May 2009 01:23:41 +0000
Received: from mr02.lnh.mail.rcn.net ([207.172.157.22])   by smtp02.lnh.mail.rcn.net with ESMTP; 17 May 2009 21:23:39 -0400
Received: from smtp01.lnh.mail.rcn.net (smtp01.lnh.mail.rcn.net [207.172.4.11]) 	by mr02.lnh.mail.rcn.net (MOS 3.10.5-GA) 	with ESMTP id PWK73209; 	Sun, 17 May 2009 21:22:37 -0400 (EDT)
Received: from 65-78-31-9.c3-0.lex-ubr3.sbo-lex.ma.cable.rcn.com (HELO homebase.localnet) ([65.78.31.9])   by smtp01.lnh.mail.rcn.net with ESMTP; 17 May 2009 21:22:37 -0400
Received: from psmith by homebase.localnet with local (Exim 4.69) 	(envelope-from <psmith@gnu.org>) 	id 1M5rYq-0003e5-KL 	for gdb@sourceware.org; Sun, 17 May 2009 21:22:36 -0400
Subject: Partial cores using Linux "pipe" core_pattern
From: Paul Smith <psmith@gnu.org>
Reply-To: psmith@gnu.org
To: gdb@sourceware.org
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
Date: Mon, 18 May 2009 01:23:00 -0000
Message-Id: <1242609756.2800.135.camel@homebase.localnet>
Mime-Version: 1.0
X-IsSubscribed: yes
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb.sourceware.org>
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
X-SW-Source: 2009-05/txt/msg00109.txt.bz2

I'm not sure this is the best list for this question; if anyone has any
other thoughts of where to ask please let me know.

I'm having problems debugging some cores being generated on a
distributed system.  The "client" (where the cores are being dumped) is
running on a cut-down GNU/Linux system, running out of a ramdisk (no
local disk).  To preserve cores I have set up NFS and automount, and I'm
dumping cores over the network to a host.  In order to make this as
efficient as possible I am using the Linux (I'm running 2.6.27) kernel's
pipe capability in the core_pattern and piping it to my own program to
write compressed output using gzopen()/etc.  I have some other locking,
etc. to do myself which is why I have my own program instead of just
piping to gzip.

Most of the time this works great; the core appears on the host and I
can decompress it and debug it and it's very nice.

But sometimes, the core is truncated and can't be debugged.  Basically
it has the first part of the core file without error (I've seen sizes
both 64K(!) and about 65M) but obviously you can't even get a backtrace,
with the whole last part of the core missing.  However, it's still a
valid compressed file (it decompresses just fine) so it's not a network
error.  After some experimentation I can determine that indeed the
generated core file contains all the data that was read from the
kernel... in this situation, it appears, the kernel simply doesn't give
me all the data to construct the core.

I've instrumented every single function with checking for errors and
writing issues to syslog (including informational messages so I know the
logging works) and no errors are printed.  The size of the core that I
get from read(2)'ing stdin is just short, but read(2) never fails or
shows any errors!

Does anyone have any thoughts about where I can look next to try to
figure out what's going on?  Ideas or knowledge about limitations of the
kernel's core_pattern pipe capability, such as timing issues etc., that
might be leaving me with short cores?

I'm pretty stumped here!

-- 
-------------------------------------------------------------------------------
 Paul D. Smith <psmith@gnu.org>          Find some GNU make tips at:
 http://www.gnu.org                      http://make.mad-scientist.us
 "Please remain calm...I may be mad, but I am a professional." --Mad Scientist