From: Paul Smith <psmith@gnu.org>
To: gdb@sourceware.org
Subject: Partial cores using Linux "pipe" core_pattern
Date: Mon, 18 May 2009 01:23:00 -0000 [thread overview]
Message-ID: <1242609756.2800.135.camel@homebase.localnet> (raw)
I'm not sure this is the best list for this question; if anyone has any
other thoughts of where to ask please let me know.
I'm having problems debugging some cores being generated on a
distributed system. The "client" (where the cores are being dumped) is
running on a cut-down GNU/Linux system, running out of a ramdisk (no
local disk). To preserve cores I have set up NFS and automount, and I'm
dumping cores over the network to a host. In order to make this as
efficient as possible I am using the Linux (I'm running 2.6.27) kernel's
pipe capability in the core_pattern and piping it to my own program to
write compressed output using gzopen()/etc. I have some other locking,
etc. to do myself which is why I have my own program instead of just
piping to gzip.
Most of the time this works great; the core appears on the host and I
can decompress it and debug it and it's very nice.
But sometimes, the core is truncated and can't be debugged. Basically
it has the first part of the core file without error (I've seen sizes
both 64K(!) and about 65M) but obviously you can't even get a backtrace,
with the whole last part of the core missing. However, it's still a
valid compressed file (it decompresses just fine) so it's not a network
error. After some experimentation I can determine that indeed the
generated core file contains all the data that was read from the
kernel... in this situation, it appears, the kernel simply doesn't give
me all the data to construct the core.
I've instrumented every single function with checking for errors and
writing issues to syslog (including informational messages so I know the
logging works) and no errors are printed. The size of the core that I
get from read(2)'ing stdin is just short, but read(2) never fails or
shows any errors!
Does anyone have any thoughts about where I can look next to try to
figure out what's going on? Ideas or knowledge about limitations of the
kernel's core_pattern pipe capability, such as timing issues etc., that
might be leaving me with short cores?
I'm pretty stumped here!
--
-------------------------------------------------------------------------------
Paul D. Smith <psmith@gnu.org> Find some GNU make tips at:
http://www.gnu.org http://make.mad-scientist.us
"Please remain calm...I may be mad, but I am a professional." --Mad Scientist
next reply other threads:[~2009-05-18 1:23 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-18 1:23 Paul Smith [this message]
2009-05-18 6:05 ` Paul Pluzhnikov
2009-05-18 13:22 ` Paul Smith
2009-05-18 7:25 ` Andi Kleen
2009-05-18 13:29 ` Paul Smith
2009-05-18 13:49 ` Andreas Schwab
2009-05-18 14:32 ` Paul Smith
2009-05-21 16:32 ` Paul Smith
2009-05-26 19:26 ` Paul Smith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1242609756.2800.135.camel@homebase.localnet \
--to=psmith@gnu.org \
--cc=gdb@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox