From: Paul Marquess <Paul.Marquess@owmobility.com>
To: Dmitry Samersoff <dms@samersoff.net>,
"gdb@sourceware.org" <gdb@sourceware.org>
Subject: RE: collecting data from a coring process
Date: Thu, 08 Sep 2016 17:12:00 -0000 [thread overview]
Message-ID: <CY1PR0501MB1178644749C83A5EF1A6D1E295FB0@CY1PR0501MB1178.namprd05.prod.outlook.com> (raw)
In-Reply-To: <d327752e-89f6-c5a3-6d72-4789b106e1f6@samersoff.net>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1842 bytes --]
From: Dmitry Samersoff [mailto:dms@samersoff.net]
> Paul,
>
> > Thanks, will take a look at that. When you say "more or less safely",
> > I'm reading that as saying there will be issues with it. :-)
>
> I don't know a way to do anything with a crashing process with 100% reliability. Ever coredump.
Agree.
> Custom code in signal handler doesn't make the situation worse.
I'd rephrase that as saying that to say that if you careful and know the all limitations of what is possible in a signal handler you won't make the situation worse.
> It's quite often for complicated apps that the crash is result of something that happens far before crash point. E.g. when you see a memory corruption you typically interesting where the memory had been corrupted but not where corrupted memory was hit by the app.
Tell me about it. Memory corruption errors can be impossible to track down.
> So signal handlers that know application data structure and can print meaningful information is quite usable and saves a lot of time in debugging.
>
> Also it might be necessary to free some resources before process start dumping core to allow faster restart.
>
> > Trouble is I soon will not allow a core file to be written -- the
> > process is reaching a size where I cannot allow it to be out of action
> > for the amount of time it takes to write that to disk.
>
> One of possible solution is to add some keep-alive protocol between child and parent (e.g. child keep touching file on disk or sending udp packets), if keep-alive doesn't come in time, parent consider the child as dead, send abort to it and fire a new process.
>
> This solution also covers the situation when a child process hugs or deadlocks.
Luckily I already have a health check probe that does that.
> -Dmitry
\x16º&ÖëzÛ«w×¹b²Ö«r\x18\x1d
next prev parent reply other threads:[~2016-09-08 17:12 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-26 9:06 Paul Marquess
2016-08-26 12:01 ` vijay nag
2016-08-26 12:33 ` Paul Marquess
2016-08-28 7:48 ` Dmitry Samersoff
2016-09-05 11:10 ` Paul Marquess
2016-09-05 22:17 ` Samuel Bronson
2016-09-05 23:19 ` Paul Marquess
[not found] ` <CAJYzjmf0a2Dd8XbOQaO3937Bcab1AW9gVp=r3mKSgUq_27G8ow@mail.gmail.com>
2016-09-06 16:41 ` Paul Marquess
2016-09-07 19:22 ` Samuel Bronson
2016-09-07 22:10 ` Paul Marquess
[not found] ` <d327752e-89f6-c5a3-6d72-4789b106e1f6@samersoff.net>
2016-09-08 17:12 ` Paul Marquess [this message]
2016-09-01 4:23 ` David Niklas
2016-09-05 10:59 ` Paul Marquess
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CY1PR0501MB1178644749C83A5EF1A6D1E295FB0@CY1PR0501MB1178.namprd05.prod.outlook.com \
--to=paul.marquess@owmobility.com \
--cc=dms@samersoff.net \
--cc=gdb@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox