From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3927 invoked by alias); 28 Oct 2015 15:43:27 -0000 Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org Received: (qmail 3918 invoked by uid 89); 28 Oct 2015 15:43:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Wed, 28 Oct 2015 15:43:25 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id 2E7FEC0A1603; Wed, 28 Oct 2015 15:43:24 +0000 (UTC) Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t9SFhI9b028286; Wed, 28 Oct 2015 11:43:19 -0400 Message-ID: <5630ED16.50900@redhat.com> Date: Wed, 28 Oct 2015 15:43:00 -0000 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Oleg Nesterov CC: Denys Vlasenko , Denys Vlasenko , Andrew Morton , Dmitry Vyukov , Alexander Potapenko , Eric Dumazet , Jan Kratochvil , Julien Tinnes , Kees Cook , Kostya Serebryany , Linus Torvalds , "Michael Kerrisk (man-pages)" , Robert Swiecki , Roland McGrath , syzkaller@googlegroups.com, Linux Kernel Mailing List , "gdb@sourceware.org" Subject: Re: [PATCH 1/2] wait/ptrace: always assume __WALL if the child is traced References: <20151020171740.GA29290@redhat.com> <20151020171754.GA29304@redhat.com> <20151020153155.e03f4219da4014efe6f810b0@linux-foundation.org> <5627EE9E.8040600@redhat.com> <5627F607.4050506@redhat.com> <20151021214703.GA1810@redhat.com> <20151025155440.GB2043@redhat.com> <562E17D8.4000108@redhat.com> <20151028161152.GA24042@redhat.com> In-Reply-To: <20151028161152.GA24042@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-SW-Source: 2015-10/txt/msg00109.txt.bz2 On 10/28/2015 04:11 PM, Oleg Nesterov wrote: > On 10/26, Pedro Alves wrote: >> >> On 10/25/2015 03:54 PM, Oleg Nesterov wrote: >>> >>> In any case, the real question is whether we should change the kernel to >>> fix the problem, or ask the distros to fix their init's. In the former >>> case 1/2 looks simpler/safer to me than the change in ptrace_traceme(), >>> and you seem to agree that 1/2 is not that bad. >> >> A risk here seems to be that waitpid will start returning unexpected >> (thread) PIDs to parent processes, > > I don't see how this change can make the things worse, > >> and it's not unreasonable to assume >> that e.g., a program asserts that waitpid either returns error or a >> known (process) PID. > > Well. /sbin/init can never assume this, obviously. Right. I was actually thinking of !init processes -- basically code that spawns helper processes, keeps a data structure indexed by pid, then discards the structure when the child exits. Something like: pid = waitpid(-1, &status, 0); if (pid > 0) { struct child_process *child = find_process(pid); assert (child != NULL); } As in, before your change, the child could get stuck forever, but after your change, the parent could die/assert instead. But ... > >> That's not an init-only issue, > > Yes. Because we have CLONE_PARENT. So "waitpid either returns error or a > known (process) PID" is only true if you trust your children. ... OK, that's indeed a good point. >> (Also, in the original test case, if the child gets/raises a signal or execs >> before exiting, the bash/init/whatever process won't be issuing PTRACE_CONT, >> and the child will thus end up stuck (though should be SIGKILLable, > > Oh, but if it is killable everything is fine. How does this differ from the > case when, say, you jusr reparent to init and do kill(getpid(), SIGSTOP) ? The difference is that if the child called PTRACE_TRACEME, then it goes to ptrace-stop instead and no amount of SIGCONT unstucks it -- the only way out is force killing. I agree it's not a major issue as there's a way out (and thus made it a parens), but I wouldn't call it nice either. >> All this because PTRACE_TRACEME is broken by design > > Heh. I agree. But we can't fix it now. Perhaps the man page could document it as deprecated, suggesting PTRACE_ATTACH/PTRACE_SEIZE instead? Thanks, Pedro Alves