From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 64150 invoked by alias); 13 Jul 2015 17:32:27 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 64140 invoked by uid 89); 13 Jul 2015 17:32:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Mon, 13 Jul 2015 17:32:25 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (Postfix) with ESMTPS id 9D22F358A00; Mon, 13 Jul 2015 17:32:24 +0000 (UTC) Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t6DHWN26028523; Mon, 13 Jul 2015 13:32:23 -0400 Message-ID: <55A3F626.7050409@redhat.com> Date: Mon, 13 Jul 2015 17:32:00 -0000 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Yao Qi , gdb-patches ml Subject: Re: [PATCH v2] GDBserver crashes when killing a multi-thread process References: <510f2362-8d33-4c3c-9a13-5d187f26abdf@SVR-ORW-FEM-04.mgc.mentorg.com> <53AF87EB.60703@mentor.com> <53B3CBDB.5030207@redhat.com> <53BEAE5E.7030209@redhat.com> <55A3E23C.8020101@gmail.com> In-Reply-To: <55A3E23C.8020101@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2015-07/txt/msg00382.txt.bz2 On 07/13/2015 05:07 PM, Yao Qi wrote: > Hi Pedro, > do you still remember why did you add this assert? It wasn't > mentioned in the mail > https://sourceware.org/ml/gdb-patches/2014-07/msg00206.html > Simply because getting here was supposed to indicate something went wrong elsewhere, but at the time I didn't consider that the child could die while ptrace-stopped. > I am looking at a GDBserver internal error on x86_64 when I run > gdb.threads/thread-unwindonsignal.exp with GDBserver, > > continue^M > Continuing.^M > warning: Remote failure reply: E.No unwaited-for children left.^M > PC register is not available^M > (gdb) FAIL: gdb.threads/thread-unwindonsignal.exp: continue until exit > Remote debugging from host 127.0.0.1^M > ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M > ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M > ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M > ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M > monitor exit^M > Killing process(es): 30694^M > (gdb) /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/linux-low.c:1106: A > problem internal to GDBserver has been detected.^M > kill_wait_lwp: Assertion `res > 0' failed. > > After your patch https://sourceware.org/ml/gdb-patches/2015-03/msg00597.html > GDBserver starts to swallows errors if the LWP is gone. Then, when > GDBservers kills non-exist LWP, the assert will be triggered. > Looks like I forgot to push the rest of that series: https://sourceware.org/ml/gdb-patches/2015-03/msg00182.html What do you think of that one? > Why don't we implement kill_wait_lwp like its counterpart in GDB > linux-nat.c:kill_wait_callback? we can loop and assert like this > patch below, (note that this patch fixes the internal error, and > the FAIL is still there). > Seems to me it's not 100% correct to waitpid the pid one more time after we've already reaped it, because there's a minuscule chance another process that we're debugging could clone a new lwp that reuses the PID of the one we've just killed/reaped, and then another iteration could collect the initial SIGSTOP of the wrong LWP and we'd kill it: -> kill (pid1, SIGKILL); <- waitpid (pid1) returns pid1/WSIGNALLED -> on iteration1: new pid1 clone lwp is spawned -> ret==pid1, continue iterating -> kill (pid1, SIGKILL); // killing wrong process <- waitpid (pid1) returns either SIGSTOP or WSIGNALLED ... Thanks, Pedro Alves