From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11212 invoked by alias); 4 Aug 2008 13:40:49 -0000 Received: (qmail 11119 invoked by uid 22791); 4 Aug 2008 13:40:46 -0000 X-Spam-Check-By: sourceware.org Received: from NaN.false.org (HELO nan.false.org) (208.75.86.248) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 04 Aug 2008 13:40:03 +0000 Received: from nan.false.org (localhost [127.0.0.1]) by nan.false.org (Postfix) with ESMTP id CEE22983A9; Mon, 4 Aug 2008 13:40:01 +0000 (GMT) Received: from caradoc.them.org (22.svnf5.xdsl.nauticom.net [209.195.183.55]) by nan.false.org (Postfix) with ESMTP id AEFFA9839F; Mon, 4 Aug 2008 13:40:01 +0000 (GMT) Received: from drow by caradoc.them.org with local (Exim 4.69) (envelope-from ) id 1KQ0I4-00069Y-Vl; Mon, 04 Aug 2008 09:40:00 -0400 Date: Mon, 04 Aug 2008 13:40:00 -0000 From: Daniel Jacobowitz To: Ulrich Weigand Cc: gdb-patches@sourceware.org Subject: Re: [gdbserver] Problems trying to resume dead threads Message-ID: <20080804134000.GA17925@caradoc.them.org> Mail-Followup-To: Ulrich Weigand , gdb-patches@sourceware.org References: <200807191716.m6JHGmsA013958@d12av02.megacenter.de.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200807191716.m6JHGmsA013958@d12av02.megacenter.de.ibm.com> User-Agent: Mutt/1.5.17 (2008-05-11) X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2008-08/txt/msg00051.txt.bz2 Sorry - as you can see, I am once again behind on gdb-patches. On Sat, Jul 19, 2008 at 07:16:48PM +0200, Ulrich Weigand wrote: > Hello, > > gdbserver on Linux seems to have difficulties handling > the case where a thread dies while it is stopped. This can > happen during the loop over all threads in linux_resume: I can reproduce this problem by using the binary from killed.exp and running strace on gdbserver. I can also reproduce it on an embedded ARM target by running killed.exp. I can't reproduce it on my desktop running killed.exp, which suggests this is normally hidden by scheduler decisions - you need a long enough gap between the two PTRACE_CONT's. What do you think of this change? Ideally, we could wait with WNOHANG at this point to check for the exit case, but we'd have to restructure a bit of the event loop to handle pending status == exited. -- Daniel Jacobowitz CodeSourcery 2008-08-04 Daniel Jacobowitz * linux-low.c (linux_resume_one_process): Ignore ESRCH. Index: linux-low.c =================================================================== RCS file: /cvs/src/src/gdb/gdbserver/linux-low.c,v retrieving revision 1.79 diff -u -p -r1.79 linux-low.c --- linux-low.c 28 Jul 2008 18:28:56 -0000 1.79 +++ linux-low.c 4 Aug 2008 13:38:24 -0000 @@ -1193,7 +1193,19 @@ linux_resume_one_process (struct inferio current_inferior = saved_inferior; if (errno) - perror_with_name ("ptrace"); + { + /* ESRCH from ptrace either means that the thread was already + running (an error) or that it is gone (a race condition). If + it's gone, we will get a notification the next time we wait, + so we can ignore the error. We could differentiate these + two, but it's tricky without waiting; the thread still exists + as a zombie, so sending it signal 0 would succeed. So just + ignore ESRCH. */ + if (errno == ESRCH) + return; + + perror_with_name ("ptrace"); + } } static struct thread_resume *resume_ptr;