From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21834 invoked by alias); 13 Jan 2017 01:53:47 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 21817 invoked by uid 89); 13 Jan 2017 01:53:46 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_SOFTFAIL autolearn=no version=3.3.2 spammy=H*f:sk:75fc00f, H*MI:sk:75fc00f, H*i:sk:75fc00f, Thursday X-HELO: mail.baldwin.cx Received: from bigwig.baldwin.cx (HELO mail.baldwin.cx) (96.47.65.170) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 13 Jan 2017 01:53:44 +0000 Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by mail.baldwin.cx (Postfix) with ESMTPSA id C0A8710B4FF; Thu, 12 Jan 2017 20:53:42 -0500 (EST) From: John Baldwin To: Luis Machado Cc: gdb-patches@sourceware.org, vd@freebsd.org Subject: Re: [PATCH] PR threads/20743: Don't attempt to suspend or resume exited threads. Date: Fri, 13 Jan 2017 01:53:00 -0000 Message-ID: <4617495.WEzsQTeE5q@ralph.baldwin.cx> User-Agent: KMail/4.14.10 (FreeBSD/11.0-STABLE; KDE/4.14.10; amd64; ; ) In-Reply-To: <75fc00ff-6472-d5aa-9f74-3495bcf7fb83@codesourcery.com> References: <20161223212842.42715-1-jhb@FreeBSD.org> <1723055.CyypAqrLYR@ralph.baldwin.cx> <75fc00ff-6472-d5aa-9f74-3495bcf7fb83@codesourcery.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-IsSubscribed: yes X-SW-Source: 2017-01/txt/msg00259.txt.bz2 On Thursday, January 12, 2017 07:27:43 PM Luis Machado wrote: > On 01/12/2017 01:16 PM, John Baldwin wrote: > > On Thursday, January 12, 2017 10:29:00 AM Luis Machado wrote: > >> On 12/28/2016 11:37 AM, John Baldwin wrote: > >>> On Wednesday, December 28, 2016 09:07:07 AM Vasil Dimov wrote: > >>>> On Tue, Dec 27, 2016 at 13:03:27 -0800, John Baldwin wrote: > >>>> [...] > >>>>> I have tried changing fbsd_wait() to return a TARGET_WAITKIND_SPURIOUS > >>>>> instead of explicitly continuing the process, but that doesn't help, and it > >>>>> means that the ptid being returned is still T1 in that case. > >>>>> > >>>>> I'm not sure if I should explicitly be calling delete_exited_threads() in > >>>>> fbsd_resume() before calling iterate_threads()? Alternatively, fbsd_resume() > >>>>> could use ALL_NONEXITED_THREADS() instead of iterate_threads() (it isn't > >>>>> clear to me which of these is preferred since both are in use). > >>>>> > >>>>> I added the assertion for my own sanity. I suspect gdb should never try to > >>>>> invoke target_resume() with a ptid of an exited thread, but if for some > >>>>> reason it did the effect on FreeBSD would be a hang since we would suspend > >>>>> all the other threads and when the process was continued via PT_CONTINUE it > >>>>> would have nothing to do and would never return from wait(). I'd rather have > >>>>> gdb fail an assertion in that case rather than hang. > >>>> [...] > >>>> > >>>> Hi, > >>>> > >>>> I am not sure if this is related, but since I get a hang I would rather > >>>> mention it: with the John's patch (including the assert) gdb does not > >>>> emit the "ptrace: No such process" error, but when I attempt to quit, > >>>> it hangs: > >>> > >>> No, this is a separate bug in the kernel whereby a process doesn't > >>> treat PT_KILL as a detach-like event but incorrectly expects to keep > >>> getting PT_CONTINUE events for a while until it finally exits. I'm > >>> working on writing up regression/unit tests for PT_KILL and then > >>> fixing the bug. > >>> > >> > >> I think the patch is mainly papering over a bigger problem. My guess is > >> that the native fbsd backend is not doing something it should. > >> > >> I'd check how linux-nat.c is doing things and then try to confirm the > >> fbsd behavior is sane. > >> > >> For example, i noticed linux-nat.c has exit_lwp (...) that handles > >> deletion of both thread information and the thread itself (lwp). Even if > >> it is the currently-selected thread, we *will* get the lwp removed from > >> the list of existing lwp's. > > > > FreeBSD's backend doesn't maintain a separate lwp list, it just uses > > the existing GDB thread list. For FreeBSD's backend the two lists would > > simply mirror each other so it seems a bit of a waste to maintain a > > duplicate list. exit_lwp() calls delete_thread() which is the same thing > > the FreeBSD backend is doing, so if that is the current thread in > > inferior_ptid, the Linux backend will also being leaving the exited > > thread around in GDB's list until some future call to delete_exited_threads(). > > > > I think the thing that makes Linux work is that it doesn't use GDB's > > thread list. Meaning, it doesn't walk over GDB's thread list, but instead > > iterates over its private LWP list via iterate_over_lwps(). It would seem > > that GDB's thread list is designed so that backends shouldn't need their > > own thread list (you can add target-specific data with a custom destructor > > that gets invoked when freeing a thread for example), but the Linux backend > > doesn't choose to use it that way? > > > > Looking at some other threaded backends: > > > > - aix-thread.c relies on custom ptrace ops that resume a single thread > > - darwin-nat.c uses its own thread list (stored in the inferior's > > private data) instead of GDB's thread list. > > - gnu-nat.c uses its own thread list instead of GDB's thread list. > > - obsd-nat.c uses GDB's thread list but doesn't seem to support resuming > > individual threads (only entire processes). > > - procfs.c maintains its own thread list, but it doesn't seem to use it > > for resume but relies on the associated kernel resuming either an > > entire process or a single thread in a process via different ioctls. > > - remote.c:remote_resume() uses ALL_NON_EXITED_THREADS > > - windows_nat.cwindows_resume() calls windows_continue() which uses a > > target-internal thread list rather than GDB's thread list. > > > >> It doesn't make sense to keep a thread that has already exitted in the > >> list of threads we are manipulating. > > > > FreeBSD's backend isn't making that choice. delete_thread() in threads.c > > is the one making that choice. If FreeBSD's backend were to define its > > own thread list, the contents would be identical except it would not > > include any exited threads, so skipping exited threads gives the same > > result as walking a hypothetical private list. > > > > So i take it using ALL_NON_EXITED_THREADS is something that would seem > reasonable to use in this case and not iterate through all threads (even > ones marked exitting)? Yes, that would work for me. -- John Baldwin