From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 655 invoked by alias); 24 Mar 2004 15:51:33 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 630 invoked from network); 24 Mar 2004 15:51:31 -0000 Received: from unknown (HELO nevyn.them.org) (66.93.172.17) by sources.redhat.com with SMTP; 24 Mar 2004 15:51:31 -0000 Received: from drow by nevyn.them.org with local (Exim 4.30 #1 (Debian)) id 1B6AfC-00088D-59; Wed, 24 Mar 2004 10:51:30 -0500 Date: Wed, 24 Mar 2004 15:51:00 -0000 From: Daniel Jacobowitz To: Jeff Johnston Cc: gdb-patches@sources.redhat.com Subject: Re: [RFC]: fix for recycled thread ids Message-ID: <20040324155130.GA27748@nevyn.them.org> Mail-Followup-To: Jeff Johnston , gdb-patches@sources.redhat.com References: <405A4089.1080605@redhat.com> <20040319015351.GA28443@nevyn.them.org> <405B3F83.4030503@redhat.com> <20040319190126.GA16950@nevyn.them.org> <405B4B8C.2060801@redhat.com> <20040319194011.GA18776@nevyn.them.org> <405B66F4.4090101@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <405B66F4.4090101@redhat.com> User-Agent: Mutt/1.5.1i X-SW-Source: 2004-03/txt/msg00560.txt.bz2 On Fri, Mar 19, 2004 at 04:32:36PM -0500, Jeff Johnston wrote: > > > Daniel Jacobowitz wrote: > >On Fri, Mar 19, 2004 at 02:35:40PM -0500, Jeff Johnston wrote: > > > >>>Conceptually, we attach to LWPs, not to threads. That suggests to me > >>>that the correct fix is to ask the LWP layer if the LWP is attached > >>>rather than looking it up in the thread list in the first place. > >>>We've already got an appropriate list of LWPs though we might need a > >>>new accessor. > >>> > >> > >>I like that idea. We still have to deal with the bogus thread list > >>entry. The routine prune_threads calls thread_db_alive and it won't > >>realize the thread info it has is bogus because it will find the tid is > >>valid. > > > > > >Do you think this will be a problem? My hope is that it will just look > >as if the thread has 'migrated' to a new LWP. > > > > It will have invalid state associated with it. For example, the thread > info has a prev_pc field. As to all the havoc that the state may or may > not cause, I think it would be a very good idea to clean it up now. Who's > to say what state will be added to thread_info in the future. It turns out that this is not only a problem, but the whole problem. Could you run this test under GDB, on RHEL, using strace? Tell me whether or not you see WIFEXITED results for every thread as it exits. I was assuming you did not, but I can reproduce the misbehavior here even though I do get them. The problem is that we get the LWP death events, but we treat the threads and LWPs as completely independent sets. We never find out that the threads have died. We don't enable thread death event reporting, because in glibc 2.1.3 there was a bug in the death reporting which would cause the debugged program to segfault: #if 0 /* FIXME: kettenis/2000-04-23: The event reporting facility is broken for TD_DEATH events in glibc 2.1.3, so don't enable it for now. */ td_event_addset (&events, TD_DEATH); #endif Fortunately, in there is a function to return the runtime version of glibc. We should be able to use that - and the not 100% valid, but generally valid and already assumed by thread_db, assumption that a native GDB, when used to debug native programs, is debugging the same version of the C library - to enable TD_DEATH when it is safe to do so. This will let us detach the threads when they die. That has its own risks, since the thread continues to run for a short while after the death event is reported. For instance, in NPTL the thread reports the event and then cleans up after itself; in LT I don't remember whether the manager or the thread does this, but I think it's the same. I already wrote limited code to handle this, if you search for "zombie" in thread-db.c, so it should be OK. The gist is that we remove it from the thread list right away, but do not detach the thread. We resist attaching to zombies. At least, all that is how it looks to me. I'll experiment with TD_DEATH before I speculate further. -- Daniel Jacobowitz MontaVista Software Debian GNU/Linux Developer