From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Received: (qmail 1087 invoked from network); 19 Mar 2004 01:53:52 -0000 Received: from unknown (HELO nevyn.them.org) (66.93.172.17) by sources.redhat.com with SMTP; 19 Mar 2004 01:53:52 -0000 Received: from drow by nevyn.them.org with local (Exim 4.30 #1 (Debian)) id 1B49Cp-0007RE-KR; Thu, 18 Mar 2004 20:53:51 -0500 Date: Fri, 19 Mar 2004 01:53:00 -0000 From: Daniel Jacobowitz To: Jeff Johnston Cc: gdb-patches@sources.redhat.com Subject: Re: [RFC]: fix for recycled thread ids Message-ID: <20040319015351.GA28443@nevyn.them.org> Mail-Followup-To: Jeff Johnston , gdb-patches@sources.redhat.com References: <405A4089.1080605@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <405A4089.1080605@redhat.com> User-Agent: Mutt/1.5.1i X-SW-Source: 2004-03/txt/msg00447.txt.bz2 On Thu, Mar 18, 2004 at 07:36:25PM -0500, Jeff Johnston wrote: > The following patch fixes a problem when a user application creates a > thread shortly after another thread has completed. For nptl, thread ids > are addresses. If a thread completes/dies, the tid is available for reuse > by a new thread. Does NPTL re-use the TID quickly, or cycle around the way LT did so that we only see this under high thread pressure? > On RH9 and RHEL3, nptl threads do not have exit events associated with > them. I have already discussed this with Daniel J. who feels that the > kernels are not doing the right thing, but regardless, the current and > previous RH nptl kernels are behaving this way and gdb needs to handle it. > As such, when a new thread is created, if it is reusing the tid of a > previous thread that gdb hasn't figured out isn't around any more, gdb > ignores the create event and the new thread is not added. Ignoring the > event is done because it is possible for gdb to find out about the thread > before it's creation event is reported and so the create event can be > redundant information. What I haven't seen a good explanation of is what problem this causes. If a thread goes away, and then a new thread using the same ID is created, and then we stop, what do we lose besides the cosmetic fact that there is no [New Thread] message? Does anything go wrong? Also, I would like the issue of whether or not it is a kernel bug resolved before we discuss working around it in GDB. > The following fix removes the problem by also checking if the original lwp > given to the old thread in the list matches the lwp for the new thread. > IIRC, there is at least one platform that allows threads to change their > lwps dynamically. I do not believe that platform uses the thread-db layer, > but to be safe, I did not use a direct compare between the original lwp and > the new lwp. Instead, I added a target vector routine that by default > never returns an unequal comparison. For linux, the comparison routine > points to a new routine in lin-lwp.c which does the expected comparison. > The comparison is based on C compare routines whereby 0 means equal and > non-zero means unequal. At least one such "platform" was NGPT, which runs on Linux (and worked OK with thread-db.c, too). It's obsolete nowadays. > Now, if a thread id for a new thread is found to already be on the list, > the target comparison is made between the original lwp stored for the old > thread and the lwp for the new thread. If the comparison is unequal, the > old thread is deleted and then the new thread is added. Otherwise, the new > thread event is ignored as before. If we aren't going to support migration, then why not just store the LWP ID in the thread PTID structure? That's why it has three fields, I'd guess. I don't think that the target hook makes much sense, since thread-db.c is basically Linux-specific at this point and since the contents being passed to the hook are thread-db-specific. -- Daniel Jacobowitz MontaVista Software Debian GNU/Linux Developer