From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30986 invoked by alias); 19 Mar 2004 18:44:21 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 30968 invoked from network); 19 Mar 2004 18:44:19 -0000 Received: from unknown (HELO touchme.toronto.redhat.com) (216.129.200.20) by sources.redhat.com with SMTP; 19 Mar 2004 18:44:19 -0000 Received: from redhat.com (toocool.toronto.redhat.com [172.16.14.72]) by touchme.toronto.redhat.com (Postfix) with ESMTP id 31C328000DA; Fri, 19 Mar 2004 13:44:19 -0500 (EST) Message-ID: <405B3F83.4030503@redhat.com> Date: Fri, 19 Mar 2004 18:44:00 -0000 From: Jeff Johnston User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 MIME-Version: 1.0 To: Daniel Jacobowitz Cc: gdb-patches@sources.redhat.com Subject: Re: [RFC]: fix for recycled thread ids References: <405A4089.1080605@redhat.com> <20040319015351.GA28443@nevyn.them.org> In-Reply-To: <20040319015351.GA28443@nevyn.them.org> Content-Type: multipart/mixed; boundary="------------070802030008040408040803" X-SW-Source: 2004-03/txt/msg00464.txt.bz2 This is a multi-part message in MIME format. --------------070802030008040408040803 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-length: 4389 Daniel Jacobowitz wrote: > On Thu, Mar 18, 2004 at 07:36:25PM -0500, Jeff Johnston wrote: > >>The following patch fixes a problem when a user application creates a >>thread shortly after another thread has completed. For nptl, thread ids >>are addresses. If a thread completes/dies, the tid is available for reuse >> by a new thread. > > > Does NPTL re-use the TID quickly, or cycle around the way LT did so > that we only see this under high thread pressure? > I can't say for sure as I don't maintain libthread_db. The test case in question does create high thread pressure, but I think it would be a mistake to generalize and think that this couldn't happen in an existing application. > >>On RH9 and RHEL3, nptl threads do not have exit events associated with >>them. I have already discussed this with Daniel J. who feels that the >>kernels are not doing the right thing, but regardless, the current and >>previous RH nptl kernels are behaving this way and gdb needs to handle it. >>As such, when a new thread is created, if it is reusing the tid of a >>previous thread that gdb hasn't figured out isn't around any more, gdb >>ignores the create event and the new thread is not added. Ignoring the >>event is done because it is possible for gdb to find out about the thread >>before it's creation event is reported and so the create event can be >>redundant information. > > > What I haven't seen a good explanation of is what problem this causes. > If a thread goes away, and then a new thread using the same ID is > created, and then we stop, what do we lose besides the cosmetic fact > that there is no [New Thread] message? Does anything go wrong? > > Also, I would like the issue of whether or not it is a kernel bug > resolved before we discuss working around it in GDB. > The problem is if a global signal is passed on to the inferior program when there are threads we have not attached to, the process terminates. A Ctrl-C is such a signal. In the example program, we only attach to the first 100 threads and when the Ctrl-C is issued, we get: ptrace: No such process. thread_db_get_info: cannot get thread info: generic error The end-user is cooked. Regarding resolving this issue as a kernel error, any fix for RHEL3 won't get shipped until Update 3. I know of no scheduled update for RH9 and this would not qualify as a security update. If I read your subsequent comments right, you wouldn't have a problem with me just making the lwp compare directly in the thread-db.c code. This would mean the change would be all of 6 lines of code. Does it make a lot of sense to withhold this change in light of the time-frame a kernel fix could be issued and the simplicity of the change itself? > >>The following fix removes the problem by also checking if the original lwp >>given to the old thread in the list matches the lwp for the new thread. >>IIRC, there is at least one platform that allows threads to change their >>lwps dynamically. I do not believe that platform uses the thread-db layer, >>but to be safe, I did not use a direct compare between the original lwp and >>the new lwp. Instead, I added a target vector routine that by default >>never returns an unequal comparison. For linux, the comparison routine >>points to a new routine in lin-lwp.c which does the expected comparison. >>The comparison is based on C compare routines whereby 0 means equal and >>non-zero means unequal. > > > At least one such "platform" was NGPT, which runs on Linux (and worked > OK with thread-db.c, too). It's obsolete nowadays. > > >>Now, if a thread id for a new thread is found to already be on the list, >>the target comparison is made between the original lwp stored for the old >>thread and the lwp for the new thread. If the comparison is unequal, the >>old thread is deleted and then the new thread is added. Otherwise, the new >>thread event is ignored as before. > > > If we aren't going to support migration, then why not just store the > LWP ID in the thread PTID structure? That's why it has three fields, > I'd guess. > > I don't think that the target hook makes much sense, since thread-db.c > is basically Linux-specific at this point and since the contents being > passed to the hook are thread-db-specific. > Would it make sense to rename thread-db.c to lin-thread-db.c? -- Jeff J. --------------070802030008040408040803 Content-Type: text/plain; name="lphello.c" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="lphello.c" Content-length: 2973 #include #include #include #include #include #include #include /** * Compile with: * gcc -g -Wall -lpthread -o lphello lphello.c * * Author Magnus Ihse, ihse@bea.com */ #define THREAD_COUNT 100 #define ITER_COUNT 500000 static volatile int finishedArray[THREAD_COUNT]; static int pKey; void* setup_altstack(void) { stack_t ss; ss.ss_sp = malloc(20*1024); if (ss.ss_sp == 0) { return NULL; } ss.ss_size = 20*1024; ss.ss_flags = 0; if (sigaltstack(&ss, NULL) == -1) { //perror("sigaltstack"); return NULL; } return ss.ss_sp; } void takedown_altstack(void* stack) { struct sigaltstack ss; int result; if (stack == NULL) { return; } ss.ss_flags = SS_DISABLE; ss.ss_sp = (void*)47; // This value should be ignored when ss_flags is SS_DISABLE ss.ss_size = 29; // This value should be ignored when ss_flags is SS_DISABLE { result = sigaltstack(&ss, NULL); free(stack); } } void *threadfunc(void *arg) { int mypos = (int)(size_t)arg; int i; long square = 1; void* altstack = setup_altstack(); pthread_setspecific(pKey, arg); for (i=0; i < 1000; i++) { square = i*i + square*mypos; } finishedArray[mypos] = 1; takedown_altstack(altstack); return NULL; } int main(int argc, char ** argv) { pthread_t threads[THREAD_COUNT]; pthread_attr_t attr; int result; int i; int iteration; int finished; pthread_attr_init(&attr); pthread_attr_setstacksize(&attr, 128*1024); pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE); pthread_key_create(&pKey, NULL); for (iteration = 0; iteration < ITER_COUNT; iteration++) { if ((iteration % 100) == 0) { printf("\nStarting run series %i: ", iteration); } if ((iteration % 10) == 0) { printf("."); fflush(stdout); } // Clear array for (i = 0; i< THREAD_COUNT; i++) { finishedArray[i] = 0; } // Start threads for (i = 0; i< THREAD_COUNT; i++) { result = pthread_create(&threads[i], &attr, threadfunc, (void*)(size_t)i); if (result != 0) { perror("pthread_create"); exit(1); } } // Join threads for (i = 0; i< THREAD_COUNT; i++) { result = pthread_join(threads[i], NULL); if (result != 0) { perror("pthread_join"); exit(EXIT_FAILURE); } } // Spin waiting for results finished = 1; do { struct timespec req, rem; req.tv_sec = 0; req.tv_nsec = 10 * 1000 * 1000; nanosleep(&req, &rem); for (i = 0; i< THREAD_COUNT; i++) { if (finishedArray[i] != 1) { finished = 0; break; } } } while (!finished); } return EXIT_SUCCESS; } --------------070802030008040408040803--