The following is the last part of my revised nptl patch that has been broken up per Daniel J.'s suggestion. There are no generated files included in the patch. A bit of background is needed. First of all, in the nptl model, lwps and pids are distinct entities. If you perform a ps on a multithreaded application, you will not see its child threads show up. This differs from the linuxthreads model whereby an lwp was actually a pid. In the nptl model, you cannot issue a kill to a child thread via its lwp. If you want to do this, you need to use the tkill syscall instead. The action of sending a signal to a specific lwp is used commonly in lin-lwp.c. The first part of the nptl change is to determine at configuration time if we have the tkill syscall and then at run-time, if we can use it. A new function kill_lwp has been added which either calls the tkill syscall or the regular kill function as appropriate. Another key difference in behavior between nptl and linuxthreads is what happens when a thread exits. In the linuxthreads model, each thread that exits causes an exit event to occur. The main thread can exit before all of its children. In the nptl model, only the main thread generates the exit event and it only does so after all child threads have exited. So, to determine when an lwp has exited, we have to constantly check its status when we are able. When we get an exit event we have to determine if we are exiting the program or just one of the threads. Some additional logic has been added to the exit checking code such that if we have (lwp == pid), then we stop all active threads and check if they have already exited. If they have not exited, we resume these threads, otherwise, we delete them. Daniel J. brought up a good point regarding my previous attempt at this logic whereby I stopped all threads and resumed all threads. That old logic is wrong when scheduler_locking is set on. The new logic addresses this because it only stops/resumes threads which are not already stopped. Currently, if we lock into a linuxthreads thread and we run it until exit, we will cause a gdb_assert() to trigger because we will field the exit event and end up with no running threads. Under nptl, we never get notified of the thread exiting so for this scenario, we end up hung. In this instance, there is nothing we can do without a kernel change. Daniel J. is looking into a kernel change. For user threads, we could activate the death event in thread-db.c and make a low-level call to notify the lwp layer, but this does not handle lwps created directly by the end-user. This issue needs to be discussed further outside of this patch because I want to propose changing the assertion to allow the user to resume the other threads. I have tested the change with such a scenario (locking a thread and running until exit). For linuxthreads we trigger the assertion as before. For nptl, we hang as expected as we never get the exit event. A bug can be opened for this if the patch is accepted. The last piece of logic has to do with a new interface needed by the nptl libthread_db library to get the thread area. I have grouped this together with the lin-lwp changes because the lin-lwp changes cannot work without this crucial change. This patch has been tested for both linuxthreads and with an nptl kernel. Ok to commit? -- Jeff J. 2003-04-24 Jeff Johnston * acconfig.h: Add HAVE_TKILL_SYSCALL definition check. * config.in: Regenerated. * configure.in: Add test for syscall function and check for __NR_tkill macro in to set HAVE_TKILL_SYSCALL. * configure: Regenerated. * thread-db.c (check_event): For create/death event breakpoints, loop through all messages to ensure that we read the message corresponding to the breakpoint we are at. * lin-lwp.c [HAVE_TKILL_SYSCALL]: Include and . (kill_lwp): New function that uses tkill syscall or uses kill, depending on whether threading model is nptl or not. All callers of kill() changed to use kill_lwp(). (lin_lwp_wait): Make special check when WIFEXITED occurs to see if all threads have already exited in the nptl model. (stop_and_resume_callback): New callback function used by the lin_lwp_wait thread exit handling code. (stop_wait_callback): Check for threads already having exited and delete such threads fromt the lwp list when discovered. (stop_callback): Don't assert retcode of kill call. Roland McGrath * i386-linux-nat.c (ps_get_thread_area): New function needed by nptl libthread_db.