From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-24009-listarch-gdb-patches=sources.redhat.com@sources.redhat.com>
Received: (qmail 22442 invoked by alias); 22 Apr 2003 20:10:31 -0000
Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-patches-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sources.redhat.com>
List-Help: <mailto:gdb-patches-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: gdb-patches-owner@sources.redhat.com
Received: (qmail 22435 invoked from network); 22 Apr 2003 20:10:28 -0000
Received: from unknown (HELO touchme.toronto.redhat.com) (207.219.125.105)
  by sources.redhat.com with SMTP; 22 Apr 2003 20:10:28 -0000
Received: from redhat.com (toocool.toronto.redhat.com [172.16.14.72])
	by touchme.toronto.redhat.com (Postfix) with ESMTP
	id E03E48000FA; Tue, 22 Apr 2003 16:10:27 -0400 (EDT)
Message-ID: <3EA5A1B3.6030001@redhat.com>
Date: Tue, 22 Apr 2003 20:10:00 -0000
From: "J. Johnston" <jjohnstn@redhat.com>
Organization: Red Hat Inc.
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Daniel Jacobowitz <drow@mvista.com>
Cc: gdb-patches@sources.redhat.com
Subject: Re: RFA: nptl threading support patches
References: <3EA55B50.809@redhat.com> <20030422173055.GB17043@nevyn.them.org>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-SW-Source: 2003-04/txt/msg00410.txt.bz2

Daniel Jacobowitz wrote:
> I can't approve any of this, but I have some comments.  First of all,
> this should be multiple patches, in at least a few trivial places. 
> Roughly one per paragraph below.
>

> Please don't post changes to generated files.  However, also please use
> the right versions of the tools to generate them!  You can not use
> autoconf 2.57 to generate the GDB configure script; the right version
> of autoconf 2.13 is available on sources.redhat.com.
> 

Ok, I will resubmit in smaller patches and remove generated files.  I have to rethink
the scenario you mention whereby the main thread is the only unlocked thread
and exits under the Linuxthreads model.

> You did not mention testing.  I assume you tested with NPTL.  Did you
> test with LinuxThreads also?
> 

Yes.  No additional Linuxthreads failures in gdb.threads, although killed.exp fails a
little uglier than it did before.  After looking at the exit problem, I will include the
testsuite results for gdb.threads for both nptl and linuxthreads in my resubmit.

> On Tue, Apr 22, 2003 at 11:10:08AM -0400, J. Johnston wrote:
> 
>>The attached patch adds nptl threading support for gdb.
>>
>>There are a couple of key design changes to note with the nptl-enabled
>>kernel.  First of all, kill semantics have changed.  It is no longer
>>possible to send a signal to an lwp via kill.  Only a process id can be
>>used, rather than a thread lwp.  To keep the lin-lwp logic whereby SIGSTOP
>>is applied to a particular thread and some signals are pushed back, the
>>tkill syscall must be used.  The tkill syscall allows us to send a signal
>>to a particular thread.  A configuration check is made to see if we can
>>call syscall(__NR_tkill,..) and a runtime check is made to ensure that we 
>>can call
>>it successfully without getting ENOSYS.
> 
> 
> This is a separate patch, for instance.  It looks reasonable to me.
> 
> 
>>Another key behavior change regards thread exit.  For linuxthreads, exiting
>>causes a WIFEXITED event to occur at which point we can delete the current
>>thread and determine if any threads still exist.  In the new nptl model,
>>only the main thread generates this event and does so only once all the
>>threads have exited.  There is no event for the individual threads exiting
>>so we must check at various times to see if a thread has gone away.  For
>>example, when we stop all of the threads, this is a good time to see if
>>any have gone away.  When we get the exit signal, we check if we have the
>>main thread or not.  If it is the main thread, we have to stop all threads
>>(which will check whether they are alive or not), and then see how many
>>threads are left.  This logic works for either model.  If there are still
>>threads alive, we resume them and continue on without reporting the
>>exit.
> 
> 
> What was the verdict on whether this would break if I turned on exit
> events for every thread?  Sorry for not getting back to that kernel
> patch sooner, I'm way behind.  It looks like it would work.
> 

Your patch won't affect this code as it was designed to handle either model, however,
a Red Hat Linux kernel has already shipped that has the behavior listed above.

> There are some subtleties.  I think that in your patch, the main thread
> reporting a WIFEXITED status will cause other threads to be resumed -
> even if they weren't running before, due to schedule locking.  I
> haven't tested that though.
> 

As mentioned earlier, you raise a valid problem.  I have to look at this.

> 
>>A change was made in thread-db.c to handle a FIXME that has been
>>there for a while.  When we get a create breakpoint in check_event, we
>>do not know what lwp has been created so we call td_ta_event_getmsg() to
>>get "a message" off the queue.  This message might not actually be the
>>message corresponding to the breakpoint that stopped us.  I have changed
>>the logic to perform a loop and read off all messages on the queue.  This
>>enables death event reporting to work if desired.  Without it, you can
>>get scenarios such as a death event preceding its create event, etc..
> 
> 
> This change looks reasonable to me.
> 
> 
>>The testsuite needed some changes because there is no longer a manager
>>thread with nptl so the testcase must account for either model when
>>doing info threads.  The schedlock testcase was changed because
>>nptl scheduling differs from linuxthreads.  One can't assume that
>>all threads will run equally in a small time slice, most notably when
>>stepping a particular thread.
> 
> 
> The changes to linux-dp.exp look reasonable to me.  That can be a patch
> all by itself.
> 
> The changes to schedlock.exp look reasonable, and may fix the false
> failures it produces on LinuxThreads already depending on your luck
> with the scheduler.  It weakens the test - ideally it was supposed to
> make sure that GDB resumed every thread - but at this point I'll settle
> for resuming at least one other thread.  So that's a separate good
> patch too.
> 
> 
>>It is a lot to digest.  Let me know if/when it is ok to commit.
>>
>>-- Jeff J.
>>
>>2003-04-17  Jeff Johnston  <jjohnstn@redhat.com>
>>
>>	* acconfig.h: Add HAVE_TKILL_SYSCALL definition check.
>>	* config.in: HAVE_TKILL_SYSCALL and HAVE_SYSCALL checks added.
>>	* configure.in: Add test for syscall function and check for
>>	__NR_tkill macro in <syscall.h> to set HAVE_TKILL_SYSCALL.
>>	* configure: Regenerated.
>>	* thread-db.c (check_event): For create/death event breakpoints,
>>	loop through all messages to ensure that we read the message
>>	corresponding to the breakpoint we are at.
>>	* lin-lwp.c [HAVE_TKILL_SYSCALL]: Include <unistd.h> and 
>>	<sys/syscall.h>.
>>	(kill_lwp): New function that uses tkill syscall or
>>	uses kill, depending on whether threading model is nptl or not.
>>	All callers of kill() changed to use kill_lwp().
>>	(lin_lwp_wait): Make special check when WIFEXITED occurs to
>>	see if all threads have already exited in the nptl model.
>>	(stop_wait_callback): Check for threads already having exited and
>>	delete such threads fromt the lwp list when discovered.
>>	(stop_callback): Don't assert retcode of kill call.
>>	* i386-linux-nat.c (ps_get_thread_area): New function needed by
>>	nptl libthread_db.
> 
>