From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4566 invoked by alias); 25 Jun 2008 21:17:48 -0000 Received: (qmail 4556 invoked by uid 22791); 25 Jun 2008 21:17:47 -0000 X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (65.74.133.4) by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 25 Jun 2008 21:17:29 +0000 Received: (qmail 26759 invoked from network); 25 Jun 2008 21:17:27 -0000 Received: from unknown (HELO orlando.local) (pedro@127.0.0.2) by mail.codesourcery.com with ESMTPA; 25 Jun 2008 21:17:27 -0000 From: Pedro Alves To: Daniel Jacobowitz Subject: Re: [non-stop] 08/10 linux native support Date: Wed, 25 Jun 2008 22:03:00 -0000 User-Agent: KMail/1.9.9 Cc: gdb-patches@sourceware.org References: <200806152205.49241.pedro@codesourcery.com> <20080625201946.GH25575@caradoc.them.org> In-Reply-To: <20080625201946.GH25575@caradoc.them.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200806252217.25796.pedro@codesourcery.com> X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2008-06/txt/msg00456.txt.bz2 A Wednesday 25 June 2008 21:19:46, Daniel Jacobowitz wrote: > On Sun, Jun 15, 2008 at 10:05:49PM +0100, Pedro Alves wrote: > > @@ -920,7 +929,7 @@ delete_lwp (ptid_t ptid) > > /* Return a pointer to the structure describing the LWP corresponding > > to PID. If no corresponding LWP could be found, return NULL. */ > > > > -static struct lwp_info * > > +struct lwp_info * > > find_lwp_pid (ptid_t ptid) > > { > > struct lwp_info *lp; > > If you need this function global, please rename it first. > Ack, will do. > > @@ -1306,16 +1315,76 @@ get_pending_status (struct lwp_info *lp, > > events are always cached in waitpid_queue. */ > > > > *status = 0; > > - if (GET_LWP (lp->ptid) == GET_LWP (last_ptid)) > > + > > + if (non_stop) > > { > > - if (stop_signal != TARGET_SIGNAL_0 > > - && signal_pass_state (stop_signal)) > > - *status = W_STOPCODE (target_signal_to_host (stop_signal)); > > + enum target_signal signo = TARGET_SIGNAL_0; > > + > > + if (is_executing (lp->ptid)) > > + { > > + /* If the core thought this lwp was executing, we can only > > + have pending events in the local queue. */ > > + if (queued_waitpid (GET_LWP (lp->ptid), status, __WALL) != -1) > > + { > > + if (WIFSTOPPED (status)) > > + signo = target_signal_from_host (WSTOPSIG (status)); > > + > > + /* If not stopped, then the lwp is gone, no use in > > + resending a signal. */ > > + } > > How do we get here if the core thinks the thread is executing? Is it > when linux-nat.c resumes the thread without telling the core it > stopped? A little more detail here would be helpful. Nope, this function is used while detaching, as you know. Due to the fact that PTRACE_DETACH needs the threads to be stopped to work, there's a stop_callback/stop_wait_callback sequence over all threads just before detaching. In non-stop mode, for threads that *were* running when linux_nat_detach is reached, the core will still believe they were executing, as the executing state is managed in handle_inferior_event. I'll more comments there, unless you think I should do things differently. > > > + else > > + { > > + /* If the core knows the thread is not executing, then we > > + have then last signal recorded in > > + thread_info->stop_signal, unless this is inferior_ptid, > > + in which case, it's in the global stop_signal, due to > > + context switching. */ > > I wish we could keep this stuff in the thread struct all the time... > Working on it... That pesky context switching... > > @@ -1489,6 +1580,9 @@ linux_nat_resume (ptid_t ptid, int step, > > /* Mark this LWP as resumed. */ > > lp->resumed = 1; > > > > + /* Remove the SIGINT mark. Used in non-stop mode. */ > > + lp->sigint = 0; > > + > > Confused. Why does resuming the thread affect whether we have sent it > a SIGINT, but not received it back yet? > Hmm, I was under the impression that it was possible to push more than one SIGINT into a thread's signal queue, but I just tried it, and it doesn't seem like it is. This check was meant to prevent that happening. > > @@ -1650,20 +1746,43 @@ linux_handle_extended_wait (struct lwp_i > > else > > status = 0; > > > > + /* Make thread_db aware of this thread. We do this this > > + early, so in non-stop mode, threads show up as they're > > + created, instead of on next stop. thread_db needs a > > + stopped inferior_ptid --- since we know LP is stopped, > > + use it this time. */ > > + old_chain = save_inferior_ptid (); > > + inferior_ptid = lp->ptid; > > + lp->stopped = 1; > > + target_find_new_threads (); > > + do_cleanups (old_chain); > > + if (!in_thread_list (new_lp->ptid)) > > + { > > + /* We're not using thread_db. Attach and add it to > > + GDB's list. */ > > + lin_lwp_attach_lwp (new_lp->ptid); > > + target_post_attach (GET_LWP (new_lp->ptid)); > > + add_thread (new_lp->ptid); > > + } > > + > > This may be trouble. Sometimes the thread state is not > atomically updated, so peeking at it right after creation but before > an event can fail. > Oh, that's not nice. Is this something that's worth and/or possible to fix in libthreaddb? > Why is it necessary? We already know the ptid since we made them > independent of thread_db TID some time ago. attach_thread should cope > if the thread is already in GDB's thread list when the event > eventually arrives. So we should be able to just add the new > thread directly. That's right, the only thing we'll miss if we do that, is the thread_db id of the thread in output like: [New Thread 0xf7e11b90 (LWP 26100)] ^^^^^^^^ And info threads: 2 Thread 0xf7e11b90 (LWP 26100) (running) ^^^^^^^^ Those will only show up on the next stop event (of any thread). It may take a while, if all threads are running (unless we do momentarily stop threads trick). Having a TARGET_WAITKIND_NEW_THREAD so we could pass the event to the thread-db layer (and do the immediate resume either there, or in handle_inferior_event would also get rid of the target_find_new_threads call, but it has then the same race issue... > > > @@ -2796,13 +2915,26 @@ static int > > kill_callback (struct lwp_info *lp, void *data) > > { > > errno = 0; > > - ptrace (PTRACE_KILL, GET_LWP (lp->ptid), 0, 0); > > - if (debug_linux_nat) > > - fprintf_unfiltered (gdb_stdlog, > > - "KC: PTRACE_KILL %s, 0, 0 (%s)\n", > > - target_pid_to_str (lp->ptid), > > - errno ? safe_strerror (errno) : "OK"); > > > > + /* PTRACE_KILL doesn't work when the thread is running. */ > > + if (!lp->stopped) > > + { > > + kill_lwp (GET_LWP (lp->ptid), SIGKILL); > > + if (debug_linux_nat) > > + fprintf_unfiltered (gdb_stdlog, > > + "KC: kill_lwp (SIGKILL) %s (%s)\n", > > + target_pid_to_str (lp->ptid), > > + errno ? safe_strerror (errno) : "OK"); > > + } > > + else > > + { > > + ptrace (PTRACE_KILL, GET_LWP (lp->ptid), 0, 0); > > + if (debug_linux_nat) > > + fprintf_unfiltered (gdb_stdlog, > > + "KC: PTRACE_KILL %s, 0, 0 (%s)\n", > > + target_pid_to_str (lp->ptid), > > + errno ? safe_strerror (errno) : "OK"); > > + } > > return 0; > > } > > SIGKILL should work even if the thread is stopped. I think I'll need a SIGCONT as well in that case. For some reason, I wasn't getting that to work all the times. I'll experiment some more. As always, thanks much for the review. -- Pedro Alves