Unbreak 'catch syscall' + multi-threading

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* Unbreak 'catch syscall' + multi-threading
@ 2009-10-01  0:48 Pedro Alves
  2009-10-01  9:51 ` Doug Evans
  0 siblings, 1 reply; 4+ messages in thread
From: Pedro Alves @ 2009-10-01  0:48 UTC (permalink / raw)
  To: gdb-patches

As we were discussing yesterday, 'catch syscall' is unfortunately
broken with multi-threading in the mix, plus it has a few other
problems (present on 7.0 too, of course).  This patch fixes all the
issues I found.

 - "detach" when stopped at a syscall event dies horribly.  This was
   because we'd try to pass signal 0x85 to the inferior (SIGTRAP | 0x80, with
   0x80 being the magic sysgood/syscall bit).  We should never pass a
   ptrace SIGTRAP to the inferior.  get_pending_status was fixed to handle
   that., and I ended up simpl

 - In a multi-threaded app+all-stop: when gdb stops all threads to report
   an event, and some other thread hits a syscall trap, that event is left
   pending, but then a few issues trigger: the entry/return state of the
   syscall ends up inverted; if the user removes the syscall catchpoint
   before resuming, the pending event ends up reported as a spurious
   SIGTRAP or ??? signal (SIGTRAP | 0x80).  I've spend a good bit
   figuring out how/when syscall traps are reported, and what
   happens (at least on my kernel 2.6.24 x86_64) when we PTRACE_CONT
   to collect a pending SIGSTOP.  I've added hopefully compreensive
   (and correct!) explanations to the patch.  I wouldn't mind a double
   check.

 - Related to the previous point, even when not interested in a given
   syscall event, we'd still always stop all threads, report the event
   to the core, only to have the core immediately resume all threads
   again.  I've made it so that if we're ignoring the event, we
   resume the lwp hat reported the syscall immediately on the target
   side, bypassing the slow all-threads stop/resume.

 - Yet related to the above points, TRAP_IS_SYSCALL is gone.  Right after
   handling the syscall event, we now remove the sysgood bit, and the rest
   of linux-nat.c doesn't need to know about this special SIGTRAP, we
   already have lp->waitstatus for that.  The event is handled just like
   the other extended wait statuses, except that on those, WSTOPSIG
   always reported SIGTRAP already, so we never needed to consider
   this masking.  The code now uses (SIGTRAP | 0x80) directly in the couple
   of places that need it, since that is exactly how the event is
   described in the ptrace man page.

 - When handling a PTRACE_EVENT_CLONE event both parent and child threads
   are currently resumed with PTRACE_CONT instead of PTRACE_SYSCALL, even
   if we have a syscall catchpoint.  This means that no syscalls are caught
   after the clone, in neither the parent nor the child.  When
   pthreads/thread_db are loaded, the thread event breakpoints happen
   to paper over the problem --- they cause an all-threads stop/resume
   sequence that resumes them all with PTRACE_SYSCALL.  Pedantically, any
   syscall between clone and the event breakpoint is still missed
   without the fix.  And the fix is to use linux_ops->to_resume instead of
   ptrace(PTRACE_CONT) directly, as the former knows to use PT_SYSCALL if
   needed.

 - infrun.c:deal_with_syscall_event had a few minor-ish issues,
   that I noticed by code inspection that are mostly harmless in
   all-stop/single-process, not so in non-stop/multi-process.

Tested on x86_64-linux without regressions.  Any comments on this
(or anyone wants to try it) before I check this in?

-- 
Pedro Alves

2009-10-01  Pedro Alves  <pedro@codesourcery.com>

	* linux-nat.c (TRAP_IS_SYSCALL, TRAP_REMOVE_SYSCALL_FLAG): Delete.
	(status_to_str): Adjust.
	(get_pending_status): Pending events in lp->waitstatus don't map
	to any signal.  Simplify.
	(linux_handle_syscall_trap): New.
	(linux_handle_extended_wait): When handling PTRACE_EVENT_CLONE
	events, use linux_ops->to_resume instead of direct ptrace with
	PTRACE_CONT.  Remove all TRAP_IS_SYSCALL handling.
	(wait_lwp): Handle syscall traps with linux_handle_syscall_trap,
	and clear the sysgood bit.
	(status_callback): Make it clearer and add comments.
	(cancel_breakpoints_callback): Ignore if LP has waitstatus set.
	(linux_nat_filter_event): Handle syscall traps with
	linux_handle_syscall_trap, and clear the sysgood bit.  Move the
	check for storing siginfo to after handling extended statuses and
	syscall traps.
	(linux_wait_1): Don't clear the sysgood bit here.

	* infrun.c (deal_with_syscall_event): Rename to ...
	(handle_syscall_event): ... this.  Always context switch and set
	stop_pc, even if not catching the syscall.  If not catching the
	syscall, always resume with keep_going.
	(handle_inferior_event): Adjust.

---
 gdb/infrun.c    |   60 ++++----
 gdb/linux-nat.c |  375 ++++++++++++++++++++++++++++++++++++--------------------
 2 files changed, 273 insertions(+), 162 deletions(-)

Index: src/gdb/linux-nat.c
===================================================================
--- src.orig/gdb/linux-nat.c	2009-10-01 01:07:00.000000000 +0100
+++ src/gdb/linux-nat.c	2009-10-01 01:07:41.000000000 +0100
@@ -67,11 +67,6 @@
 # endif
 #endif /* HAVE_PERSONALITY */
 
-/* To be used when one needs to know wether a
-   WSTOPSIG (status) is a syscall */
-#define TRAP_IS_SYSCALL (SIGTRAP | 0x80)
-#define TRAP_REMOVE_SYSCALL_FLAG(status) ((status) & ~(0x80 << 8))
-
 /* This comment documents high-level logic of this file. 
 
 Waiting for events in sync mode
@@ -982,7 +977,7 @@ status_to_str (int status)
 
   if (WIFSTOPPED (status))
     {
-      if (WSTOPSIG (status) == TRAP_IS_SYSCALL)
+      if (WSTOPSIG (status) == (SIGTRAP | 0x80))
 	snprintf (buf, sizeof (buf), "%s (stopped at syscall)",
 		  strsignal (SIGTRAP));
       else
@@ -1514,74 +1509,78 @@ linux_nat_attach (struct target_ops *ops
 static int
 get_pending_status (struct lwp_info *lp, int *status)
 {
-  struct target_waitstatus last;
-  ptid_t last_ptid;
-
-  get_last_target_status (&last_ptid, &last);
+  enum target_signal signo = TARGET_SIGNAL_0;
 
-  /* If this lwp is the ptid that GDB is processing an event from, the
-     signal will be in stop_signal.  Otherwise, we may cache pending
-     events in lp->status while trying to stop all threads (see
-     stop_wait_callback).  */
+  /* If we paused threads momentarily, we may have stored pending
+     events in lp->status or lp->waitstatus (see stop_wait_callback),
+     and GDB core hasn't seen any signal for those threads.
+     Otherwise, the last signal reported to the core is found in the
+     thread object's stop_signal.
+
+     There's a corner case that isn't handled here at present.  Only
+     if the thread stopped with a TARGET_WAITKIND_STOPPED does
+     stop_signal make sense as a real signal to pass to the inferior.
+     Some catchpoint related events, like
+     TARGET_WAITKIND_(V)FORK|EXEC|SYSCALL, have their stop_signal set
+     to TARGET_SIGNAL_SIGTRAP when the catchpoint triggers.  But,
+     those traps are debug API (ptrace in our case) related and
+     induced; the inferior wouldn't see them if it wasn't being
+     traced.  Hence, we should never pass them to the inferior, even
+     when set to pass state.  Since this corner case isn't handled by
+     infrun.c when proceeding with a signal, for consistency, neither
+     do we handle it here (or elsewhere in the file we check for
+     signal pass state).  Normally SIGTRAP isn't set to pass state, so
+     this is really a corner case.  */
 
-  *status = 0;
-
-  if (non_stop)
+  if (lp->waitstatus.kind != TARGET_WAITKIND_IGNORE)
+    signo = TARGET_SIGNAL_0; /* a pending ptrace event, not a real signal.  */
+  else if (lp->status)
+    signo = target_signal_from_host (WSTOPSIG (lp->status));
+  else if (non_stop && !is_executing (lp->ptid))
     {
-      enum target_signal signo = TARGET_SIGNAL_0;
+      struct thread_info *tp = find_thread_ptid (lp->ptid);
+      signo = tp->stop_signal;
+    }
+  else if (!non_stop)
+    {
+      struct target_waitstatus last;
+      ptid_t last_ptid;
 
-      if (is_executing (lp->ptid))
-	{
-	  /* If the core thought this lwp was executing --- e.g., the
-	     executing property hasn't been updated yet, but the
-	     thread has been stopped with a stop_callback /
-	     stop_wait_callback sequence (see linux_nat_detach for
-	     example) --- we can only have pending events in the local
-	     queue.  */
-	  signo = target_signal_from_host (WSTOPSIG (lp->status));
-	}
-      else
-	{
-	  /* If the core knows the thread is not executing, then we
-	     have the last signal recorded in
-	     thread_info->stop_signal.  */
+      get_last_target_status (&last_ptid, &last);
 
+      if (GET_LWP (lp->ptid) == GET_LWP (last_ptid))
+	{
 	  struct thread_info *tp = find_thread_ptid (lp->ptid);
 	  signo = tp->stop_signal;
 	}
+    }
 
-      if (signo != TARGET_SIGNAL_0
-	  && !signal_pass_state (signo))
-	{
-	  if (debug_linux_nat)
-	    fprintf_unfiltered (gdb_stdlog, "\
-GPT: lwp %s had signal %s, but it is in no pass state\n",
-				target_pid_to_str (lp->ptid),
-				target_signal_to_string (signo));
-	}
-      else
-	{
-	  if (signo != TARGET_SIGNAL_0)
-	    *status = W_STOPCODE (target_signal_to_host (signo));
+  *status = 0;
 
-	  if (debug_linux_nat)
-	    fprintf_unfiltered (gdb_stdlog,
-				"GPT: lwp %s as pending signal %s\n",
-				target_pid_to_str (lp->ptid),
-				target_signal_to_string (signo));
-	}
+  if (signo == TARGET_SIGNAL_0)
+    {
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "GPT: lwp %s has no pending signal\n",
+			    target_pid_to_str (lp->ptid));
+    }
+  else if (!signal_pass_state (signo))
+    {
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog, "\
+GPT: lwp %s had signal %s, but it is in no pass state\n",
+			    target_pid_to_str (lp->ptid),
+			    target_signal_to_string (signo));
     }
   else
     {
-      if (GET_LWP (lp->ptid) == GET_LWP (last_ptid))
-	{
-	  struct thread_info *tp = find_thread_ptid (lp->ptid);
-	  if (tp->stop_signal != TARGET_SIGNAL_0
-	      && signal_pass_state (tp->stop_signal))
-	    *status = W_STOPCODE (target_signal_to_host (tp->stop_signal));
-	}
-      else
-	*status = lp->status;
+      *status = W_STOPCODE (target_signal_to_host (signo));
+
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "GPT: lwp %s has pending signal %s\n",
+			    target_pid_to_str (lp->ptid),
+			    target_signal_to_string (signo));
     }
 
   return 0;
@@ -1893,6 +1892,130 @@ kill_lwp (int lwpid, int signo)
   return kill (lwpid, signo);
 }
 
+/* Handle a GNU/Linux syscall trap wait response.  If we see a syscall
+   event, check if the core is interested in it: if not, ignore the
+   event, and keep waiting; otherwise, we need to toggle the LWP's
+   syscall entry/exit status, since the ptrace event itself doesn't
+   indicate it, and report the trap to higher layers.  */
+
+static int
+linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
+{
+  struct target_waitstatus *ourstatus = &lp->waitstatus;
+  struct gdbarch *gdbarch = target_thread_architecture (lp->ptid);
+  int syscall_number = (int) gdbarch_get_syscall_number (gdbarch, lp->ptid);
+
+  if (stopping)
+    {
+      /* If we're stopping threads, there's a SIGSTOP pending, which
+	 makes it so that the LWP reports an immediate syscall return,
+	 followed by the SIGSTOP.  Skip seeing that "return" using
+	 PTRACE_CONT directly, and let stop_wait_callback collect the
+	 SIGSTOP.  Later when the thread is resumed, a new syscall
+	 entry event.  If we didn't do this (and returned 0), we'd
+	 leave a syscall entry pending, and our caller, by using
+	 PTRACE_CONT to collect the SIGSTOP, skips the syscall return
+	 itself.  Later, when the user re-resumes this LWP, we'd see
+	 another syscall entry event and we'd mistake it for a return.
+
+	 If stop_wait_callback didn't force the SIGSTOP out of the LWP
+	 (leaving immediately with LWP->signalled set, without issuing
+	 a PTRACE_CONT), it would still be problematic to leave this
+	 syscall enter pending, as later when the thread is resumed,
+	 it would then see the same syscall exit mentioned above,
+	 followed by the delayed SIGSTOP, while the syscall didn't
+	 actually get to execute.  It seems it would be even more
+	 confusing to the user.  */
+
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "LHST: ignoring syscall %d "
+			    "for LWP %ld (stopping threads), "
+			    "resuming with PTRACE_CONT for SIGSTOP\n",
+			    syscall_number,
+			    GET_LWP (lp->ptid));
+
+      lp->syscall_state = TARGET_WAITKIND_IGNORE;
+      ptrace (PTRACE_CONT, GET_LWP (lp->ptid), 0, 0);
+      return 1;
+    }
+
+  if (catch_syscall_enabled () != 0)
+    {
+      /* Always update the entry/return state, even if this particular
+	 syscall isn't interesting to the core now.  In async mode,
+	 the user could install a new catchpoint for this syscall
+	 between syscall enter/return, and we'll need to know to
+	 report a syscall return if that happens.  */
+      lp->syscall_state = (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
+			   ? TARGET_WAITKIND_SYSCALL_RETURN
+			   : TARGET_WAITKIND_SYSCALL_ENTRY);
+
+      if (catching_syscall_number (syscall_number))
+	{
+	  /* Alright, an event to report.  */
+	  ourstatus->kind = lp->syscall_state;
+	  ourstatus->value.syscall_number = syscall_number;
+
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog,
+				"LHST: stopping for %s of syscall %d"
+				" for LWP %ld\n",
+				lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
+				? "entry" : "return",
+				syscall_number,
+				GET_LWP (lp->ptid));
+	  return 0;
+	}
+      else if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "LHST: ignoring %s of syscall %d "
+			    "for LWP %ld\n",
+			    lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
+			    ? "entry" : "return",
+			    syscall_number,
+			    GET_LWP (lp->ptid));
+    }
+  else
+    {
+      /* If we had been syscall tracing, and hence used PT_SYSCALL
+	 before on this LWP, it could happen that the user removes all
+	 syscall catchpoints before we get to process this event.
+	 There are two noteworthy issues here:
+
+	 - if we last single-stepped this thread, then this event
+	   can't be a syscall enter.  Only PT_SYSCALL catches syscall
+	   enters.  This has to be a syscall exit them.
+
+	 - when stopped at a syscall entry event, resuming with
+	   PT_STEP still reports a syscall return.
+
+	 The points above mean that the next resume, be it PT_STEP or
+	 PT_CONTINUE, can not trigger a syscall trace event.  */
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "LHST: caught syscall event with no syscall catchpoints."
+			    " %d for LWP %ld, ignoring\n",
+			    syscall_number,
+			    GET_LWP (lp->ptid));
+      lp->syscall_state = TARGET_WAITKIND_IGNORE;
+    }
+
+  /* The core isn't interested in this event.  For efficiency, avoid
+     stopping all threads only to have the core resume them all again.
+     Since we're not stopping threads, if we're still syscall tracing
+     and not stepping, we can't use PTRACE_CONT here, as we'd miss any
+     subsequent syscall.  Simply resume using the inf-ptrace layer,
+     which knows when to use PT_SYSCALL or PT_CONTINUE.  */
+
+  /* Note that gdbarch_get_syscall_number may access registers, hence
+     fill a regcache.  */
+  registers_changed ();
+  linux_ops->to_resume (linux_ops, pid_to_ptid (GET_LWP (lp->ptid)),
+			lp->step, TARGET_SIGNAL_0);
+  return 1;
+}
+
 /* Handle a GNU/Linux extended wait response.  If we see a clone
    event, we need to add the new LWP to our list (and not report the
    trap to higher layers).  This function returns non-zero if the
@@ -2018,19 +2141,30 @@ linux_handle_extended_wait (struct lwp_i
 		}
 	    }
 
+	  /* Note the need to use the low target ops to resume, to
+	     handle resuming with PT_SYSCALL if we have syscall
+	     catchpoints.  */
 	  if (!stopping)
 	    {
+	      int signo;
+
 	      new_lp->stopped = 0;
 	      new_lp->resumed = 1;
-	      ptrace (PTRACE_CONT, new_pid, 0,
-		      status ? WSTOPSIG (status) : 0);
+
+	      signo = (status
+		       ? target_signal_from_host (WSTOPSIG (status))
+		       : TARGET_SIGNAL_0);
+
+	      linux_ops->to_resume (linux_ops, pid_to_ptid (new_pid),
+				    0, signo);
 	    }
 
 	  if (debug_linux_nat)
 	    fprintf_unfiltered (gdb_stdlog,
 				"LHEW: Got clone event from LWP %ld, resuming\n",
 				GET_LWP (lp->ptid));
-	  ptrace (PTRACE_CONT, GET_LWP (lp->ptid), 0, 0);
+	  linux_ops->to_resume (linux_ops, pid_to_ptid (GET_LWP (lp->ptid)),
+				0, TARGET_SIGNAL_0);
 
 	  return 1;
 	}
@@ -2070,47 +2204,6 @@ linux_handle_extended_wait (struct lwp_i
       return 0;
     }
 
-  /* Used for 'catch syscall' feature.  */
-  if (WSTOPSIG (status) == TRAP_IS_SYSCALL)
-    {
-      if (catch_syscall_enabled () == 0)
-	  ourstatus->kind = TARGET_WAITKIND_IGNORE;
-      else
-	{
-	  struct regcache *regcache = get_thread_regcache (lp->ptid);
-	  struct gdbarch *gdbarch = get_regcache_arch (regcache);
-
-	  ourstatus->value.syscall_number =
-	    (int) gdbarch_get_syscall_number (gdbarch, lp->ptid);
-
-	  /* If we are catching this specific syscall number, then we
-	     should update the target_status to reflect which event
-	     has occurred.  But if this syscall is not to be caught,
-	     then we can safely mark the event as a SYSCALL_RETURN.
-
-	     This is particularly needed if:
-
-	       - We are catching any syscalls, or
-	       - We are catching the syscall "exit"
-
-	     In this case, as the syscall "exit" *doesn't* return,
-	     then GDB would be confused because it would mark the last
-	     syscall event as a SYSCALL_ENTRY.  After that, if we re-ran the
-	     inferior GDB will think that the first syscall event is
-	     the opposite of a SYSCALL_ENTRY, which is the SYSCALL_RETURN.
-	     Therefore, GDB would report inverted syscall events.  */
-	  if (catching_syscall_number (ourstatus->value.syscall_number))
-	    ourstatus->kind = 
-	      (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY) ?
-	      TARGET_WAITKIND_SYSCALL_RETURN : TARGET_WAITKIND_SYSCALL_ENTRY;
-	  else
-	    ourstatus->kind = TARGET_WAITKIND_SYSCALL_RETURN;
-
-	  lp->syscall_state = ourstatus->kind;
-	}
-      return 0;
-    }
-
   internal_error (__FILE__, __LINE__,
 		  _("unknown ptrace event %d"), event);
 }
@@ -2176,6 +2269,17 @@ wait_lwp (struct lwp_info *lp)
 
   gdb_assert (WIFSTOPPED (status));
 
+  /* Handle GNU/Linux's syscall SIGTRAPs.  */
+  if (WIFSTOPPED (status) && WSTOPSIG (status) == (SIGTRAP | 0x80))
+    {
+      /* No longer need the sysgood bit.  The ptrace event ends up
+	 recorded in lp->waitstatus if we care for it, we can carry on
+	 handling the event like a regular SIGTRAP from here on.  */
+      status = W_STOPCODE (SIGTRAP);
+      if (linux_handle_syscall_trap (lp, 1))
+	return wait_lwp (lp);
+    }
+
   /* Handle GNU/Linux's extended waitstatus for trace events.  */
   if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP && status >> 16 != 0)
     {
@@ -2435,19 +2539,31 @@ stop_wait_callback (struct lwp_info *lp,
   return 0;
 }
 
-/* Return non-zero if LP has a wait status pending.  */
+/* Return non-zero if LP has a wait status pending.  If an LP is
+   stopped at a SYSCALL event that we're no longer caring for, resume
+   it.  */
 
 static int
 status_callback (struct lwp_info *lp, void *data)
 {
   /* Only report a pending wait status if we pretend that this has
      indeed been resumed.  */
-  /* We check for lp->waitstatus in addition to lp->status, because we
-     can have pending process exits recorded in lp->waitstatus, and
-     W_EXITCODE(0,0) == 0.  */
-  return ((lp->status != 0
-	   || lp->waitstatus.kind != TARGET_WAITKIND_IGNORE)
-	  && lp->resumed);
+  if (!lp->resumed)
+    return 0;
+
+  if (lp->waitstatus.kind != TARGET_WAITKIND_IGNORE)
+    {
+      /* A ptrace event, like PTRACE_FORK|VFORK|EXEC, syscall event,
+	 or a a pending process exit.  Note that `W_EXITCODE(0,0) ==
+	 0', so a clean process exit can't be stored in lp->status).
+	 or this LWP has already exited and */
+      return 1;
+    }
+
+  if (lp->status != 0)
+    return 1;
+
+  return 0;
 }
 
 /* Return non-zero if LP isn't stopped.  */
@@ -2557,7 +2673,8 @@ cancel_breakpoints_callback (struct lwp_
      delete or disable the breakpoint, but the LWP will have already
      tripped on it.  */
 
-  if (lp->status != 0
+  if (lp->waitstatus.kind == TARGET_WAITKIND_IGNORE
+      && lp->status != 0
       && WIFSTOPPED (lp->status) && WSTOPSIG (lp->status) == SIGTRAP
       && cancel_breakpoint (lp))
     /* Throw away the SIGTRAP.  */
@@ -2708,17 +2825,19 @@ linux_nat_filter_event (int lwpid, int s
       add_thread (lp->ptid);
     }
 
-  /* Save the trap's siginfo in case we need it later.  */
-  if (WIFSTOPPED (status)
-      && (WSTOPSIG (status) == SIGTRAP || WSTOPSIG (status) == TRAP_IS_SYSCALL))
-    save_siginfo (lp);
+  /* Handle GNU/Linux's syscall SIGTRAPs.  */
+  if (WIFSTOPPED (status) && WSTOPSIG (status) == (SIGTRAP | 0x80))
+    {
+      /* No longer need the sysgood bit.  The ptrace event ends up
+	 recorded in lp->waitstatus if we care for it, we can carry on
+	 handling the event like a regular SIGTRAP from here on.  */
+      status = W_STOPCODE (SIGTRAP);
+      if (linux_handle_syscall_trap (lp, 0))
+	return NULL;
+    }
 
-  /* Handle GNU/Linux's extended waitstatus for trace events.
-     It is necessary to check if WSTOPSIG is signaling that
-     the inferior is entering/exiting a system call.  */
-  if (WIFSTOPPED (status)
-      && ((WSTOPSIG (status) == TRAP_IS_SYSCALL)
-          || (WSTOPSIG (status) == SIGTRAP && status >> 16 != 0)))
+  /* Handle GNU/Linux's extended waitstatus for trace events.  */
+  if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP && status >> 16 != 0)
     {
       if (debug_linux_nat)
 	fprintf_unfiltered (gdb_stdlog,
@@ -2728,6 +2847,10 @@ linux_nat_filter_event (int lwpid, int s
 	return NULL;
     }
 
+  /* Save the trap's siginfo in case we need it later.  */
+  if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP)
+    save_siginfo (lp);
+
   /* Check if the thread has exited.  */
   if ((WIFEXITED (status) || WIFSIGNALED (status))
       && num_lwps (GET_PID (lp->ptid)) > 1)
@@ -3013,12 +3136,6 @@ retry:
 
 	  lp = linux_nat_filter_event (lwpid, status, options);
 
-	  /* If this was a syscall trap, we no longer need or want
-	     the 0x80 flag, remove it.  */
-	  if (WIFSTOPPED (status)
-	      && WSTOPSIG (status) == TRAP_IS_SYSCALL)
-	    status = TRAP_REMOVE_SYSCALL_FLAG (status);
-
 	  if (lp
 	      && ptid_is_pid (ptid)
 	      && ptid_get_pid (lp->ptid) != ptid_get_pid (ptid))
Index: src/gdb/infrun.c
===================================================================
--- src.orig/gdb/infrun.c	2009-10-01 01:07:00.000000000 +0100
+++ src/gdb/infrun.c	2009-10-01 01:07:41.000000000 +0100
@@ -2386,13 +2386,22 @@ stepped_in_from (struct frame_info *fram
    It returns 1 if the inferior should keep going (and GDB
    should ignore the event), or 0 if the event deserves to be
    processed.  */
+
 static int
-deal_with_syscall_event (struct execution_control_state *ecs)
+handle_syscall_event (struct execution_control_state *ecs)
 {
-  struct regcache *regcache = get_thread_regcache (ecs->ptid);
-  struct gdbarch *gdbarch = get_regcache_arch (regcache);
-  int syscall_number = gdbarch_get_syscall_number (gdbarch,
-                                                   ecs->ptid);
+  struct regcache *regcache;
+  struct gdbarch *gdbarch;
+  int syscall_number;
+
+  if (!ptid_equal (ecs->ptid, inferior_ptid))
+    context_switch (ecs->ptid);
+
+  regcache = get_thread_regcache (ecs->ptid);
+  gdbarch = get_regcache_arch (regcache);
+  syscall_number = gdbarch_get_syscall_number (gdbarch, ecs->ptid);
+  stop_pc = regcache_read_pc (regcache);
+
   target_last_waitstatus.value.syscall_number = syscall_number;
 
   if (catch_syscall_enabled () > 0
@@ -2401,35 +2410,22 @@ deal_with_syscall_event (struct executio
       if (debug_infrun)
         fprintf_unfiltered (gdb_stdlog, "infrun: syscall number = '%d'\n",
                             syscall_number);
-      ecs->event_thread->stop_signal = TARGET_SIGNAL_TRAP;
-
-      if (!ptid_equal (ecs->ptid, inferior_ptid))
-        {
-          context_switch (ecs->ptid);
-          reinit_frame_cache ();
-        }
-
-      stop_pc = regcache_read_pc (get_thread_regcache (ecs->ptid));
 
       ecs->event_thread->stop_bpstat = bpstat_stop_status (stop_pc, ecs->ptid);
-
       ecs->random_signal = !bpstat_explains_signal (ecs->event_thread->stop_bpstat);
 
-      /* If no catchpoint triggered for this, then keep going.  */
-      if (ecs->random_signal)
-        {
-          ecs->event_thread->stop_signal = TARGET_SIGNAL_0;
-          keep_going (ecs);
-          return 1;
-        }
-      return 0;
-    }
-  else
-    {
-      resume (0, TARGET_SIGNAL_0);
-      prepare_to_wait (ecs);
-      return 1;
+      if (!ecs->random_signal)
+	{
+	  /* Catchpoint hit.  */
+	  ecs->event_thread->stop_signal = TARGET_SIGNAL_TRAP;
+	  return 0;
+	}
     }
+
+  /* If no catchpoint triggered for this, then keep going.  */
+  ecs->event_thread->stop_signal = TARGET_SIGNAL_0;
+  keep_going (ecs);
+  return 1;
 }
 
 /* Given an execution control state that has been freshly filled in
@@ -2753,10 +2749,9 @@ handle_inferior_event (struct execution_
       if (debug_infrun)
         fprintf_unfiltered (gdb_stdlog, "infrun: TARGET_WAITKIND_SYSCALL_ENTRY\n");
       /* Getting the current syscall number */
-      if (deal_with_syscall_event (ecs) != 0)
+      if (handle_syscall_event (ecs) != 0)
         return;
       goto process_event_stop_test;
-      break;
 
       /* Before examining the threads further, step this thread to
          get it entirely out of the syscall.  (We get notice of the
@@ -2766,10 +2761,9 @@ handle_inferior_event (struct execution_
     case TARGET_WAITKIND_SYSCALL_RETURN:
       if (debug_infrun)
         fprintf_unfiltered (gdb_stdlog, "infrun: TARGET_WAITKIND_SYSCALL_RETURN\n");
-      if (deal_with_syscall_event (ecs) != 0)
+      if (handle_syscall_event (ecs) != 0)
         return;
       goto process_event_stop_test;
-      break;
 
     case TARGET_WAITKIND_STOPPED:
       if (debug_infrun)


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Unbreak 'catch syscall' + multi-threading
  2009-10-01  0:48 Unbreak 'catch syscall' + multi-threading Pedro Alves
@ 2009-10-01  9:51 ` Doug Evans
  2009-10-02 16:55   ` Pedro Alves
  0 siblings, 1 reply; 4+ messages in thread
From: Doug Evans @ 2009-10-01  9:51 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches

On Wed, Sep 30, 2009 at 5:48 PM, Pedro Alves <pedro@codesourcery.com> wrote:
> As we were discussing yesterday, 'catch syscall' is unfortunately
> broken with multi-threading in the mix, plus it has a few other
> problems (present on 7.0 too, of course).  This patch fixes all the
> issues I found.

Thanks.

> The code now uses (SIGTRAP | 0x80) directly in the couple
>   of places that need it, since that is exactly how the event is
>   described in the ptrace man page.

nit: 0x80 is still a magic  number no different than others (ISTM anyway).

IWBN to keep TRAP_IS_SYSCALL (change the name however you like &/|
only record 0x80 in it if you like).


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Unbreak 'catch syscall' + multi-threading
  2009-10-01  9:51 ` Doug Evans
@ 2009-10-02 16:55   ` Pedro Alves
  2009-10-03  3:17     ` Sérgio Durigan Júnior
  0 siblings, 1 reply; 4+ messages in thread
From: Pedro Alves @ 2009-10-02 16:55 UTC (permalink / raw)
  To: Doug Evans; +Cc: gdb-patches

Revised per comments, a few comments cleaned up, retested and applied
to mainline as below.

-- 
Pedro Alves

2009-10-02  Pedro Alves  <pedro@codesourcery.com>

	* linux-nat.c (TRAP_IS_SYSCALL, TRAP_REMOVE_SYSCALL_FLAG): Delete.
	(SYSCALL_SIGTRAP): New.
	(status_to_str): Adjust.
	(get_pending_status): Pending events in lp->waitstatus don't map
	to any signal.  Simplify.
	(linux_handle_syscall_trap): New.
	(linux_handle_extended_wait): When handling PTRACE_EVENT_CLONE
	events, use linux_ops->to_resume instead of direct ptrace with
	PTRACE_CONT.  Remove all TRAP_IS_SYSCALL handling.
	(wait_lwp): Handle syscall traps with linux_handle_syscall_trap,
	and clear the sysgood bit.
	(status_callback): Make it clearer and add comments.
	(cancel_breakpoints_callback): Ignore if LP has waitstatus set.
	(linux_nat_filter_event): Handle syscall traps with
	linux_handle_syscall_trap, and clear the sysgood bit.  Move the
	check for storing siginfo to after handling extended statuses and
	syscall traps.  Store status in the lwp object.
	(linux_wait_1): Don't swap the pending status out of the lwp
	object until after deciding we found an lwp with an interesting
	event.  Requeue a new pending signal if we find one while getting
	rid or a pending SIGSTOP we sent ourselves.  Don't clear the
	sysgood bit here.

	* infrun.c (deal_with_syscall_event): Rename to ...
	(handle_syscall_event): ... this.  Always context switch and set
	stop_pc, even if not catching the syscall.  If not catching the
	syscall, always resume with keep_going.
	(handle_inferior_event): Adjust.

---
 gdb/infrun.c    |   60 +++----
 gdb/linux-nat.c |  430 +++++++++++++++++++++++++++++++++++++-------------------
 2 files changed, 312 insertions(+), 178 deletions(-)

Index: src/gdb/linux-nat.c
===================================================================
--- src.orig/gdb/linux-nat.c	2009-10-02 17:40:37.000000000 +0100
+++ src/gdb/linux-nat.c	2009-10-02 17:44:29.000000000 +0100
@@ -67,11 +67,6 @@
 # endif
 #endif /* HAVE_PERSONALITY */
 
-/* To be used when one needs to know wether a
-   WSTOPSIG (status) is a syscall */
-#define TRAP_IS_SYSCALL (SIGTRAP | 0x80)
-#define TRAP_REMOVE_SYSCALL_FLAG(status) ((status) & ~(0x80 << 8))
-
 /* This comment documents high-level logic of this file. 
 
 Waiting for events in sync mode
@@ -189,6 +184,11 @@ blocked.  */
 
 #endif /* PTRACE_EVENT_FORK */
 
+/* Unlike other extended result codes, WSTOPSIG (status) on
+   PTRACE_O_TRACESYSGOOD syscall events doesn't return SIGTRAP, but
+   instead SIGTRAP with bit 7 set.  */
+#define SYSCALL_SIGTRAP (SIGTRAP | 0x80)
+
 /* We can't always assume that this flag is available, but all systems
    with the ptrace event handlers also have __WALL, so it's safe to use
    here.  */
@@ -982,7 +982,7 @@ status_to_str (int status)
 
   if (WIFSTOPPED (status))
     {
-      if (WSTOPSIG (status) == TRAP_IS_SYSCALL)
+      if (WSTOPSIG (status) == SYSCALL_SIGTRAP)
 	snprintf (buf, sizeof (buf), "%s (stopped at syscall)",
 		  strsignal (SIGTRAP));
       else
@@ -1514,74 +1514,78 @@ linux_nat_attach (struct target_ops *ops
 static int
 get_pending_status (struct lwp_info *lp, int *status)
 {
-  struct target_waitstatus last;
-  ptid_t last_ptid;
-
-  get_last_target_status (&last_ptid, &last);
-
-  /* If this lwp is the ptid that GDB is processing an event from, the
-     signal will be in stop_signal.  Otherwise, we may cache pending
-     events in lp->status while trying to stop all threads (see
-     stop_wait_callback).  */
+  enum target_signal signo = TARGET_SIGNAL_0;
 
-  *status = 0;
+  /* If we paused threads momentarily, we may have stored pending
+     events in lp->status or lp->waitstatus (see stop_wait_callback),
+     and GDB core hasn't seen any signal for those threads.
+     Otherwise, the last signal reported to the core is found in the
+     thread object's stop_signal.
+
+     There's a corner case that isn't handled here at present.  Only
+     if the thread stopped with a TARGET_WAITKIND_STOPPED does
+     stop_signal make sense as a real signal to pass to the inferior.
+     Some catchpoint related events, like
+     TARGET_WAITKIND_(V)FORK|EXEC|SYSCALL, have their stop_signal set
+     to TARGET_SIGNAL_SIGTRAP when the catchpoint triggers.  But,
+     those traps are debug API (ptrace in our case) related and
+     induced; the inferior wouldn't see them if it wasn't being
+     traced.  Hence, we should never pass them to the inferior, even
+     when set to pass state.  Since this corner case isn't handled by
+     infrun.c when proceeding with a signal, for consistency, neither
+     do we handle it here (or elsewhere in the file we check for
+     signal pass state).  Normally SIGTRAP isn't set to pass state, so
+     this is really a corner case.  */
 
-  if (non_stop)
+  if (lp->waitstatus.kind != TARGET_WAITKIND_IGNORE)
+    signo = TARGET_SIGNAL_0; /* a pending ptrace event, not a real signal.  */
+  else if (lp->status)
+    signo = target_signal_from_host (WSTOPSIG (lp->status));
+  else if (non_stop && !is_executing (lp->ptid))
     {
-      enum target_signal signo = TARGET_SIGNAL_0;
+      struct thread_info *tp = find_thread_ptid (lp->ptid);
+      signo = tp->stop_signal;
+    }
+  else if (!non_stop)
+    {
+      struct target_waitstatus last;
+      ptid_t last_ptid;
 
-      if (is_executing (lp->ptid))
-	{
-	  /* If the core thought this lwp was executing --- e.g., the
-	     executing property hasn't been updated yet, but the
-	     thread has been stopped with a stop_callback /
-	     stop_wait_callback sequence (see linux_nat_detach for
-	     example) --- we can only have pending events in the local
-	     queue.  */
-	  signo = target_signal_from_host (WSTOPSIG (lp->status));
-	}
-      else
-	{
-	  /* If the core knows the thread is not executing, then we
-	     have the last signal recorded in
-	     thread_info->stop_signal.  */
+      get_last_target_status (&last_ptid, &last);
 
+      if (GET_LWP (lp->ptid) == GET_LWP (last_ptid))
+	{
 	  struct thread_info *tp = find_thread_ptid (lp->ptid);
 	  signo = tp->stop_signal;
 	}
+    }
 
-      if (signo != TARGET_SIGNAL_0
-	  && !signal_pass_state (signo))
-	{
-	  if (debug_linux_nat)
-	    fprintf_unfiltered (gdb_stdlog, "\
-GPT: lwp %s had signal %s, but it is in no pass state\n",
-				target_pid_to_str (lp->ptid),
-				target_signal_to_string (signo));
-	}
-      else
-	{
-	  if (signo != TARGET_SIGNAL_0)
-	    *status = W_STOPCODE (target_signal_to_host (signo));
+  *status = 0;
 
-	  if (debug_linux_nat)
-	    fprintf_unfiltered (gdb_stdlog,
-				"GPT: lwp %s as pending signal %s\n",
-				target_pid_to_str (lp->ptid),
-				target_signal_to_string (signo));
-	}
+  if (signo == TARGET_SIGNAL_0)
+    {
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "GPT: lwp %s has no pending signal\n",
+			    target_pid_to_str (lp->ptid));
+    }
+  else if (!signal_pass_state (signo))
+    {
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog, "\
+GPT: lwp %s had signal %s, but it is in no pass state\n",
+			    target_pid_to_str (lp->ptid),
+			    target_signal_to_string (signo));
     }
   else
     {
-      if (GET_LWP (lp->ptid) == GET_LWP (last_ptid))
-	{
-	  struct thread_info *tp = find_thread_ptid (lp->ptid);
-	  if (tp->stop_signal != TARGET_SIGNAL_0
-	      && signal_pass_state (tp->stop_signal))
-	    *status = W_STOPCODE (target_signal_to_host (tp->stop_signal));
-	}
-      else
-	*status = lp->status;
+      *status = W_STOPCODE (target_signal_to_host (signo));
+
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "GPT: lwp %s has pending signal %s\n",
+			    target_pid_to_str (lp->ptid),
+			    target_signal_to_string (signo));
     }
 
   return 0;
@@ -1893,6 +1897,133 @@ kill_lwp (int lwpid, int signo)
   return kill (lwpid, signo);
 }
 
+/* Handle a GNU/Linux syscall trap wait response.  If we see a syscall
+   event, check if the core is interested in it: if not, ignore the
+   event, and keep waiting; otherwise, we need to toggle the LWP's
+   syscall entry/exit status, since the ptrace event itself doesn't
+   indicate it, and report the trap to higher layers.  */
+
+static int
+linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
+{
+  struct target_waitstatus *ourstatus = &lp->waitstatus;
+  struct gdbarch *gdbarch = target_thread_architecture (lp->ptid);
+  int syscall_number = (int) gdbarch_get_syscall_number (gdbarch, lp->ptid);
+
+  if (stopping)
+    {
+      /* If we're stopping threads, there's a SIGSTOP pending, which
+	 makes it so that the LWP reports an immediate syscall return,
+	 followed by the SIGSTOP.  Skip seeing that "return" using
+	 PTRACE_CONT directly, and let stop_wait_callback collect the
+	 SIGSTOP.  Later when the thread is resumed, a new syscall
+	 entry event.  If we didn't do this (and returned 0), we'd
+	 leave a syscall entry pending, and our caller, by using
+	 PTRACE_CONT to collect the SIGSTOP, skips the syscall return
+	 itself.  Later, when the user re-resumes this LWP, we'd see
+	 another syscall entry event and we'd mistake it for a return.
+
+	 If stop_wait_callback didn't force the SIGSTOP out of the LWP
+	 (leaving immediately with LWP->signalled set, without issuing
+	 a PTRACE_CONT), it would still be problematic to leave this
+	 syscall enter pending, as later when the thread is resumed,
+	 it would then see the same syscall exit mentioned above,
+	 followed by the delayed SIGSTOP, while the syscall didn't
+	 actually get to execute.  It seems it would be even more
+	 confusing to the user.  */
+
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "LHST: ignoring syscall %d "
+			    "for LWP %ld (stopping threads), "
+			    "resuming with PTRACE_CONT for SIGSTOP\n",
+			    syscall_number,
+			    GET_LWP (lp->ptid));
+
+      lp->syscall_state = TARGET_WAITKIND_IGNORE;
+      ptrace (PTRACE_CONT, GET_LWP (lp->ptid), 0, 0);
+      return 1;
+    }
+
+  if (catch_syscall_enabled ())
+    {
+      /* Always update the entry/return state, even if this particular
+	 syscall isn't interesting to the core now.  In async mode,
+	 the user could install a new catchpoint for this syscall
+	 between syscall enter/return, and we'll need to know to
+	 report a syscall return if that happens.  */
+      lp->syscall_state = (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
+			   ? TARGET_WAITKIND_SYSCALL_RETURN
+			   : TARGET_WAITKIND_SYSCALL_ENTRY);
+
+      if (catching_syscall_number (syscall_number))
+	{
+	  /* Alright, an event to report.  */
+	  ourstatus->kind = lp->syscall_state;
+	  ourstatus->value.syscall_number = syscall_number;
+
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog,
+				"LHST: stopping for %s of syscall %d"
+				" for LWP %ld\n",
+				lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
+				? "entry" : "return",
+				syscall_number,
+				GET_LWP (lp->ptid));
+	  return 0;
+	}
+
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "LHST: ignoring %s of syscall %d "
+			    "for LWP %ld\n",
+			    lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
+			    ? "entry" : "return",
+			    syscall_number,
+			    GET_LWP (lp->ptid));
+    }
+  else
+    {
+      /* If we had been syscall tracing, and hence used PT_SYSCALL
+	 before on this LWP, it could happen that the user removes all
+	 syscall catchpoints before we get to process this event.
+	 There are two noteworthy issues here:
+
+	 - When stopped at a syscall entry event, resuming with
+	   PT_STEP still resumes executing the syscall and reports a
+	   syscall return.
+
+	 - Only PT_SYSCALL catches syscall enters.  If we last
+	   single-stepped this thread, then this event can't be a
+	   syscall enter.  If we last single-stepped this thread, this
+	   has to be a syscall exit.
+
+	 The points above mean that the next resume, be it PT_STEP or
+	 PT_CONTINUE, can not trigger a syscall trace event.  */
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "LHST: caught syscall event with no syscall catchpoints."
+			    " %d for LWP %ld, ignoring\n",
+			    syscall_number,
+			    GET_LWP (lp->ptid));
+      lp->syscall_state = TARGET_WAITKIND_IGNORE;
+    }
+
+  /* The core isn't interested in this event.  For efficiency, avoid
+     stopping all threads only to have the core resume them all again.
+     Since we're not stopping threads, if we're still syscall tracing
+     and not stepping, we can't use PTRACE_CONT here, as we'd miss any
+     subsequent syscall.  Simply resume using the inf-ptrace layer,
+     which knows when to use PT_SYSCALL or PT_CONTINUE.  */
+
+  /* Note that gdbarch_get_syscall_number may access registers, hence
+     fill a regcache.  */
+  registers_changed ();
+  linux_ops->to_resume (linux_ops, pid_to_ptid (GET_LWP (lp->ptid)),
+			lp->step, TARGET_SIGNAL_0);
+  return 1;
+}
+
 /* Handle a GNU/Linux extended wait response.  If we see a clone
    event, we need to add the new LWP to our list (and not report the
    trap to higher layers).  This function returns non-zero if the
@@ -2018,19 +2149,30 @@ linux_handle_extended_wait (struct lwp_i
 		}
 	    }
 
+	  /* Note the need to use the low target ops to resume, to
+	     handle resuming with PT_SYSCALL if we have syscall
+	     catchpoints.  */
 	  if (!stopping)
 	    {
+	      int signo;
+
 	      new_lp->stopped = 0;
 	      new_lp->resumed = 1;
-	      ptrace (PTRACE_CONT, new_pid, 0,
-		      status ? WSTOPSIG (status) : 0);
+
+	      signo = (status
+		       ? target_signal_from_host (WSTOPSIG (status))
+		       : TARGET_SIGNAL_0);
+
+	      linux_ops->to_resume (linux_ops, pid_to_ptid (new_pid),
+				    0, signo);
 	    }
 
 	  if (debug_linux_nat)
 	    fprintf_unfiltered (gdb_stdlog,
 				"LHEW: Got clone event from LWP %ld, resuming\n",
 				GET_LWP (lp->ptid));
-	  ptrace (PTRACE_CONT, GET_LWP (lp->ptid), 0, 0);
+	  linux_ops->to_resume (linux_ops, pid_to_ptid (GET_LWP (lp->ptid)),
+				0, TARGET_SIGNAL_0);
 
 	  return 1;
 	}
@@ -2070,47 +2212,6 @@ linux_handle_extended_wait (struct lwp_i
       return 0;
     }
 
-  /* Used for 'catch syscall' feature.  */
-  if (WSTOPSIG (status) == TRAP_IS_SYSCALL)
-    {
-      if (catch_syscall_enabled () == 0)
-	  ourstatus->kind = TARGET_WAITKIND_IGNORE;
-      else
-	{
-	  struct regcache *regcache = get_thread_regcache (lp->ptid);
-	  struct gdbarch *gdbarch = get_regcache_arch (regcache);
-
-	  ourstatus->value.syscall_number =
-	    (int) gdbarch_get_syscall_number (gdbarch, lp->ptid);
-
-	  /* If we are catching this specific syscall number, then we
-	     should update the target_status to reflect which event
-	     has occurred.  But if this syscall is not to be caught,
-	     then we can safely mark the event as a SYSCALL_RETURN.
-
-	     This is particularly needed if:
-
-	       - We are catching any syscalls, or
-	       - We are catching the syscall "exit"
-
-	     In this case, as the syscall "exit" *doesn't* return,
-	     then GDB would be confused because it would mark the last
-	     syscall event as a SYSCALL_ENTRY.  After that, if we re-ran the
-	     inferior GDB will think that the first syscall event is
-	     the opposite of a SYSCALL_ENTRY, which is the SYSCALL_RETURN.
-	     Therefore, GDB would report inverted syscall events.  */
-	  if (catching_syscall_number (ourstatus->value.syscall_number))
-	    ourstatus->kind = 
-	      (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY) ?
-	      TARGET_WAITKIND_SYSCALL_RETURN : TARGET_WAITKIND_SYSCALL_ENTRY;
-	  else
-	    ourstatus->kind = TARGET_WAITKIND_SYSCALL_RETURN;
-
-	  lp->syscall_state = ourstatus->kind;
-	}
-      return 0;
-    }
-
   internal_error (__FILE__, __LINE__,
 		  _("unknown ptrace event %d"), event);
 }
@@ -2176,6 +2277,18 @@ wait_lwp (struct lwp_info *lp)
 
   gdb_assert (WIFSTOPPED (status));
 
+  /* Handle GNU/Linux's syscall SIGTRAPs.  */
+  if (WIFSTOPPED (status) && WSTOPSIG (status) == SYSCALL_SIGTRAP)
+    {
+      /* No longer need the sysgood bit.  The ptrace event ends up
+	 recorded in lp->waitstatus if we care for it.  We can carry
+	 on handling the event like a regular SIGTRAP from here
+	 on.  */
+      status = W_STOPCODE (SIGTRAP);
+      if (linux_handle_syscall_trap (lp, 1))
+	return wait_lwp (lp);
+    }
+
   /* Handle GNU/Linux's extended waitstatus for trace events.  */
   if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP && status >> 16 != 0)
     {
@@ -2442,12 +2555,23 @@ status_callback (struct lwp_info *lp, vo
 {
   /* Only report a pending wait status if we pretend that this has
      indeed been resumed.  */
-  /* We check for lp->waitstatus in addition to lp->status, because we
-     can have pending process exits recorded in lp->waitstatus, and
-     W_EXITCODE(0,0) == 0.  */
-  return ((lp->status != 0
-	   || lp->waitstatus.kind != TARGET_WAITKIND_IGNORE)
-	  && lp->resumed);
+  if (!lp->resumed)
+    return 0;
+
+  if (lp->waitstatus.kind != TARGET_WAITKIND_IGNORE)
+    {
+      /* A ptrace event, like PTRACE_FORK|VFORK|EXEC, syscall event,
+	 or a a pending process exit.  Note that `W_EXITCODE(0,0) ==
+	 0', so a clean process exit can not be stored pending in
+	 lp->status, it is indistinguishable from
+	 no-pending-status.  */
+      return 1;
+    }
+
+  if (lp->status != 0)
+    return 1;
+
+  return 0;
 }
 
 /* Return non-zero if LP isn't stopped.  */
@@ -2557,7 +2681,8 @@ cancel_breakpoints_callback (struct lwp_
      delete or disable the breakpoint, but the LWP will have already
      tripped on it.  */
 
-  if (lp->status != 0
+  if (lp->waitstatus.kind == TARGET_WAITKIND_IGNORE
+      && lp->status != 0
       && WIFSTOPPED (lp->status) && WSTOPSIG (lp->status) == SIGTRAP
       && cancel_breakpoint (lp))
     /* Throw away the SIGTRAP.  */
@@ -2708,17 +2833,20 @@ linux_nat_filter_event (int lwpid, int s
       add_thread (lp->ptid);
     }
 
-  /* Save the trap's siginfo in case we need it later.  */
-  if (WIFSTOPPED (status)
-      && (WSTOPSIG (status) == SIGTRAP || WSTOPSIG (status) == TRAP_IS_SYSCALL))
-    save_siginfo (lp);
+  /* Handle GNU/Linux's syscall SIGTRAPs.  */
+  if (WIFSTOPPED (status) && WSTOPSIG (status) == SYSCALL_SIGTRAP)
+    {
+      /* No longer need the sysgood bit.  The ptrace event ends up
+	 recorded in lp->waitstatus if we care for it.  We can carry
+	 on handling the event like a regular SIGTRAP from here
+	 on.  */
+      status = W_STOPCODE (SIGTRAP);
+      if (linux_handle_syscall_trap (lp, 0))
+	return NULL;
+    }
 
-  /* Handle GNU/Linux's extended waitstatus for trace events.
-     It is necessary to check if WSTOPSIG is signaling that
-     the inferior is entering/exiting a system call.  */
-  if (WIFSTOPPED (status)
-      && ((WSTOPSIG (status) == TRAP_IS_SYSCALL)
-          || (WSTOPSIG (status) == SIGTRAP && status >> 16 != 0)))
+  /* Handle GNU/Linux's extended waitstatus for trace events.  */
+  if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP && status >> 16 != 0)
     {
       if (debug_linux_nat)
 	fprintf_unfiltered (gdb_stdlog,
@@ -2728,6 +2856,10 @@ linux_nat_filter_event (int lwpid, int s
 	return NULL;
     }
 
+  /* Save the trap's siginfo in case we need it later.  */
+  if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP)
+    save_siginfo (lp);
+
   /* Check if the thread has exited.  */
   if ((WIFEXITED (status) || WIFSIGNALED (status))
       && num_lwps (GET_PID (lp->ptid)) > 1)
@@ -2849,6 +2981,7 @@ linux_nat_filter_event (int lwpid, int s
 
   /* An interesting event.  */
   gdb_assert (lp);
+  lp->status = status;
   return lp;
 }
 
@@ -2908,13 +3041,10 @@ retry:
       lp = iterate_over_lwps (ptid, status_callback, NULL);
       if (lp)
 	{
-	  status = lp->status;
-	  lp->status = 0;
-
-	  if (debug_linux_nat && status)
+	  if (debug_linux_nat && lp->status)
 	    fprintf_unfiltered (gdb_stdlog,
 				"LLW: Using pending wait status %s for %s.\n",
-				status_to_str (status),
+				status_to_str (lp->status),
 				target_pid_to_str (lp->ptid));
 	}
 
@@ -2933,13 +3063,11 @@ retry:
       /* We have a specific LWP to check.  */
       lp = find_lwp_pid (ptid);
       gdb_assert (lp);
-      status = lp->status;
-      lp->status = 0;
 
-      if (debug_linux_nat && status)
+      if (debug_linux_nat && lp->status)
 	fprintf_unfiltered (gdb_stdlog,
 			    "LLW: Using pending wait status %s for %s.\n",
-			    status_to_str (status),
+			    status_to_str (lp->status),
 			    target_pid_to_str (lp->ptid));
 
       /* If we have to wait, take into account whether PID is a cloned
@@ -2952,7 +3080,7 @@ retry:
 	 because we can have pending process exits recorded in
 	 lp->status and W_EXITCODE(0,0) == 0.  We should probably have
 	 an additional lp->status_p flag.  */
-      if (status == 0 && lp->waitstatus.kind == TARGET_WAITKIND_IGNORE)
+      if (lp->status == 0 && lp->waitstatus.kind == TARGET_WAITKIND_IGNORE)
 	lp = NULL;
     }
 
@@ -2980,8 +3108,26 @@ retry:
       lp->stopped = 0;
       gdb_assert (lp->resumed);
 
-      /* This should catch the pending SIGSTOP.  */
+      /* Catch the pending SIGSTOP.  */
+      status = lp->status;
+      lp->status = 0;
+
       stop_wait_callback (lp, NULL);
+
+      /* If the lp->status field isn't empty, we caught another signal
+	 while flushing the SIGSTOP.  Return it back to the event
+	 queue of the LWP, as we already have an event to handle.  */
+      if (lp->status)
+	{
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog,
+				"LLW: kill %s, %s\n",
+				target_pid_to_str (lp->ptid),
+				status_to_str (lp->status));
+	  kill_lwp (GET_LWP (lp->ptid), WSTOPSIG (lp->status));
+	}
+
+      lp->status = status;
     }
 
   if (!target_can_async_p ())
@@ -3013,12 +3159,6 @@ retry:
 
 	  lp = linux_nat_filter_event (lwpid, status, options);
 
-	  /* If this was a syscall trap, we no longer need or want
-	     the 0x80 flag, remove it.  */
-	  if (WIFSTOPPED (status)
-	      && WSTOPSIG (status) == TRAP_IS_SYSCALL)
-	    status = TRAP_REMOVE_SYSCALL_FLAG (status);
-
 	  if (lp
 	      && ptid_is_pid (ptid)
 	      && ptid_get_pid (lp->ptid) != ptid_get_pid (ptid))
@@ -3027,12 +3167,10 @@ retry:
 		fprintf (stderr, "LWP %ld got an event %06x, leaving pending.\n",
 			 ptid_get_lwp (lp->ptid), status);
 
-	      if (WIFSTOPPED (status))
+	      if (WIFSTOPPED (lp->status))
 		{
-		  if (WSTOPSIG (status) != SIGSTOP)
+		  if (WSTOPSIG (lp->status) != SIGSTOP)
 		    {
-		      lp->status = status;
-
 		      stop_callback (lp, NULL);
 
 		      /* Resume in order to collect the sigstop.  */
@@ -3059,7 +3197,6 @@ retry:
 		     about the exit code/signal, leave the status
 		     pending for the next time we're able to report
 		     it.  */
-		  lp->status = status;
 
 		  /* Prevent trying to stop this thread again.  We'll
 		     never try to resume it because it has a pending
@@ -3072,7 +3209,7 @@ retry:
 
 		  /* Store the pending event in the waitstatus as
 		     well, because W_EXITCODE(0,0) == 0.  */
-		  store_waitstatus (&lp->waitstatus, status);
+		  store_waitstatus (&lp->waitstatus, lp->status);
 		}
 
 	      /* Keep looking.  */
@@ -3128,6 +3265,9 @@ retry:
 
   gdb_assert (lp);
 
+  status = lp->status;
+  lp->status = 0;
+
   /* Don't report signals that GDB isn't interested in, such as
      signals that are neither printed nor stopped upon.  Stopping all
      threads can be a bit time-consuming so if we want decent
Index: src/gdb/infrun.c
===================================================================
--- src.orig/gdb/infrun.c	2009-10-02 17:40:37.000000000 +0100
+++ src/gdb/infrun.c	2009-10-02 17:44:29.000000000 +0100
@@ -2386,13 +2386,22 @@ stepped_in_from (struct frame_info *fram
    It returns 1 if the inferior should keep going (and GDB
    should ignore the event), or 0 if the event deserves to be
    processed.  */
+
 static int
-deal_with_syscall_event (struct execution_control_state *ecs)
+handle_syscall_event (struct execution_control_state *ecs)
 {
-  struct regcache *regcache = get_thread_regcache (ecs->ptid);
-  struct gdbarch *gdbarch = get_regcache_arch (regcache);
-  int syscall_number = gdbarch_get_syscall_number (gdbarch,
-                                                   ecs->ptid);
+  struct regcache *regcache;
+  struct gdbarch *gdbarch;
+  int syscall_number;
+
+  if (!ptid_equal (ecs->ptid, inferior_ptid))
+    context_switch (ecs->ptid);
+
+  regcache = get_thread_regcache (ecs->ptid);
+  gdbarch = get_regcache_arch (regcache);
+  syscall_number = gdbarch_get_syscall_number (gdbarch, ecs->ptid);
+  stop_pc = regcache_read_pc (regcache);
+
   target_last_waitstatus.value.syscall_number = syscall_number;
 
   if (catch_syscall_enabled () > 0
@@ -2401,35 +2410,22 @@ deal_with_syscall_event (struct executio
       if (debug_infrun)
         fprintf_unfiltered (gdb_stdlog, "infrun: syscall number = '%d'\n",
                             syscall_number);
-      ecs->event_thread->stop_signal = TARGET_SIGNAL_TRAP;
-
-      if (!ptid_equal (ecs->ptid, inferior_ptid))
-        {
-          context_switch (ecs->ptid);
-          reinit_frame_cache ();
-        }
-
-      stop_pc = regcache_read_pc (get_thread_regcache (ecs->ptid));
 
       ecs->event_thread->stop_bpstat = bpstat_stop_status (stop_pc, ecs->ptid);
-
       ecs->random_signal = !bpstat_explains_signal (ecs->event_thread->stop_bpstat);
 
-      /* If no catchpoint triggered for this, then keep going.  */
-      if (ecs->random_signal)
-        {
-          ecs->event_thread->stop_signal = TARGET_SIGNAL_0;
-          keep_going (ecs);
-          return 1;
-        }
-      return 0;
-    }
-  else
-    {
-      resume (0, TARGET_SIGNAL_0);
-      prepare_to_wait (ecs);
-      return 1;
+      if (!ecs->random_signal)
+	{
+	  /* Catchpoint hit.  */
+	  ecs->event_thread->stop_signal = TARGET_SIGNAL_TRAP;
+	  return 0;
+	}
     }
+
+  /* If no catchpoint triggered for this, then keep going.  */
+  ecs->event_thread->stop_signal = TARGET_SIGNAL_0;
+  keep_going (ecs);
+  return 1;
 }
 
 /* Given an execution control state that has been freshly filled in
@@ -2753,10 +2749,9 @@ handle_inferior_event (struct execution_
       if (debug_infrun)
         fprintf_unfiltered (gdb_stdlog, "infrun: TARGET_WAITKIND_SYSCALL_ENTRY\n");
       /* Getting the current syscall number */
-      if (deal_with_syscall_event (ecs) != 0)
+      if (handle_syscall_event (ecs) != 0)
         return;
       goto process_event_stop_test;
-      break;
 
       /* Before examining the threads further, step this thread to
          get it entirely out of the syscall.  (We get notice of the
@@ -2766,10 +2761,9 @@ handle_inferior_event (struct execution_
     case TARGET_WAITKIND_SYSCALL_RETURN:
       if (debug_infrun)
         fprintf_unfiltered (gdb_stdlog, "infrun: TARGET_WAITKIND_SYSCALL_RETURN\n");
-      if (deal_with_syscall_event (ecs) != 0)
+      if (handle_syscall_event (ecs) != 0)
         return;
       goto process_event_stop_test;
-      break;
 
     case TARGET_WAITKIND_STOPPED:
       if (debug_infrun)


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Unbreak 'catch syscall' + multi-threading
  2009-10-02 16:55   ` Pedro Alves
@ 2009-10-03  3:17     ` Sérgio Durigan Júnior
  0 siblings, 0 replies; 4+ messages in thread
From: Sérgio Durigan Júnior @ 2009-10-03  3:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Pedro Alves, Doug Evans

Hi Pedro,
On Friday 02 October 2009, Pedro Alves wrote:
> Revised per comments, a few comments cleaned up, retested and applied
> to mainline as below.

Thank you very much for this.  It was really great to study your patch and 
understand my errors.

Regards,

-- 
Sérgio Durigan Júnior
Linux on Power Toolchain - Software Engineer
Linux Technology Center - LTC
IBM Brazil


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-10-03  3:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-01  0:48 Unbreak 'catch syscall' + multi-threading Pedro Alves
2009-10-01  9:51 ` Doug Evans
2009-10-02 16:55   ` Pedro Alves
2009-10-03  3:17     ` Sérgio Durigan Júnior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox