Re: [patch] Fix for PR gdb/10757

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Paul Pluzhnikov <ppluzhnikov@google.com>
To: Pedro Alves <pedro@codesourcery.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [patch] Fix for PR gdb/10757
Date: Mon, 26 Oct 2009 21:25:00 -0000	[thread overview]
Message-ID: <8ac60eac0910261425x363afd0bp43299ab0b18e576a@mail.gmail.com> (raw)
In-Reply-To: <200910252354.55084.pedro@codesourcery.com>

[-- Attachment #1: Type: text/plain, Size: 3034 bytes --]

On Sun, Oct 25, 2009 at 4:54 PM, Pedro Alves <pedro@codesourcery.com> wrote:

> FTR, I've took a peek at nptl/nptl_db, to try to understand
> a bit better why this happens (please do correct me if I'm
> wrong):
>
>  - First, it is worth remind us while nptl takes locks and barriers
>   to protect its internal structures, thread_db/ntpl_db doesn't,
>   since it runs in gdb's context.

I wonder if it *could* take some lock that would make all other threads
wait in pthread_create -- after all, thread_db does have access to inferior
via ps_pdwrite. That would certainly make the interface cleaner, at a cost
of an extra global lock on pthread_create path.

Regardless of whether that's a good idea or not, I am stuck on glibc-2.3.6
for the time being, and even if this was to go into glibc-2.11, the "broken"
glibc versions will be with us for a long time, and GDB should be able
to deal with them.

>  - nptl adds new threads to one of the __stack_user or stack_used
>   lists.  New nodes are added to the _head_ of the lists.
>
>  - nptl_db/td_ta_thr_iter.c:td_ta_thr_iter iterates over
>   these thread lists from head to tail.
>
> From these last two points alone, one can infer that an iteration
> over the thread list can miss new threads if they're added just while
> the list is being iterated.

Right.

[While nptl_db is fresh in your mind, would you mind looking at gdb/10838
to double-check my diagnosis there?
http://sourceware.org/bugzilla/show_bug.cgi?id=10838

This is related to current patch only in that linux/libthread_db is
also involved.]

> Another way to tackle this, would be iterate over
> /proc/pid/task/ and attach to all lwps that way, instead of
> relying on thread_db to do the initial iteration over all
> threads.  We'd still iterate using the thread_db interface
> once afterwards, so to learn the matching pthread
> ids of the lwps.

I should note that /proc may not be mounted. I am not sure whether GDB
currently requires /proc in other places.

> With /proc/pid/task, you'd still need to keep iterating until
> no new thread comes out, but, I think we could make the
> number of loops unbounded much safely, since we wouldn't be
> at risk of reading stale inferior data.  Assuming no kernel
> bugs, that is.  Note that this is something that would be
> a useful addition to gdb anyway, so that we could be able to
> attach to multi-threaded apps that use raw clone instead of
> pthreads.

I'll put that on my TODO list :-)

>> +      /* Require NUM_LOOP successive iterations which do not find any new threads.  */
>
> This line too long.

Fixed.

>> +      for (i = 0, loop = 0; loop < 4; ++i, ++loop)
>
> Err, s/4/num_loop/ ?  A comment explain _why_ we need this is missing
> as well.  The patch looks OK otherwise.

Fixed.

> Actually, I'd rename the `num_loop' parameter into
> say, `until_no_new', and do this instead:

Good idea, done.

I'll commit attached patch tomorrow if there are no further comments.
A similar patch is required for gdbserver as well.


Thanks,
-- 
Paul Pluzhnikov

[-- Attachment #2: gdb-pr10757-20091026.txt --]
[-- Type: text/plain, Size: 6938 bytes --]

Index: linux-thread-db.c
===================================================================
RCS file: /cvs/src/src/gdb/linux-thread-db.c,v
retrieving revision 1.65
diff -u -p -u -r1.65 linux-thread-db.c
--- linux-thread-db.c	19 Oct 2009 09:51:41 -0000	1.65
+++ linux-thread-db.c	26 Oct 2009 21:20:31 -0000
@@ -160,6 +160,7 @@ struct thread_db_info
 struct thread_db_info *thread_db_list;
 
 static void thread_db_find_new_threads_1 (ptid_t ptid);
+static void thread_db_find_new_threads_2 (ptid_t ptid, int until_no_new);
 
 /* Add the current inferior to the list of processes using libpthread.
    Return a pointer to the newly allocated object that was added to
@@ -229,8 +230,8 @@ delete_thread_db_info (int pid)
 }
 
 /* Prototypes for local functions.  */
-static void attach_thread (ptid_t ptid, const td_thrhandle_t *th_p,
-			   const td_thrinfo_t *ti_p);
+static int attach_thread (ptid_t ptid, const td_thrhandle_t *th_p,
+			  const td_thrinfo_t *ti_p);
 static void detach_thread (ptid_t ptid);
 \f
 
@@ -597,7 +598,7 @@ thread_db_find_new_threads_silently (pti
 
   TRY_CATCH (except, RETURN_MASK_ERROR)
     {
-      thread_db_find_new_threads_1 (ptid);
+      thread_db_find_new_threads_2 (ptid, 1);
     }
 
   if (except.reason < 0 && info_verbose)
@@ -977,9 +978,9 @@ thread_db_new_objfile (struct objfile *o
 
 /* Attach to a new thread.  This function is called when we receive a
    TD_CREATE event or when we iterate over all threads and find one
-   that wasn't already in our list.  */
+   that wasn't already in our list.  Returns true on success.  */
 
-static void
+static int
 attach_thread (ptid_t ptid, const td_thrhandle_t *th_p,
 	       const td_thrinfo_t *ti_p)
 {
@@ -1013,7 +1014,7 @@ attach_thread (ptid_t ptid, const td_thr
       if (tp->private != NULL)
 	{
 	  if (!tp->private->dying)
-	    return;
+	    return 0;
 
 	  delete_thread (ptid);
 	  tp = NULL;
@@ -1023,12 +1024,12 @@ attach_thread (ptid_t ptid, const td_thr
   check_thread_signals ();
 
   if (ti_p->ti_state == TD_THR_UNKNOWN || ti_p->ti_state == TD_THR_ZOMBIE)
-    return;			/* A zombie thread -- do not attach.  */
+    return 0;			/* A zombie thread -- do not attach.  */
 
   /* Under GNU/Linux, we have to attach to each and every thread.  */
   if (tp == NULL
       && lin_lwp_attach_lwp (BUILD_LWP (ti_p->ti_lid, GET_PID (ptid))) < 0)
-    return;
+    return 0;
 
   /* Construct the thread's private data.  */
   private = xmalloc (sizeof (struct private_thread_info));
@@ -1056,6 +1057,7 @@ attach_thread (ptid_t ptid, const td_thr
   if (err != TD_OK)
     error (_("Cannot enable thread event reporting for %s: %s"),
 	   target_pid_to_str (ptid), thread_db_err_str (err));
+  return 1;
 }
 
 static void
@@ -1282,6 +1284,12 @@ thread_db_mourn_inferior (struct target_
     unpush_target (ops);
 }
 
+struct callback_data
+{
+  struct thread_db_info *info;
+  int new_threads;
+};
+
 static int
 find_new_threads_callback (const td_thrhandle_t *th_p, void *data)
 {
@@ -1289,7 +1297,8 @@ find_new_threads_callback (const td_thrh
   td_err_e err;
   ptid_t ptid;
   struct thread_info *tp;
-  struct thread_db_info *info = data;
+  struct callback_data *cb_data = data;
+  struct thread_db_info *info = cb_data->info;
 
   err = info->td_thr_get_info_p (th_p, &ti);
   if (err != TD_OK)
@@ -1332,21 +1341,76 @@ find_new_threads_callback (const td_thrh
   ptid = ptid_build (info->pid, ti.ti_lid, 0);
   tp = find_thread_ptid (ptid);
   if (tp == NULL || tp->private == NULL)
-    attach_thread (ptid, th_p, &ti);
+    {
+      if (attach_thread (ptid, th_p, &ti))
+	cb_data->new_threads += 1;
+      else
+	/* Problem attaching this thread; perhaps it exited before we
+	   could attach it?
+	   This could mean that the thread list inside glibc itself is in
+	   inconsistent state, and libthread_db could go on looping forever
+	   (observed with glibc-2.3.6).  To prevent that, terminate
+	   iteration: thread_db_find_new_threads_2 will retry.  */
+	return 1;
+    }
 
   return 0;
 }
 
+/* Helper for thread_db_find_new_threads_2.
+   Returns number of new threads found.  */
+
+static int
+find_new_threads_once (struct thread_db_info *info, int iteration,
+		       int *errp)
+{
+  volatile struct gdb_exception except;
+  struct callback_data data;
+  int err = TD_ERR;
+
+  data.info = info;
+  data.new_threads = 0;
+
+  TRY_CATCH (except, RETURN_MASK_ERROR)
+    {
+      /* Iterate over all user-space threads to discover new threads.  */
+      err = info->td_ta_thr_iter_p (info->thread_agent,
+				    find_new_threads_callback,
+				    &data,
+				    TD_THR_ANY_STATE,
+				    TD_THR_LOWEST_PRIORITY,
+				    TD_SIGNO_MASK,
+				    TD_THR_ANY_USER_FLAGS);
+    }
+
+  if (info_verbose)
+    {
+      if (except.reason < 0)
+	exception_fprintf (gdb_stderr, except,
+			   "Warning: find_new_threads_once: ");
+
+      printf_filtered (_("Found %d new threads in iteration %d.\n"),
+		       data.new_threads, iteration);
+    }
+
+  if (errp != NULL)
+    *errp = err;
+
+  return data.new_threads;
+}
+
 /* Search for new threads, accessing memory through stopped thread
-   PTID.  */
+   PTID.  If UNTIL_NO_NEW is true, repeat searching until several
+   searches in a row do not discover any new threads.  */
 
 static void
-thread_db_find_new_threads_1 (ptid_t ptid)
+thread_db_find_new_threads_2 (ptid_t ptid, int until_no_new)
 {
   td_err_e err;
   struct lwp_info *lp;
   struct thread_db_info *info;
   int pid = ptid_get_pid (ptid);
+  int i, loop;
 
   /* In linux, we can only read memory through a stopped lwp.  */
   ALL_LWPS (lp, ptid)
@@ -1361,15 +1425,36 @@ thread_db_find_new_threads_1 (ptid_t pti
 
   /* Access an lwp we know is stopped.  */
   info->proc_handle.ptid = ptid;
-  /* Iterate over all user-space threads to discover new threads.  */
-  err = info->td_ta_thr_iter_p (info->thread_agent, find_new_threads_callback,
-				info, TD_THR_ANY_STATE, TD_THR_LOWEST_PRIORITY,
-				TD_SIGNO_MASK, TD_THR_ANY_USER_FLAGS);
-  if (err != TD_OK)
-    error (_("Cannot find new threads: %s"), thread_db_err_str (err));
+
+  if (until_no_new)
+    {
+      /* Require 4 successive iterations which do not find any new threads.
+	 The 4 is a heuristic: there is an inherent race here, and I have
+	 seen that 2 iterations in a row are not always sufficient to
+	 "capture" all threads.  */
+      for (i = 0, loop = 0; loop < 4; ++i, ++loop)
+	if (find_new_threads_once (info, i, NULL) != 0)
+	  /* Found some new threads.  Restart the loop from beginning.	*/
+	  loop = -1;
+    }
+  else
+    {
+      int err;
+
+      find_new_threads_once (info, 0, &err);
+      if (err != TD_OK)
+	error (_("Cannot find new threads: %s"), thread_db_err_str (err));
+    }
 }
 
 static void
+thread_db_find_new_threads_1 (ptid_t ptid)
+{
+  thread_db_find_new_threads_2 (ptid, 0);
+}
+
+
+static void
 thread_db_find_new_threads (struct target_ops *ops)
 {
   struct thread_db_info *info;

next prev parent reply	other threads:[~2009-10-26 21:25 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-15 19:10 Paul Pluzhnikov
2009-10-17 23:26 ` Paul Pluzhnikov
2009-10-25 17:15   ` Paul Pluzhnikov
2009-10-25 23:54   ` Pedro Alves
2009-10-26 21:25     ` Paul Pluzhnikov [this message]
2009-10-26 21:34       ` Pedro Alves
2009-10-27  0:41         ` Paul Pluzhnikov
2009-10-27  9:43           ` Pedro Alves
2009-10-27 16:58             ` Paul Pluzhnikov
2009-10-27 21:38       ` Paul Pluzhnikov
2009-10-27 21:52         ` Pedro Alves
2009-10-28 17:04           ` Paul Pluzhnikov
2009-10-25 23:59 Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8ac60eac0910261425x363afd0bp43299ab0b18e576a@mail.gmail.com \
    --to=ppluzhnikov@google.com \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox