From: Paul Pluzhnikov <ppluzhnikov@google.com>
To: Pedro Alves <pedro@codesourcery.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [patch] Fix for PR gdb/10757
Date: Mon, 26 Oct 2009 21:25:00 -0000 [thread overview]
Message-ID: <8ac60eac0910261425x363afd0bp43299ab0b18e576a@mail.gmail.com> (raw)
In-Reply-To: <200910252354.55084.pedro@codesourcery.com>
[-- Attachment #1: Type: text/plain, Size: 3034 bytes --]
On Sun, Oct 25, 2009 at 4:54 PM, Pedro Alves <pedro@codesourcery.com> wrote:
> FTR, I've took a peek at nptl/nptl_db, to try to understand
> a bit better why this happens (please do correct me if I'm
> wrong):
>
> - First, it is worth remind us while nptl takes locks and barriers
> to protect its internal structures, thread_db/ntpl_db doesn't,
> since it runs in gdb's context.
I wonder if it *could* take some lock that would make all other threads
wait in pthread_create -- after all, thread_db does have access to inferior
via ps_pdwrite. That would certainly make the interface cleaner, at a cost
of an extra global lock on pthread_create path.
Regardless of whether that's a good idea or not, I am stuck on glibc-2.3.6
for the time being, and even if this was to go into glibc-2.11, the "broken"
glibc versions will be with us for a long time, and GDB should be able
to deal with them.
> - nptl adds new threads to one of the __stack_user or stack_used
> lists. New nodes are added to the _head_ of the lists.
>
> - nptl_db/td_ta_thr_iter.c:td_ta_thr_iter iterates over
> these thread lists from head to tail.
>
> From these last two points alone, one can infer that an iteration
> over the thread list can miss new threads if they're added just while
> the list is being iterated.
Right.
[While nptl_db is fresh in your mind, would you mind looking at gdb/10838
to double-check my diagnosis there?
http://sourceware.org/bugzilla/show_bug.cgi?id=10838
This is related to current patch only in that linux/libthread_db is
also involved.]
> Another way to tackle this, would be iterate over
> /proc/pid/task/ and attach to all lwps that way, instead of
> relying on thread_db to do the initial iteration over all
> threads. We'd still iterate using the thread_db interface
> once afterwards, so to learn the matching pthread
> ids of the lwps.
I should note that /proc may not be mounted. I am not sure whether GDB
currently requires /proc in other places.
> With /proc/pid/task, you'd still need to keep iterating until
> no new thread comes out, but, I think we could make the
> number of loops unbounded much safely, since we wouldn't be
> at risk of reading stale inferior data. Assuming no kernel
> bugs, that is. Note that this is something that would be
> a useful addition to gdb anyway, so that we could be able to
> attach to multi-threaded apps that use raw clone instead of
> pthreads.
I'll put that on my TODO list :-)
>> + /* Require NUM_LOOP successive iterations which do not find any new threads. */
>
> This line too long.
Fixed.
>> + for (i = 0, loop = 0; loop < 4; ++i, ++loop)
>
> Err, s/4/num_loop/ ? A comment explain _why_ we need this is missing
> as well. The patch looks OK otherwise.
Fixed.
> Actually, I'd rename the `num_loop' parameter into
> say, `until_no_new', and do this instead:
Good idea, done.
I'll commit attached patch tomorrow if there are no further comments.
A similar patch is required for gdbserver as well.
Thanks,
--
Paul Pluzhnikov
[-- Attachment #2: gdb-pr10757-20091026.txt --]
[-- Type: text/plain, Size: 6938 bytes --]
Index: linux-thread-db.c
===================================================================
RCS file: /cvs/src/src/gdb/linux-thread-db.c,v
retrieving revision 1.65
diff -u -p -u -r1.65 linux-thread-db.c
--- linux-thread-db.c 19 Oct 2009 09:51:41 -0000 1.65
+++ linux-thread-db.c 26 Oct 2009 21:20:31 -0000
@@ -160,6 +160,7 @@ struct thread_db_info
struct thread_db_info *thread_db_list;
static void thread_db_find_new_threads_1 (ptid_t ptid);
+static void thread_db_find_new_threads_2 (ptid_t ptid, int until_no_new);
/* Add the current inferior to the list of processes using libpthread.
Return a pointer to the newly allocated object that was added to
@@ -229,8 +230,8 @@ delete_thread_db_info (int pid)
}
/* Prototypes for local functions. */
-static void attach_thread (ptid_t ptid, const td_thrhandle_t *th_p,
- const td_thrinfo_t *ti_p);
+static int attach_thread (ptid_t ptid, const td_thrhandle_t *th_p,
+ const td_thrinfo_t *ti_p);
static void detach_thread (ptid_t ptid);
\f
@@ -597,7 +598,7 @@ thread_db_find_new_threads_silently (pti
TRY_CATCH (except, RETURN_MASK_ERROR)
{
- thread_db_find_new_threads_1 (ptid);
+ thread_db_find_new_threads_2 (ptid, 1);
}
if (except.reason < 0 && info_verbose)
@@ -977,9 +978,9 @@ thread_db_new_objfile (struct objfile *o
/* Attach to a new thread. This function is called when we receive a
TD_CREATE event or when we iterate over all threads and find one
- that wasn't already in our list. */
+ that wasn't already in our list. Returns true on success. */
-static void
+static int
attach_thread (ptid_t ptid, const td_thrhandle_t *th_p,
const td_thrinfo_t *ti_p)
{
@@ -1013,7 +1014,7 @@ attach_thread (ptid_t ptid, const td_thr
if (tp->private != NULL)
{
if (!tp->private->dying)
- return;
+ return 0;
delete_thread (ptid);
tp = NULL;
@@ -1023,12 +1024,12 @@ attach_thread (ptid_t ptid, const td_thr
check_thread_signals ();
if (ti_p->ti_state == TD_THR_UNKNOWN || ti_p->ti_state == TD_THR_ZOMBIE)
- return; /* A zombie thread -- do not attach. */
+ return 0; /* A zombie thread -- do not attach. */
/* Under GNU/Linux, we have to attach to each and every thread. */
if (tp == NULL
&& lin_lwp_attach_lwp (BUILD_LWP (ti_p->ti_lid, GET_PID (ptid))) < 0)
- return;
+ return 0;
/* Construct the thread's private data. */
private = xmalloc (sizeof (struct private_thread_info));
@@ -1056,6 +1057,7 @@ attach_thread (ptid_t ptid, const td_thr
if (err != TD_OK)
error (_("Cannot enable thread event reporting for %s: %s"),
target_pid_to_str (ptid), thread_db_err_str (err));
+ return 1;
}
static void
@@ -1282,6 +1284,12 @@ thread_db_mourn_inferior (struct target_
unpush_target (ops);
}
+struct callback_data
+{
+ struct thread_db_info *info;
+ int new_threads;
+};
+
static int
find_new_threads_callback (const td_thrhandle_t *th_p, void *data)
{
@@ -1289,7 +1297,8 @@ find_new_threads_callback (const td_thrh
td_err_e err;
ptid_t ptid;
struct thread_info *tp;
- struct thread_db_info *info = data;
+ struct callback_data *cb_data = data;
+ struct thread_db_info *info = cb_data->info;
err = info->td_thr_get_info_p (th_p, &ti);
if (err != TD_OK)
@@ -1332,21 +1341,76 @@ find_new_threads_callback (const td_thrh
ptid = ptid_build (info->pid, ti.ti_lid, 0);
tp = find_thread_ptid (ptid);
if (tp == NULL || tp->private == NULL)
- attach_thread (ptid, th_p, &ti);
+ {
+ if (attach_thread (ptid, th_p, &ti))
+ cb_data->new_threads += 1;
+ else
+ /* Problem attaching this thread; perhaps it exited before we
+ could attach it?
+ This could mean that the thread list inside glibc itself is in
+ inconsistent state, and libthread_db could go on looping forever
+ (observed with glibc-2.3.6). To prevent that, terminate
+ iteration: thread_db_find_new_threads_2 will retry. */
+ return 1;
+ }
return 0;
}
+/* Helper for thread_db_find_new_threads_2.
+ Returns number of new threads found. */
+
+static int
+find_new_threads_once (struct thread_db_info *info, int iteration,
+ int *errp)
+{
+ volatile struct gdb_exception except;
+ struct callback_data data;
+ int err = TD_ERR;
+
+ data.info = info;
+ data.new_threads = 0;
+
+ TRY_CATCH (except, RETURN_MASK_ERROR)
+ {
+ /* Iterate over all user-space threads to discover new threads. */
+ err = info->td_ta_thr_iter_p (info->thread_agent,
+ find_new_threads_callback,
+ &data,
+ TD_THR_ANY_STATE,
+ TD_THR_LOWEST_PRIORITY,
+ TD_SIGNO_MASK,
+ TD_THR_ANY_USER_FLAGS);
+ }
+
+ if (info_verbose)
+ {
+ if (except.reason < 0)
+ exception_fprintf (gdb_stderr, except,
+ "Warning: find_new_threads_once: ");
+
+ printf_filtered (_("Found %d new threads in iteration %d.\n"),
+ data.new_threads, iteration);
+ }
+
+ if (errp != NULL)
+ *errp = err;
+
+ return data.new_threads;
+}
+
/* Search for new threads, accessing memory through stopped thread
- PTID. */
+ PTID. If UNTIL_NO_NEW is true, repeat searching until several
+ searches in a row do not discover any new threads. */
static void
-thread_db_find_new_threads_1 (ptid_t ptid)
+thread_db_find_new_threads_2 (ptid_t ptid, int until_no_new)
{
td_err_e err;
struct lwp_info *lp;
struct thread_db_info *info;
int pid = ptid_get_pid (ptid);
+ int i, loop;
/* In linux, we can only read memory through a stopped lwp. */
ALL_LWPS (lp, ptid)
@@ -1361,15 +1425,36 @@ thread_db_find_new_threads_1 (ptid_t pti
/* Access an lwp we know is stopped. */
info->proc_handle.ptid = ptid;
- /* Iterate over all user-space threads to discover new threads. */
- err = info->td_ta_thr_iter_p (info->thread_agent, find_new_threads_callback,
- info, TD_THR_ANY_STATE, TD_THR_LOWEST_PRIORITY,
- TD_SIGNO_MASK, TD_THR_ANY_USER_FLAGS);
- if (err != TD_OK)
- error (_("Cannot find new threads: %s"), thread_db_err_str (err));
+
+ if (until_no_new)
+ {
+ /* Require 4 successive iterations which do not find any new threads.
+ The 4 is a heuristic: there is an inherent race here, and I have
+ seen that 2 iterations in a row are not always sufficient to
+ "capture" all threads. */
+ for (i = 0, loop = 0; loop < 4; ++i, ++loop)
+ if (find_new_threads_once (info, i, NULL) != 0)
+ /* Found some new threads. Restart the loop from beginning. */
+ loop = -1;
+ }
+ else
+ {
+ int err;
+
+ find_new_threads_once (info, 0, &err);
+ if (err != TD_OK)
+ error (_("Cannot find new threads: %s"), thread_db_err_str (err));
+ }
}
static void
+thread_db_find_new_threads_1 (ptid_t ptid)
+{
+ thread_db_find_new_threads_2 (ptid, 0);
+}
+
+
+static void
thread_db_find_new_threads (struct target_ops *ops)
{
struct thread_db_info *info;
next prev parent reply other threads:[~2009-10-26 21:25 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-15 19:10 Paul Pluzhnikov
2009-10-17 23:26 ` Paul Pluzhnikov
2009-10-25 17:15 ` Paul Pluzhnikov
2009-10-25 23:54 ` Pedro Alves
2009-10-26 21:25 ` Paul Pluzhnikov [this message]
2009-10-26 21:34 ` Pedro Alves
2009-10-27 0:41 ` Paul Pluzhnikov
2009-10-27 9:43 ` Pedro Alves
2009-10-27 16:58 ` Paul Pluzhnikov
2009-10-27 21:38 ` Paul Pluzhnikov
2009-10-27 21:52 ` Pedro Alves
2009-10-28 17:04 ` Paul Pluzhnikov
2009-10-25 23:59 Pedro Alves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8ac60eac0910261425x363afd0bp43299ab0b18e576a@mail.gmail.com \
--to=ppluzhnikov@google.com \
--cc=gdb-patches@sourceware.org \
--cc=pedro@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox