* [patch] Fix internal-error on dead LWPs with no associated thread
@ 2009-06-29 10:09 Jan Kratochvil
2009-06-29 10:35 ` Pedro Alves
0 siblings, 1 reply; 10+ messages in thread
From: Jan Kratochvil @ 2009-06-29 10:09 UTC (permalink / raw)
To: gdb-patches
Hi,
original bugreport:
https://bugzilla.redhat.com/show_bug.cgi?id=471819
linux-nat.c:1662: internal-error: linux_nat_resume: Assertion `lp != NULL' failed.
A reproducing testcase included.
No regressions on {x86_64,i686}-fedora-linux-gnu.
Thanks,
Jan
gdb/
2009-06-27 Jan Kratochvil <jan.kratochvil@redhat.com>
Fix internal error on dead LWPs with no associated thread.
* linux-nat.c (num_lwps): Do not count DEAD LWPs.
(delete_lwp): Only set LP as DEAD if it is INFERIOR_PTID.
(iterate_over_lwps): Skip DEAD LPs. Discard DEAD LPs not in
INFERIOR_PTID.
(linux_nat_resume): Do not resume DEAD LPs. Extend the debug message.
* linux-nat.h (struct lwp_info): New field `dead'.
gdb/testsuite/
2009-06-27 Jan Kratochvil <jan.kratochvil@redhat.com>
* gdb.threads/current-lwp-dead.exp, gdb.threads/current-lwp-dead.c: New.
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -941,7 +941,7 @@ num_lwps (int pid)
struct lwp_info *lp;
for (lp = lwp_list; lp; lp = lp->next)
- if (ptid_get_pid (lp->ptid) == pid)
+ if (ptid_get_pid (lp->ptid) == pid && !lp->dead)
count++;
return count;
@@ -991,6 +991,13 @@ delete_lwp (ptid_t ptid)
if (!lp)
return;
+ if (ptid_equal (ptid, inferior_ptid))
+ {
+ /* Delay the deletion the same way as in delete_thread_1. */
+ lp->dead = 1;
+ return;
+ }
+
if (lpprev)
lpprev->next = lp->next;
else
@@ -1063,6 +1070,15 @@ iterate_over_lwps (ptid_t filter,
{
lpnext = lp->next;
+ /* Ignored already dead LWPs. Moreover delete them if we no longer have
+ to keep them around. */
+ if (lp->dead)
+ {
+ if (!ptid_equal (lp->ptid, inferior_ptid))
+ delete_lwp (lp->ptid);
+ continue;
+ }
+
if (ptid_match (lp->ptid, filter))
{
if ((*callback) (lp, data))
@@ -1732,15 +1748,19 @@ linux_nat_resume (struct target_ops *ops,
/* Convert to something the lower layer understands. */
ptid = pid_to_ptid (GET_LWP (lp->ptid));
- linux_ops->to_resume (linux_ops, ptid, step, signo);
- memset (&lp->siginfo, 0, sizeof (lp->siginfo));
+ if (!lp->dead)
+ {
+ linux_ops->to_resume (linux_ops, ptid, step, signo);
+ memset (&lp->siginfo, 0, sizeof (lp->siginfo));
+ }
if (debug_linux_nat)
fprintf_unfiltered (gdb_stdlog,
- "LLR: %s %s, %s (resume event thread)\n",
+ "LLR: %s %s, %s, %s (resume event thread)\n",
step ? "PTRACE_SINGLESTEP" : "PTRACE_CONT",
target_pid_to_str (ptid),
- signo ? strsignal (signo) : "0");
+ signo ? strsignal (signo) : "0",
+ lp->dead ? "dead" : "alive");
restore_child_signals_mask (&prev_mask);
if (target_can_async_p ())
--- a/gdb/linux-nat.h
+++ b/gdb/linux-nat.h
@@ -32,6 +32,12 @@ struct lwp_info
and overall process id. */
ptid_t ptid;
+ /* If this flag is set, the lwp is known to be dead already (exit
+ event already received in a my_waitpid()). Such LWP should not longer be
+ iterated and it is being kept only temporarily if it is still referenced
+ by INFERIOR_PTID. */
+ int dead;
+
/* Non-zero if this LWP is cloned. In this context "cloned" means
that the LWP is reporting to its parent using a signal other than
SIGCHLD. */
new file mode 100644
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/current-lwp-dead.c
@@ -0,0 +1,75 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+ Copyright 2009 Free Software Foundation, Inc.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+ Do not use threads as we need to exploit a bug in LWP code masked by the
+ threads code otherwise.
+
+ INFERIOR_PTID must point to exited LWP. Here we use the initial LWP as it
+ is automatically INFERIOR_PTID for GDB.
+
+ Finally we need to call target_resume (RESUME_ALL, ...) which we invoke by
+ NEW_THREAD_EVENT (called from the new LWP as initial LWP is exited now). */
+
+#define _GNU_SOURCE
+#include <sched.h>
+#include <assert.h>
+#include <unistd.h>
+#include <stdlib.h>
+
+#define STACK_SIZE 0x1000
+
+static int
+fn_abort (void *unused)
+{
+ return 0; /* at-fn_abort */
+}
+
+static int
+fn (void *unused)
+{
+ int i;
+ unsigned char *stack;
+ int new_pid;
+
+ i = sleep (1);
+ assert (i == 0);
+
+ stack = malloc (STACK_SIZE);
+ assert (stack != NULL);
+
+ new_pid = clone (fn_abort, stack + STACK_SIZE, CLONE_FILES | CLONE_VM, NULL,
+ NULL, NULL, NULL);
+ assert (new_pid > 0);
+
+ return 0;
+}
+
+int
+main (int argc, char **argv)
+{
+ unsigned char *stack;
+ int new_pid;
+
+ stack = malloc (STACK_SIZE);
+ assert (stack != NULL);
+
+ new_pid = clone (fn, stack + STACK_SIZE, CLONE_FILES | CLONE_VM, NULL, NULL,
+ NULL, NULL);
+ assert (new_pid > 0);
+
+ return 0;
+}
new file mode 100644
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/current-lwp-dead.exp
@@ -0,0 +1,31 @@
+# This testcase is part of GDB, the GNU debugger.
+
+# Copyright 2009 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+# Please email any bugs, comments, and/or additions to this file to:
+# bug-gdb@gnu.org
+
+if { [prepare_for_testing current-lwp-dead.exp current-lwp-dead] } {
+ return -1
+}
+
+if {[runto_main] <= 0} {
+ untested current-lwp-dead.exp
+ return -1
+}
+
+gdb_breakpoint "fn_abort"
+gdb_continue_to_breakpoint "fn_abort" ".*at-fn_abort.*"
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix internal-error on dead LWPs with no associated thread
2009-06-29 10:09 [patch] Fix internal-error on dead LWPs with no associated thread Jan Kratochvil
@ 2009-06-29 10:35 ` Pedro Alves
2009-06-29 17:51 ` Jan Kratochvil
0 siblings, 1 reply; 10+ messages in thread
From: Pedro Alves @ 2009-06-29 10:35 UTC (permalink / raw)
To: gdb-patches; +Cc: Jan Kratochvil
On Monday 29 June 2009 11:09:22, Jan Kratochvil wrote:
> original bugreport:
> https://bugzilla.redhat.com/show_bug.cgi?id=471819
>
> linux-nat.c:1662: internal-error: linux_nat_resume: Assertion `lp != NULL' failed.
>
> A reproducing testcase included.
> No regressions on {x86_64,i686}-fedora-linux-gnu.
Please try the below patchlet instead. Do you have any other case where this
would be needed?
Does anyone know why does the new_thread_event bit need to resume the target
at all? Removing that resume should fix the issue too.
---
gdb/infrun.c | 2 ++
1 file changed, 2 insertions(+)
Index: src/gdb/infrun.c
===================================================================
--- src.orig/gdb/infrun.c 2009-06-29 11:29:58.000000000 +0100
+++ src/gdb/infrun.c 2009-06-29 11:30:36.000000000 +0100
@@ -2746,6 +2746,8 @@ targets should add new threads to the th
in either the OS or the native code). Therefore we need to
continue all threads in order to make progress. */
+ if (!ptid_equal (ecs->ptid, inferior_ptid))
+ context_switch (ecs->ptid);
target_resume (RESUME_ALL, 0, TARGET_SIGNAL_0);
prepare_to_wait (ecs);
return;
--
Pedro Alves
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix internal-error on dead LWPs with no associated thread
2009-06-29 10:35 ` Pedro Alves
@ 2009-06-29 17:51 ` Jan Kratochvil
2009-06-29 18:30 ` Pedro Alves
2009-06-29 18:31 ` Jan Kratochvil
0 siblings, 2 replies; 10+ messages in thread
From: Jan Kratochvil @ 2009-06-29 17:51 UTC (permalink / raw)
To: Pedro Alves; +Cc: gdb-patches
On Mon, 29 Jun 2009 12:36:28 +0200, Pedro Alves wrote:
> On Monday 29 June 2009 11:09:22, Jan Kratochvil wrote:
>
> > original bugreport:
> > https://bugzilla.redhat.com/show_bug.cgi?id=471819
> >
> > linux-nat.c:1662: internal-error: linux_nat_resume: Assertion `lp != NULL' failed.
[...]
> Please try the below patchlet instead. Do you have any other case where this
> would be needed?
[...]
> + if (!ptid_equal (ecs->ptid, inferior_ptid))
> + context_switch (ecs->ptid);
> target_resume (RESUME_ALL, 0, TARGET_SIGNAL_0);
> prepare_to_wait (ecs);
> return;
Your patch works for my testcase and I failed to find any countercase for it.
It has no regressions on {x86_64,i686}-fedora-linux-gnu.
Could you please check it in? Also the attached testcase needs an approval.
> Does anyone know why does the new_thread_event bit need to resume the target
> at all? Removing that resume should fix the issue too.
I find it more as a cleanup and going to send a followup with a patch made now
for it.
Thanks,
Jan
gdb/testsuite/
2009-06-29 Jan Kratochvil <jan.kratochvil@redhat.com>
* gdb.threads/current-lwp-dead.exp, gdb.threads/current-lwp-dead.c: New.
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/current-lwp-dead.c
@@ -0,0 +1,75 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+ Copyright 2009 Free Software Foundation, Inc.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+ Do not use threads as we need to exploit a bug in LWP code masked by the
+ threads code otherwise.
+
+ INFERIOR_PTID must point to exited LWP. Here we use the initial LWP as it
+ is automatically INFERIOR_PTID for GDB.
+
+ Finally we need to call target_resume (RESUME_ALL, ...) which we invoke by
+ NEW_THREAD_EVENT (called from the new LWP as initial LWP is exited now). */
+
+#define _GNU_SOURCE
+#include <sched.h>
+#include <assert.h>
+#include <unistd.h>
+#include <stdlib.h>
+
+#define STACK_SIZE 0x1000
+
+static int
+fn_return (void *unused)
+{
+ return 0; /* at-fn_return */
+}
+
+static int
+fn (void *unused)
+{
+ int i;
+ unsigned char *stack;
+ int new_pid;
+
+ i = sleep (1);
+ assert (i == 0);
+
+ stack = malloc (STACK_SIZE);
+ assert (stack != NULL);
+
+ new_pid = clone (fn_return, stack + STACK_SIZE, CLONE_FILES | CLONE_VM, NULL,
+ NULL, NULL, NULL);
+ assert (new_pid > 0);
+
+ return 0;
+}
+
+int
+main (int argc, char **argv)
+{
+ unsigned char *stack;
+ int new_pid;
+
+ stack = malloc (STACK_SIZE);
+ assert (stack != NULL);
+
+ new_pid = clone (fn, stack + STACK_SIZE, CLONE_FILES | CLONE_VM, NULL, NULL,
+ NULL, NULL);
+ assert (new_pid > 0);
+
+ return 0;
+}
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/current-lwp-dead.exp
@@ -0,0 +1,31 @@
+# This testcase is part of GDB, the GNU debugger.
+
+# Copyright 2009 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+# Please email any bugs, comments, and/or additions to this file to:
+# bug-gdb@gnu.org
+
+if { [prepare_for_testing current-lwp-dead.exp current-lwp-dead] } {
+ return -1
+}
+
+if {[runto_main] <= 0} {
+ untested current-lwp-dead.exp
+ return -1
+}
+
+gdb_breakpoint "fn_return"
+gdb_continue_to_breakpoint "fn_return" ".*at-fn_return.*"
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix internal-error on dead LWPs with no associated thread
2009-06-29 17:51 ` Jan Kratochvil
@ 2009-06-29 18:30 ` Pedro Alves
2009-06-29 18:59 ` Jan Kratochvil
2009-06-29 18:31 ` Jan Kratochvil
1 sibling, 1 reply; 10+ messages in thread
From: Pedro Alves @ 2009-06-29 18:30 UTC (permalink / raw)
To: Jan Kratochvil; +Cc: gdb-patches
On Monday 29 June 2009 18:49:29, Jan Kratochvil wrote:
> Your patch works for my testcase and I failed to find any countercase for it.
> It has no regressions on {x86_64,i686}-fedora-linux-gnu.
Thanks for testing.
> Could you please check it in?
Done. We're now a bit closer to always context-switching
to the event thread in all-stop too. :-) Everytime we don't,
it's most likely a bug waiting to happen.
> Also the attached testcase needs an approval.
Approved.
--
Pedro Alves
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix internal-error on dead LWPs with no associated thread
2009-06-29 17:51 ` Jan Kratochvil
2009-06-29 18:30 ` Pedro Alves
@ 2009-06-29 18:31 ` Jan Kratochvil
2009-06-29 18:41 ` Pedro Alves
1 sibling, 1 reply; 10+ messages in thread
From: Jan Kratochvil @ 2009-06-29 18:31 UTC (permalink / raw)
To: Pedro Alves; +Cc: gdb-patches
On Mon, 29 Jun 2009 19:49:29 +0200, Jan Kratochvil wrote:
> > Does anyone know why does the new_thread_event bit need to resume the target
> > at all? Removing that resume should fix the issue too.
Attached here. (this one is for CVS HEAD so it still includes your patch above)
Expecting you are right but only for GNU/Linux.
This patch has no regressions on {x86_64,i686}-fedora-linux-gnu.
Thanks,
Jan
gdb/
2009-06-29 Jan Kratochvil <jan.kratochvil@redhat.com>
Attach new threads in the GNU/Linux all-stop mode immediately.
* inferior.h (default_new_thread_found): New prototype.
* infrun.c (handle_inferior_event <new_thread_event>): Call
target_new_thread_found, move the original code to ...
(default_new_thread_found): ... a new function.
* linux-nat.c (linux_handle_extended_wait): Create the GDB structures
for new thread even in the all-stop mode. New comment point 3.
(linux_nat_new_thread_found): New function.
(linux_nat_add_target): Register linux_nat_new_thread_found.
* target.c (update_current_target): Initialize to_new_thread_found.
* target.h (struct target_ops): New field `to_new_thread_found'.
(target_new_thread_found): New macro.
--- a/gdb/inferior.h
+++ b/gdb/inferior.h
@@ -244,6 +244,8 @@ extern void ensure_not_running (void);
void set_step_info (struct frame_info *frame, struct symtab_and_line sal);
+extern void default_new_thread_found (ptid_t ptid);
+
/* From infcmd.c */
extern void tty_command (char *, int);
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2732,21 +2732,7 @@ handle_inferior_event (struct execution_control_state *ecs)
if (ecs->new_thread_event)
{
- if (non_stop)
- /* Non-stop assumes that the target handles adding new threads
- to the thread list. */
- internal_error (__FILE__, __LINE__, "\
-targets should add new threads to the thread list themselves in non-stop mode.");
-
- /* We may want to consider not doing a resume here in order to
- give the user a chance to play with the new thread. It might
- be good to make that a user-settable option. */
-
- /* At this point, all threads are stopped (happens automatically
- in either the OS or the native code). Therefore we need to
- continue all threads in order to make progress. */
-
- target_resume (RESUME_ALL, 0, TARGET_SIGNAL_0);
+ target_new_thread_found (ecs->ptid);
prepare_to_wait (ecs);
return;
}
@@ -4116,6 +4102,34 @@ infrun: not switching back to stepped thread, it has vanished\n");
keep_going (ecs);
}
+/* Handle new unexpected LWP PTID. At this point, both the parent and the new
+ child LWP are stopped (happens automatically in either the OS or the native
+ code). Therefore we need to continue all threads in order to make progress.
+
+ GNU/Linux support uses linux_nat_new_thread_found instead. */
+
+void
+default_new_thread_found (ptid_t ptid)
+{
+ if (non_stop)
+ /* Non-stop assumes that the target handles adding new threads
+ to the thread list. */
+ internal_error (__FILE__, __LINE__, "\
+targets should add new threads to the thread list themselves in non-stop mode.");
+
+ /* We may want to consider not doing a resume here in order to
+ give the user a chance to play with the new thread. It might
+ be good to make that a user-settable option. */
+
+ /* target_resume requires valid INFERIOR_PTID while it may have been deleted
+ in the meantime. At least the new PTID was now created by add_thread. */
+
+ if (!ptid_equal (ptid, inferior_ptid))
+ context_switch (ptid);
+
+ target_resume (RESUME_ALL, 0, TARGET_SIGNAL_0);
+}
+
/* Is thread TP in the middle of single-stepping? */
static int
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1843,35 +1843,32 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
else
status = 0;
- if (non_stop)
- {
- /* Add the new thread to GDB's lists as soon as possible
- so that:
+ /* Add the new thread to GDB's lists as soon as possible
+ so that:
- 1) the frontend doesn't have to wait for a stop to
- display them, and,
+ 1) the frontend doesn't have to wait for a stop to
+ display them, and,
- 2) we tag it with the correct running state. */
+ 2) we tag it with the correct running state.
- /* If the thread_db layer is active, let it know about
- this new thread, and add it to GDB's list. */
- if (!thread_db_attach_lwp (new_lp->ptid))
- {
- /* We're not using thread_db. Add it to GDB's
- list. */
- target_post_attach (GET_LWP (new_lp->ptid));
- add_thread (new_lp->ptid);
- }
+ 3) we do not consider the first breakpoint just as
+ a NEW_THREAD_EVENT. */
- if (!stopping)
- {
- set_running (new_lp->ptid, 1);
- set_executing (new_lp->ptid, 1);
- }
+ /* If the thread_db layer is active, let it know about
+ this new thread, and add it to GDB's list. */
+ if (!thread_db_attach_lwp (new_lp->ptid))
+ {
+ /* We're not using thread_db. Add it to GDB's
+ list. */
+ target_post_attach (GET_LWP (new_lp->ptid));
+ add_thread (new_lp->ptid);
}
if (!stopping)
{
+ set_running (new_lp->ptid, 1);
+ set_executing (new_lp->ptid, 1);
+
new_lp->stopped = 0;
new_lp->resumed = 1;
ptrace (PTRACE_CONT, new_pid, 0,
@@ -1926,6 +1923,20 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
_("unknown ptrace event %d"), event);
}
+/* linux_handle_extended_wait already creates the GDB structures for the new
+ LWP PTID. LWPs therefore should not appear unexpectedly. */
+
+static void
+linux_nat_new_thread_found (ptid_t ptid)
+{
+ if (linux_supports_tracefork (GET_LWP (ptid)))
+ internal_error (__FILE__, __LINE__, ("\
+Unexpected %s despite PTRACE_O_TRACEFORK is supported on this GNU/Linux OS."),
+ target_pid_to_str (ptid));
+
+ linux_ops->to_new_thread_found (ptid);
+}
+
/* Wait for LP to stop. Returns the wait status, or 0 if the LWP has
exited. */
@@ -4599,6 +4610,8 @@ linux_nat_add_target (struct target_ops *t)
t->to_supports_multi_process = linux_nat_supports_multi_process;
+ t->to_new_thread_found = linux_nat_new_thread_found;
+
/* We don't change the stratum; this target will sit at
process_stratum and thread_db will set at thread_stratum. This
is a little strange, since this is a multi-threaded-capable
--- a/gdb/target.c
+++ b/gdb/target.c
@@ -634,6 +634,7 @@ update_current_target (void)
INHERIT (to_get_ada_task_ptid, t);
/* Do not inherit to_search_memory. */
INHERIT (to_supports_multi_process, t);
+ INHERIT (to_new_thread_found, t);
INHERIT (to_magic, t);
/* Do not inherit to_memory_map. */
/* Do not inherit to_flash_erase. */
@@ -777,6 +778,8 @@ update_current_target (void)
de_fault (to_supports_multi_process,
(int (*) (void))
return_zero);
+ de_fault (to_new_thread_found,
+ default_new_thread_found);
#undef de_fault
/* Finally, position the target-stack beneath the squashed
--- a/gdb/target.h
+++ b/gdb/target.h
@@ -542,6 +542,9 @@ struct target_ops
simultaneously? */
int (*to_supports_multi_process) (void);
+ /* An event was received for a so far unknown PTID. */
+ void (*to_new_thread_found) (ptid_t ptid);
+
int to_magic;
/* Need sub-structure for target machine related rather than comm related?
*/
@@ -1118,6 +1121,9 @@ extern char *normal_pid_to_str (ptid_t ptid);
(current_target.to_can_execute_reverse ? \
current_target.to_can_execute_reverse () : 0)
+#define target_new_thread_found \
+ (*current_target.to_new_thread_found)
+
extern const struct target_desc *target_read_description (struct target_ops *);
#define target_get_ada_task_ptid(lwp, tid) \
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix internal-error on dead LWPs with no associated thread
2009-06-29 18:31 ` Jan Kratochvil
@ 2009-06-29 18:41 ` Pedro Alves
2009-06-29 21:35 ` Jan Kratochvil
0 siblings, 1 reply; 10+ messages in thread
From: Pedro Alves @ 2009-06-29 18:41 UTC (permalink / raw)
To: Jan Kratochvil; +Cc: gdb-patches
On Monday 29 June 2009 19:28:52, Jan Kratochvil wrote:
> On Mon, 29 Jun 2009 19:49:29 +0200, Jan Kratochvil wrote:
> > > Does anyone know why does the new_thread_event bit need to resume the target
> > > at all? Removing that resume should fix the issue too.
>
> Attached here. (this one is for CVS HEAD so it still includes your patch above)
Haya! This is much more than I said...
1 - I'm not convinced currently that adding threads immediately to the list in
all-stop mode in linux_handle_extended_wait is a good idea. See here for
thoughts around that:
http://sourceware.org/ml/gdb/2009-05/msg00067.html
2 - If the target has let the thread escape this far without having added it to
the list, *and* the target needs to book-keep extra thread info associated
with the thread structure, than your patch looks like paparing over a bug.
It's just a simple to handle it in the target's target_wait implementation,
just before returning an event.
3 - I really just meant to just remove this whole block:
- if (ecs->new_thread_event)
- {
- if (non_stop)
- /* Non-stop assumes that the target handles adding new threads
- to the thread list. */
- internal_error (__FILE__, __LINE__, "\
-targets should add new threads to the thread list themselves in non-stop mode.");
-
- /* We may want to consider not doing a resume here in order to
- give the user a chance to play with the new thread. It might
- be good to make that a user-settable option. */
-
- /* At this point, all threads are stopped (happens automatically
- in either the OS or the native code). Therefore we need to
- continue all threads in order to make progress. */
-
- if (!ptid_equal (ecs->ptid, inferior_ptid))
- context_switch (ecs->ptid);
- target_resume (RESUME_ALL, 0, TARGET_SIGNAL_0);
- prepare_to_wait (ecs);
- return;
- }
>
> Expecting you are right but only for GNU/Linux.
>
> This patch has no regressions on {x86_64,i686}-fedora-linux-gnu.
>
>
> Thanks,
> Jan
>
>
> gdb/
> 2009-06-29 Jan Kratochvil <jan.kratochvil@redhat.com>
>
> Attach new threads in the GNU/Linux all-stop mode immediately.
> * inferior.h (default_new_thread_found): New prototype.
> * infrun.c (handle_inferior_event <new_thread_event>): Call
> target_new_thread_found, move the original code to ...
> (default_new_thread_found): ... a new function.
> * linux-nat.c (linux_handle_extended_wait): Create the GDB structures
> for new thread even in the all-stop mode. New comment point 3.
> (linux_nat_new_thread_found): New function.
> (linux_nat_add_target): Register linux_nat_new_thread_found.
> * target.c (update_current_target): Initialize to_new_thread_found.
> * target.h (struct target_ops): New field `to_new_thread_found'.
> (target_new_thread_found): New macro.
>
> --- a/gdb/inferior.h
> +++ b/gdb/inferior.h
> @@ -244,6 +244,8 @@ extern void ensure_not_running (void);
>
> void set_step_info (struct frame_info *frame, struct symtab_and_line sal);
>
> +extern void default_new_thread_found (ptid_t ptid);
> +
> /* From infcmd.c */
>
> extern void tty_command (char *, int);
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -2732,21 +2732,7 @@ handle_inferior_event (struct execution_control_state *ecs)
>
> if (ecs->new_thread_event)
> {
> - if (non_stop)
> - /* Non-stop assumes that the target handles adding new threads
> - to the thread list. */
> - internal_error (__FILE__, __LINE__, "\
> -targets should add new threads to the thread list themselves in non-stop mode.");
> -
> - /* We may want to consider not doing a resume here in order to
> - give the user a chance to play with the new thread. It might
> - be good to make that a user-settable option. */
> -
> - /* At this point, all threads are stopped (happens automatically
> - in either the OS or the native code). Therefore we need to
> - continue all threads in order to make progress. */
> -
> - target_resume (RESUME_ALL, 0, TARGET_SIGNAL_0);
> + target_new_thread_found (ecs->ptid);
> prepare_to_wait (ecs);
> return;
> }
> @@ -4116,6 +4102,34 @@ infrun: not switching back to stepped thread, it has vanished\n");
> keep_going (ecs);
> }
>
> +/* Handle new unexpected LWP PTID. At this point, both the parent and the new
> + child LWP are stopped (happens automatically in either the OS or the native
> + code). Therefore we need to continue all threads in order to make progress.
> +
> + GNU/Linux support uses linux_nat_new_thread_found instead. */
> +
> +void
> +default_new_thread_found (ptid_t ptid)
> +{
> + if (non_stop)
> + /* Non-stop assumes that the target handles adding new threads
> + to the thread list. */
> + internal_error (__FILE__, __LINE__, "\
> +targets should add new threads to the thread list themselves in non-stop mode.");
> +
> + /* We may want to consider not doing a resume here in order to
> + give the user a chance to play with the new thread. It might
> + be good to make that a user-settable option. */
> +
> + /* target_resume requires valid INFERIOR_PTID while it may have been deleted
> + in the meantime. At least the new PTID was now created by add_thread. */
> +
> + if (!ptid_equal (ptid, inferior_ptid))
> + context_switch (ptid);
> +
> + target_resume (RESUME_ALL, 0, TARGET_SIGNAL_0);
> +}
> +
> /* Is thread TP in the middle of single-stepping? */
>
> static int
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -1843,35 +1843,32 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
> else
> status = 0;
>
> - if (non_stop)
> - {
> - /* Add the new thread to GDB's lists as soon as possible
> - so that:
> + /* Add the new thread to GDB's lists as soon as possible
> + so that:
>
> - 1) the frontend doesn't have to wait for a stop to
> - display them, and,
> + 1) the frontend doesn't have to wait for a stop to
> + display them, and,
>
> - 2) we tag it with the correct running state. */
> + 2) we tag it with the correct running state.
>
> - /* If the thread_db layer is active, let it know about
> - this new thread, and add it to GDB's list. */
> - if (!thread_db_attach_lwp (new_lp->ptid))
> - {
> - /* We're not using thread_db. Add it to GDB's
> - list. */
> - target_post_attach (GET_LWP (new_lp->ptid));
> - add_thread (new_lp->ptid);
> - }
> + 3) we do not consider the first breakpoint just as
> + a NEW_THREAD_EVENT. */
>
> - if (!stopping)
> - {
> - set_running (new_lp->ptid, 1);
> - set_executing (new_lp->ptid, 1);
> - }
> + /* If the thread_db layer is active, let it know about
> + this new thread, and add it to GDB's list. */
> + if (!thread_db_attach_lwp (new_lp->ptid))
> + {
> + /* We're not using thread_db. Add it to GDB's
> + list. */
> + target_post_attach (GET_LWP (new_lp->ptid));
> + add_thread (new_lp->ptid);
> }
>
> if (!stopping)
> {
> + set_running (new_lp->ptid, 1);
> + set_executing (new_lp->ptid, 1);
> +
> new_lp->stopped = 0;
> new_lp->resumed = 1;
> ptrace (PTRACE_CONT, new_pid, 0,
> @@ -1926,6 +1923,20 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
> _("unknown ptrace event %d"), event);
> }
>
> +/* linux_handle_extended_wait already creates the GDB structures for the new
> + LWP PTID. LWPs therefore should not appear unexpectedly. */
> +
> +static void
> +linux_nat_new_thread_found (ptid_t ptid)
> +{
> + if (linux_supports_tracefork (GET_LWP (ptid)))
> + internal_error (__FILE__, __LINE__, ("\
> +Unexpected %s despite PTRACE_O_TRACEFORK is supported on this GNU/Linux OS."),
> + target_pid_to_str (ptid));
> +
> + linux_ops->to_new_thread_found (ptid);
> +}
> +
> /* Wait for LP to stop. Returns the wait status, or 0 if the LWP has
> exited. */
>
> @@ -4599,6 +4610,8 @@ linux_nat_add_target (struct target_ops *t)
>
> t->to_supports_multi_process = linux_nat_supports_multi_process;
>
> + t->to_new_thread_found = linux_nat_new_thread_found;
> +
> /* We don't change the stratum; this target will sit at
> process_stratum and thread_db will set at thread_stratum. This
> is a little strange, since this is a multi-threaded-capable
> --- a/gdb/target.c
> +++ b/gdb/target.c
> @@ -634,6 +634,7 @@ update_current_target (void)
> INHERIT (to_get_ada_task_ptid, t);
> /* Do not inherit to_search_memory. */
> INHERIT (to_supports_multi_process, t);
> + INHERIT (to_new_thread_found, t);
> INHERIT (to_magic, t);
> /* Do not inherit to_memory_map. */
> /* Do not inherit to_flash_erase. */
> @@ -777,6 +778,8 @@ update_current_target (void)
> de_fault (to_supports_multi_process,
> (int (*) (void))
> return_zero);
> + de_fault (to_new_thread_found,
> + default_new_thread_found);
> #undef de_fault
>
> /* Finally, position the target-stack beneath the squashed
> --- a/gdb/target.h
> +++ b/gdb/target.h
> @@ -542,6 +542,9 @@ struct target_ops
> simultaneously? */
> int (*to_supports_multi_process) (void);
>
> + /* An event was received for a so far unknown PTID. */
> + void (*to_new_thread_found) (ptid_t ptid);
> +
> int to_magic;
> /* Need sub-structure for target machine related rather than comm related?
> */
> @@ -1118,6 +1121,9 @@ extern char *normal_pid_to_str (ptid_t ptid);
> (current_target.to_can_execute_reverse ? \
> current_target.to_can_execute_reverse () : 0)
>
> +#define target_new_thread_found \
> + (*current_target.to_new_thread_found)
> +
> extern const struct target_desc *target_read_description (struct target_ops *);
>
> #define target_get_ada_task_ptid(lwp, tid) \
>
--
Pedro Alves
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix internal-error on dead LWPs with no associated thread
2009-06-29 18:30 ` Pedro Alves
@ 2009-06-29 18:59 ` Jan Kratochvil
0 siblings, 0 replies; 10+ messages in thread
From: Jan Kratochvil @ 2009-06-29 18:59 UTC (permalink / raw)
To: Pedro Alves; +Cc: gdb-patches
On Mon, 29 Jun 2009 20:31:21 +0200, Pedro Alves wrote:
> We're now a bit closer to always context-switching
> to the event thread in all-stop too. :-) Everytime we don't,
> it's most likely a bug waiting to happen.
Assuming THREAD_EXITED may get removed one day when there is no need for dead
LWPs.
> On Monday 29 June 2009 18:49:29, Jan Kratochvil wrote:
> > Also the attached testcase needs an approval.
>
> Approved.
Checked-in.
http://sourceware.org/ml/gdb-cvs/2009-06/msg00195.html
Thanks,
Jan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix internal-error on dead LWPs with no associated thread
2009-06-29 18:41 ` Pedro Alves
@ 2009-06-29 21:35 ` Jan Kratochvil
2009-06-29 21:42 ` Pedro Alves
0 siblings, 1 reply; 10+ messages in thread
From: Jan Kratochvil @ 2009-06-29 21:35 UTC (permalink / raw)
To: Pedro Alves; +Cc: gdb-patches
On Mon, 29 Jun 2009 20:41:51 +0200, Pedro Alves wrote:
> 1 - I'm not convinced currently that adding threads immediately to the list in
> all-stop mode in linux_handle_extended_wait is a good idea. See here for
> thoughts around that:
>
> http://sourceware.org/ml/gdb/2009-05/msg00067.html
I do not share this opinion - not attaching short-lived threads should not be
allowed for performance reasons. When such short-lived thread crashes GDB
should catch such crash.
> 2 - If the target has let the thread escape this far without having added it to
> the list, *and* the target needs to book-keep extra thread info associated
> with the thread structure, than your patch looks like paparing over a bug.
> It's just a simple to handle it in the target's target_wait implementation,
> just before returning an event.
>
> 3 - I really just meant to just remove this whole block:
>
> - if (ecs->new_thread_event)
> - {
> - if (non_stop)
> - /* Non-stop assumes that the target handles adding new threads
> - to the thread list. */
> - internal_error (__FILE__, __LINE__, "\
> -targets should add new threads to the thread list themselves in non-stop mode.");
> -
> - /* We may want to consider not doing a resume here in order to
> - give the user a chance to play with the new thread. It might
> - be good to make that a user-settable option. */
> -
> - /* At this point, all threads are stopped (happens automatically
> - in either the OS or the native code). Therefore we need to
> - continue all threads in order to make progress. */
> -
> - if (!ptid_equal (ecs->ptid, inferior_ptid))
> - context_switch (ecs->ptid);
> - target_resume (RESUME_ALL, 0, TARGET_SIGNAL_0);
> - prepare_to_wait (ecs);
> - return;
> - }
Such a review for myself why:
On GNU/Linux if linux_test_for_tracefork() fails then the new LWP is left
unstopped - so it needs no target_resume.
On GNU/Linux if linux_test_for_tracefork() succeeds then the new LWP is left
unstopped immediately after catching its clone event by
linux_handle_extended_wait as it is called with STOPPING 0. So no
target_resume is needed.
For other OSes expectin no one stops the new LWP.
(I currently do not have an opinion on this removal, it is just a dead code.)
Thanks,
Jan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix internal-error on dead LWPs with no associated thread
2009-06-29 21:35 ` Jan Kratochvil
@ 2009-06-29 21:42 ` Pedro Alves
2009-06-29 21:48 ` Jan Kratochvil
0 siblings, 1 reply; 10+ messages in thread
From: Pedro Alves @ 2009-06-29 21:42 UTC (permalink / raw)
To: Jan Kratochvil; +Cc: gdb-patches
On Monday 29 June 2009 22:33:33, Jan Kratochvil wrote:
> I do not share this opinion - not attaching short-lived threads should not be
> allowed for performance reasons. When such short-lived thread crashes GDB
> should catch such crash.
>
Sorry, this makes no sense. GDB is automaticaly attached to cloned
threads. You won't miss any crash. You're thinking of discovering
threads with thread_db?
The performance issue is related with communicating the
new-thread/thread-exited to the frontend, or with printing it to
the screen with printf_unfiltered, not related to the thread
management per-se.
--
Pedro Alves
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix internal-error on dead LWPs with no associated thread
2009-06-29 21:42 ` Pedro Alves
@ 2009-06-29 21:48 ` Jan Kratochvil
0 siblings, 0 replies; 10+ messages in thread
From: Jan Kratochvil @ 2009-06-29 21:48 UTC (permalink / raw)
To: Pedro Alves; +Cc: gdb-patches
On Mon, 29 Jun 2009 23:43:23 +0200, Pedro Alves wrote:
> On Monday 29 June 2009 22:33:33, Jan Kratochvil wrote:
> > I do not share this opinion - not attaching short-lived threads should not be
> > allowed for performance reasons. Â When such short-lived thread crashes GDB
> > should catch such crash.
>
> Sorry, this makes no sense. GDB is automaticaly attached to cloned
> threads. You won't miss any crash. You're thinking of discovering
> threads with thread_db?
I agree my comment made no sense. I was thinking of the case when
linux_test_for_tracefork() fails and PTRACE_O_TRACEFORK is not in effect; but
that is not possible on the current systems.
Thanks,
Jan
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-06-29 21:48 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-29 10:09 [patch] Fix internal-error on dead LWPs with no associated thread Jan Kratochvil
2009-06-29 10:35 ` Pedro Alves
2009-06-29 17:51 ` Jan Kratochvil
2009-06-29 18:30 ` Pedro Alves
2009-06-29 18:59 ` Jan Kratochvil
2009-06-29 18:31 ` Jan Kratochvil
2009-06-29 18:41 ` Pedro Alves
2009-06-29 21:35 ` Jan Kratochvil
2009-06-29 21:42 ` Pedro Alves
2009-06-29 21:48 ` Jan Kratochvil
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox