* [PATCH] Fix crash of gdbserver when kill threads
@ 2014-06-23 10:10 Hui Zhu
2014-06-29 3:28 ` Hui Zhu
0 siblings, 1 reply; 10+ messages in thread
From: Hui Zhu @ 2014-06-23 10:10 UTC (permalink / raw)
To: gdb-patches ml
gdbserver :1234 gdb.base/watch_thread_num
gdb gdb.base/watch_thread_num
(gdb) b 48
Breakpoint 1 at 0x400737: file ../../../binutils-gdb/gdb/testsuite/gdb.base/watch_thread_num.c, line 48.
(gdb) c
Continuing.
Breakpoint 1, main () at ../../../binutils-gdb/gdb/testsuite/gdb.base/watch_thread_num.c:48
48 thread_result = thread_function ((void *) i);
(gdb) k
Kill the program being debugged? (y or n) y
gdbserver :1234 gdb.base/watch_thread_num
Process gdb.base/watch_thread_num created; pid = 9719
Listening on port 1234
Remote debugging from host 127.0.0.1
Killing all inferiors
Segmentation fault (core dumped)
Backtrace:
(gdb) bt
#0 find_inferior (list=<optimized out>, func=func@entry=0x423990 <kill_one_lwp_callback>, arg=arg@entry=0x7fffe97405dc)
at ../../../binutils-gdb/gdb/gdbserver/inferiors.c:199
#1 0x0000000000425bff in linux_kill (pid=10130) at ../../../binutils-gdb/gdb/gdbserver/linux-low.c:966
#2 0x000000000040ae8c in kill_inferior_callback (entry=<optimized out>) at ../../../binutils-gdb/gdb/gdbserver/server.c:2934
#3 0x0000000000405c61 in for_each_inferior (list=<optimized out>, action=action@entry=0x40ae60 <kill_inferior_callback>)
at ../../../binutils-gdb/gdb/gdbserver/inferiors.c:57
#4 0x000000000040d5e2 in process_serial_event () at ../../../binutils-gdb/gdb/gdbserver/server.c:3767
#5 handle_serial_event (err=<optimized out>, client_data=<optimized out>) at ../../../binutils-gdb/gdb/gdbserver/server.c:3880
#6 0x0000000000412cda in handle_file_event (event_file_desc=event_file_desc@entry=4)
at ../../../binutils-gdb/gdb/gdbserver/event-loop.c:434
#7 0x000000000041357a in process_event () at ../../../binutils-gdb/gdb/gdbserver/event-loop.c:189
#8 start_event_loop () at ../../../binutils-gdb/gdb/gdbserver/event-loop.c:552
#9 0x0000000000403088 in main (argc=3, argv=0x7fffe9740938) at ../../../binutils-gdb/gdb/gdbserver/server.c:3283
The cause of this issue is when linux_kill call "find_inferior (&all_threads, kill_one_lwp_callback , &pid)"
to kill all the lwp of pid.
In linux_wait_for_event, it will delete_lwp any lwp in all_threads if it
get exit event of it. Then it make find_inferior crash.
I make a patch that let kill_one_lwp_callback return 1, then after
linux_wait_for_event is called(Maybe all_threads is changed), find_inferior
will return.
And change call "find_inferior (&all_threads, kill_one_lwp_callback , &pid)"
to be a loop. It will stop when all_threads doesn't have any lwp is pid.
It pass regression test in x86_64 Linux.
Thanks,
Hui
2014-06-23 Hui Zhu <hui@codesourcery.com>
* linux-low.c (kill_one_lwp_callback): Change last return to 1.
(linux_kill): Call find_inferior with a loop.
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -944,7 +944,9 @@ kill_one_lwp_callback (struct inferior_l
pid = linux_wait_for_event (thread->entry.id, &wstat, __WALL);
} while (pid > 0 && WIFSTOPPED (wstat));
- return 0;
+ /* Let find_inferior return because maybe other lwp in the list will
+ be deleted by delete_lwp. */
+ return 1;
}
static int
@@ -963,7 +965,9 @@ linux_kill (int pid)
first, as PTRACE_KILL will not work otherwise. */
stop_all_lwps (0, NULL);
- find_inferior (&all_threads, kill_one_lwp_callback , &pid);
+ /* Keep call kill_one_lwp_callback until find_inferior cannot find any
+ lwps that is for pid. */
+ while (find_inferior (&all_threads, kill_one_lwp_callback , &pid) != NULL);
/* See the comment in linux_kill_one_lwp. We did not kill the first
thread in the list, so do so now. */
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] Fix crash of gdbserver when kill threads 2014-06-23 10:10 [PATCH] Fix crash of gdbserver when kill threads Hui Zhu @ 2014-06-29 3:28 ` Hui Zhu 2014-07-02 9:07 ` Pedro Alves 0 siblings, 1 reply; 10+ messages in thread From: Hui Zhu @ 2014-06-29 3:28 UTC (permalink / raw) To: Pedro Alves, gdb-patches ml Hi Pedro, This issue is introduced by the patches for PR 12702 that was pushed by you. Maybe you could take a look on this patch. Thanks, Hui On 06/23/14 18:09, Hui Zhu wrote: > gdbserver :1234 gdb.base/watch_thread_num > gdb gdb.base/watch_thread_num > (gdb) b 48 > Breakpoint 1 at 0x400737: file ../../../binutils-gdb/gdb/testsuite/gdb.base/watch_thread_num.c, line 48. > (gdb) c > Continuing. > > Breakpoint 1, main () at ../../../binutils-gdb/gdb/testsuite/gdb.base/watch_thread_num.c:48 > 48 thread_result = thread_function ((void *) i); > (gdb) k > Kill the program being debugged? (y or n) y > gdbserver :1234 gdb.base/watch_thread_num > Process gdb.base/watch_thread_num created; pid = 9719 > Listening on port 1234 > Remote debugging from host 127.0.0.1 > Killing all inferiors > Segmentation fault (core dumped) > > Backtrace: > (gdb) bt > #0 find_inferior (list=<optimized out>, func=func@entry=0x423990 <kill_one_lwp_callback>, arg=arg@entry=0x7fffe97405dc) > at ../../../binutils-gdb/gdb/gdbserver/inferiors.c:199 > #1 0x0000000000425bff in linux_kill (pid=10130) at ../../../binutils-gdb/gdb/gdbserver/linux-low.c:966 > #2 0x000000000040ae8c in kill_inferior_callback (entry=<optimized out>) at ../../../binutils-gdb/gdb/gdbserver/server.c:2934 > #3 0x0000000000405c61 in for_each_inferior (list=<optimized out>, action=action@entry=0x40ae60 <kill_inferior_callback>) > at ../../../binutils-gdb/gdb/gdbserver/inferiors.c:57 > #4 0x000000000040d5e2 in process_serial_event () at ../../../binutils-gdb/gdb/gdbserver/server.c:3767 > #5 handle_serial_event (err=<optimized out>, client_data=<optimized out>) at ../../../binutils-gdb/gdb/gdbserver/server.c:3880 > #6 0x0000000000412cda in handle_file_event (event_file_desc=event_file_desc@entry=4) > at ../../../binutils-gdb/gdb/gdbserver/event-loop.c:434 > #7 0x000000000041357a in process_event () at ../../../binutils-gdb/gdb/gdbserver/event-loop.c:189 > #8 start_event_loop () at ../../../binutils-gdb/gdb/gdbserver/event-loop.c:552 > #9 0x0000000000403088 in main (argc=3, argv=0x7fffe9740938) at ../../../binutils-gdb/gdb/gdbserver/server.c:3283 > > The cause of this issue is when linux_kill call "find_inferior (&all_threads, kill_one_lwp_callback , &pid)" > to kill all the lwp of pid. > In linux_wait_for_event, it will delete_lwp any lwp in all_threads if it > get exit event of it. Then it make find_inferior crash. > > I make a patch that let kill_one_lwp_callback return 1, then after > linux_wait_for_event is called(Maybe all_threads is changed), find_inferior > will return. > And change call "find_inferior (&all_threads, kill_one_lwp_callback , &pid)" > to be a loop. It will stop when all_threads doesn't have any lwp is pid. > > It pass regression test in x86_64 Linux. > > Thanks, > Hui > > 2014-06-23 Hui Zhu <hui@codesourcery.com> > > * linux-low.c (kill_one_lwp_callback): Change last return to 1. > (linux_kill): Call find_inferior with a loop. > > --- a/gdb/gdbserver/linux-low.c > +++ b/gdb/gdbserver/linux-low.c > @@ -944,7 +944,9 @@ kill_one_lwp_callback (struct inferior_l > pid = linux_wait_for_event (thread->entry.id, &wstat, __WALL); > } while (pid > 0 && WIFSTOPPED (wstat)); > > - return 0; > + /* Let find_inferior return because maybe other lwp in the list will > + be deleted by delete_lwp. */ > + return 1; > } > > static int > @@ -963,7 +965,9 @@ linux_kill (int pid) > first, as PTRACE_KILL will not work otherwise. */ > stop_all_lwps (0, NULL); > > - find_inferior (&all_threads, kill_one_lwp_callback , &pid); > + /* Keep call kill_one_lwp_callback until find_inferior cannot find any > + lwps that is for pid. */ > + while (find_inferior (&all_threads, kill_one_lwp_callback , &pid) != NULL); > > /* See the comment in linux_kill_one_lwp. We did not kill the first > thread in the list, so do so now. */ > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Fix crash of gdbserver when kill threads 2014-06-29 3:28 ` Hui Zhu @ 2014-07-02 9:07 ` Pedro Alves 2014-07-10 15:17 ` [PATCH v2] GDBserver crashes when killing a multi-thread process Pedro Alves 0 siblings, 1 reply; 10+ messages in thread From: Pedro Alves @ 2014-07-02 9:07 UTC (permalink / raw) To: Hui Zhu, gdb-patches ml On 06/29/2014 04:28 AM, Hui Zhu wrote: > Hi Pedro, > > This issue is introduced by the patches for PR 12702 that was pushed by you. > Maybe you could take a look on this patch. Thanks. I'll need to think a bit about whether this is the right solution. I've added it to the 7.8 regression queue at: https://sourceware.org/gdb/wiki/GDB_7.8_Release I'll look better at it as soon as I have a chance. -- Pedro Alves ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2] GDBserver crashes when killing a multi-thread process 2014-07-02 9:07 ` Pedro Alves @ 2014-07-10 15:17 ` Pedro Alves 2014-07-11 8:21 ` Hui Zhu 2015-07-13 16:07 ` Yao Qi 0 siblings, 2 replies; 10+ messages in thread From: Pedro Alves @ 2014-07-10 15:17 UTC (permalink / raw) To: Hui Zhu, gdb-patches ml Hi Hui, On 07/02/2014 10:07 AM, Pedro Alves wrote: >> This issue is introduced by the patches for PR 12702 that was pushed by you. >> Maybe you could take a look on this patch. > > Thanks. I'll need to think a bit about whether this is the right > solution. > > I've added it to the 7.8 regression queue at: > > https://sourceware.org/gdb/wiki/GDB_7.8_Release > > I'll look better at it as soon as I have a chance. > First, it's really annoying that the testsuite doesn't catch this! We're really missing such a basic test, so I went ahead and wrote one. I was actually quite surprised that we didn't have any test that makes sure "kill" works without complaints/spurious output. However, unfortunately, the new test revealed that the fix isn't right. :-/ Sometimes the test passed, but other times (that'll depend on machine, of course), I'd get: kill Kill the program being debugged? (y or n) y Ignoring packet error, continuing... (gdb) FAIL: gdb.threads/kill.exp: kill That means GDBserver didn't reply to the vKill packet. Attaching to GDBserver at this point shows that it's stuck forever in the new loop: + /* Keep call kill_one_lwp_callback until find_inferior cannot find any + lwps that is for pid. */ + while (find_inferior (&all_threads, kill_one_lwp_callback , &pid) != NULL); The reason was that the leader lwp goes zombie before the other lwps are reaped, and linux_wait_for_event would delete it. We'd be left with e.g., only one lwp that is _not_ the leader. That lwp had disappeared already (due to SIGKILL), but there's no status to collect (waitpid would return -1). Nothing in linux-low.c would notice the lwp is already gone. Because that lwp has lwp->stopped == 0, linux_wait_for_event would return, here: if ((find_inferior (&all_threads, not_stopped_callback, &wait_ptid) == NULL)) { if (debug_threads) debug_printf ("LLW: exit (no unwaited-for LWP)\n"); sigprocmask (SIG_SETMASK, &prev_mask, NULL); return -1; } So this would return true: --- a/gdb/gdbserver/linux-low.c +++ b/gdb/gdbserver/linux-low.c @@ -944,7 +944,9 @@ kill_one_lwp_callback (struct inferior_l pid = linux_wait_for_event (thread->entry.id, &wstat, __WALL); } while (pid > 0 && WIFSTOPPED (wstat)); - return 0; + /* Let find_inferior return because maybe other lwp in the list will + be deleted by delete_lwp. */ + return 1; } So the new loop would never break. Here's a different patch that passes the test cleanly for me. Let me know if it works for you. 8<------------------- From ed3e42e5447e0276d6fc759a27408d06e9ccf0b7 Mon Sep 17 00:00:00 2001 From: Pedro Alves <palves@redhat.com> Date: Thu, 10 Jul 2014 15:13:26 +0100 Subject: [PATCH] GDBserver crashes when killing a multi-thread process Here's an example, with the new test: gdbserver :9999 gdb.threads/kill gdb gdb.threads/kill (gdb) b 52 Breakpoint 1 at 0x4007f4: file kill.c, line 52. Continuing. Breakpoint 1, main () at kill.c:52 52 return 0; /* set break here */ (gdb) k Kill the program being debugged? (y or n) y gdbserver :9999 gdb.threads/kill Process gdb.base/watch_thread_num created; pid = 9719 Listening on port 1234 Remote debugging from host 127.0.0.1 Killing all inferiors Segmentation fault (core dumped) Backtrace: (gdb) bt #0 0x00000000004068a0 in find_inferior (list=0x66b060 <all_threads>, func=0x427637 <kill_one_lwp_callback>, arg=0x7fffffffd3fc) at src/gdb/gdbserver/inferiors.c:199 #1 0x00000000004277b6 in linux_kill (pid=15708) at src/gdb/gdbserver/linux-low.c:966 #2 0x000000000041354d in kill_inferior (pid=15708) at src/gdb/gdbserver/target.c:163 #3 0x00000000004107e9 in kill_inferior_callback (entry=0x6704f0) at src/gdb/gdbserver/server.c:2934 #4 0x0000000000406522 in for_each_inferior (list=0x66b050 <all_processes>, action=0x4107a6 <kill_inferior_callback>) at src/gdb/gdbserver/inferiors.c:57 #5 0x0000000000412377 in process_serial_event () at src/gdb/gdbserver/server.c:3767 #6 0x000000000041267c in handle_serial_event (err=0, client_data=0x0) at src/gdb/gdbserver/server.c:3880 #7 0x00000000004189ff in handle_file_event (event_file_desc=4) at src/gdb/gdbserver/event-loop.c:434 #8 0x00000000004181c6 in process_event () at src/gdb/gdbserver/event-loop.c:189 #9 0x0000000000418f45 in start_event_loop () at src/gdb/gdbserver/event-loop.c:552 #10 0x0000000000411272 in main (argc=3, argv=0x7fffffffd8d8) at src/gdb/gdbserver/server.c:3283 The problem is that linux_wait_for_event deletes lwps that have exited (even those not passed in as lwps of interest), while the lwp/thread list is being walked on with find_inferior. find_inferior can handle the current iterated inferior being deleted, but not others. When killing lwps, we don't really care about any of the pending status handling of linux_wait_for_event. We can just waitpid the lwps directly, which is also what GDB does (see linux-nat.c:kill_wait_callback). This way the lwps are not deleted while we're walking the list. They'll be deleted by linux_mourn afterwards. This crash triggers several times when running the testsuite against GDBserver with the native-gdbserver board (target remote), but as GDB can't distinguish between GDBserver crashing and "kill" being sucessful, as in both cases the connection is closed (the 'k' packet doesn't require a reply), and the inferior is gone, that results in no FAIL. The patch adds a generic test that catches the issue with extended-remote mode (and works fine with native testing too). Here's how it fails with the native-extended-gdbserver board without the fix: (gdb) info threads Id Target Id Frame 6 Thread 15367.15374 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 5 Thread 15367.15373 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 4 Thread 15367.15372 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 3 Thread 15367.15371 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 2 Thread 15367.15370 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 * 1 Thread 15367.15367 main () at .../gdb.threads/kill.c:52 (gdb) kill Kill the program being debugged? (y or n) y Remote connection closed ^^^^^^^^^^^^^^^^^^^^^^^^ (gdb) FAIL: gdb.threads/kill.exp: kill Extended remote should remain connected after the kill. gdb/gdbserver/ 2014-07-10 Pedro Alves <palves@redhat.com> * linux-low.c (kill_wait_lwp): New function, based on kill_one_lwp_callback, but use my_waitpid directly. (kill_one_lwp_callback, linux_kill): Use it. gdb/testsuite/ 2014-07-10 Pedro Alves <palves@redhat.com> * gdb.threads/kill.c: New file. * gdb.threads/kill.exp: New file. --- gdb/gdbserver/linux-low.c | 68 ++++++++++++++++++++------------- gdb/testsuite/gdb.threads/kill.c | 64 +++++++++++++++++++++++++++++++ gdb/testsuite/gdb.threads/kill.exp | 77 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 183 insertions(+), 26 deletions(-) create mode 100644 gdb/testsuite/gdb.threads/kill.c create mode 100644 gdb/testsuite/gdb.threads/kill.exp diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c index 61552f4..bdc9aaa 100644 --- a/gdb/gdbserver/linux-low.c +++ b/gdb/gdbserver/linux-low.c @@ -909,6 +909,46 @@ linux_kill_one_lwp (struct lwp_info *lwp) errno ? strerror (errno) : "OK"); } +/* Kill LWP and wait for it to die. */ + +static void +kill_wait_lwp (struct lwp_info *lwp) +{ + struct thread_info *thr = get_lwp_thread (lwp); + int pid = ptid_get_pid (ptid_of (thr)); + int lwpid = ptid_get_lwp (ptid_of (thr)); + int wstat; + int res; + + if (debug_threads) + debug_printf ("kwl: killing lwp %d, for pid: %d\n", lwpid, pid); + + do + { + linux_kill_one_lwp (lwp); + + /* Make sure it died. Notes: + + - The loop is most likely unnecessary. + + - We don't use linux_wait_for_event as that could delete lwps + while we're iterating over them. We're not interested in + any pending status at this point, only in making sure all + wait status on the kernel side are collected until the + process is reaped. + + - We don't use __WALL here as the __WALL emulation relies on + SIGCHLD, and killing a stopped process doesn't generate + one, nor an exit status. + */ + res = my_waitpid (lwpid, &wstat, 0); + if (res == -1 && errno == ECHILD) + res = my_waitpid (lwpid, &wstat, __WCLONE); + } while (res > 0 && WIFSTOPPED (wstat)); + + gdb_assert (res > 0); +} + /* Callback for `find_inferior'. Kills an lwp of a given process, except the leader. */ @@ -917,7 +957,6 @@ kill_one_lwp_callback (struct inferior_list_entry *entry, void *args) { struct thread_info *thread = (struct thread_info *) entry; struct lwp_info *lwp = get_thread_lwp (thread); - int wstat; int pid = * (int *) args; if (ptid_get_pid (entry->id) != pid) @@ -936,14 +975,7 @@ kill_one_lwp_callback (struct inferior_list_entry *entry, void *args) return 0; } - do - { - linux_kill_one_lwp (lwp); - - /* Make sure it died. The loop is most likely unnecessary. */ - pid = linux_wait_for_event (thread->entry.id, &wstat, __WALL); - } while (pid > 0 && WIFSTOPPED (wstat)); - + kill_wait_lwp (lwp); return 0; } @@ -952,8 +984,6 @@ linux_kill (int pid) { struct process_info *process; struct lwp_info *lwp; - int wstat; - int lwpid; process = find_process_pid (pid); if (process == NULL) @@ -976,21 +1006,7 @@ linux_kill (int pid) pid); } else - { - struct thread_info *thr = get_lwp_thread (lwp); - - if (debug_threads) - debug_printf ("lk_1: killing lwp %ld, for pid: %d\n", - lwpid_of (thr), pid); - - do - { - linux_kill_one_lwp (lwp); - - /* Make sure it died. The loop is most likely unnecessary. */ - lwpid = linux_wait_for_event (thr->entry.id, &wstat, __WALL); - } while (lwpid > 0 && WIFSTOPPED (wstat)); - } + kill_wait_lwp (lwp); the_target->mourn (process); diff --git a/gdb/testsuite/gdb.threads/kill.c b/gdb/testsuite/gdb.threads/kill.c new file mode 100644 index 0000000..aefbb06 --- /dev/null +++ b/gdb/testsuite/gdb.threads/kill.c @@ -0,0 +1,64 @@ +/* This testcase is part of GDB, the GNU debugger. + + Copyright 2014 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <http://www.gnu.org/licenses/>. */ + +#ifdef USE_THREADS + +#include <unistd.h> +#include <pthread.h> + +#define NUM 5 + +pthread_barrier_t barrier; + +void * +thread_function (void *arg) +{ + volatile unsigned int counter = 1; + + pthread_barrier_wait (&barrier); + + while (counter > 0) + { + counter++; + usleep (1); + } + + pthread_exit (NULL); +} + +#endif /* USE_THREADS */ + +void +setup (void) +{ +#ifdef USE_THREADS + pthread_t threads[NUM]; + int i; + + pthread_barrier_init (&barrier, NULL, NUM + 1); + for (i = 0; i < NUM; i++) + pthread_create (&threads[i], NULL, thread_function, NULL); + pthread_barrier_wait (&barrier); +#endif /* USE_THREADS */ +} + +int +main (void) +{ + setup (); + return 0; /* set break here */ +} diff --git a/gdb/testsuite/gdb.threads/kill.exp b/gdb/testsuite/gdb.threads/kill.exp new file mode 100644 index 0000000..a3f921f --- /dev/null +++ b/gdb/testsuite/gdb.threads/kill.exp @@ -0,0 +1,77 @@ +# This testcase is part of GDB, the GNU debugger. + +# Copyright 2014 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. */ + +standard_testfile + +# Run the test proper. THREADED indicates whether to build a threaded +# program and spawn several threads before trying to kill the program. + +proc test {threaded} { + global testfile srcfile + + with_test_prefix [expr ($threaded)?"threaded":"non-threaded"] { + + set options {debug} + if {$threaded} { + lappend options "pthreads" + lappend options "additional_flags=-DUSE_THREADS" + set prog ${testfile}_threads + } else { + set prog ${testfile}_nothreads + } + + if {[prepare_for_testing "failed to prepare" $prog $srcfile $options] == -1} { + return -1 + } + + if { ![runto main] } then { + fail "run to main" + return + } + + set linenum [gdb_get_line_number "set break here"] + gdb_breakpoint "$srcfile:$linenum" + gdb_continue_to_breakpoint "break here" ".*break here.*" + + if {$threaded} { + gdb_test "info threads" "6.*5.*4.*3.*2.*1.*" "all threads started" + } + + # This kills and ensures no output other than the prompt comes out, + # like: + # + # (gdb) kill + # Kill the program being debugged? (y or n) y + # (gdb) + # + # If we instead saw more output, like e.g., with an extended-remote + # connection: + # + # (gdb) kill + # Kill the program being debugged? (y or n) y + # Remote connection closed + # (gdb) + # + # the above would mean that the remote end crashed. + + gdb_test "kill" "^y" "kill program" "Kill the program being debugged\\? \\(y or n\\) $" "y" + } +} + +foreach threaded {true false} { + test $threaded +} -- 1.9.3 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] GDBserver crashes when killing a multi-thread process 2014-07-10 15:17 ` [PATCH v2] GDBserver crashes when killing a multi-thread process Pedro Alves @ 2014-07-11 8:21 ` Hui Zhu 2014-07-11 10:53 ` [PUSHED+7.8] " Pedro Alves 2015-07-13 16:07 ` Yao Qi 1 sibling, 1 reply; 10+ messages in thread From: Hui Zhu @ 2014-07-11 8:21 UTC (permalink / raw) To: Pedro Alves, gdb-patches ml Hi Pedro, Thanks for your patch. It fixed the issue in my part. Best, Hui On 07/10/14 23:16, Pedro Alves wrote: > Hi Hui, > > On 07/02/2014 10:07 AM, Pedro Alves wrote: >>> This issue is introduced by the patches for PR 12702 that was pushed by you. >>> Maybe you could take a look on this patch. >> >> Thanks. I'll need to think a bit about whether this is the right >> solution. >> >> I've added it to the 7.8 regression queue at: >> >> https://sourceware.org/gdb/wiki/GDB_7.8_Release >> >> I'll look better at it as soon as I have a chance. >> > > First, it's really annoying that the testsuite doesn't catch this! > We're really missing such a basic test, so I went ahead and wrote one. > I was actually quite surprised that we didn't have any > test that makes sure "kill" works without complaints/spurious output. > > However, unfortunately, the new test revealed that the fix isn't right. :-/ > > Sometimes the test passed, but other times (that'll depend on > machine, of course), I'd get: > > kill > Kill the program being debugged? (y or n) y > Ignoring packet error, continuing... > (gdb) FAIL: gdb.threads/kill.exp: kill > > That means GDBserver didn't reply to the vKill packet. > > Attaching to GDBserver at this point shows that it's stuck forever > in the new loop: > > > + /* Keep call kill_one_lwp_callback until find_inferior cannot find any > + lwps that is for pid. */ > + while (find_inferior (&all_threads, kill_one_lwp_callback , &pid) != NULL); > > > The reason was that the leader lwp goes zombie before the other > lwps are reaped, and linux_wait_for_event would delete it. We'd be > left with e.g., only one lwp that is _not_ the leader. > > That lwp had disappeared already (due to SIGKILL), but there's > no status to collect (waitpid would return -1). Nothing in linux-low.c > would notice the lwp is already gone. Because that lwp has lwp->stopped == 0, > linux_wait_for_event would return, here: > > if ((find_inferior (&all_threads, > not_stopped_callback, > &wait_ptid) == NULL)) > { > if (debug_threads) > debug_printf ("LLW: exit (no unwaited-for LWP)\n"); > sigprocmask (SIG_SETMASK, &prev_mask, NULL); > return -1; > } > > So this would return true: > > --- a/gdb/gdbserver/linux-low.c > +++ b/gdb/gdbserver/linux-low.c > @@ -944,7 +944,9 @@ kill_one_lwp_callback (struct inferior_l > pid = linux_wait_for_event (thread->entry.id, &wstat, __WALL); > } while (pid > 0 && WIFSTOPPED (wstat)); > > - return 0; > + /* Let find_inferior return because maybe other lwp in the list will > + be deleted by delete_lwp. */ > + return 1; > } > > So the new loop would never break. > > Here's a different patch that passes the test cleanly for me. > Let me know if it works for you. > > 8<------------------- > From ed3e42e5447e0276d6fc759a27408d06e9ccf0b7 Mon Sep 17 00:00:00 2001 > From: Pedro Alves <palves@redhat.com> > Date: Thu, 10 Jul 2014 15:13:26 +0100 > Subject: [PATCH] GDBserver crashes when killing a multi-thread process > > Here's an example, with the new test: > > gdbserver :9999 gdb.threads/kill > gdb gdb.threads/kill > (gdb) b 52 > Breakpoint 1 at 0x4007f4: file kill.c, line 52. > Continuing. > > Breakpoint 1, main () at kill.c:52 > 52 return 0; /* set break here */ > (gdb) k > Kill the program being debugged? (y or n) y > > gdbserver :9999 gdb.threads/kill > Process gdb.base/watch_thread_num created; pid = 9719 > Listening on port 1234 > Remote debugging from host 127.0.0.1 > Killing all inferiors > Segmentation fault (core dumped) > > Backtrace: > > (gdb) bt > #0 0x00000000004068a0 in find_inferior (list=0x66b060 <all_threads>, func=0x427637 <kill_one_lwp_callback>, arg=0x7fffffffd3fc) at src/gdb/gdbserver/inferiors.c:199 > #1 0x00000000004277b6 in linux_kill (pid=15708) at src/gdb/gdbserver/linux-low.c:966 > #2 0x000000000041354d in kill_inferior (pid=15708) at src/gdb/gdbserver/target.c:163 > #3 0x00000000004107e9 in kill_inferior_callback (entry=0x6704f0) at src/gdb/gdbserver/server.c:2934 > #4 0x0000000000406522 in for_each_inferior (list=0x66b050 <all_processes>, action=0x4107a6 <kill_inferior_callback>) at src/gdb/gdbserver/inferiors.c:57 > #5 0x0000000000412377 in process_serial_event () at src/gdb/gdbserver/server.c:3767 > #6 0x000000000041267c in handle_serial_event (err=0, client_data=0x0) at src/gdb/gdbserver/server.c:3880 > #7 0x00000000004189ff in handle_file_event (event_file_desc=4) at src/gdb/gdbserver/event-loop.c:434 > #8 0x00000000004181c6 in process_event () at src/gdb/gdbserver/event-loop.c:189 > #9 0x0000000000418f45 in start_event_loop () at src/gdb/gdbserver/event-loop.c:552 > #10 0x0000000000411272 in main (argc=3, argv=0x7fffffffd8d8) at src/gdb/gdbserver/server.c:3283 > > The problem is that linux_wait_for_event deletes lwps that have exited > (even those not passed in as lwps of interest), while the lwp/thread > list is being walked on with find_inferior. find_inferior can handle > the current iterated inferior being deleted, but not others. > > When killing lwps, we don't really care about any of the pending > status handling of linux_wait_for_event. We can just waitpid the lwps > directly, which is also what GDB does (see > linux-nat.c:kill_wait_callback). This way the lwps are not deleted > while we're walking the list. They'll be deleted by linux_mourn > afterwards. > > This crash triggers several times when running the testsuite against > GDBserver with the native-gdbserver board (target remote), but as GDB > can't distinguish between GDBserver crashing and "kill" being > sucessful, as in both cases the connection is closed (the 'k' packet > doesn't require a reply), and the inferior is gone, that results in no > FAIL. > > The patch adds a generic test that catches the issue with > extended-remote mode (and works fine with native testing too). Here's > how it fails with the native-extended-gdbserver board without the fix: > > (gdb) info threads > Id Target Id Frame > 6 Thread 15367.15374 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 > 5 Thread 15367.15373 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 > 4 Thread 15367.15372 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 > 3 Thread 15367.15371 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 > 2 Thread 15367.15370 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 > * 1 Thread 15367.15367 main () at .../gdb.threads/kill.c:52 > (gdb) kill > Kill the program being debugged? (y or n) y > Remote connection closed > ^^^^^^^^^^^^^^^^^^^^^^^^ > (gdb) FAIL: gdb.threads/kill.exp: kill > > Extended remote should remain connected after the kill. > > gdb/gdbserver/ > 2014-07-10 Pedro Alves <palves@redhat.com> > > * linux-low.c (kill_wait_lwp): New function, based on > kill_one_lwp_callback, but use my_waitpid directly. > (kill_one_lwp_callback, linux_kill): Use it. > > gdb/testsuite/ > 2014-07-10 Pedro Alves <palves@redhat.com> > > * gdb.threads/kill.c: New file. > * gdb.threads/kill.exp: New file. > --- > gdb/gdbserver/linux-low.c | 68 ++++++++++++++++++++------------- > gdb/testsuite/gdb.threads/kill.c | 64 +++++++++++++++++++++++++++++++ > gdb/testsuite/gdb.threads/kill.exp | 77 ++++++++++++++++++++++++++++++++++++++ > 3 files changed, 183 insertions(+), 26 deletions(-) > create mode 100644 gdb/testsuite/gdb.threads/kill.c > create mode 100644 gdb/testsuite/gdb.threads/kill.exp > > diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c > index 61552f4..bdc9aaa 100644 > --- a/gdb/gdbserver/linux-low.c > +++ b/gdb/gdbserver/linux-low.c > @@ -909,6 +909,46 @@ linux_kill_one_lwp (struct lwp_info *lwp) > errno ? strerror (errno) : "OK"); > } > > +/* Kill LWP and wait for it to die. */ > + > +static void > +kill_wait_lwp (struct lwp_info *lwp) > +{ > + struct thread_info *thr = get_lwp_thread (lwp); > + int pid = ptid_get_pid (ptid_of (thr)); > + int lwpid = ptid_get_lwp (ptid_of (thr)); > + int wstat; > + int res; > + > + if (debug_threads) > + debug_printf ("kwl: killing lwp %d, for pid: %d\n", lwpid, pid); > + > + do > + { > + linux_kill_one_lwp (lwp); > + > + /* Make sure it died. Notes: > + > + - The loop is most likely unnecessary. > + > + - We don't use linux_wait_for_event as that could delete lwps > + while we're iterating over them. We're not interested in > + any pending status at this point, only in making sure all > + wait status on the kernel side are collected until the > + process is reaped. > + > + - We don't use __WALL here as the __WALL emulation relies on > + SIGCHLD, and killing a stopped process doesn't generate > + one, nor an exit status. > + */ > + res = my_waitpid (lwpid, &wstat, 0); > + if (res == -1 && errno == ECHILD) > + res = my_waitpid (lwpid, &wstat, __WCLONE); > + } while (res > 0 && WIFSTOPPED (wstat)); > + > + gdb_assert (res > 0); > +} > + > /* Callback for `find_inferior'. Kills an lwp of a given process, > except the leader. */ > > @@ -917,7 +957,6 @@ kill_one_lwp_callback (struct inferior_list_entry *entry, void *args) > { > struct thread_info *thread = (struct thread_info *) entry; > struct lwp_info *lwp = get_thread_lwp (thread); > - int wstat; > int pid = * (int *) args; > > if (ptid_get_pid (entry->id) != pid) > @@ -936,14 +975,7 @@ kill_one_lwp_callback (struct inferior_list_entry *entry, void *args) > return 0; > } > > - do > - { > - linux_kill_one_lwp (lwp); > - > - /* Make sure it died. The loop is most likely unnecessary. */ > - pid = linux_wait_for_event (thread->entry.id, &wstat, __WALL); > - } while (pid > 0 && WIFSTOPPED (wstat)); > - > + kill_wait_lwp (lwp); > return 0; > } > > @@ -952,8 +984,6 @@ linux_kill (int pid) > { > struct process_info *process; > struct lwp_info *lwp; > - int wstat; > - int lwpid; > > process = find_process_pid (pid); > if (process == NULL) > @@ -976,21 +1006,7 @@ linux_kill (int pid) > pid); > } > else > - { > - struct thread_info *thr = get_lwp_thread (lwp); > - > - if (debug_threads) > - debug_printf ("lk_1: killing lwp %ld, for pid: %d\n", > - lwpid_of (thr), pid); > - > - do > - { > - linux_kill_one_lwp (lwp); > - > - /* Make sure it died. The loop is most likely unnecessary. */ > - lwpid = linux_wait_for_event (thr->entry.id, &wstat, __WALL); > - } while (lwpid > 0 && WIFSTOPPED (wstat)); > - } > + kill_wait_lwp (lwp); > > the_target->mourn (process); > > diff --git a/gdb/testsuite/gdb.threads/kill.c b/gdb/testsuite/gdb.threads/kill.c > new file mode 100644 > index 0000000..aefbb06 > --- /dev/null > +++ b/gdb/testsuite/gdb.threads/kill.c > @@ -0,0 +1,64 @@ > +/* This testcase is part of GDB, the GNU debugger. > + > + Copyright 2014 Free Software Foundation, Inc. > + > + This program is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3 of the License, or > + (at your option) any later version. > + > + This program is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with this program. If not, see <http://www.gnu.org/licenses/>. */ > + > +#ifdef USE_THREADS > + > +#include <unistd.h> > +#include <pthread.h> > + > +#define NUM 5 > + > +pthread_barrier_t barrier; > + > +void * > +thread_function (void *arg) > +{ > + volatile unsigned int counter = 1; > + > + pthread_barrier_wait (&barrier); > + > + while (counter > 0) > + { > + counter++; > + usleep (1); > + } > + > + pthread_exit (NULL); > +} > + > +#endif /* USE_THREADS */ > + > +void > +setup (void) > +{ > +#ifdef USE_THREADS > + pthread_t threads[NUM]; > + int i; > + > + pthread_barrier_init (&barrier, NULL, NUM + 1); > + for (i = 0; i < NUM; i++) > + pthread_create (&threads[i], NULL, thread_function, NULL); > + pthread_barrier_wait (&barrier); > +#endif /* USE_THREADS */ > +} > + > +int > +main (void) > +{ > + setup (); > + return 0; /* set break here */ > +} > diff --git a/gdb/testsuite/gdb.threads/kill.exp b/gdb/testsuite/gdb.threads/kill.exp > new file mode 100644 > index 0000000..a3f921f > --- /dev/null > +++ b/gdb/testsuite/gdb.threads/kill.exp > @@ -0,0 +1,77 @@ > +# This testcase is part of GDB, the GNU debugger. > + > +# Copyright 2014 Free Software Foundation, Inc. > + > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3 of the License, or > +# (at your option) any later version. > +# > +# This program is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program. If not, see <http://www.gnu.org/licenses/>. */ > + > +standard_testfile > + > +# Run the test proper. THREADED indicates whether to build a threaded > +# program and spawn several threads before trying to kill the program. > + > +proc test {threaded} { > + global testfile srcfile > + > + with_test_prefix [expr ($threaded)?"threaded":"non-threaded"] { > + > + set options {debug} > + if {$threaded} { > + lappend options "pthreads" > + lappend options "additional_flags=-DUSE_THREADS" > + set prog ${testfile}_threads > + } else { > + set prog ${testfile}_nothreads > + } > + > + if {[prepare_for_testing "failed to prepare" $prog $srcfile $options] == -1} { > + return -1 > + } > + > + if { ![runto main] } then { > + fail "run to main" > + return > + } > + > + set linenum [gdb_get_line_number "set break here"] > + gdb_breakpoint "$srcfile:$linenum" > + gdb_continue_to_breakpoint "break here" ".*break here.*" > + > + if {$threaded} { > + gdb_test "info threads" "6.*5.*4.*3.*2.*1.*" "all threads started" > + } > + > + # This kills and ensures no output other than the prompt comes out, > + # like: > + # > + # (gdb) kill > + # Kill the program being debugged? (y or n) y > + # (gdb) > + # > + # If we instead saw more output, like e.g., with an extended-remote > + # connection: > + # > + # (gdb) kill > + # Kill the program being debugged? (y or n) y > + # Remote connection closed > + # (gdb) > + # > + # the above would mean that the remote end crashed. > + > + gdb_test "kill" "^y" "kill program" "Kill the program being debugged\\? \\(y or n\\) $" "y" > + } > +} > + > +foreach threaded {true false} { > + test $threaded > +} > ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PUSHED+7.8] Re: [PATCH v2] GDBserver crashes when killing a multi-thread process 2014-07-11 8:21 ` Hui Zhu @ 2014-07-11 10:53 ` Pedro Alves 0 siblings, 0 replies; 10+ messages in thread From: Pedro Alves @ 2014-07-11 10:53 UTC (permalink / raw) To: Hui Zhu, gdb-patches ml On 07/11/2014 08:33 AM, Hui Zhu wrote: > Hi Pedro, > > Thanks for your patch. It fixed the issue in my part. Thank you. I went ahead and pushed it to both master and gdb-7.8-branch. -- Pedro Alves ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] GDBserver crashes when killing a multi-thread process 2014-07-10 15:17 ` [PATCH v2] GDBserver crashes when killing a multi-thread process Pedro Alves 2014-07-11 8:21 ` Hui Zhu @ 2015-07-13 16:07 ` Yao Qi 2015-07-13 17:32 ` Pedro Alves 1 sibling, 1 reply; 10+ messages in thread From: Yao Qi @ 2015-07-13 16:07 UTC (permalink / raw) To: Pedro Alves, gdb-patches ml On 10/07/14 16:16, Pedro Alves wrote: > +static void > +kill_wait_lwp (struct lwp_info *lwp) > +{ > + struct thread_info *thr = get_lwp_thread (lwp); > + int pid = ptid_get_pid (ptid_of (thr)); > + int lwpid = ptid_get_lwp (ptid_of (thr)); > + int wstat; > + int res; > + > + if (debug_threads) > + debug_printf ("kwl: killing lwp %d, for pid: %d\n", lwpid, pid); > + > + do > + { > + linux_kill_one_lwp (lwp); > + > + /* Make sure it died. Notes: > + > + - The loop is most likely unnecessary. > + > + - We don't use linux_wait_for_event as that could delete lwps > + while we're iterating over them. We're not interested in > + any pending status at this point, only in making sure all > + wait status on the kernel side are collected until the > + process is reaped. > + > + - We don't use __WALL here as the __WALL emulation relies on > + SIGCHLD, and killing a stopped process doesn't generate > + one, nor an exit status. > + */ > + res = my_waitpid (lwpid, &wstat, 0); > + if (res == -1 && errno == ECHILD) > + res = my_waitpid (lwpid, &wstat, __WCLONE); > + } while (res > 0 && WIFSTOPPED (wstat)); > + > + gdb_assert (res > 0); > +} Hi Pedro, do you still remember why did you add this assert? It wasn't mentioned in the mail https://sourceware.org/ml/gdb-patches/2014-07/msg00206.html I am looking at a GDBserver internal error on x86_64 when I run gdb.threads/thread-unwindonsignal.exp with GDBserver, continue^M Continuing.^M warning: Remote failure reply: E.No unwaited-for children left.^M PC register is not available^M (gdb) FAIL: gdb.threads/thread-unwindonsignal.exp: continue until exit Remote debugging from host 127.0.0.1^M ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M monitor exit^M Killing process(es): 30694^M (gdb) /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/linux-low.c:1106: A problem internal to GDBserver has been detected.^M kill_wait_lwp: Assertion `res > 0' failed. After your patch https://sourceware.org/ml/gdb-patches/2015-03/msg00597.html GDBserver starts to swallows errors if the LWP is gone. Then, when GDBservers kills non-exist LWP, the assert will be triggered. Why don't we implement kill_wait_lwp like its counterpart in GDB linux-nat.c:kill_wait_callback? we can loop and assert like this patch below, (note that this patch fixes the internal error, and the FAIL is still there). -- Yao (é½å°§) diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c index 7bb9f7f..07d051a 100644 --- a/gdb/gdbserver/linux-low.c +++ b/gdb/gdbserver/linux-low.c @@ -1101,9 +1101,9 @@ kill_wait_lwp (struct lwp_info *lwp) res = my_waitpid (lwpid, &wstat, 0); if (res == -1 && errno == ECHILD) res = my_waitpid (lwpid, &wstat, __WCLONE); - } while (res > 0 && WIFSTOPPED (wstat)); + } while (res == lwpid); - gdb_assert (res > 0); + gdb_assert (res == -1 && errno == ECHILD); } /* Callback for `find_inferior'. Kills an lwp of a given process, ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] GDBserver crashes when killing a multi-thread process 2015-07-13 16:07 ` Yao Qi @ 2015-07-13 17:32 ` Pedro Alves 2015-07-14 8:00 ` Yao Qi 0 siblings, 1 reply; 10+ messages in thread From: Pedro Alves @ 2015-07-13 17:32 UTC (permalink / raw) To: Yao Qi, gdb-patches ml On 07/13/2015 05:07 PM, Yao Qi wrote: > Hi Pedro, > do you still remember why did you add this assert? It wasn't > mentioned in the mail > https://sourceware.org/ml/gdb-patches/2014-07/msg00206.html > Simply because getting here was supposed to indicate something went wrong elsewhere, but at the time I didn't consider that the child could die while ptrace-stopped. > I am looking at a GDBserver internal error on x86_64 when I run > gdb.threads/thread-unwindonsignal.exp with GDBserver, > > continue^M > Continuing.^M > warning: Remote failure reply: E.No unwaited-for children left.^M > PC register is not available^M > (gdb) FAIL: gdb.threads/thread-unwindonsignal.exp: continue until exit > Remote debugging from host 127.0.0.1^M > ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M > ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M > ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M > ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M > monitor exit^M > Killing process(es): 30694^M > (gdb) /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/linux-low.c:1106: A > problem internal to GDBserver has been detected.^M > kill_wait_lwp: Assertion `res > 0' failed. > > After your patch https://sourceware.org/ml/gdb-patches/2015-03/msg00597.html > GDBserver starts to swallows errors if the LWP is gone. Then, when > GDBservers kills non-exist LWP, the assert will be triggered. > Looks like I forgot to push the rest of that series: https://sourceware.org/ml/gdb-patches/2015-03/msg00182.html What do you think of that one? > Why don't we implement kill_wait_lwp like its counterpart in GDB > linux-nat.c:kill_wait_callback? we can loop and assert like this > patch below, (note that this patch fixes the internal error, and > the FAIL is still there). > Seems to me it's not 100% correct to waitpid the pid one more time after we've already reaped it, because there's a minuscule chance another process that we're debugging could clone a new lwp that reuses the PID of the one we've just killed/reaped, and then another iteration could collect the initial SIGSTOP of the wrong LWP and we'd kill it: -> kill (pid1, SIGKILL); <- waitpid (pid1) returns pid1/WSIGNALLED -> on iteration1: new pid1 clone lwp is spawned -> ret==pid1, continue iterating -> kill (pid1, SIGKILL); // killing wrong process <- waitpid (pid1) returns either SIGSTOP or WSIGNALLED ... Thanks, Pedro Alves ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] GDBserver crashes when killing a multi-thread process 2015-07-13 17:32 ` Pedro Alves @ 2015-07-14 8:00 ` Yao Qi 2015-07-14 9:13 ` Pedro Alves 0 siblings, 1 reply; 10+ messages in thread From: Yao Qi @ 2015-07-14 8:00 UTC (permalink / raw) To: Pedro Alves; +Cc: Yao Qi, gdb-patches ml Pedro Alves <palves@redhat.com> writes: > Looks like I forgot to push the rest of that series: > > https://sourceware.org/ml/gdb-patches/2015-03/msg00182.html > > What do you think of that one? Yes, it looks good to me. We also need it on 7.10 branch. > >> Why don't we implement kill_wait_lwp like its counterpart in GDB >> linux-nat.c:kill_wait_callback? we can loop and assert like this >> patch below, (note that this patch fixes the internal error, and >> the FAIL is still there). >> > > Seems to me it's not 100% correct to waitpid the pid one more time > after we've already reaped it, because there's a minuscule chance > another process that we're debugging could clone a new lwp that reuses > the PID of the one we've just killed/reaped, and then another iteration > could collect the initial SIGSTOP of the wrong LWP and we'd kill it: > > -> kill (pid1, SIGKILL); > <- waitpid (pid1) returns pid1/WSIGNALLED > -> on iteration1: new pid1 clone lwp is spawned > -> ret==pid1, continue iterating > -> kill (pid1, SIGKILL); // killing wrong process > <- waitpid (pid1) returns either SIGSTOP or WSIGNALLED > ... Yes, that is possible. -- Yao (齐尧) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] GDBserver crashes when killing a multi-thread process 2015-07-14 8:00 ` Yao Qi @ 2015-07-14 9:13 ` Pedro Alves 0 siblings, 0 replies; 10+ messages in thread From: Pedro Alves @ 2015-07-14 9:13 UTC (permalink / raw) To: Yao Qi; +Cc: gdb-patches ml On 07/14/2015 09:00 AM, Yao Qi wrote: > Pedro Alves <palves@redhat.com> writes: > >> Looks like I forgot to push the rest of that series: >> >> https://sourceware.org/ml/gdb-patches/2015-03/msg00182.html >> >> What do you think of that one? > > Yes, it looks good to me. We also need it on 7.10 branch. OK, I'll push it. Thanks, Pedro Alves ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-07-14 9:13 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-06-23 10:10 [PATCH] Fix crash of gdbserver when kill threads Hui Zhu 2014-06-29 3:28 ` Hui Zhu 2014-07-02 9:07 ` Pedro Alves 2014-07-10 15:17 ` [PATCH v2] GDBserver crashes when killing a multi-thread process Pedro Alves 2014-07-11 8:21 ` Hui Zhu 2014-07-11 10:53 ` [PUSHED+7.8] " Pedro Alves 2015-07-13 16:07 ` Yao Qi 2015-07-13 17:32 ` Pedro Alves 2015-07-14 8:00 ` Yao Qi 2015-07-14 9:13 ` Pedro Alves
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox