From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22742 invoked by alias); 15 Mar 2014 19:44:10 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 22732 invoked by uid 89); 15 Mar 2014 19:44:09 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.9 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mail-qc0-f201.google.com Received: from mail-qc0-f201.google.com (HELO mail-qc0-f201.google.com) (209.85.216.201) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Sat, 15 Mar 2014 19:44:07 +0000 Received: by mail-qc0-f201.google.com with SMTP id x13so486081qcv.0 for ; Sat, 15 Mar 2014 12:44:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:mime-version:content-type :content-transfer-encoding:message-id:date:to:cc:subject:in-reply-to :references; bh=EJ4p4ExFe0SJU8hkYAFFUA3+3gTfwU9hEr4Z68dHPpg=; b=IeYVR9vruv0Qj74iz+S+XHDwZwOt0CrMq/4Lqt7oVB628ntSfqumD0uTrkc9fKf7qv gG3k+iSl7szaGajhhlSG7xIP1w3UVNJOIV/NFzi6UXZXmunQdiIDhQE0uNbL00P1iiHW UqbD0RLSmr3AFDdTyrRHiLTWO4q+fy4c31bCaTnWkdv4mF71RHTi8P+ZypzG5hCfeNDp /8iXHS+Inzao/CEPZca379PTQIJYvcKbaLlHS2JpE/uj7wewMVamSi55kQ/gLv+JIeWa S/V/xg+Hp8tO7iunhXE1kLGLrQoaB00HBJBUfsAChI/052VB9UybCE3CCK5SuL8oNZfS n9SA== X-Gm-Message-State: ALoCoQl0KAXwfI+quCkQ6JasiIZS4Kdn9lmfu8YluHb3I4OvNAvNBCvzATm6486SJqgburWYrZY0ghNQH6kDpcjKX0BruoIGiqdjcvhjJB+g++1tx6kmvpNXLnoqabjmGNwD9FbXkLCiiQsK1ZlwYGJa8TKr1x7RZI1L+wKLmjKg4jPocVQ2erOBjxeoFSkzRpevUZUlSbRXE7yKFOLutO54iB/Mzw4+mA== X-Received: by 10.58.134.20 with SMTP id pg20mr6206846veb.0.1394912644954; Sat, 15 Mar 2014 12:44:04 -0700 (PDT) Received: from corp2gmr1-2.hot.corp.google.com (corp2gmr1-2.hot.corp.google.com [172.24.189.93]) by gmr-mx.google.com with ESMTPS id k45si1518626yhn.4.2014.03.15.12.44.04 for (version=TLSv1.1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 15 Mar 2014 12:44:04 -0700 (PDT) Received: from ruffy.mtv.corp.google.com (ruffy.mtv.corp.google.com [172.17.128.44]) by corp2gmr1-2.hot.corp.google.com (Postfix) with ESMTP id 65A025A4166; Sat, 15 Mar 2014 12:44:04 -0700 (PDT) From: Doug Evans MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <21284.44419.745786.47756@ruffy.mtv.corp.google.com> Date: Sat, 15 Mar 2014 19:44:00 -0000 To: Jan Kratochvil Cc: gdb-patches@sourceware.org Subject: Re: [patchv2] Fix SIGTERM signal safety (PR gdb/15358) [refresh] In-Reply-To: <20140314184535.GA30853@host2.jankratochvil.net> References: <20140314183709.GA28050@host2.jankratochvil.net> <20140314184535.GA30853@host2.jankratochvil.net> X-IsSubscribed: yes X-SW-Source: 2014-03/txt/msg00342.txt.bz2 Jan Kratochvil writes: > [...] Hi. I have some comments and requests for this patch. A high level comment is that I guess this patch means it's now more important to fix the spots of gdb where QUITs aren't called for long times (I was working on a common one, debug info reading, but had to defer that to work on something else). > The sync mode is a bit difficult - assuming it is safe to call quit_force from > any place of QUIT;. OTOH the patch above assumes it can do: > if (check_quit_flag ()) > send_interrupt_sequence (); > which clears quit_flag but we used set_quit_flag () to call quit_force, not > just to throw the quit exception. This is why QUIT now checks also for > SYNC_QUIT_FORCE_RUN. > > The change in linux-nat.c comes from testing i386-linux-nat.c (therefore > 32-bit host GDB). i386_linux_resume there calls QUIT; via target_read (). > This is a bug on its own, filed as: > http://sourceware.org/bugzilla/show_bug.cgi?id=15713 > > But I have seen another bug in linux-nat.c, it was depending on PTRACE_KILL > but at least Linux kernel ptrace expert Oleg Nesterov considers PTRACE_KILL > superseded by kill(SIGKILL). Therefore I used there (also) more safe SIGKILL > so that the possibly inconsistent state of inferior from i386_linux_resume > does not matter. > > > No regressions on {x86_64,x86_64-m32,i686}-fedora19pre-linux-gnu and in > gdbserver mode. > > > Thanks, > Jan > > > gdb/ > 2013-07-02 Jan Kratochvil > > PR gdb/15358 > * defs.h (sync_quit_force_run): New declaration. > (QUIT): Check also SYNC_QUIT_FORCE_RUN. > * event-top.c (async_sigterm_handler): New declaration. > (async_sigterm_token): New variable. > (async_init_signals): Create also async_sigterm_token. > (async_sigterm_handler): New function. > (sync_quit_force_run): New variable. > (handle_sigterm): Replace quit_force call by other calls. > * linux-nat.c (linux_nat_kill): Use kill_callback first. > Extend the comment for stop_callback. > * utils.c (quit): Call quit_force if SYNC_QUIT_FORCE_RUN. > > gdb/testsuite/ > 2013-07-02 Jan Kratochvil > > PR gdb/15358 > * gdb.base/gdb-sigterm.c: New file. > * gdb.base/gdb-sigterm.exp: New file. > > diff --git a/gdb/defs.h b/gdb/defs.h > index 480133d..47da43a 100644 > --- a/gdb/defs.h > +++ b/gdb/defs.h > @@ -171,6 +171,9 @@ extern int check_quit_flag (void); > /* * Set the quit flag. */ > extern void set_quit_flag (void); > > +/* Flag that function quit should call quit_force. */ > +extern volatile int sync_quit_force_run; > + > extern int immediate_quit; > > extern void quit (void); > @@ -183,7 +186,7 @@ extern void quit (void); > needed. */ > > #define QUIT { \ > - if (check_quit_flag ()) quit (); \ > + if (check_quit_flag () || sync_quit_force_run) quit (); \ > if (deprecated_interactive_hook) deprecated_interactive_hook (); \ > } > > diff --git a/gdb/event-top.c b/gdb/event-top.c > index fbe89bd..af1562c 100644 > --- a/gdb/event-top.c > +++ b/gdb/event-top.c > @@ -72,6 +72,7 @@ static void async_float_handler (gdb_client_data); > #ifdef STOP_SIGNAL > static void async_stop_sig (gdb_client_data); > #endif > +static void async_sigterm_handler (gdb_client_data arg); > > /* Readline offers an alternate interface, via callback > functions. These are all included in the file callback.c in the > @@ -135,6 +136,7 @@ static struct async_signal_handler *sigfpe_token; > #ifdef STOP_SIGNAL > static struct async_signal_handler *sigtstp_token; > #endif > +static struct async_signal_handler *async_sigterm_token; > > /* Structure to save a partially entered command. This is used when > the user types '\' at the end of a command line. This is necessary > @@ -770,6 +772,8 @@ async_init_signals (void) > create_async_signal_handler (async_stop_sig, NULL); > #endif > > + async_sigterm_token = > + create_async_signal_handler (async_sigterm_handler, NULL); > } I realize SIGTERM isn't handled exactly like SIGINT, but IWBN to keep all the SIGTERM bits together in this function. How about moving the above snippet to after this: signal (SIGINT, handle_sigint); sigint_token = create_async_signal_handler (async_request_quit, NULL); signal (SIGTERM, handle_sigterm); btw, the rule has been effectively changed to require = being put on the next line. While I certainly wouldn't ask for whitespace change in this patch, [IOW I'm not asking for the existing code to be fixed] I'm curious what the rule is for handling this here. For the patch at hand, should it be written as this: async_sigterm_token = create_async_signal_handler (async_sigterm_handler, NULL); or this: async_sigterm_token = create_async_signal_handler (async_sigterm_handler, NULL); ? I honestly don't know. It's not important of course, just wondering. > > /* Tell the event loop what to do if SIGINT is received. > @@ -797,13 +801,31 @@ handle_sigint (int sig) > gdb_call_async_signal_handler (sigint_token, immediate_quit); > } > > +/* Handle GDB exit upon receiving SIGTERM if target_can_async_p (). */ > + > +static void > +async_sigterm_handler (gdb_client_data arg) > +{ > + quit_force (NULL, stdin == instream); > +} > + > +/* See defs.h. */ > +volatile int sync_quit_force_run; > + > /* Quit GDB if SIGTERM is received. > GDB would quit anyway, but this way it will clean up properly. */ > void > handle_sigterm (int sig) > { > signal (sig, handle_sigterm); > - quit_force ((char *) 0, stdin == instream); I think we need a comment here explaining *why* we can't just call quit_force here. > + > + if (target_can_async_p ()) > + mark_async_signal_handler (async_sigterm_token); > + else > + { > + sync_quit_force_run = 1; > + set_quit_flag (); > + } > } > > /* Do the quit. All the checks have been done by the caller. */ > diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c > index b615423..ec84188 100644 > --- a/gdb/linux-nat.c > +++ b/gdb/linux-nat.c > @@ -3777,8 +3777,15 @@ linux_nat_kill (struct target_ops *ops) > { > ptid_t ptid = pid_to_ptid (ptid_get_pid (inferior_ptid)); > > + /* Kill all LWP's before trying to stop them. In rare cases the > + lwp_info state may not match the inferior and > + stop_wait_callback could lock up. */ > + iterate_over_lwps (ptid, kill_callback, NULL); > + > /* Stop all threads before killing them, since ptrace requires > - that the thread is stopped to sucessfully PTRACE_KILL. */ > + that the thread is stopped to sucessfully PTRACE_KILL. > + kill_callback normally already turned the inferior into a zombie > + except for old Linux kernels 2.4.x. */ > iterate_over_lwps (ptid, stop_callback, NULL); > /* ... and wait until all of them have reported back that > they're no longer running. */ This part feels sufficiently outside the scope of this patch that IWBN if this were a separate patch. > diff --git a/gdb/utils.c b/gdb/utils.c > index 9d068c5..364470c 100644 > --- a/gdb/utils.c > +++ b/gdb/utils.c > @@ -999,6 +999,12 @@ print_sys_errmsg (const char *string, int errcode) > void > quit (void) > { > + if (sync_quit_force_run) > + { > + sync_quit_force_run = 0; > + quit_force (NULL, stdin == instream); > + } > + Bleah. Not your problem, and obviously I'm not suggesting fixing the names in this patch, but "quit" is overloaded. quit is really quit_current_command (or some such), and quit_force is really quit_and_exit (or some such). I raise the issue because the patch is conflating them, and I'm wondering if that's a good idea. > [...] > > diff --git a/gdb/testsuite/gdb.base/gdb-sigterm.exp b/gdb/testsuite/gdb.base/gdb-sigterm.exp > new file mode 100644 > index 0000000..8baeb96 > --- /dev/null > +++ b/gdb/testsuite/gdb.base/gdb-sigterm.exp > @@ -0,0 +1,94 @@ > +# This testcase is part of GDB, the GNU debugger. > +# > +# Copyright 2013 Free Software Foundation, Inc. > +# > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3 of the License, or > +# (at your option) any later version. > +# > +# This program is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program. If not, see . > + > +standard_testfile > + > +if { [build_executable ${testfile}.exp ${testfile}] == -1 } { > + return -1 > +} > + > +proc do_test { pass } { > + global testfile gdb_prompt binfile pf_prefix > + > + if ![runto_main] { > + return -1 > + } > + > + gdb_breakpoint "${testfile}.c:[gdb_get_line_number "loop-line" ${testfile}.c]" \ > + temporary > + > + # gdb_continue_to_breakpoint would print a pass message. > + gdb_test "continue" "Temporary breakpoint .* loop-line .*" "" > + > + gdb_test_no_output "set range-stepping off" "" > + gdb_test_no_output "set debug infrun 1" "" > + gdb_test_no_output "set debug lin-lwp 1" "" > + > + set test "run a bit #$pass" > + set abort 1 > + gdb_test_multiple "step" $test { > + -re "infrun: stepping inside range" { > + # Suppress pass $test > + verbose -log "$pf_prefix $test" > + set abort 0 > + } > + } > + if $abort { > + return > + } > + > + set gdb_pid [exp_pid -i [board_info host fileid]] > + remote_exec host "kill -TERM ${gdb_pid}" > + > + set test "expect eof #$pass" > + set abort 1 > + set stepping 0 > + gdb_test_multiple "" $test { > + eof { > + verbose -log "$pf_prefix $test" > + set abort 0 > + } > + -re "infrun: stepping inside range" { > + incr stepping > + if { $stepping > 200 } { > + fail "$test (stepping inside range $stepping times)" > + } else { > + exp_continue > + } > + } > + } > + if $abort { > + return > + } > +} > + > + > +for {set pass 0} {$pass < 50} {incr pass} { How come 50 passes? At the least a comment is required explaining why. > + > + clean_restart ${testfile} > + gdb_test_no_output "set target-async off" "" > + with_test_prefix "sync" { > + do_test $pass > + } > + > + clean_restart ${testfile} > + gdb_test_no_output "set target-async on" "" > + with_test_prefix "async" { > + do_test $pass > + } > +} > +pass "$pass SIGTERM passes" -- /dje