From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id 6xPYORmYx2CbSwAAWB0awg (envelope-from ) for ; Mon, 14 Jun 2021 13:55:37 -0400 Received: by simark.ca (Postfix, from userid 112) id DF5DF1F163; Mon, 14 Jun 2021 13:55:37 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 86B351E813 for ; Mon, 14 Jun 2021 13:55:35 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0597D3959C1C for ; Mon, 14 Jun 2021 17:55:35 +0000 (GMT) Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) by sourceware.org (Postfix) with ESMTPS id 1E43E384841E for ; Mon, 14 Jun 2021 17:55:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1E43E384841E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wr1-f47.google.com with SMTP id i94so15427848wri.4 for ; Mon, 14 Jun 2021 10:55:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=uvOUhNCMOgUewIXHD0PfOgfq3v1ynLFHY+mdcYV6R4Q=; b=csZ90XrJnnw84ZQia9QZPJleDANQODCLlHkURJ85VHJnRUDfAyhMY1gDp7+dC3lfiC rhj5ACp2krrnTz80JwF6sgzAviRIDX6sP+ZFsxHCBVdDHVC9fQC4mRtJsVhNQXSCOs2I VRq8nv0aIohrn8BrVVYfEWY2MnwlDdctrI8Q/G1Znkvo7VXPDr7UVejn323jbKAV6Y7L 7YjcEy0gY0ue5x5ooiwRmyF5ZG1WQuzM4Lpm+ujdJg0XAsrjOkwo8Pm/OQrPTnj457/a Sf3GRQfqzlLPSuNyI75A9GdVmHcB7Sp+EfUiIpFKW0vqtopWggFkaf+V+yjr126MCSbt W/5Q== X-Gm-Message-State: AOAM530/WldW9gx7IYsGQ7bImBNB2TkGRQGcvA5LMqJ+jeDsX3IS2LYF cdoWF1g6aJ8Le7nOOLZmsEW4yn/pKE/+Aw== X-Google-Smtp-Source: ABdhPJxQDTC0IGCIoYTMQRrcPQaBl3lqc0i8WVTao+iSPgvnRyW0TacYYpZggr5+/fL+cPgWyiMQtw== X-Received: by 2002:a5d:564a:: with SMTP id j10mr20156213wrw.171.1623693320269; Mon, 14 Jun 2021 10:55:20 -0700 (PDT) Received: from ?IPv6:2001:8a0:f932:6a00:6b6e:c7b6:c5a7:aac3? ([2001:8a0:f932:6a00:6b6e:c7b6:c5a7:aac3]) by smtp.gmail.com with ESMTPSA id t1sm16304827wrx.28.2021.06.14.10.55.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 14 Jun 2021 10:55:19 -0700 (PDT) Subject: Re: [PATCH 00/17] Interrupting programs that block/ignore SIGINT From: Pedro Alves To: Eli Zaretskii References: <20210603190243.2609886-1-pedro@palves.net> <835yyuwwuu.fsf@gnu.org> Message-ID: <2af9e861-69f2-80c5-606b-2936b6d2d212@palves.net> Date: Mon, 14 Jun 2021 18:55:17 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <835yyuwwuu.fsf@gnu.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: gdb-patches@sourceware.org Errors-To: gdb-patches-bounces+public-inbox=simark.ca@sourceware.org Sender: "Gdb-patches" On 2021-06-03 8:51 p.m., Eli Zaretskii wrote: >> From: Pedro Alves >> Date: Thu, 3 Jun 2021 20:02:26 +0100 >> >> Currently, on GNU/Linux, it is not possible to interrupt with Ctrl-C >> programs that block or ignore SIGINT, with e.g., sigprocmask or >> signal(SIGINT, SIG_IGN). You type Ctrl-C, but nothing happens. > > I'm not sure I understand why this is a problem. If a debuggee blocks > or ignores SIGINT, then SIGINT cannot be used to stop it, neither when > it runs outside the debugger nor when it runs under a debugger. There > are enough other methods to stop such a debuggee (e.g., send it a > different signal), but basically such a program clearly tells that it > doesn't want to be interrupted by SIGINT, period. It sounds like you're taking the view that Ctrl-C should be seen by the inferior first, and behave exactly as if you were not debugging the program? My view is that, as a user, I see Ctrl-C as the standard way across configurations to interrupt programs so I can inspect them. When I type Ctrl-C, I'm telling GDB that I want to stop the program. I'm not really thinking that I'm typing Ctrl-C in the program's terminal. I.e., I'm not specifically interested in the SIGINT signal itself. If the program doesn't stop, then I see that as a GDB bug. It's not unlike how when you're remote debugging, or debugging a program you've attached to, you can type Ctrl-C in GDB's terminal, and that stops the program. It happens to stop it with SIGINT, but in those cases we're not really typing the Ctrl-C in the program's terminal either. Making the inferior have its own terminal makes native debugging work similarly to remote debugging or attaching, in the sense that GDB gets the signal first, and then is free to decide what to do with it, instead of being at the mercy of how the inferior configured its signal masks. > > Btw, what about programs that install a SIGINT handler that does > something when SIGINT is delivered, but don't block or ignore SIGINT? With current master, you type Ctrl-C, and a SIGINT reaches the inferior directly without stopping the program. After my patches, you type Ctrl-C, and GDB stops the program. If you want the program to get a SIGINT, you can then still do "signal SIGINT" to pass a SIGINT to the program. It's the same as if with current master you had "handle SIGINT stop print nopass" and then typed Ctrl-C but still wanted to pass the signal to the inferior after GDB intercepts it. Like so: ~~~~~~~~~~~~ $ cat sigint.c #include #include static void handler (int sig) { printf ("Got SIGINT\n"); } int main () { signal (SIGINT, handler); while (1) ; return 0; } ~~~~~~~~~~~~ With my patches: ... (gdb) r Starting program: sigint # press ctrl-c Program stopped. main () at sigint.c:15 15 while (1) (gdb) c Continuing. # press ctrl-c again Program stopped. main () at sigint.c:15 15 while (1) (gdb) signal SIGINT Continuing with signal SIGINT. Got SIGINT With current master GDB, or with my patches + "(gdb) tty /dev/tty": (gdb) handle SIGINT print stop nopass SIGINT is used by the debugger. Are you sure you want to change it? (y or n) y Signal Stop Print Pass to program Description SIGINT Yes Yes No Interrupt (gdb) r Starting program: sigint ^C Program received signal SIGINT, Interrupt. main () at sigint.c:15 15 while (1) (gdb) c Continuing. ^C Program received signal SIGINT, Interrupt. main () at sigint.c:15 15 while (1) (gdb) signal SIGINT Continuing with signal SIGINT. Got SIGINT I guess I could mention this in the new interrupting programs section in the manual. > >> Similarly, if a program uses sigwait to wait for SIGINT, and the >> program receives a SIGINT, the SIGINT is _not_ intercepted by ptrace, >> it goes straight to the inferior. > > This sounds like a more serious issue, but how many programs use this > technique? Obviously I can't know that number. I only know the issue has been reported several times over the years. PRs gdb/9425, gdb/14559 are the oldest bugs I've seen about it, but I've seen people ask about this a number of times over the years. Note that the interrupting programs that block/ignore SIGINT part was really the use case among several that I picked as being good standalone justification for the series. The being able to take control of inferior input/output when we switch to the TUI would be another one. That the TUI gets messed up when the inferior prints something is also a recurring wart people complain about. Also, SIGINT handling in multi-target scenarios is quite awkward. Say you're debugging a multi-target program, and under control of one GDB, you have: - one native process - one remote process Currently, whether GDB gets the SIGINT first, or whether the kernel sends a SIGINT to the native process, is basically random -- it depends on which process GDB resumed first -- if the native process, then GDB puts that process in the foreground and SIGINT will reach it first. If the remote target, then GDB gets the SIGINT first, and then GDB tries to forward it to one of the processes, via target_pass_ctrlc. Now let's imagine one step forward, and consider a different kind of multi-target debugging -- debugging a native GNU/Linux process which uses some some accelerator co-processor threads, GPU threads, etc. Imagine GDB is able to debug all that, where "info threads" shows both CPU and device threads. You have all threads of the CPU and device stopped. You now resume one of the accelerator device threads, leaving the CPU threads stopped. E.g., say the device thread is 100, and you do (gdb) set scheduler-locking on (gdb) thread 100 (gdb) c As a response to the resumption, GDB puts the process's terminal settings in effect and puts the inferior in the foreground of its terminal. And now you type Ctrl-C to stop the program. But, instead nothing happens, Ctrl-C just does nothing. That's because while the process is in the foreground, no CPU thread has been resumed, so there's no thread that can actually receive the SIGINT. Unix signals don't exist on the accelerator device. In this situation, there would be no way to get back the prompt, unless the device thread hits a breakpoint or some other kind of event on its own. This wasn't the original use case I wrote this series for, but I can actually trigger this scenario with AMD ROCm GDB (debug CPU + GPU threads all in one). If GDB always gets the signal first, then it can stop all processes in a more controlled fashion. I'm not sure I mentioned this earlier, but we could even add a knob to make Ctrl-C _not_ stop any process, but just drop to the prompt with the inferior still running. That can be interesting in scenarios that do not want to disturb the inferior if possible, as for example "set observer on". Curiously, I remember that when the non-stop mode was being designed back in 2007/2008, ctrl-c just giving you back the prompt was on the table, but it wasn't possible to implement it back then so it was dropped. So I see a good number of problems with the current way we handle SIGINT, and I think that putting the inferior in its own terminal so that GDB gets the SIGINT first finally lets us solve them. > > I guess what I'm asking is whether these issues really justify the > complications of setting up a separate terminal. I think they do. The biggest wart seems to be the SIGHUP after detach, but today I figured out a way to avoid it. See below. > The fact that the > inferior could be stopped by a different signal, the fact that it > could be hit by SIGHUP, the fact that users will need to decide > whether or not they use the separate-terminal method (which will be > the default, a backward-incompatible change) -- are all these > complications really worth the (IMO) minor issues they solve? Note that I think that the decision of whether to use a separate terminal only really matters for the "run + detach" use case. Our testcases mostly do that (run+detach) because they want to exercise something about detach, and using "run" is the easiest way to launch the program. I.e., it's just a convenience. I don't think real users want to do this very often. If they do need to do it, the patch made GDB inform you know how to fix it. But, given your poke, I went back a bit to the drawing board, and found out how to avoid the SIGHUP in the first place. So thank you! I was thinking -- if users really need to detach after run, then their programs will already need to handle SIGHUP, because they will get the SIGHUP today already it they close the terminal where GDB was running after detaching their program -- but that's actually incorrect. Today, after you run+detach, and then close gdb's terminal, the detached process loses its terminal, but remains alive. How could it be? The reason is that in current master, by the time the process is detached, we have target_terminal::ours() in effect, which means that the inferior's pgrp is not the foreground process group. And it's only processes in the foreground process group that get the SIGHUP. Background processes do not get it. So that got me wondering, how to have the same with my changes. And it turns out not to be very hard. We still need to make the inferior be the foreground process group during normal debugging just as today. But, we can make the session leader process make itself the foreground pgrp after the detach, but before GDB closes the terminal. This actually works nicely. The inferior output has nowhere to go after the detach, but then that's what you would get if you closed GDB's terminal after the detach. If you weren't going to do that, then you wouldn't need to detach in the first place. So I think that's a minor detail, much less concerning than the SIGHUP. I think that with this, I can drop the new query from GDB, the one suggesting "tty /dev/tty" in the problem scenario. And in turn, that lets us not issue "tty /dev/tty" in all testcases touched by patch #8 except one, which still needs it for another reason. See below. I'll folding this into the right spot in the series, and adjust the manual changes accordingly too. I'll also look at documenting the "signal SIGINT" scenario discussed above. WDTY? >From ac842dc45bbe373e0c9a6c0cb839d8883da64fd0 Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Mon, 14 Jun 2021 15:49:06 +0100 Subject: [PATCH] no SIGHUP Change-Id: I0a0170fab01f6566bed4d2a8a0d07e083d4fb768 --- gdb/infcmd.c | 23 ----------------------- gdb/inflow.c | 16 +++++++++++++--- gdb/nat/fork-inferior.c | 31 +++++++++++++++++++++++++++++-- 3 files changed, 42 insertions(+), 28 deletions(-) diff --git a/gdb/infcmd.c b/gdb/infcmd.c index 56fd48f97cb..5f777b5138b 100644 --- a/gdb/infcmd.c +++ b/gdb/infcmd.c @@ -2716,27 +2716,6 @@ notice_new_inferior (thread_info *thr, bool leave_running, int from_tty) attach_post_wait (from_tty, mode); } -/* Check if the current inferior is associated with a session/terminal - created by GDB. If so, query the user if he/she really wants to - detach. */ - -static void -query_detach_if_gdb_owns_session () -{ - if (child_gdb_owns_session (current_inferior ())) - { - if (!query (_("\ -warning: The process was started by GDB and associated with its\n\ -own session/tty. If you detach, GDB closes the process's terminal\n\ -and the likely consequence is that the process is killed by SIGHUP,\n\ -unless it explicitly handles or ignores that signal.\n\ -One way to avoid this is by starting the program in the same session\n\ -as GDB using the \"tty /dev/tty\" command.\n\ -Detach anyway? "))) - error (_("Not confirmed.")); - } -} - /* * detach_command -- * takes a program previously attached to and detaches it. @@ -2758,8 +2737,6 @@ detach_command (const char *args, int from_tty) scoped_disable_commit_resumed disable_commit_resumed ("detaching"); - query_detach_if_gdb_owns_session (); - query_if_trace_running (from_tty); disconnect_tracing (); diff --git a/gdb/inflow.c b/gdb/inflow.c index 51700f971ad..38e29b26517 100644 --- a/gdb/inflow.c +++ b/gdb/inflow.c @@ -995,7 +995,14 @@ inflow_inferior_exit (struct inferior *inf) /* Flush any pending output and close the pty. */ delete_file_handler (run_terminal->pty_fd); child_terminal_flush_stdout (run_terminal); - close (run_terminal->pty_fd); + + /* Explicitly send a SIGHUP instead of just closing + the terminal and letting the kernel send it, + because we want the session leader to have a + chance to put itself in the foreground, so that + its children, if any (e.g., we're detaching), + don't get a SIGHUP too. */ + kill (run_terminal->session_leader, SIGHUP); /* Since we closed the controlling terminal, the session leader should exit now, killed by SIGHUP. @@ -1015,13 +1022,16 @@ inflow_inferior_exit (struct inferior *inf) inf->num, (int) run_terminal->session_leader, errno, safe_strerror (errno)); else if (res != run_terminal->session_leader - || !WIFSIGNALED (status) - || WTERMSIG (status) != SIGHUP) + || !WIFEXITED (status) + || WEXITSTATUS (status) != 0) warning (_("unexpected waitstatus " "reaping session leader for inf %d (sid=%d): " "res=%d, status=0x%x"), inf->num, (int) run_terminal->session_leader, res, status); + + /* We can now close the terminal. */ + close (run_terminal->pty_fd); } #endif /* GDB_MANAGED_TERMINALS */ delete run_terminal; diff --git a/gdb/nat/fork-inferior.c b/gdb/nat/fork-inferior.c index 31725673fd3..f140242ace5 100644 --- a/gdb/nat/fork-inferior.c +++ b/gdb/nat/fork-inferior.c @@ -85,6 +85,33 @@ class execv_argv std::string m_storage; }; +#if GDB_MANAGED_TERMINALS + +/* SIGHUP handler for the session leader processes. GDB sends this + explicitly, though it'll also be called if GDB crashes and the + terminal is abruptly closed. */ + +static void +session_leader_hup (int sig) +{ + scoped_ignore_sigttou ignore_sigttou; + + /* We put the inferior (a child of the session leader) in the + foreground, so by default, on detach, if we did nothing else, the + inferior would get a SIGHUP when the terminal is closed by GDB. + That SIGHUP would likely kill the inferior. To avoid it, we put + the session leader in the foreground before the terminal is + closed. Only processes in the foreground process group get the + automatic SIGHUP, so the detached process doesn't get it. */ + int res = tcsetpgrp (0, getpid ()); + if (res == -1) + trace_start_error (_("tcsetpgrp failed in session leader\n")); + + _exit (0); +} + +#endif + /* Create argument vector for straight call to execvp. Breaks up ALLARGS into an argument vector suitable for passing to execvp and stores it in M_ARGV. E.g., on "run a b c d" this routine would get @@ -433,8 +460,8 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs, /* This is the parent / session leader process. It just stays around until GDB closes the terminal. */ - /* We want to exit when GDB closes our terminal. */ - signal (SIGHUP, SIG_DFL); + /* Gracefully handle SIGHUP. */ + signal (SIGHUP, session_leader_hup); managed_tty_debug_printf (_("session-leader (sid=%d): waiting for child pid=%d exit\n"), -- 2.26.2