From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29213 invoked by alias); 19 Feb 2020 13:40:35 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 29203 invoked by uid 89); 19 Feb 2020 13:40:35 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-13.0 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,GIT_PATCH_3,KAM_LOTSOFHASH,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=inclined, non_stop X-HELO: mail-qk1-f194.google.com Received: from mail-qk1-f194.google.com (HELO mail-qk1-f194.google.com) (209.85.222.194) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 19 Feb 2020 13:40:33 +0000 Received: by mail-qk1-f194.google.com with SMTP id z19so105112qkj.5 for ; Wed, 19 Feb 2020 05:40:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=sBni3ow9VCD6FKJ8yqEqYmWCK4VCzB6A2ykYlCztxEA=; b=hr+CdEUCM7edP4xXu7oIzTpOtlbHQct1pG4kylR/+ciscweVoKc5C1dQxIwOnAPsOJ aVniVzj7Kfn4Wj0Z/8/S4onTquiHUtUYJUoeYB+Ipi4Fmfi3c1EnrbE2oRKHJeHdwISC bZj5FyZdptGA4MPBlvghxgQp7q5H1gS/vdifE/qDm3COJkG6ynXhGwNckJIO4Bs0Hk0g ynunYNq4g4IbG1JM9W6aglaim/+arl80KYLF1vHANxD2mq2n8HZ5FPDUaI3UdIK2Z875 Afa0BaZ9ivHsOkaesnFTpCJEkRZj1oHV+vZxkX9km8BoxDzwKnKDnjXmWLCtty15aYNp nZWg== Return-Path: Received: from [192.168.0.185] ([191.34.219.95]) by smtp.gmail.com with ESMTPSA id 13sm975876qkk.136.2020.02.19.05.40.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 19 Feb 2020 05:40:30 -0800 (PST) Subject: Re: [PATCH v2 4/4] Unregister the inferior from the event loop if failed to resume To: "Maciej W. Rozycki" , Pedro Alves , gdb-patches@sourceware.org Cc: Jim Wilson References: From: Luis Machado Message-ID: <9b88fef7-8c95-c670-a1ad-d51a8b73e289@linaro.org> Date: Wed, 19 Feb 2020 13:40:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2020-02/txt/msg00772.txt.bz2 On 11/6/19 5:52 PM, Maciej W. Rozycki wrote: > Fix an issue with GDB starting to effectively busy-loop (though still > accepting and interpreting commands) using the full processing power > available when a remote stub has gone closing the connection while GDB > is waiting for user input once the inferior has failed to resume. > > This was observed with a `riscv64-linux-gnu' cross-GDB and RISC-V/Linux > `gdbserver' being developed and an attempt to step over the `ret' aka > `c.jr ra' instruction where the value held in the `ra' aka `x1' register > and therefore the address of a software-stepping breakpoint to insert is > 0. Here's a record of a remote debug session leading to the issue: > > Sending packet: $vCont?#49...Packet received: vCont;c;C;t > Packet vCont (verbose-resume) is supported > Sending packet: $vCont;c:p6857.-1#b8...infrun: infrun_async(1) > infrun: prepare_to_wait > infrun: target_wait (-1.0.0, status) = > infrun: -1.0.0 [process -1], > infrun: status->kind = ignore > infrun: handle_inferior_event status->kind = ignore > infrun: prepare_to_wait > Packet received: T05swbreak:;02:f0f8ffff3f000000;20:b807565515000000;thread:p6857.6857;core:1; > infrun: target_wait (-1.0.0, status) = > infrun: 26711.26711.0 [Thread 26711.26711], > infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP > infrun: handle_inferior_event status->kind = stopped, signal = GDB_SIGNAL_TRAP > infrun: stop_pc = 0x15555607b8 > Sending packet: $qXfer:libraries-svr4:read::0,1000#20...Packet received: l > infrun: BPSTAT_WHAT_SINGLE > infrun: thread [Thread 26711.26711] still needs step-over > infrun: skipping breakpoint: stepping past insn at: 0x15555607b8 > Sending packet: $X15555607b8,2:\202\200#2e...Packet received: OK > infrun: skipping breakpoint: stepping past insn at: 0x15555607b8 > infrun: skipping breakpoint: stepping past insn at: 0x15555607b8 > infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 26711.26711] at 0x15555607b8 > Sending packet: $m15555607b8,2#07...Packet received: 8280 > Sending packet: $m15555607b8,2#07...Packet received: 8280 > Sending packet: $g#67...Packet received: 00000000000000000000000000000000f0f8ffff3f000000d0c1b9aa2a00000020f76d55150000000f00000000000000ec6e555515000000ffffff6f00000000f0faffff3f00000090f0565515000000000000000000000052e57464000000000500000000000000887801000000000028f1565515000000010000000000000008f8ffff3f000000722f6c696236342f60f156551500000000000000000000000000000000000000000000000000000008f75655150000000100000000000000f5feffefffffffff80de56551500000050f9ffff3f00000068f9ffff3f000000ae585655150000000b00000000000000fffdff6f00000000fcffffffffffffff[552 bytes omitted] > Sending packet: $m0,40#2d...Packet received: E01 > Sending packet: $m0,1#fa...Packet received: E01 > Sending packet: $m0,40#2d...Packet received: E01 > Sending packet: $m0,1#fa...Packet received: E01 > infrun: skipping breakpoint: stepping past insn at: 0x15555607b8 > infrun: clear_step_over_info > Cannot access memory at address 0x0 > (gdb) > > At this point a command was issued to kill `gdbserver' and GDB responded > immediately by entering the event loop: > > infrun: target_wait (-1.0.0, status) = > infrun: -1.0.0 [process -1], > infrun: status->kind = no-resumed > infrun: handle_inferior_event status->kind = no-resumed > infrun: TARGET_WAITKIND_NO_RESUMED (ignoring: bg) > infrun: prepare_to_wait > infrun: target_wait (-1.0.0, status) = > infrun: -1.0.0 [process -1], > infrun: status->kind = no-resumed > infrun: handle_inferior_event status->kind = no-resumed > infrun: TARGET_WAITKIND_NO_RESUMED (ignoring: bg) > infrun: prepare_to_wait > infrun: target_wait (-1.0.0, status) = > infrun: -1.0.0 [process -1], > infrun: status->kind = no-resumed > infrun: handle_inferior_event status->kind = no-resumed > infrun: TARGET_WAITKIND_NO_RESUMED (ignoring: bg) > infrun: prepare_to_wait > [...] > > This is because `remote_target::resume' enables the asynchronous event > loop, as indicated by the first `infrun: infrun_async(1)' record above, > and then the EOF condition on the remote connection is delivered as an > event, but there are no resumed children to be waited for and no other > event waiting that would stop the loop. > > Correct that then by disabling the asynchronous event as execution > completion would, where applicable, i.e. in the all-stop mode. This > makes the final sequence of the session look like: > > Sending packet: $m0,40#2d...Packet received: E01 > Sending packet: $m0,1#fa...Packet received: E01 > Sending packet: $m0,40#2d...Packet received: E01 > Sending packet: $m0,1#fa...Packet received: E01 > infrun: infrun_async(0) > infrun: skipping breakpoint: stepping past insn at: 0x15555607b8 > infrun: clear_step_over_info > Cannot access memory at address 0x0 > (gdb) > > and GDB does not trigger the loop at this point if `gdbserver' goes away > as the asynchronous event loop has already been disabled. > > gdb/ > * infrun.c (resume): In the `catch' clause also unregister the > inferior from the event loop. > --- > Hi, > > It might be that the non-stop mode requires additional clean-up here, > however I have no way to work with that at the moment, so I'll be leaving > any investigation to someone who has and will be inclined to look into it. > > Maciej > > New change in v2. > --- > gdb/infrun.c | 16 +++++++++++----- > 1 file changed, 11 insertions(+), 5 deletions(-) > > gdb-resume-async.diff > Index: binutils-gdb/gdb/infrun.c > =================================================================== > --- binutils-gdb.orig/gdb/infrun.c > +++ binutils-gdb/gdb/infrun.c > @@ -2614,11 +2614,17 @@ resume (gdb_signal sig) > } > catch (const gdb_exception &ex) > { > - /* If resuming is being aborted for any reason, delete any > - single-step breakpoint resume_1 may have created, to avoid > - confusing the following resumption, and to avoid leaving > - single-step breakpoints perturbing other threads, in case > - we're running in non-stop mode. */ > + /* If resuming is being aborted for any reason, and we are in > + the all-stop mode, then unregister the inferior from the event > + loop. This is done so that when the inferior is not running > + we don't get distracted by spurious inferior output. */ > + if (!non_stop && target_has_execution && target_can_async_p ()) > + target_async (0); > + > + /* Also delete any single-step breakpoint resume_1 may have > + created, to avoid confusing the following resumption, and to > + avoid leaving single-step breakpoints perturbing other threads, > + in case we're running in non-stop mode. */ > if (inferior_ptid != null_ptid) > { > thread_info *tp = inferior_thread (); > This looks reasonable, though the call to target_async (0) is always a bit cryptic to me. Makes me wonder if we're missing such invocations in other places to make GDB more recoverable on failures. Does this fix also address 02/04? Since 02/04 is also a failure during resume? Not sure if it goes through the catch block.