From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id bXJ7GqdA1V8YGQAAWB0awg (envelope-from ) for ; Sat, 12 Dec 2020 17:13:59 -0500 Received: by simark.ca (Postfix, from userid 112) id 5F41F1F0AA; Sat, 12 Dec 2020 17:13:59 -0500 (EST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 139601E552 for ; Sat, 12 Dec 2020 17:13:58 -0500 (EST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DFB923858028; Sat, 12 Dec 2020 22:13:56 +0000 (GMT) Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by sourceware.org (Postfix) with ESMTPS id 09F773858019 for ; Sat, 12 Dec 2020 22:13:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 09F773858019 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=andrew.burgess@embecosm.com Received: by mail-wr1-x442.google.com with SMTP id r7so12644591wrc.5 for ; Sat, 12 Dec 2020 14:13:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=A5wlVpUODDfWw7jc1At3pWnYHsZ90qCqrzE7r0MX+V4=; b=Tvpfsa7WB5Nt5hU3XJ08g0b6fe037ZtZw5rwhx03AbbSILb6wD9xPqa+v+qlEqM3a8 DHNYqM4+CRWPbnxogUgxZJfYaV9a3wUPMSD8zQg2OhJE3qZjXnfnI095u+2vTju3+w2J LVcyD2uH8IrEus2QtRtugwLLbvOyansrRgoJu4zvnwfAmkYcLICJh/J9ixkf1lE6sSP3 4YBzlj6SQn7jkp9t6mvDNUMILxX0Uh1K/pL7pGR4+Y01e3mCzDO5XH6OU6jcE6m0gmPP 7kCeBh4gNbwzji927x1e4hWVlaK5AeljgcbKeVMjQkxL2BlgkyCr+2DJ9ksWE6Qfwpaa 8K9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=A5wlVpUODDfWw7jc1At3pWnYHsZ90qCqrzE7r0MX+V4=; b=ckOB0DgkpcSrdWxgKFe1jWKKGGL7fKrV8N+tp4gG5W/ulMgetjeEwcixEbkM3b855u KomENUM7DO0+L7+Kyf8Hgrhqpg9ZQ0bwJH3T18aB+VP3meKZ3Dln1XSR5Z4rV5/gqZXu /NF/LwnEkcPPPeA1Tny9ih1nIh/uUkC1bdU4bSYP5jnWjL6g3cXDrUzTrIQXlPDSr17R 70SOlBpjBTM49jEpXKJtStD8TBaplFS2l7ynNG8Lz3yhsRtOh5x46BUiN8YTWc+sbYmG wwYYOd8xMlJ30a9XBxZ/+VY35uVBlesD5j0zddO0yRY0L0To3oSI+ADiNtvjEHc2QJMC RfKw== X-Gm-Message-State: AOAM533kOPw+5JQ7AdUQ1sLrxv80C2zp0kCG+6rXR7NJFRHKpTwF9qIM 2zfqwEzEm4Kg6ZU2bZ3fzfXOJAxUjA7jRg== X-Google-Smtp-Source: ABdhPJxXETbL+/cPc8IPr/zS2wcW52prcxWCV6V6mU7bjkDoQBHgWWDfbxFuwYJ1AHpTS6tzLkn/2w== X-Received: by 2002:a5d:4b4c:: with SMTP id w12mr20778260wrs.402.1607811233029; Sat, 12 Dec 2020 14:13:53 -0800 (PST) Received: from localhost (host86-180-62-188.range86-180.btcentralplus.com. [86.180.62.188]) by smtp.gmail.com with ESMTPSA id w17sm23110761wru.82.2020.12.12.14.13.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 12 Dec 2020 14:13:51 -0800 (PST) Date: Sat, 12 Dec 2020 22:13:50 +0000 From: Andrew Burgess To: Pedro Alves Subject: Re: [PATCH 1/7] Fix spurious unhandled remote %Stop notifications Message-ID: <20201212221350.GE2945@embecosm.com> References: <20200706190252.22552-1-pedro@palves.net> <20200706190252.22552-2-pedro@palves.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200706190252.22552-2-pedro@palves.net> X-Operating-System: Linux/5.8.13-100.fc31.x86_64 (x86_64) X-Uptime: 21:57:15 up 4 days, 2:41, X-Editor: GNU Emacs [ http://www.gnu.org/software/emacs ] X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: gdb-patches@sourceware.org Errors-To: gdb-patches-bounces@sourceware.org Sender: "Gdb-patches" * Pedro Alves [2020-07-06 20:02:46 +0100]: > In non-stop mode, remote targets mark an async event source whose > callback is supposed to result in calling remote_target::wait_ns to > either process the event queue, or acknowledge an incoming %Stop > notification. > > The callback in question is remote_async_inferior_event_handler, where > we call inferior_event_handler, to end up in fetch_inferior_event -> > target_wait -> remote_target::wait -> remote_target::wait_ns. > > A problem here however is that when debugging multiple targets, > fetch_inferior_event can pull events out of any target picked at > random, for event fairness. This means that when > remote_async_inferior_event_handler returns, remote_target::wait may > have not been called at all, and thus pending notifications may have > not been acked. Because async event sources auto-clear, when > remote_async_inferior_event_handler returns the async event handler is > no longer marked, so the event loop won't automatically call > remote_async_inferior_event_handler again to try to process the > pending remote notifications/queue. The result is that stop events > may end up not processed, e.g., "interrupt -a" seemingly not managing > to stop all threads. > > Fix this by making remote_async_inferior_event_handler mark the event > handler again before returning, if necessary. > > Maybe a better fix would be to make async event handlers not > auto-clear themselves, make that the responsibility of the callback, > so that the event loop would keep calling the callback automatically. > Or, we could try making so that fetch_inferior_event would optionally > handle events only for the target that it got passed down via > parameter. However, I don't think now just before branching is the > time to try to do any such change. > > gdb/ChangeLog: > > PR gdb/26199 > * remote.c (remote_target::open_1): Pass remote target pointer as > data to create_async_event_handler. > (remote_async_inferior_event_handler): Mark async event handler > before returning if the remote target still has either pending > events or unacknowledged notifications. > --- > gdb/remote.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/gdb/remote.c b/gdb/remote.c > index f7f99dc24f..59075cb09f 100644 > --- a/gdb/remote.c > +++ b/gdb/remote.c > @@ -5605,7 +5605,7 @@ remote_target::open_1 (const char *name, int from_tty, int extended_p) > > /* Register extra event sources in the event loop. */ > rs->remote_async_inferior_event_token > - = create_async_event_handler (remote_async_inferior_event_handler, NULL); > + = create_async_event_handler (remote_async_inferior_event_handler, remote); > rs->notif_state = remote_notif_state_allocate (remote); > > /* Reset the target state; these things will be queried either by > @@ -14164,6 +14164,19 @@ static void > remote_async_inferior_event_handler (gdb_client_data data) > { > inferior_event_handler (INF_REG_EVENT); > + > + remote_target *remote = (remote_target *) data; > + remote_state *rs = remote->get_remote_state (); > + > + /* inferior_event_handler may have consumed an event pending on the > + infrun side without calling target_wait on the REMOTE target, or > + may have pulled an event out of a different target. Keep trying > + for this remote target as long it still has either pending events > + or unacknowledged notifications. */ > + > + if (rs->notif_state->pending_event[notif_client_stop.id] != NULL > + || !rs->stop_reply_queue.empty ()) > + mark_async_event_handler (rs->remote_async_inferior_event_token); > } Pedro, This patch introduced a use after free issue here. This can be seen by running the test: make check-gdb RUNTESTFLAGS="--target_board=native-gdbserver gdb.base/inferior-died.exp" For me this fails maybe 1 in 5 times. I've done some initial investigation at the problem is obvious one you see the following stack trace: #0 remote_state::~remote_state (this=0x338d548, __in_chrg=) at ../../src.dev-3/gdb/remote.c:1097 #1 0x0000000000acf3b3 in remote_target::~remote_target (this=0x338d530, __in_chrg=) at ../../src.dev-3/gdb/remote.c:4078 #2 0x0000000000acf3f6 in remote_target::~remote_target (this=0x338d530, __in_chrg=) at ../../src.dev-3/gdb/remote.c:4097 #3 0x0000000000acf2fa in remote_target::close (this=0x338d530) at ../../src.dev-3/gdb/remote.c:4075 #4 0x0000000000c75bfd in target_close (targ=0x338d530) at ../../src.dev-3/gdb/target.c:3126 #5 0x0000000000c62ca4 in decref_target (t=0x338d530) at ../../src.dev-3/gdb/target.c:545 #6 0x0000000000c62ec7 in target_stack::unpush (this=0x3666d50, t=0x338d530) at ../../src.dev-3/gdb/target.c:633 #7 0x0000000000c7796c in inferior::unpush_target (this=0x3666ba0, t=0x338d530) at ../../src.dev-3/gdb/inferior.h:357 #8 0x0000000000c62de1 in unpush_target (t=0x338d530) at ../../src.dev-3/gdb/target.c:595 #9 0x0000000000c62ee7 in unpush_target_and_assert (target=0x338d530) at ../../src.dev-3/gdb/target.c:643 #10 0x0000000000c62fb6 in pop_all_targets_at_and_above (stratum=process_stratum) at ../../src.dev-3/gdb/target.c:666 #11 0x0000000000ad21a4 in remote_unpush_target (target=0x338d530) at ../../src.dev-3/gdb/remote.c:5524 #12 0x0000000000adc619 in remote_target::mourn_inferior (this=0x338d530) at ../../src.dev-3/gdb/remote.c:9962 #13 0x0000000000c65e79 in target_mourn_inferior (ptid=...) at ../../src.dev-3/gdb/target.c:2136 #14 0x000000000086d0a5 in handle_inferior_event (ecs=0x7fffffffb3b0) at ../../src.dev-3/gdb/infrun.c:5234 #15 0x0000000000869beb in fetch_inferior_event () at ../../src.dev-3/gdb/infrun.c:3863 #16 0x000000000084a922 in inferior_event_handler (event_type=INF_REG_EVENT) at ../../src.dev-3/gdb/inf-loop.c:42 #17 0x0000000000ae73a9 in remote_async_inferior_event_handler (data=0x338d530) at ../../src.dev-3/gdb/remote.c:14177 #18 0x00000000004ea759 in check_async_event_handlers () at ../../src.dev-3/gdb/async-event.c:328 #19 0x0000000001449e7a in gdb_do_one_event () at ../../src.dev-3/gdbsupport/event-loop.cc:216 #20 0x00000000009102d0 in start_event_loop () at ../../src.dev-3/gdb/main.c:347 #21 0x00000000009103f0 in captured_command_loop () at ../../src.dev-3/gdb/main.c:407 #22 0x0000000000911be6 in captured_main (data=0x7fffffffb640) at ../../src.dev-3/gdb/main.c:1239 #23 0x0000000000911c4c in gdb_main (args=0x7fffffffb640) at ../../src.dev-3/gdb/main.c:1254 #24 0x000000000041755d in main (argc=5, argv=0x7fffffffb748) at ../../src.dev-3/gdb/gdb.c:32 The inferior event being processed is the inferior exited event, this is the last remote inferior, and so the remote target is unpushed. GDB then returns to remote_async_inferior_event_handler where we hit the code you added above which proceeds to make use of the remote target :-/ Like I say, the problem is now obvious, but the solution less so! Reading what you originally wrote in the patch I wondered about the idea of having it be the call back that is responsible for marking the async event handler as clear. I haven't tried to fix this yet, but thought I'd share my findings so far with you. Thanks, Andrew