From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id btnNOFEDHmZZ1DAAWB0awg (envelope-from ) for ; Tue, 16 Apr 2024 00:49:21 -0400 Authentication-Results: simark.ca; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=j0POxtK4; dkim-atps=neutral Received: by simark.ca (Postfix, from userid 112) id D2DC51E0C0; Tue, 16 Apr 2024 00:49:21 -0400 (EDT) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 76F221E092 for ; Tue, 16 Apr 2024 00:49:19 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 91AE5384AB6A for ; Tue, 16 Apr 2024 04:49:18 +0000 (GMT) Received: from mail-ot1-x32c.google.com (mail-ot1-x32c.google.com [IPv6:2607:f8b0:4864:20::32c]) by sourceware.org (Postfix) with ESMTPS id 9EC083858D32 for ; Tue, 16 Apr 2024 04:48:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9EC083858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9EC083858D32 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::32c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713242940; cv=none; b=R/gRPQ3Vs8d1FX0awZ+zlbRiNPJdOfdEnwwl/vPNI95UEIW+YqowkfXij33s3Mt4IAVwzUbm8DeferJx1YSCTvVW2ODU4r+govL1dEObOPIs1as2Ebv9xbL20GqhvRLavy1x/2DVLwjnt50MyAVM85OnZV2UkxYgQAS775EtnJU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713242940; c=relaxed/simple; bh=7hc7ZB6jXIUa2qrpzAHmCum3mbtUsR6fu1Gz6ZVyaZg=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=s+9J02Fn7V6ZVpe/bPu682NekC8F8qhWQfdNnZsMkqlLgpSI+H3hSxSgaHpgp3x6qiaWZZpKK359ojTTDgQaSkr0oghJnzw8ddA5pbpihxyR2TsSHytc9gqXSYrmXP6cpCj0O4SVtU47u82k1P9r3bgcH5SDJiXlh5JIESxbko4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ot1-x32c.google.com with SMTP id 46e09a7af769-6ea2f95ec67so2042241a34.2 for ; Mon, 15 Apr 2024 21:48:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1713242937; x=1713847737; darn=sourceware.org; h=content-transfer-encoding:mime-version:message-id:date:user-agent :references:in-reply-to:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=m1BrFSCs4Bx0CGVtt2IPlfH9RbX5JBMdboGBvCebfIU=; b=j0POxtK4f74qLAUpU59txrp9z+NqlzijYE2RtQ7ptcLKcekWIg88ecCsqUTLBwGaTC Qjb0lc8dEsnmLvK3+Eq2wrcaoKr7jYFdSssKd0dvhgEGIU2FKljgZnorGPGgYv6TO8Pt NZJCGfmAHIXpv3QmKdAgncCrjUQkRSOOTgR7YhOeifqujoOgwys/Y+H5gyXnTHy0l90Y s6PLziU+HZvto0+XEEuaNDA5W1fPEvPxUVjs5g7qath4A51M+C+xvVVSch5zl3m+32x6 biUTPkV9ZQ2XqdJXbnDVFZdpDWA4oBGTSeZS2sXq9KnaDAopbzS8yNRN+38Tk5GdNuIA 5vBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713242937; x=1713847737; h=content-transfer-encoding:mime-version:message-id:date:user-agent :references:in-reply-to:subject:cc:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=m1BrFSCs4Bx0CGVtt2IPlfH9RbX5JBMdboGBvCebfIU=; b=Xt5oHfpeSoBdyYfehaZ24lcEnD4/4mlWyhoXPC2WkMTCK3tfx8MaCFgqfiWkYfr1XK bocmih1pAINOHpRK6yH9a0djIRghLQhYib6vVTCA9r4szNMzikcALso6VtZNbf5ThQBJ U5jnoBlA7WR2b5qIXYBwohfZEQy2HMKL55NUUgB4rc5esY5Wj4w5mirv04uxpP4YrCyO paCpPoPTyQgxsDKxwa00PhiFbX7Itf9csINRJ3/lN71oUdJiuMx4eTOr6G3ODl1jD4t8 JXqwcXvDsMM0JlxJidO+AkZMCM1HZ8Nmnnxja2L+vOYKHjd5DQn1fZ3W18Mf7fU8oAdq EKdQ== X-Gm-Message-State: AOJu0YzKtWXJe2vF+phlAKlLOxqN+oxcsNqnBP2/MXQfvWwTbnB3dwqQ P54rabVyIyZW7VNJ/ikecdFBESoSPNShVzNtekxIfvOgqpZ4loQhHi7+5M6rZ+2oDo96RjlHGvY a X-Google-Smtp-Source: AGHT+IHqVK/ildYFF1BRBQ6VshR1VvuIRtr1glkkrLYoi9oS4KfkBjYVFyka2ljrdrghDo8Oo8cLng== X-Received: by 2002:a05:6808:291:b0:3c5:eed2:e244 with SMTP id z17-20020a056808029100b003c5eed2e244mr12266971oic.30.1713242937135; Mon, 15 Apr 2024 21:48:57 -0700 (PDT) Received: from localhost ([2804:14d:7e39:8470:6a44:587d:5be4:4fc8]) by smtp.gmail.com with ESMTPSA id ll10-20020a056a00728a00b006efd89effd4sm3194214pfb.102.2024.04.15.21.48.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Apr 2024 21:48:56 -0700 (PDT) From: Thiago Jung Bauermann To: Pedro Alves Cc: gdb-patches@sourceware.org Subject: Re: [RFC PATCH 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads In-Reply-To: (Pedro Alves's message of "Fri, 22 Mar 2024 16:52:09 +0000") References: <20240321231149.519549-1-thiago.bauermann@linaro.org> <20240321231149.519549-4-thiago.bauermann@linaro.org> User-Agent: mu4e 1.12.2; emacs 29.3 Date: Tue, 16 Apr 2024 01:48:53 -0300 Message-ID: <87msptgbey.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+public-inbox=simark.ca@sourceware.org Hello Pedro, [ I just now noticed that I replied to you in private last time. Sorry about that, I thought I had cc'd the list. ] Pedro Alves writes: > On 2024-03-21 23:11, Thiago Jung Bauermann wrote: >> When GDB attaches to a multi-threaded process, it calls >> linux_proc_attach_tgid_threads () to go through all threads found in >> /proc/PID/task/ and call attach_proc_task_lwp_callback () on each of >> them. If it does that twice without the callback reporting that a new >> thread was found, then it considers that all inferior threads have been >> found and returns. >> >> The problem is that the callback considers any thread that it hasn't >> attached to yet as new. This causes problems if the process has one or >> more zombie threads, because GDB can't attach to it and the loop will >> always "find" a new thread (the zombie one), and get stuck in an >> infinite loop. >> >> This is easy to trigger (at least on aarch64-linux and powerpc64le-linux) >> with the gdb.threads/attach-many-short-lived-threads.exp testcase, becau= se >> its test program constantly creates and finishes joinable threads so the >> chance of having zombie threads is high. > > Hmm. How about simply not restarting the loop if attach_lwp tries to att= ach to > a zombie lwp (and silently fails)? > > I.e., in gdb, make attach_proc_task_lwp_callback return false/0 here: > > if (ptrace (PTRACE_ATTACH, lwpid, 0, 0) < 0) > { > int err =3D errno; > > /* Be quiet if we simply raced with the thread exiting. > EPERM is returned if the thread's task still exists, and > is marked as exited or zombie, as well as other > conditions, so in that case, confirm the status in > /proc/PID/status. */ > if (err =3D=3D ESRCH > || (err =3D=3D EPERM && linux_proc_pid_is_gone (lwpid))) > { > linux_nat_debug_printf > ("Cannot attach to lwp %d: thread is gone (%d: %s)", > lwpid, err, safe_strerror (err)); > > return 0; <<<< NEW RETURN > } > > Similar thing for gdbserver, of course. Thank you for the suggestion. I tried doing that, and in fact I attached a patch with that change in comment #17 of PR 31312 when I was investigating a fix=C2=B9. I called it a workaround because I also had to increase the number of iterations in linux_proc_attach_tgid_threads from 2 to 100, otherwise GDB gives up on waiting for new inferior threads too early and the inferior dies with a SIGTRAP because some new unnoticed thread tripped into the breakpoint. Because of the need to increase the number of iterations, I didn't consider it a good solution and went with the approach in this patch series instead. Now I finally understand why I had to increase the number of iterations (though I still don't see a way around it other than what this patch series does): With the approach in this patch series, even if a new thread becomes zombie by the time GDB tries to attach to it, it still causes new_threads_found to be set the first time GDB notices it, and the loop in linux_proc_attach_tgid_threads starts over. With the approach above, a new thread that becomes zombie before GDB has a chance to attach to it never causes the loop to start over, and so it exits earlier. I think it's a matter of opinion whether one approach or the other can be considered the better one. Even with this patch series, it's not guaranteed that two iterations without finding new threads is enough to ensure that GDB has found all threads in the inferior. I left the test running in a loop overnight with the patch series applied and it failed after about 2500 runs. -- Thiago =C2=B9 https://sourceware.org/bugzilla/attachment.cgi?id=3D15405&action=3Dd= iff