From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id 6MkEJLr4H2ZOrjIAWB0awg (envelope-from ) for ; Wed, 17 Apr 2024 12:28:42 -0400 Received: by simark.ca (Postfix, from userid 112) id 7DD341E0C0; Wed, 17 Apr 2024 12:28:42 -0400 (EDT) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 5F11A1E092 for ; Wed, 17 Apr 2024 12:28:40 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DDD503858CDA for ; Wed, 17 Apr 2024 16:28:39 +0000 (GMT) Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) by sourceware.org (Postfix) with ESMTPS id 1E3BE3858C41 for ; Wed, 17 Apr 2024 16:28:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1E3BE3858C41 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1E3BE3858C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.128.41 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713371300; cv=none; b=qwc1xpcSADQxjh3cLf0fnakP4oCoZzS5oHYvP0yxMzBWQgCX2IsAtVz9KLALjVu9G8KMzNo8jUbjgTqC0bfF/5lysBo3Mq+n92HhntyLAaTYeK2cvrkuf3qkGgb4YKMfyGLGZ5u4nfRIkQ32H/ezIBfHGSCHw3hxOtiZ9/NHCs0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713371300; c=relaxed/simple; bh=FtMr2o0J9rwS9ZTg0tLIQ1u7TagwbOgKDh48ixqga2k=; h=Message-ID:Date:MIME-Version:Subject:To:From; b=KJv3Y/TDJiadNRzxaKtue06CvxaTYDVk02pDr063iLnz0jqZH9zL+qgNajv/Q1fMtRlk3p4Lqic2JcEDUFNXdBhPJAFD0y4ItSMXHnunLpZC+7wdpgHeSj427Vwnx3SJ9S82fMSza43HRwjOA89+OX1cIJSPiYqtLfaNIPE6c38= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-f41.google.com with SMTP id 5b1f17b1804b1-418c979ddd2so4499135e9.3 for ; Wed, 17 Apr 2024 09:28:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713371296; x=1713976096; h=content-transfer-encoding:in-reply-to:content-language:from :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MOYbykCegIBIKpv3G+l9hz+ddX9p21uPYS9Lv3ectts=; b=oWifRuZphULwjU6eoB5/HdUjbR2hhlRdzKR8HZGONyHnenAN+GKj5HMnykczI/PtnX mmYxkmt1q8wvFKpLAubNiOx67A1JwTjYHE4dFtZo4V5Swu1J5jOuS5TdmA74/9WhhRmZ PeMZrNJwTx+J/yzaOUTYXk4bDNrQLQXIZH9AZec0T2wAtXTDYyTnZBrdefq2lrB5is+7 xqHaQfn/Vj5a9gXBI+PQssHSw+vGtkk8XSnoo9cSHQZPjhbD83tEW4J5jnsEyT0CU6/4 5aZ0zhQMVpKcIi5zNzZQb0OXYuN/OAfRjZCOcp+mzK1I7MiHLl2kwn7DpFW4uhHyCTMq C+bw== X-Forwarded-Encrypted: i=1; AJvYcCV7TZI1jrdiamSh8XunzRItwPJI2M6ns37nhzR68UY/gIHBcS3JLEQcccJHI0qhY1s7WngmM57pbHMbGDmaXa04Iz95vNpWp9Pxkg== X-Gm-Message-State: AOJu0YxAX6NvlpS+jeXRH+HWI1UDRnltSBp8Bhzca9Byb6YguS4vOBlO QvDPhblzWrknRWTV5TN/BFRSGtanxsUja4GNLZsTrMvkK9Urq9po X-Google-Smtp-Source: AGHT+IH/vgZGY0J25POiFS5lEWlrzT7xc26InMuQr9NFHZ/XjuQLFEUFrBMmOJE3AXcva9+KMR1shQ== X-Received: by 2002:a05:600c:3d89:b0:418:46d5:193e with SMTP id bi9-20020a05600c3d8900b0041846d5193emr98624wmb.25.1713371295809; Wed, 17 Apr 2024 09:28:15 -0700 (PDT) Received: from ?IPV6:2001:8a0:f93d:b900:e643:3adf:640c:b3ad? ([2001:8a0:f93d:b900:e643:3adf:640c:b3ad]) by smtp.gmail.com with ESMTPSA id h9-20020a05600c314900b00418a9961c47sm3327301wmo.47.2024.04.17.09.28.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 Apr 2024 09:28:15 -0700 (PDT) Message-ID: <29215aa0-8387-4dee-8b8d-3cbf64e6abe3@palves.net> Date: Wed, 17 Apr 2024 17:28:12 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads To: Thiago Jung Bauermann , gdb-patches@sourceware.org References: <20240321231149.519549-1-thiago.bauermann@linaro.org> <20240321231149.519549-4-thiago.bauermann@linaro.org> From: Pedro Alves Content-Language: en-US In-Reply-To: <20240321231149.519549-4-thiago.bauermann@linaro.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+public-inbox=simark.ca@sourceware.org On 2024-03-21 23:11, Thiago Jung Bauermann wrote: > When GDB attaches to a multi-threaded process, it calls > linux_proc_attach_tgid_threads () to go through all threads found in > /proc/PID/task/ and call attach_proc_task_lwp_callback () on each of > them. If it does that twice without the callback reporting that a new > thread was found, then it considers that all inferior threads have been > found and returns. > > The problem is that the callback considers any thread that it hasn't > attached to yet as new. This causes problems if the process has one or > more zombie threads, because GDB can't attach to it and the loop will > always "find" a new thread (the zombie one), and get stuck in an > infinite loop. > > This is easy to trigger (at least on aarch64-linux and powerpc64le-linux) > with the gdb.threads/attach-many-short-lived-threads.exp testcase, because > its test program constantly creates and finishes joinable threads so the > chance of having zombie threads is high. > > This problem causes the following failures: > > FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: attach (timeout) > FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: no new threads (timeout) > FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: set breakpoint always-inserted on (timeout) > FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break break_fn (timeout) > FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 1 (timeout) > FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 2 (timeout) > FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 3 (timeout) > FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: reset timer in the inferior (timeout) > FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: print seconds_left (timeout) > FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: detach (timeout) > FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: set breakpoint always-inserted off (timeout) > FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: delete all breakpoints, watchpoints, tracepoints, and catchpoints in delete_breakpoints (timeout) > ERROR: breakpoints not deleted > > The iteration number is random, and all tests in the subsequent iterations > fail too, because GDB is stuck in the attach command at the beginning of > the iteration. > > The solution is to make linux_proc_attach_tgid_threads () remember when it > has already processed a given LWP and skip it in the subsequent iterations. > > PR testsuite/31312 > Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31312 > --- > gdb/nat/linux-osdata.c | 22 ++++++++++++++++++++++ > gdb/nat/linux-osdata.h | 4 ++++ > gdb/nat/linux-procfs.c | 19 +++++++++++++++++++ > 3 files changed, 45 insertions(+) > > diff --git a/gdb/nat/linux-osdata.c b/gdb/nat/linux-osdata.c > index c254f2e4f05b..998279377433 100644 > --- a/gdb/nat/linux-osdata.c > +++ b/gdb/nat/linux-osdata.c > @@ -112,6 +112,28 @@ linux_common_core_of_thread (ptid_t ptid) > return core; > } > > +/* See linux-osdata.h. */ > + > +std::optional > +linux_get_starttime (ptid_t ptid) Ditto re. moving this to linux-procfs. This has nothing to do with "info osdata". > index a82fb08b998e..1cdc687aa9cf 100644 > --- a/gdb/nat/linux-osdata.h > +++ b/gdb/nat/linux-osdata.h > @@ -27,4 +27,8 @@ extern int linux_common_core_of_thread (ptid_t ptid); > extern LONGEST linux_common_xfer_osdata (const char *annex, gdb_byte *readbuf, > ULONGEST offset, ULONGEST len); > > +/* Get the start time of thread PTID. */ > + > +extern std::optional linux_get_starttime (ptid_t ptid); > + > #endif /* NAT_LINUX_OSDATA_H */ > diff --git a/gdb/nat/linux-procfs.c b/gdb/nat/linux-procfs.c > index b17e3120792e..b01bf36c0b53 100644 > --- a/gdb/nat/linux-procfs.c > +++ b/gdb/nat/linux-procfs.c > @@ -17,10 +17,13 @@ > along with this program. If not, see . */ > > #include "gdbsupport/common-defs.h" > +#include "linux-osdata.h" > #include "linux-procfs.h" linux-procfs.h is the main header of this source file, so it should be first. But then again, I think this shouldn't be consuming things from linux-osdata, it should be the other way around. > #include "gdbsupport/filestuff.h" > #include > #include > +#include > +#include > > /* Return the TGID of LWPID from /proc/pid/status. Returns -1 if not > found. */ > @@ -290,6 +293,10 @@ linux_proc_attach_tgid_threads (pid_t pid, > return; > } > > + /* Keeps track of the LWPs we have already visited in /proc, > + identified by their PID and starttime to detect PID reuse. */ > + std::set> visited_lwps; Missing space before ULONGEST. AFAICT, you don't rely on order, so this could be an unordered_set? > + > /* Scan the task list for existing threads. While we go through the > threads, new threads may be spawned. Cycle through the list of > threads until we have done two iterations without finding new > @@ -308,6 +315,18 @@ linux_proc_attach_tgid_threads (pid_t pid, > if (lwp != 0) > { > ptid_t ptid = ptid_t (pid, lwp); > + std::optional starttime = linux_get_starttime (ptid); > + > + if (starttime.has_value ()) > + { > + std::pair key (lwp, *starttime); Space before ULONGEST. > + > + /* If we already visited this LWP, skip it this time. */ > + if (visited_lwps.find (key) != visited_lwps.cend ()) > + continue; > + > + visited_lwps.insert (key); > + } > > if (attach_lwp (ptid)) > new_threads_found = 1;