From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 56449 invoked by alias); 17 Apr 2019 17:38:49 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 56220 invoked by uid 89); 17 Apr 2019 17:38:49 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-4.9 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=bullet, our X-HELO: rock.gnat.com Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 17 Apr 2019 17:38:46 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by filtered-rock.gnat.com (Postfix) with ESMTP id ED361117E41; Wed, 17 Apr 2019 13:38:44 -0400 (EDT) Received: from rock.gnat.com ([127.0.0.1]) by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id RLPQjQ0yvIwh; Wed, 17 Apr 2019 13:38:44 -0400 (EDT) Received: from joel.gnat.com (localhost.localdomain [127.0.0.1]) by rock.gnat.com (Postfix) with ESMTP id BAADF117BAD; Wed, 17 Apr 2019 13:38:44 -0400 (EDT) Received: by joel.gnat.com (Postfix, from userid 1000) id 0693A83C06; Wed, 17 Apr 2019 10:38:43 -0700 (PDT) Date: Wed, 17 Apr 2019 17:38:00 -0000 From: Joel Brobecker To: Eli Zaretskii Cc: gdb-patches@sourceware.org Subject: Re: [RFA 2/2][master only] gdb/windows-nat.c: Get rid of main_thread_id global Message-ID: <20190417173842.GA14817@adacore.com> References: <1555453982-77808-1-git-send-email-brobecker@adacore.com> <1555453982-77808-3-git-send-email-brobecker@adacore.com> <83imvcg0ud.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <83imvcg0ud.fsf@gnu.org> User-Agent: Mutt/1.9.4 (2018-02-28) X-SW-Source: 2019-04/txt/msg00297.txt.bz2 > > This global is meant to point to the "main" thread of execution of > > the program we are debugging. It is set when attaching to a process > > or when receiving a CREATE_PROCESS_DEBUG_EVENT event. The theory at > > the time was that this was also going to be the thread receiving > > the EXIT_PROCESS_DEBUG_EVENT event. > > > > Unfortunately, we have discovered since then that this is actually > > not guaranteed. What this means in practice is that there is moderate > > risk that main_thread_id refers to a thread which no longer exists. > > > > This global is used in 3 situations: > > - OUTPUT_DEBUG_STRING_EVENT > > - LOAD_DLL_DEBUG_EVENT > > - UNLOAD_DLL_DEBUG_EVENT > > > > It's not clear why we would need to use the main_thread_id in those cases > > instead of using the thread ID provided by the kernel events itself. > > So this patch implements this approach, which then allows us to delete > > the main_thread_id global. > > > > gdb/testsuite: > > > > * windows-nat.c (main_thread_id): Delete. > > (handle_output_debug_string): Replace main_thread_id by > > current_event.dwThreadId. > > (fake_create_process): Likewise. > > (get_windows_debug_event) : > > Do not set main_thread_id. > > : Replace main_thread_id by > > current_event.dwThreadId. > > : Likewise. > > Thanks, but isn't this throwing the baby with the bathwater, though? > AFAIK, the thread whose ID we get in the CREATE_PROCESS_DEBUG_EVENT > _is_ indeed the main thread, at least that's my reading of everything > I see about this on the Internet. What might _not_ be true is that > the main thread exits last. In fact, I could write a legitimate > Windows program whose main thread exited long before the process > terminated (although doing that in a console application that uses > 'main' would be somewhat tricky at best). Perhaps that's what happens > in the cases where you saw the problem. It happens completely randomly, and in all cases, the program itself has a main program which waits for all software threads to complete before itself terminating. Not sure what Windows is doing under the hood. > AFAIU, the real problem with this global is that we delete the main > thread when its thread-termination notification is received, and we > also assume that EXIT_PROCESS_DEBUG_EVENT tells us that the main > thread exited. But if this is the problem, then we could do one of > two things: > > . Keep the main_thread global, but use it just for avoiding the > announcement of the main thread. > > . Install some logic that would avoid trying to delete a thread if > its ID is equal to main_thread, when main_thread cannot be found > in the list of threads. > > This would allow us to avoid the crashes, but still not announce the > main thread as any other thread. The crash is already avoided by patch #1, which does _not_ delete the global. The reason why I want to delete the global is because it references a thread that the Windows kernel told us does not exist anymore, and I fear that keeping it around will cause us to pass it back to the Windows kernel, thus triggering an error. As far as I can tell from our uses of it as part of pure event handling, I don't see why we would need that global at all. So, rather than stay on shaky grounds regarding that question, I prefer we bite the bullet and try to do it right. The second question is about the main thread creation/exit notifications. At the moment, what we are relying on to emit or silence the notification is the kind of event we received. So, for CREATE_PROCESS_DEBUG_EVENT/EXIT_PROCESS_DEBUG_EVENT events, we silence the notification. Unfortunately, we have now discovered that does not work, because EXIT_PROCESS_DEBUG_EVENT doesn't always point the same thread as CREATE_PROCESS_DEBUG_EVENT. If I understand you correctly, you want to preserve the global so we can avoid those notifications. In my opinion, doing that just for the purpose of filtering one notification is simply not worth the trouble of keeping a global around. Generally speaking, we tend to try to avoid globals as much as we can, as they are hard to track, and makes the code harder to understand and maintain. Modifying the logic to use the main_thread_id global instead of the kind of event is a step backwards, in my opinion. -- Joel