From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id 6uIpOYiu4WE6CAAAWB0awg (envelope-from ) for ; Fri, 14 Jan 2022 12:10:32 -0500 Received: by simark.ca (Postfix, from userid 112) id D4AB01F34E; Fri, 14 Jan 2022 12:10:32 -0500 (EST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_DYNAMIC, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id AEA251EAA4 for ; Fri, 14 Jan 2022 12:10:31 -0500 (EST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 001CC3864C69 for ; Fri, 14 Jan 2022 17:10:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 001CC3864C69 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1642180231; bh=6o7Q16cZH4VbKuhxKRSNBSz+f0qL/qrWqEOU66cpyOE=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=xVPK3tvLb5wRDUwrNhtnh+Kjad0iuGr3RNB7cQLL8xnDBNImQ2xIs69GW3MvBaGPm TrNH0iwpvjnQKPzDpJZzcNuNjEk8EGvCR4kPRXfvjvGhaaYiW/jYsCr8dquJg8buHc +064ZNCRb/y/rrKhgJugm3WGVglJxrNbTzrXn0vo= Received: from smtp.polymtl.ca (smtp.polymtl.ca [132.207.4.11]) by sourceware.org (Postfix) with ESMTPS id 5CAE63858D35 for ; Fri, 14 Jan 2022 17:10:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5CAE63858D35 Received: from simark.ca (simark.ca [158.69.221.121]) (authenticated bits=0) by smtp.polymtl.ca (8.14.7/8.14.7) with ESMTP id 20EHA2Xc026659 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 14 Jan 2022 12:10:06 -0500 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.polymtl.ca 20EHA2Xc026659 Received: from [172.16.0.95] (192-222-180-24.qc.cable.ebox.net [192.222.180.24]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id B213D1EAA4; Fri, 14 Jan 2022 12:10:01 -0500 (EST) Subject: Re: [PATCHv3] gdb: make thread_info executing and resumed state more consistent To: Andrew Burgess , gdb-patches@sourceware.org References: <0491ec20c3c5f489f625383f756d3853a8e48074.1630353626.git.andrew.burgess@embecosm.com> <20220113183406.3577620-1-aburgess@redhat.com> Message-ID: Date: Fri, 14 Jan 2022 12:10:00 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: <20220113183406.3577620-1-aburgess@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: tl Content-Transfer-Encoding: 7bit X-Poly-FromMTA: (simark.ca [158.69.221.121]) at Fri, 14 Jan 2022 17:10:02 +0000 X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Simon Marchi via Gdb-patches Reply-To: Simon Marchi Cc: Andrew Burgess Errors-To: gdb-patches-bounces+public-inbox=simark.ca@sourceware.org Sender: "Gdb-patches" On 2022-01-13 1:34 p.m., Andrew Burgess wrote: > From: Andrew Burgess > > I finally got back around to working on this patch, and have a new > version that I'd like to propose. > > The biggest change in this version is that all of the thread creation > functions now have a new parameter which controls if the thread is > started resumed or not. > > There was also a great discussion last time about whether the > executing and resumed flags (within thread_info) should be combined > into a single enum. Simon argued in favour of this change, while I > remain unconvinced. > > For now, I'd like to propose that the enum change be deferred. The > patch as it stands is already pretty large, and changing how we manage > the two flags would only make the patch larger. The change as I have > it right now maintains the flags as they are, but makes their use > consistent. If we later want to change the flags to an enum then it > feels like this would be better done as a separate step. > > There are still things that I would like to improve in the area of > this code, I'm still not completely happy with how the thread state is > managed around the target_ops::create_inferior call, but I though the > code as it stands isn't great, at least things are consistent after > this patch. > > I'm as sure as I feel I can be that I've not broken anything for > Linux, but it's almost certain that something will be broken for other > targets. I've details the additional testing I've done at the end of > my commit message. > > My commit message also includes a ChangeLog like log, in this I've > tried to mention those areas of the change that I know are untested, > or lightly tested. > > I welcome all review feedback, as well as any additional testing that > people might like to do. > > Thanks, > Andrew > > --- > > This commit was inspired by this comment from Simon: > > https://sourceware.org/pipermail/gdb-patches/2021-July/180694.html > > The comment is about two thread_info flags m_executing and m_resumed. > The m_resumed flag indicates that at the infrun level GDB has set the > thread running, while the m_executing flag indicates that the thread > is actually running at the target level. > > It is very common that a thread can be m_resumed, but not m_executing, > that is, core GDB thinks the thread is, or should be, running, but at > the target level the thread isn't currently running. > > The comment Simon made was in reply to a patch I posted where I found > a case where a thread was marked m_executing, but not m_resumed, that > is, at the infrun level GDB thought the thread was stopped, but at the > target level the thread was actually running. Simon suggests that > this feels like an invalid state, and after thinking about it, I > agree. > > So, the goal of this commit is to add some asserts to GDB. The core > idea here is that the resumed and executing flags should only change > in the following pattern, to begin with everything is set false: > > Resumed: false > Executing: false > > Then infrun marks the thread resumed: > > Resumed: true > Executing: false > > Then a target starts the thread executing: > > Resumed: true > Executing: true > > The thread stops, the target spots this and marks the thread no longer > executing: > > Resumed: true > Executing: false > > And finally, infrun acknowledges that the thread is now no longer > resumed: > > Resumed: false > Executing: false > > Notice that at no point is resumed false, while executing is true. > > And so we can add these asserts: > > 1. The resumed flag should only change state when the executing flag > is false, and > > 2. The executing flag should only change state when the resumed flag > is true. > > I added these asserts and .... > > .... it turns out these rules are broken all over the place in GDB, we > have problems like: > > (a) When new threads appear they are marked executing, but not > resumed, and > > (b) In some places we mark a single thread as resumed, but then > actually set multiple threads executing. > > For (a) it could be argued that this is a legitimate state - this is > actually the problem I addressed in the original patch that Simon was > replying too, however, we don't need to support this as a separate > state, so if we can make this case follow the expected set of state > transitions then it allows us to reduce the number of states that GDB > can be in, which I think is a good thing. > > Case (b) seems to just be a bug to me. > > My initial approach was to retain the idea that threads are always > created in the non-executing, non-resumed state, and then find all the > places where new threads are created (which should be in the resumed > state), and mark the thread resumed in that case. > > However, after looking at the changes for a while, it felt like it > would be simpler to create threads as resumed by default and instead > change the threads back to non-resumed in the few cases where this was > appropriate. The appropriate case being (as far as I can tell), just > the initial threads created as part of starting up a new inferior. > > So, that's what this patch does. The thread creation routines now > take a flag to indicate if the new thread should be created resumed or > not. Almost all threads are started in the resumed state, except for > a few cases associated with initial target creation. > > In an ideal world, I would have liked that all threads created as part > of the ::create_inferior process would also be created in the > non-resumed state, but that doesn't seem easily achievable right now. > Though I could easily see changing the Linux target, other targets that > I can't test will be harder to get right. So, for now, after calling > ::create_process I assert that the threads are currently resumed, and > then mark the threads as non-resumed. > > I do have a plan that I might be able to make further improvements in > the ::create_inferior area, however, I want to see whether this patch > is accepted first before investing further time on this. I think that > where this patch gets to is good enough for now. > > On Testing: > > I've tested this primarily on x86-64 GNU/Linux with the unix, > native-gdbserver, and native-extended-gdbserver boards, and see no > regressions. > > I have also tried to test some other targets, though the testing has > been pretty minimal: > > I installed FreeBSD in an x86-64 VM. I can build GDB fine, but when I > try to run dejagnu the VM falls over with an out of swap space error, > no matter how much swap I give the machine. However, before falling > over a few hundred tests do run, and I see no regressions in those > tests. I've also manually checked that the gdb.base/attach.exp test > works (to do some basic attach testing). So I'm reasonably confident > that this target should be mostly OK. > > As I was working on GNU/Hurd for another bug I had an i386 GNU/Hurd VM > available, so I built GDB in here too. Due to other issues with this > target the dejagnu testing is pretty useless here, but I checked that > I can start multi-threaded targets, step/continue past thread > creation, and also that I can attach to a multi-threaded target. > > I have also built this branch on Windows, using mingw. I was unable > to get dejagnu testing working on this target, but again I did some > basic testing that GDB could start up with multi-threaded inferiors, > and correctly see new threads as they appear. > > I also built a target with a simulator, and checked that I could > startup and run basic tests their too. > > I did try making use of the GCC machine farm to test on AIX and > Solaris, but I was unable to get GDB building on any of the machines I > tried in the farm. Hi Andrew, I see nothing in the patch that stands out as obviously wrong. But like you, I have a hard time convincing myself what is the right resumed state to pass to add_thread at each place. And I'm not sure that just staring at the code longer will help much. It's hard to know without actually testing. For example, I am not sure that passing "true" in all update_thread_list methods is correct. update_thread_list is sometimes called when everything is stopped, on all-stop targets. So in that context, I suppose that new threads discovered this way should be created non-resumed. However, I think you did a pretty good job of testing what you could, given the context. Simon