From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id IOx/HtcgRGLQHgAAWB0awg (envelope-from ) for ; Wed, 30 Mar 2022 05:20:23 -0400 Received: by simark.ca (Postfix, from userid 112) id 7B2FA1F163; Wed, 30 Mar 2022 05:20:23 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RDNS_DYNAMIC,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.2 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id AA5D21F0BB for ; Wed, 30 Mar 2022 05:20:19 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 237A0388458B for ; Wed, 30 Mar 2022 09:20:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 237A0388458B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1648632019; bh=BUlox96ih4X4H3jAhHqXIDdAyMigr0sogFN7hgrRtfI=; h=To:Subject:In-Reply-To:References:Date:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=ClxAuKqTOYmPr/MureDY7S8ZxG3cq4n5XIb0yh7gSNCZdw7tQZt1LO9aSwTqq/NSg NFRtuAR8H02JqI9y56lQjMqJTEqp+EnI1AxpzG0tgDLh62hkUEJbE16I2cOBvMkNf6 zB4w39MWBpW7tNvObtyNVw4ETn640R72h/pEcogo= Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 2D7173886C46 for ; Wed, 30 Mar 2022 09:19:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2D7173886C46 Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-505-XzADw2ZIPBCUl1yX8-1I8w-1; Wed, 30 Mar 2022 05:19:46 -0400 X-MC-Unique: XzADw2ZIPBCUl1yX8-1I8w-1 Received: by mail-wm1-f69.google.com with SMTP id l7-20020a05600c1d0700b0038c9c48f1e7so830901wms.2 for ; Wed, 30 Mar 2022 02:19:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=BUlox96ih4X4H3jAhHqXIDdAyMigr0sogFN7hgrRtfI=; b=trL3WasDWvUXIR1xSqM9Bd6TUp0EaTpvDGk61TWwJCvgJ974HmCX0VqC4hWRSRF+YG Olis9bL5ZZ4U3Vcj4B5ONnn0sg7I4ve5RrAGN/iKzYk6Jx2B+58nn0+95KFHmT6bjNJm /DnCrzuQMKFE7hSqsWBNT3YFHZxbihA9TfGugALA69dWGQ48Vs7tBxu7ytzKPNTXUm3I XHgG+yen5PHCfLUSd0rW6rVLPRBPwv76X0rSDXZ6oJTIfuhIWrnWgtipMd8QVh6lvJnL albXU5F59eWtS4PS3x8DH8VVQqTqvpEBdtF8eaSq8q7DczPG65r6Ay07HkE332jLjpUf 8n4w== X-Gm-Message-State: AOAM530UjfOFHQhfAChDi9MPcK4r35RM431o/N4faBVg8508p4sbOFCf gHC/D2VOANzYhqcu0TVhfZQNpvR6Fbhz0fBiHaWEBa2u8ihsbqtGGkzPK8EPi9QC+nhOsdEEpoI BIW94+MRD9uDqLZXbrQAZLA== X-Received: by 2002:a05:600c:4e53:b0:38c:987e:cdc with SMTP id e19-20020a05600c4e5300b0038c987e0cdcmr3511899wmq.154.1648631984799; Wed, 30 Mar 2022 02:19:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxQoUGpj2dFksQ1gvbF1a3qe7fjdYNxcT5kIlMD8Lj17tUBa3mkugf7hIzR3NSfBeRIpQ3hPQ== X-Received: by 2002:a05:600c:4e53:b0:38c:987e:cdc with SMTP id e19-20020a05600c4e5300b0038c987e0cdcmr3511876wmq.154.1648631984458; Wed, 30 Mar 2022 02:19:44 -0700 (PDT) Received: from localhost (host86-169-131-113.range86-169.btcentralplus.com. [86.169.131.113]) by smtp.gmail.com with ESMTPSA id t4-20020a05600001c400b00203fb5dcf29sm16507670wrx.40.2022.03.30.02.19.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Mar 2022 02:19:43 -0700 (PDT) To: Simon Marchi , gdb-patches@sourceware.org Subject: Re: [PATCHv3] gdb: make thread_info executing and resumed state more consistent In-Reply-To: <87bkywb6u1.fsf@redhat.com> References: <0491ec20c3c5f489f625383f756d3853a8e48074.1630353626.git.andrew.burgess@embecosm.com> <20220113183406.3577620-1-aburgess@redhat.com> <87bkywb6u1.fsf@redhat.com> Date: Wed, 30 Mar 2022 10:19:43 +0100 Message-ID: <877d8bajbk.fsf@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Andrew Burgess via Gdb-patches Reply-To: Andrew Burgess Cc: Andrew Burgess Errors-To: gdb-patches-bounces+public-inbox=simark.ca@sourceware.org Sender: "Gdb-patches" Andrew Burgess writes: > Simon Marchi via Gdb-patches writes: > >> On 2022-01-13 1:34 p.m., Andrew Burgess wrote: >>> From: Andrew Burgess >>> >>> I finally got back around to working on this patch, and have a new >>> version that I'd like to propose. >>> >>> The biggest change in this version is that all of the thread creation >>> functions now have a new parameter which controls if the thread is >>> started resumed or not. >>> >>> There was also a great discussion last time about whether the >>> executing and resumed flags (within thread_info) should be combined >>> into a single enum. Simon argued in favour of this change, while I >>> remain unconvinced. >>> >>> For now, I'd like to propose that the enum change be deferred. The >>> patch as it stands is already pretty large, and changing how we manage >>> the two flags would only make the patch larger. The change as I have >>> it right now maintains the flags as they are, but makes their use >>> consistent. If we later want to change the flags to an enum then it >>> feels like this would be better done as a separate step. >>> >>> There are still things that I would like to improve in the area of >>> this code, I'm still not completely happy with how the thread state is >>> managed around the target_ops::create_inferior call, but I though the >>> code as it stands isn't great, at least things are consistent after >>> this patch. >>> >>> I'm as sure as I feel I can be that I've not broken anything for >>> Linux, but it's almost certain that something will be broken for other >>> targets. I've details the additional testing I've done at the end of >>> my commit message. >>> >>> My commit message also includes a ChangeLog like log, in this I've >>> tried to mention those areas of the change that I know are untested, >>> or lightly tested. >>> >>> I welcome all review feedback, as well as any additional testing that >>> people might like to do. >>> >>> Thanks, >>> Andrew >>> >>> --- >>> >>> This commit was inspired by this comment from Simon: >>> >>> https://sourceware.org/pipermail/gdb-patches/2021-July/180694.html >>> >>> The comment is about two thread_info flags m_executing and m_resumed. >>> The m_resumed flag indicates that at the infrun level GDB has set the >>> thread running, while the m_executing flag indicates that the thread >>> is actually running at the target level. >>> >>> It is very common that a thread can be m_resumed, but not m_executing, >>> that is, core GDB thinks the thread is, or should be, running, but at >>> the target level the thread isn't currently running. >>> >>> The comment Simon made was in reply to a patch I posted where I found >>> a case where a thread was marked m_executing, but not m_resumed, that >>> is, at the infrun level GDB thought the thread was stopped, but at the >>> target level the thread was actually running. Simon suggests that >>> this feels like an invalid state, and after thinking about it, I >>> agree. >>> >>> So, the goal of this commit is to add some asserts to GDB. The core >>> idea here is that the resumed and executing flags should only change >>> in the following pattern, to begin with everything is set false: >>> >>> Resumed: false >>> Executing: false >>> >>> Then infrun marks the thread resumed: >>> >>> Resumed: true >>> Executing: false >>> >>> Then a target starts the thread executing: >>> >>> Resumed: true >>> Executing: true >>> >>> The thread stops, the target spots this and marks the thread no longer >>> executing: >>> >>> Resumed: true >>> Executing: false >>> >>> And finally, infrun acknowledges that the thread is now no longer >>> resumed: >>> >>> Resumed: false >>> Executing: false >>> >>> Notice that at no point is resumed false, while executing is true. >>> >>> And so we can add these asserts: >>> >>> 1. The resumed flag should only change state when the executing flag >>> is false, and >>> >>> 2. The executing flag should only change state when the resumed flag >>> is true. >>> >>> I added these asserts and .... >>> >>> .... it turns out these rules are broken all over the place in GDB, we >>> have problems like: >>> >>> (a) When new threads appear they are marked executing, but not >>> resumed, and >>> >>> (b) In some places we mark a single thread as resumed, but then >>> actually set multiple threads executing. >>> >>> For (a) it could be argued that this is a legitimate state - this is >>> actually the problem I addressed in the original patch that Simon was >>> replying too, however, we don't need to support this as a separate >>> state, so if we can make this case follow the expected set of state >>> transitions then it allows us to reduce the number of states that GDB >>> can be in, which I think is a good thing. >>> >>> Case (b) seems to just be a bug to me. >>> >>> My initial approach was to retain the idea that threads are always >>> created in the non-executing, non-resumed state, and then find all the >>> places where new threads are created (which should be in the resumed >>> state), and mark the thread resumed in that case. >>> >>> However, after looking at the changes for a while, it felt like it >>> would be simpler to create threads as resumed by default and instead >>> change the threads back to non-resumed in the few cases where this was >>> appropriate. The appropriate case being (as far as I can tell), just >>> the initial threads created as part of starting up a new inferior. >>> >>> So, that's what this patch does. The thread creation routines now >>> take a flag to indicate if the new thread should be created resumed or >>> not. Almost all threads are started in the resumed state, except for >>> a few cases associated with initial target creation. >>> >>> In an ideal world, I would have liked that all threads created as part >>> of the ::create_inferior process would also be created in the >>> non-resumed state, but that doesn't seem easily achievable right now. >>> Though I could easily see changing the Linux target, other targets that >>> I can't test will be harder to get right. So, for now, after calling >>> ::create_process I assert that the threads are currently resumed, and >>> then mark the threads as non-resumed. >>> >>> I do have a plan that I might be able to make further improvements in >>> the ::create_inferior area, however, I want to see whether this patch >>> is accepted first before investing further time on this. I think that >>> where this patch gets to is good enough for now. >>> >>> On Testing: >>> >>> I've tested this primarily on x86-64 GNU/Linux with the unix, >>> native-gdbserver, and native-extended-gdbserver boards, and see no >>> regressions. >>> >>> I have also tried to test some other targets, though the testing has >>> been pretty minimal: >>> >>> I installed FreeBSD in an x86-64 VM. I can build GDB fine, but when I >>> try to run dejagnu the VM falls over with an out of swap space error, >>> no matter how much swap I give the machine. However, before falling >>> over a few hundred tests do run, and I see no regressions in those >>> tests. I've also manually checked that the gdb.base/attach.exp test >>> works (to do some basic attach testing). So I'm reasonably confident >>> that this target should be mostly OK. >>> >>> As I was working on GNU/Hurd for another bug I had an i386 GNU/Hurd VM >>> available, so I built GDB in here too. Due to other issues with this >>> target the dejagnu testing is pretty useless here, but I checked that >>> I can start multi-threaded targets, step/continue past thread >>> creation, and also that I can attach to a multi-threaded target. >>> >>> I have also built this branch on Windows, using mingw. I was unable >>> to get dejagnu testing working on this target, but again I did some >>> basic testing that GDB could start up with multi-threaded inferiors, >>> and correctly see new threads as they appear. >>> >>> I also built a target with a simulator, and checked that I could >>> startup and run basic tests their too. >>> >>> I did try making use of the GCC machine farm to test on AIX and >>> Solaris, but I was unable to get GDB building on any of the machines I >>> tried in the farm. >> >> Hi Andrew, >> >> I see nothing in the patch that stands out as obviously wrong. But like >> you, I have a hard time convincing myself what is the right resumed >> state to pass to add_thread at each place. And I'm not sure that just >> staring at the code longer will help much. It's hard to know without >> actually testing. >> >> For example, I am not sure that passing "true" in all update_thread_list >> methods is correct. update_thread_list is sometimes called when >> everything is stopped, on all-stop targets. So in that context, I >> suppose that new threads discovered this way should be created >> non-resumed. >> >> However, I think you did a pretty good job of testing what you could, >> given the context. > > Simon, > > Thanks for the feedback, and sorry for not following up sooner. This > got pushed down my list of priorities. > > Given how close we are to the GDB 12 branch point, my current plan is to > wait until shortly after we branch, and then update, retest, and push > this patch. > > That will give us (me) as much time as possible to help resolve any > problems that this introduces. I'm expecting there will be some issues > on the targets I've not tested as much, but hopefully they should be > easy enough to resolve. Now GDB 12 has branched, I'm planning to rebase and retest this patch, and then merge it! This is just a warning. I expect it to take a few days for the rebase and retest, so if there are any objections, now is the time to shout about it. I'm fully expecting that this patch will cause some regressions - it impacts all targets, some of which I can't test at all, others I can only do limited testing on. I'm fully committed to help debug issues relating to this patch over the coming months. I'll post again just prior to pushing. Thanks, Andrew