From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id R+pBC2SL3GBjTQAAWB0awg (envelope-from ) for ; Wed, 30 Jun 2021 11:19:00 -0400 Received: by simark.ca (Postfix, from userid 112) id 1E3141F1F2; Wed, 30 Jun 2021 11:19:00 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-0.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RDNS_DYNAMIC,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id E2A031E939 for ; Wed, 30 Jun 2021 11:18:58 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A6DA5385AC2F for ; Wed, 30 Jun 2021 15:18:58 +0000 (GMT) Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by sourceware.org (Postfix) with ESMTPS id 203593857C40 for ; Wed, 30 Jun 2021 15:18:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 203593857C40 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com Received: by mail-wr1-x432.google.com with SMTP id a13so4088069wrf.10 for ; Wed, 30 Jun 2021 08:18:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=3cNw8xez1HteUhBNuGK8Uf2XOSVYZT2+0WZfQE1VLCQ=; b=Pj/cRtiVMlkX7q7hPxiulLOfJlXLSKuBkYybELniUWvYbPOw0RfriNz0J67DsGknpk 6mllFqs9PCiBxNRSK9RgxawgMTQGmLfr7EFHZzgNW13iL+VOX4nBJWCp/x82iLrCaiRQ Fzs9f96qoAocs8vpLTelBPu4YNEz6ocCC6sICYUoYHRn430j+PxaHn8ovc2niB4rI0CH IkBsLDCr1HhLstpwf3n7FH+kNBBEvJQTVTappPJpynnsrqZOh3gsAIcj0+HiPB5WwKGq sSX+I1NIplQyYveiycYGzV+C5aidKvxd/FKIDCjwQyCNe8jOxhcdbjW5KVfV4xZ7HGtM fd6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=3cNw8xez1HteUhBNuGK8Uf2XOSVYZT2+0WZfQE1VLCQ=; b=ZCm0JSV8vPMnvyf5mHXloehOI//IbREAg213rzwyD9cRPXxVFmF3xXSVymTrOZ8faM 3VLsEcGRBNuRGZUOKoJ4u3YWrSidfSQ0O1SDgUNppHyTSQOIXR7stYnfCyfrst742PkZ IP/O8Yp2/2awWQxYVN1rM+N4DMbbfD+lCJfgj4a0Xtqhe5P8q6QF0QVgFpIoA3qXvY59 Ah/a4JzksBhIWHtOuEFJsMa/wFXynpGRko8gIQLHSMJRvKyj4wNuxB1j9mojoVXfbBNS hSsekKbElly+t9Huz+w4ekcF62+vx6OQzCfDiC6lmqthrX0ieoNeRrMBewm4JLm/W70u IgFQ== X-Gm-Message-State: AOAM531hHxR3MGmTNRQVeThvhNyqFa7/bTQtM9WHPDGZnSWAIdYrMmrm Mnlk/G36tggBLR+zvlDsuKduxQ== X-Google-Smtp-Source: ABdhPJwMgHdx+sm+JPKWrwCXbeVc39iFfYlfTc1/znvas1Kr0NEmPDkH5hc6lQoNZL2cGlYOaYqwXA== X-Received: by 2002:a5d:6dd2:: with SMTP id d18mr12286831wrz.94.1625066325210; Wed, 30 Jun 2021 08:18:45 -0700 (PDT) Received: from localhost (host86-140-92-85.range86-140.btcentralplus.com. [86.140.92.85]) by smtp.gmail.com with ESMTPSA id z9sm6470040wmf.43.2021.06.30.08.18.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Jun 2021 08:18:44 -0700 (PDT) Date: Wed, 30 Jun 2021 16:18:43 +0100 From: Andrew Burgess To: Simon Marchi Subject: Re: [PATCHv2 2/2] gdb: remove VALUE_FRAME_ID Message-ID: <20210630151843.GC2568@embecosm.com> References: <26dd0f26-7e7f-c0e8-263b-ef65554a93de@polymtl.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <26dd0f26-7e7f-c0e8-263b-ef65554a93de@polymtl.ca> X-Operating-System: Linux/5.8.18-100.fc31.x86_64 (x86_64) X-Uptime: 16:07:22 up 13 days, 22:58, X-Editor: GNU Emacs [ http://www.gnu.org/software/emacs ] X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: gdb-patches@sourceware.org Errors-To: gdb-patches-bounces+public-inbox=simark.ca@sourceware.org Sender: "Gdb-patches" * Simon Marchi [2021-06-29 13:53:20 -0400]: > > > On 2021-06-21 3:46 p.m., Andrew Burgess wrote: > > While working on the previous commit I happened to switch on 'set > > debug frame 1', and ran into a different assertion than the one I was > > trying to fix. > > > > The new problem I see is this assertion triggering: > > > > gdb/frame.c:622: internal-error: frame_id get_frame_id(frame_info*): Assertion `fi->this_id.p != frame_id_status::COMPUTING' failed. > > > > We attempt to get the frame_id for a frame while we are computing the > > frame_id for that same frame. > > > > What happens is we have a stack like this: > > > > normal_frame -> inline_frame -> sentinel_frame > > > > When we initially stop, GDB creates a frame for inline_frame but > > doesn't sniff its type, or try to fill in its frame_id (see code near > > the top of get_prev_frame_if_no_cycle). > > > > Later on during the stop, this happens: > > > > process_event_stop_test > > get_stack_frame_id > > skip_artificial_frames > > get_frame_type > > > > The call to get_frame_type causes inline_frame to sniff its type, > > establishing that it is an INLINE_FRAME, but does not cause the frame > > to build its frame_id. > > > > The same skip_artificial_frames call then calls get_prev_frame_always > > to unwind the stack, this will then try to get the frame previous to > > inline_frame, i.e. normal_frame. > > > > So, we create a new frame, but unlike frame #0, in > > get_prev_frame_if_no_cycle, we immediately try to compute the frame_id > > for this new frame #1. > > > > Computing the frame_id for frame #1 invokes the sniffer. If this > > sniffer tries to read a register then we will create a lazy register > > value by calling value_of_register_lazy. As the next frame (frame #0) > > is an INLINE_FRAME, we will skip this and instead create the lazy > > register value using the frame_id for frame #-1 (the sentinel frame). > > > > Now, when we try to resolve the lazy register value, in the value.c > > function value_fetch_lazy_register, we want to print which frame the > > lazy register value was for (in debug mode), so we call: > > > > frame = frame_find_by_id (VALUE_FRAME_ID (val)); > > > > where: > > > > #define VALUE_FRAME_ID(val) (get_prev_frame_id_by_id (VALUE_NEXT_FRAME_ID (val))) > > > > In our case we call get_prev_frame_id_by_id with the frame_id of the > > sentinel_frame frame, we then call frame_find_by_id, followed by > > get_prev_frame (which gives us frame #0), followed by get_frame_id. > > > > Frame #0 has not yet computed its frame_id, so we start that process. > > The first thing we need when computing the frame_id of an inline frame > > is the frame_id of the previous frame, so that's what we ask for. > > Unfortunately, that's where we started this journey, we are already > > computing the frame_id for frame #1, and thus the assert fires. > > > > Solving the assertion failure is pretty easy, if we consider the code > > in value_fetch_lazy_register and get_prev_frame_id_by_id then what we > > do is: > > > > 1. Start with a frame_id taken from a value, > > 2. Lookup the corresponding frame, > > 3. Find the previous frame, > > 4. Get the frame_id for that frame, and > > 5. Lookup the corresponding frame > > 6. Print the frame's level > > > > Notice that steps 3 and 5 give us the exact same result, step 4 is > > just wasted effort. We could shorten this process such that we drop > > steps 4 and 5, thus: > > > > 1. Start with a frame_id taken from a value, > > 2. Lookup the corresponding frame, > > 3. Find the previous frame, > > 6. Print the frame's level > > > > This will give the exact same frame as a result, and this is what I > > have done in this patch by removing the use of VALUE_FRAME_ID from > > value_fetch_lazy_register. > > > > Out of curiosity I looked to see how widely VALUE_FRAME_ID was used, > > and saw it was only used in one other place in valops.c:value_assign, > > where, once again, we take the result of VALUE_FRAME_ID and pass it to > > frame_find_by_id, thus introducing a redundant frame_id lookup. > > > > I don't think the value_assign case risks triggering the assertion > > though, as we are unlikely to call value_assign while computing the > > frame_id for a frame, however, we could make value_assign slightly > > more efficient, with no real additional complexity, by removing the > > use of VALUE_FRAME_ID. > > > > So, in this commit, I completely remove VALUE_FRAME_ID, and replace it > > with a use of VALUE_NEXT_FRAME_ID, followed by a direct call to > > get_prev_frame_always, this should make no difference in either case, > > and resolves the assertion issue from value.c. > > > > One thing that is worth noticing here is that in these two situations > > we don't end up getting the frame we expected to, though this is not a > > result of my change, we were not getting the expected result before > > either. > > > > Consider the debug printing case, the stack is: > > > > normal_frame -> inline_frame -> sentinel_frame > > > > We read a register from normal_frame (frame #1), the value of which is > > fetched from sentinel_frame (frame #-1). The debug print is trying to > > say: > > > > frame=1,regnum=.... > > > > However, as the lazy register value points at frame #-1, we will > > actually (incorrectly) print: > > > > frame=0,regnum=.... > > > > Like I said, this bug existed before this commit, and so, if we didn't > > assert (i.e. the lazy register read occurred in some context other > > than during the frame sniffer), then the debug print would previous > > have, and will continue to, print the wrong frame level. Thankfully, > > this is only in debug output, so not something a normal user should > > see. > > Are you sure this is not expected? An inline frame does not have > registers of its own, so it could be normal/expected, that to unwind > registers, we just skip over inline frames, so that a register value's > next frame id skips over inline frame. > > See this commit (that you wrote :)): > > https://gitlab.com/gnutools/gdb/-/commit/9fc501fdfe5dc82b5e5388cde4ac2ab70ed69d75 > > I might not understand correctly what you mean though. It was unexpected to me :) The frame number being reported here is the frame that wants the register, not the frame that provides the register, so given a stack like this (now with frame numbers): #1:normal_frame -> #0:inline_frame -> #-1:sentinel_frame And GDB says: { value_fetch_lazy (frame=1,regnum=....... GDB is trying to say that frame #1 asked to read a register, and the value returned as .... whatever .... BUT, what we _actually_ end up saying is: { value_fetch_lazy (frame=0,regnum=....... Which just isn't correct, frame 0 didn't cause the lazy register value to be created, frame 1 did. A developer reading these logs needs to understand that frame 0 is inline, and that the value in frame 1 is the same as the value in frame 0. You are correct that the patch you linked is indeed what introduced this mess in the first place, but I think the reasoning behind that original patch is still good. If this makes more sense then I will update the commit message to hopefully make things clearer - what do you think? Thanks, Andrew