From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id DpaxNS1e22BnMwAAWB0awg (envelope-from ) for ; Tue, 29 Jun 2021 13:53:49 -0400 Received: by simark.ca (Postfix, from userid 112) id CB96A1F1F2; Tue, 29 Jun 2021 13:53:49 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id AA8781EE14 for ; Tue, 29 Jun 2021 13:53:48 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 04962385DC06 for ; Tue, 29 Jun 2021 17:53:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 04962385DC06 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1624989228; bh=6S8S2w6S8N5GK65D0PBKGNRMKW42K6ANZ8N8YbNjAoY=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=PPG99L8bCj88B7Fv+/oaMPTkk54SASwsJYgxB/0jJKUs+QNLA0fzANVik7h7A6O0E fFHF3NSsdx8Y8RqWdTs0dWl0ujKryUCafrax44oILYctUdYeYT88dlzRrRPGDscNiR 2TRCBBoeXY1pQ9rbKonCyZj7X4dUtlSgcaTkw4P0= Received: from smtp.polymtl.ca (smtp.polymtl.ca [132.207.4.11]) by sourceware.org (Postfix) with ESMTPS id A8C413857C7F for ; Tue, 29 Jun 2021 17:53:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A8C413857C7F Received: from simark.ca (simark.ca [158.69.221.121]) (authenticated bits=0) by smtp.polymtl.ca (8.14.7/8.14.7) with ESMTP id 15THrLPX021057 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 29 Jun 2021 13:53:26 -0400 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.polymtl.ca 15THrLPX021057 Received: from [10.0.0.11] (192-222-157-6.qc.cable.ebox.net [192.222.157.6]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 4B4291EE14; Tue, 29 Jun 2021 13:53:21 -0400 (EDT) Subject: Re: [PATCHv2 2/2] gdb: remove VALUE_FRAME_ID To: Andrew Burgess , gdb-patches@sourceware.org References: Message-ID: <26dd0f26-7e7f-c0e8-263b-ef65554a93de@polymtl.ca> Date: Tue, 29 Jun 2021 13:53:20 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Poly-FromMTA: (simark.ca [158.69.221.121]) at Tue, 29 Jun 2021 17:53:21 +0000 X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Simon Marchi via Gdb-patches Reply-To: Simon Marchi Errors-To: gdb-patches-bounces+public-inbox=simark.ca@sourceware.org Sender: "Gdb-patches" On 2021-06-21 3:46 p.m., Andrew Burgess wrote: > While working on the previous commit I happened to switch on 'set > debug frame 1', and ran into a different assertion than the one I was > trying to fix. > > The new problem I see is this assertion triggering: > > gdb/frame.c:622: internal-error: frame_id get_frame_id(frame_info*): Assertion `fi->this_id.p != frame_id_status::COMPUTING' failed. > > We attempt to get the frame_id for a frame while we are computing the > frame_id for that same frame. > > What happens is we have a stack like this: > > normal_frame -> inline_frame -> sentinel_frame > > When we initially stop, GDB creates a frame for inline_frame but > doesn't sniff its type, or try to fill in its frame_id (see code near > the top of get_prev_frame_if_no_cycle). > > Later on during the stop, this happens: > > process_event_stop_test > get_stack_frame_id > skip_artificial_frames > get_frame_type > > The call to get_frame_type causes inline_frame to sniff its type, > establishing that it is an INLINE_FRAME, but does not cause the frame > to build its frame_id. > > The same skip_artificial_frames call then calls get_prev_frame_always > to unwind the stack, this will then try to get the frame previous to > inline_frame, i.e. normal_frame. > > So, we create a new frame, but unlike frame #0, in > get_prev_frame_if_no_cycle, we immediately try to compute the frame_id > for this new frame #1. > > Computing the frame_id for frame #1 invokes the sniffer. If this > sniffer tries to read a register then we will create a lazy register > value by calling value_of_register_lazy. As the next frame (frame #0) > is an INLINE_FRAME, we will skip this and instead create the lazy > register value using the frame_id for frame #-1 (the sentinel frame). > > Now, when we try to resolve the lazy register value, in the value.c > function value_fetch_lazy_register, we want to print which frame the > lazy register value was for (in debug mode), so we call: > > frame = frame_find_by_id (VALUE_FRAME_ID (val)); > > where: > > #define VALUE_FRAME_ID(val) (get_prev_frame_id_by_id (VALUE_NEXT_FRAME_ID (val))) > > In our case we call get_prev_frame_id_by_id with the frame_id of the > sentinel_frame frame, we then call frame_find_by_id, followed by > get_prev_frame (which gives us frame #0), followed by get_frame_id. > > Frame #0 has not yet computed its frame_id, so we start that process. > The first thing we need when computing the frame_id of an inline frame > is the frame_id of the previous frame, so that's what we ask for. > Unfortunately, that's where we started this journey, we are already > computing the frame_id for frame #1, and thus the assert fires. > > Solving the assertion failure is pretty easy, if we consider the code > in value_fetch_lazy_register and get_prev_frame_id_by_id then what we > do is: > > 1. Start with a frame_id taken from a value, > 2. Lookup the corresponding frame, > 3. Find the previous frame, > 4. Get the frame_id for that frame, and > 5. Lookup the corresponding frame > 6. Print the frame's level > > Notice that steps 3 and 5 give us the exact same result, step 4 is > just wasted effort. We could shorten this process such that we drop > steps 4 and 5, thus: > > 1. Start with a frame_id taken from a value, > 2. Lookup the corresponding frame, > 3. Find the previous frame, > 6. Print the frame's level > > This will give the exact same frame as a result, and this is what I > have done in this patch by removing the use of VALUE_FRAME_ID from > value_fetch_lazy_register. > > Out of curiosity I looked to see how widely VALUE_FRAME_ID was used, > and saw it was only used in one other place in valops.c:value_assign, > where, once again, we take the result of VALUE_FRAME_ID and pass it to > frame_find_by_id, thus introducing a redundant frame_id lookup. > > I don't think the value_assign case risks triggering the assertion > though, as we are unlikely to call value_assign while computing the > frame_id for a frame, however, we could make value_assign slightly > more efficient, with no real additional complexity, by removing the > use of VALUE_FRAME_ID. > > So, in this commit, I completely remove VALUE_FRAME_ID, and replace it > with a use of VALUE_NEXT_FRAME_ID, followed by a direct call to > get_prev_frame_always, this should make no difference in either case, > and resolves the assertion issue from value.c. > > One thing that is worth noticing here is that in these two situations > we don't end up getting the frame we expected to, though this is not a > result of my change, we were not getting the expected result before > either. > > Consider the debug printing case, the stack is: > > normal_frame -> inline_frame -> sentinel_frame > > We read a register from normal_frame (frame #1), the value of which is > fetched from sentinel_frame (frame #-1). The debug print is trying to > say: > > frame=1,regnum=.... > > However, as the lazy register value points at frame #-1, we will > actually (incorrectly) print: > > frame=0,regnum=.... > > Like I said, this bug existed before this commit, and so, if we didn't > assert (i.e. the lazy register read occurred in some context other > than during the frame sniffer), then the debug print would previous > have, and will continue to, print the wrong frame level. Thankfully, > this is only in debug output, so not something a normal user should > see. Are you sure this is not expected? An inline frame does not have registers of its own, so it could be normal/expected, that to unwind registers, we just skip over inline frames, so that a register value's next frame id skips over inline frame. See this commit (that you wrote :)): https://gitlab.com/gnutools/gdb/-/commit/9fc501fdfe5dc82b5e5388cde4ac2ab70ed69d75 I might not understand correctly what you mean though. The patch itself LGTM. Simon