From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-bounces@sourceware.org>
Received: from simark.ca
	by simark.ca with LMTP
	id aLJSC0+dwl+5bwAAWB0awg
	(envelope-from <gdb-patches-bounces@sourceware.org>)
	for <public-inbox@simark.ca>; Sat, 28 Nov 2020 13:56:15 -0500
Received: by simark.ca (Postfix, from userid 112)
	id 29DDF1F0AB; Sat, 28 Nov 2020 13:56:15 -0500 (EST)
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca
X-Spam-Level: 
X-Spam-Status: No, score=0.3 required=5.0 tests=MAILING_LIST_MULTI,RDNS_NONE
	autolearn=no autolearn_force=no version=3.4.2
Received: from sourceware.org (unknown [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by simark.ca (Postfix) with ESMTPS id 947BE1E552
	for <public-inbox@simark.ca>; Sat, 28 Nov 2020 13:56:14 -0500 (EST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 0961D3857C71;
	Sat, 28 Nov 2020 18:56:14 +0000 (GMT)
Received: from mail-wm1-f68.google.com (mail-wm1-f68.google.com
 [209.85.128.68])
 by sourceware.org (Postfix) with ESMTPS id 817E83857C71
 for <gdb-patches@sourceware.org>; Sat, 28 Nov 2020 18:56:11 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 817E83857C71
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=palves.net
Authentication-Results: sourceware.org;
 spf=pass smtp.mailfrom=alves.ped@gmail.com
Received: by mail-wm1-f68.google.com with SMTP id w24so10332646wmi.0
 for <gdb-patches@sourceware.org>; Sat, 28 Nov 2020 10:56:11 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-language
 :content-transfer-encoding;
 bh=0wsaWmB0a5gr1Kluo5exx26abcITSt8+R/RLbA5pkqY=;
 b=Ak5m7cGaHJaxxInNXoRJkh3PaxFjdYQalDXF3Ir1PCGL+Rs4q9aOQr0VLp+zxeSUDp
 fm0qaSHbqhvEs4t0KJnATLIZ3ol8tp+AnjSSQn10fyyNPekuSx7JtGpwdzq548Fl/ATh
 T4LYUOxfNpp1E+86TbAHvAIE3wigKfVUW4IN8AkQalJiVDJ0TLF4wJcCSh6rmFp3VdrZ
 eH/0dFzz6qg2w6MMMaRzcKPhPLzGp5Qq5SSLdt2WLUpcMRR5ghpmOsOAlGCM1193vB5b
 dRexi0f7a1TE1n4X1vn8TXSeuqOZdfdM9ICEPsCcL5XEssCkuixDK4lhvhrewY/YVY5A
 +8YA==
X-Gm-Message-State: AOAM530QfjzRY9jtRJOwCJyc92iEe2ImEPci1YtokDjNFhKmHgZSHkr5
 YUyqwa3GYTcebpQ6e16WN6PLuGqjU7PfhLD5
X-Google-Smtp-Source: ABdhPJxCAQD2cAvM2b9UA/p+zYNmVTAlyPxkJku5RRDRUzsmqq5NJe4Ibu9g8f2tEitFgYWQPxf6fg==
X-Received: by 2002:a1c:5f8a:: with SMTP id t132mr8701605wmb.121.1606589769295; 
 Sat, 28 Nov 2020 10:56:09 -0800 (PST)
Received: from ?IPv6:2001:8a0:f91f:e900:1d90:d745:3c32:c159?
 ([2001:8a0:f91f:e900:1d90:d745:3c32:c159])
 by smtp.gmail.com with ESMTPSA id g25sm18053815wmk.5.2020.11.28.10.56.07
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Sat, 28 Nov 2020 10:56:07 -0800 (PST)
Subject: Re: [PATCH 12/12] gdb: use two displaced step buffers on amd64/Linux
To: Simon Marchi <simark@simark.ca>, Simon Marchi
 <simon.marchi@efficios.com>, gdb-patches@sourceware.org
References: <20201110214614.2842615-1-simon.marchi@efficios.com>
 <20201110214614.2842615-13-simon.marchi@efficios.com>
 <7040e2ee-4d28-a83e-22df-20b2ace082bb@palves.net>
 <88518922-ffb6-f221-f3b8-569c5577ae5a@simark.ca>
 <91426053-1ce6-3154-3635-cef5390248f4@simark.ca>
 <9d717009-facf-66b8-6846-e687bee2ad1d@simark.ca>
 <d6cb10f9-23d3-13f9-f1ef-9bb5b4d7ca5c@simark.ca>
From: Pedro Alves <pedro@palves.net>
Message-ID: <b108f8ff-b225-c86d-974a-2801a7c1a23a@palves.net>
Date: Sat, 28 Nov 2020 18:56:06 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.2.1
MIME-Version: 1.0
In-Reply-To: <d6cb10f9-23d3-13f9-f1ef-9bb5b4d7ca5c@simark.ca>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
Errors-To: gdb-patches-bounces@sourceware.org
Sender: "Gdb-patches" <gdb-patches-bounces@sourceware.org>

On 11/26/20 9:43 PM, Simon Marchi wrote:
> On 2020-11-25 3:56 p.m., Simon Marchi wrote:
>> On 2020-11-25 3:07 p.m., Simon Marchi wrote:
>>> We still have to figure out why the performance regresses compared to
>>> master.  One thing is that we still do an extra resume attempt each
>>> time.  We do one resume where the disp step prepare succeeds and one
>>> more where it fails.  That's twice as much work as before, where we did
>>> one successful prepare and then skipped the next ones.  I'll make an
>>> experimental change such that the arch can say "the prepare worked, but
>>> now I'm out of buffer", just to see how that improves (or not)
>>> performance.
>>
>> That made just a small difference.  Let's call this new patch E:
>>
>>  #9 + A + B + C + D + E: 105,727
> 
> Ok, I played a bit more with this and found a case where we do more
> work that may explain the difference.
> 
> handle_signal_stop calls finish_step_over, which calls start_step_over.
> Let's say that the displaced step buffer is used by some thread and
> another thread either hits the breakpoint, we go into start_step_over to
> see if we can start a new step over.  We can't, because the event thread
> isn't the one that is using the displaced stepping buffer, the buffer is
> still occupied.
> 
> In the pre-patch code, that early check in start_step_over:
> 
>       /* If this inferior already has a displaced step in process,
> 	 don't start a new one.  */
>       if (displaced_step_in_progress (tp->inf))
> 	continue;
> 
> means that we won't even try one resumption, we'll skip over all threads
> in the chain (the only cost is the cost of iteration).
> 
> In the patch I previously called "C", start_step_over starts with the
> assumption that the inferior has an available buffer, and needs to do at
> least one resumption attempt that returns "unavailable" before it
> assumes there are no more available.  That one unnecessary resumption
> attempt happens over and over and ends up being costly.
> 
> I changed my patch C to make the "unavailable" property persistent,
> instead of just using it locally in start_step_over.  If a displaced
> step prepare returns "unavailable", we mark that this inferior has no
> more buffers available, and only reset this flag when a displaced step
> for this inferior finishes.  Doing so, when a thread hits the breakpoint
> and start_step_over is called, the unavailable flag is already set.
> Last time we tried to prepare a displaced step, there were no buffers
> available.  And if no displaced step finished for this inferior since,
> then why bother asking?  As a result, we don't even try one resume if
> the buffers are already take, just like with the pre-patch code.
> 
> I re-did some tests with the new patch C:
> 
> master:             110,736
> 9:                   19,973
> 9 + A + B + C:      107,711
> 9 + A + B + C + E:  111,135
> Everything up to
>  using two displaced
>  stepping buffers:  110,697
> 
> Note that I dropped patch D, which was to break out of the
> start_step_over loop when getting "unavailable".  It didn't help much
> and wasn't desired functionality anyway.
> 
> And for the reminder, patch E is that when returning "_OK", the arch can
> also tell us "that was the last one, any more request would return
> unavailable".  So that allows to do just one resumption attempt (at
> most) per start_step_over invocation, rather than 2.
> 
> What this shows is that by using everything up to patch E, we are pretty
> much on par with the performance, but with the more flexible
> architecture.  That makes sense, as (I think) we do pretty much the same
> work.
> 
> Re-thinking about patch C and E, I'm thinking that we could let the
> architecture's displaced stepping implementation decide when it sets the
> "unavailable" flag, instead of the core of GDB setting it.  The
> architecture would then choose the policy it wants to adopt:
> 
> - set the flag when it used its last buffer (equivalent to patch E)
> - set the flag when it returns "_UNAVAILABLE" (equivalent to patch C')
> - never set the unavavaible flag (equivalent to patch 9)
> 
> It also shows that using two buffers on x86-64 doesn't improve
> performance (unless there's another unfortunate bottleneck I haven't
> found).  But that's what I expected.  My hypothesis is that the actual
> step portion of the displaced step is so fast that we never really use
> the buffers in parallel.  If we start two displaced steps, by the time
> we prepare the second one, the first one has already finished and is
> waiting for GDB to clean it up.  If for some reason we could displaced
> step an instruction that took, say, 1 second to execute, then having
> many displaced step buffers would be useful, because GDB would be able
> to start them all and have them really run in parallel.
> 
> I'll prepare a user branch with all those fixup patches so you can take
> a look, and if you think it makes sense I'll be ready for a v2.

I took a look and your branch, and it makes sense to me.
I run the perf test on my end, and the performance gap is practically
gone on my end as well.  I'm happy with that.