From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id rNd5OYy5vl9EFwAAWB0awg (envelope-from ) for ; Wed, 25 Nov 2020 15:07:40 -0500 Received: by simark.ca (Postfix, from userid 112) id E45B21F0AB; Wed, 25 Nov 2020 15:07:40 -0500 (EST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=0.3 required=5.0 tests=MAILING_LIST_MULTI,RDNS_NONE autolearn=no autolearn_force=no version=3.4.2 Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 9B2C51E552 for ; Wed, 25 Nov 2020 15:07:40 -0500 (EST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 39636384A032; Wed, 25 Nov 2020 20:07:40 +0000 (GMT) Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id B579D3851C1A for ; Wed, 25 Nov 2020 20:07:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B579D3851C1A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark@simark.ca Received: from [10.0.0.11] (173-246-6-90.qc.cable.ebox.net [173.246.6.90]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 3BDE71E552; Wed, 25 Nov 2020 15:07:37 -0500 (EST) Subject: Re: [PATCH 12/12] gdb: use two displaced step buffers on amd64/Linux To: Pedro Alves , Simon Marchi , gdb-patches@sourceware.org References: <20201110214614.2842615-1-simon.marchi@efficios.com> <20201110214614.2842615-13-simon.marchi@efficios.com> <7040e2ee-4d28-a83e-22df-20b2ace082bb@palves.net> <88518922-ffb6-f221-f3b8-569c5577ae5a@simark.ca> From: Simon Marchi Message-ID: <91426053-1ce6-3154-3635-cef5390248f4@simark.ca> Date: Wed, 25 Nov 2020 15:07:36 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <88518922-ffb6-f221-f3b8-569c5577ae5a@simark.ca> Content-Type: text/plain; charset=utf-8 Content-Language: fr Content-Transfer-Encoding: 7bit X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces@sourceware.org Sender: "Gdb-patches" On 2020-11-25 1:26 a.m., Simon Marchi wrote: > Since patches C and D are about reducing the work in start_step_over, I > think it shows clearly that looping over all threads there is what gives > the biggest hit. Before this patch, we break out as soon as we manage > to initiate a displaced step, whereas in patch 9 we constantly go > through all threads in the list Sorry, that's incorrect. Before this patch, when a displaced step is already started in an inferior, we skip any subsequent thread for that inferior. It's done very early, before calling thread_still_needs_step_over. If I modify patch C to do the same, it looks more like this: (And to clarify my previous mail, it wasn't clear when I said "#9 + A", "9 + B" and so on. It should say "#9 + A", "#9 + A + B", and so on. Otherwise it gives the impression I tested the fixup patches individually, which I didn't.) master: 109,362 #9: 19,815 #9 + A: 19,463 #9 + A + B: 20,152 #9 + A + B + C: 101,170 #9 + A + B + C + D: 103,948 And for fun, with two buffers: #9 + A + B + C + D + #10 to #12: 105,864 So, with those mitigations (especially patch C), it's not as bad as it was, but still slower than master. And even when using 2 buffers, which is meant to speed things up, so it's not good. So, patch C makes implements that when a displaced step prepare returns "unavailable" for an inferior, we don't try to start any more for that inferior (for the rest of that start_step_over call). That unfortunately does not allow to implement "perfect" displaced step buffer sharing though. Maybe we can settle for a compromise though, since sharing buffers is just an optimization and not required: implement something like patch C, but also implement buffer sharing for threads that need to step the same PC. If the threads are ordered in the chain in a lucky way, in a way that multiple threads at the same PC are handled before the buffers are all occupied, then these threads will share a buffer. Once all buffers are occupied, we quit, even if there might more threads able to share a buffer further in the list. To illustrate, let's say you have these threads in the step over chain, that require to step over PC A, B and C: T1 at PC A T2 at PC A T3 at PC B T4 at PC A T5 at PC C T6 at PC A With two buffers, we'll be able to accomodate the first 4 threads. When we try to prepare the disp step for T5, the implemention will return "unavailable", so T6 won't even be considered. So in the end T1, T2 and T4 will share a buffer whle T3 will be by itself in the other buffer. I think that strikes a good balance because as long as we didn't get an "unavailable", we do expensive work to try to prepare some displaced steps, but it's useful work since we'll actually resume these threads. But after that, we plow our way through a list of hundreds of threads, doing expensive work for each, with the hope of finding some for which the prepare may work. That's not a very good investment (well, it very much depends on the workload). We still have to figure out why the performance regresses compared to master. One thing is that we still do an extra resume attempt each time. We do one resume where the disp step prepare succeeds and one more where it fails. That's twice as much work as before, where we did one successful prepare and then skipped the next ones. I'll make an experimental change such that the arch can say "the prepare worked, but now I'm out of buffer", just to see how that improves (or not) performance. But I'm hitting send on this email now, so that you don't spend too much time rebuking the lies of my previous email :). Simon