From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id qZDoAnCN22I++BgAWB0awg (envelope-from ) for ; Sat, 23 Jul 2022 01:56:00 -0400 Received: by simark.ca (Postfix, from userid 112) id F0D7B1E5EA; Sat, 23 Jul 2022 01:55:59 -0400 (EDT) Authentication-Results: simark.ca; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=B6FGfoqF; dkim-atps=neutral X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_DYNAMIC, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 93A791E222 for ; Sat, 23 Jul 2022 01:55:59 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E4A48385802D for ; Sat, 23 Jul 2022 05:55:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E4A48385802D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1658555758; bh=XU/UD992o7MI3M4No7QsFI2Scpb8qxvHrlOudMb+TRs=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=B6FGfoqFFa/cTO4gP5UpiFHDne0p/9LWBzEnVCTEc7a6E7q6UucgoxM8vDLWBvTiF Jg/08FWX2oLffo9bgIHh6a97KCPMz5myR88fk0Y5zcR2AxYu9SvGsFVPNv3GiyD78W z2VaVKgvVeTd4HzXXUs1UIhaQwsr90J66xsAfa/g= Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 2C8F93858C50 for ; Sat, 23 Jul 2022 05:55:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2C8F93858C50 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 2D40B22CC0; Sat, 23 Jul 2022 05:55:10 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 13CAD13A92; Sat, 23 Jul 2022 05:55:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id c/ymAz6N22LxawAAMHmgww (envelope-from ); Sat, 23 Jul 2022 05:55:10 +0000 Message-ID: <805b45d3-66bb-3c49-d254-147525766ea4@suse.de> Date: Sat, 23 Jul 2022 07:55:09 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH][gdbsupport] Use task size in parallel_for_each Content-Language: en-US To: Tom Tromey , Tom de Vries via Gdb-patches References: <20220718194219.GA16823@delia.home> <4fc23fcd-c15d-7622-8b51-cc48cd3cba16@palves.net> <75931310-5dcd-059d-9221-6c94dbcd231f@suse.de> <87leslj786.fsf@tromey.com> In-Reply-To: <87leslj786.fsf@tromey.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Tom de Vries via Gdb-patches Reply-To: Tom de Vries Cc: Pedro Alves Errors-To: gdb-patches-bounces+public-inbox=simark.ca@sourceware.org Sender: "Gdb-patches" On 7/22/22 21:08, Tom Tromey wrote: > Pedro> My passerby comment is that I wonder whether we should consider > Pedro> switching to a work stealing implementation, so that threads that > Pedro> are done with their chunk would grab another chunk from the work > Pedro> pool. I think this would spare us from having to worry about > Pedro> these distribution heuristics. > > Tom> I also though about a dynamic solution, but decided to try the > Tom> simplest solution first. > > Tom> Anyway, with a dynamic solution you still might want to decide how big > Tom> a chunck is, for which you could still need this type of heuristics. > > I think the idea of the work-stealing approach is that each worker grabs > work as it can, and so no sizing is needed at all. If one worker ends > up with a very large CU, it will simply end up working on fewer CUs. > Right, I understand that. My concern was that the current parallel-for solution potentially exploits caching behaviour by accessing a sequence of elements, and a pure work-stealing approach is more likely to execute non-subsequent elements, which wouldn't exploit that. But thinking about it now, that's reasoning from the point of view of a single cpu, and the work-stealing approach would also allow for exploiting caching behaviour between cpus. Anyway, so I wondered if a hybrid approach would be a good idea: where you first schedule a percentage of the workload, say 80% processing slices determined using size heuristics, and do the remaining 20% using work-stealing to counteract the discrepancy between size heuristics and actually used execution time. It also occurred to me today that if you'd have a large CU as the last CU, with pure work-stealing you may well end up with one thread processing that CU and the other threads idle, while if you use size heuristics you may force a thread to start early on it. Thanks, - Tom > In this situation, work stealing might make the overall implementation > simpler. Perhaps the batching parameter patch (commit 82d734f7a) and > the vector of results patch (commit f4565e4c9) could be removed. > > Then, rather than using parallel_for, the DWARF reader could send N jobs > to the thread pool, and each job would simply take the next available CU > by incrementing an atomic counter. When the counter reached the number > of CUs, a job would stop. > > Tom