From: Tom de Vries via Gdb-patches <gdb-patches@sourceware.org>
To: Tom Tromey <tom@tromey.com>,
Tom de Vries via Gdb-patches <gdb-patches@sourceware.org>
Cc: Pedro Alves <pedro@palves.net>
Subject: Re: [PATCH][gdbsupport] Use task size in parallel_for_each
Date: Sat, 23 Jul 2022 07:55:09 +0200 [thread overview]
Message-ID: <805b45d3-66bb-3c49-d254-147525766ea4@suse.de> (raw)
In-Reply-To: <87leslj786.fsf@tromey.com>
On 7/22/22 21:08, Tom Tromey wrote:
> Pedro> My passerby comment is that I wonder whether we should consider
> Pedro> switching to a work stealing implementation, so that threads that
> Pedro> are done with their chunk would grab another chunk from the work
> Pedro> pool. I think this would spare us from having to worry about
> Pedro> these distribution heuristics.
>
> Tom> I also though about a dynamic solution, but decided to try the
> Tom> simplest solution first.
>
> Tom> Anyway, with a dynamic solution you still might want to decide how big
> Tom> a chunck is, for which you could still need this type of heuristics.
>
> I think the idea of the work-stealing approach is that each worker grabs
> work as it can, and so no sizing is needed at all. If one worker ends
> up with a very large CU, it will simply end up working on fewer CUs.
>
Right, I understand that.
My concern was that the current parallel-for solution potentially
exploits caching behaviour by accessing a sequence of elements, and a
pure work-stealing approach is more likely to execute non-subsequent
elements, which wouldn't exploit that. But thinking about it now,
that's reasoning from the point of view of a single cpu, and the
work-stealing approach would also allow for exploiting caching behaviour
between cpus.
Anyway, so I wondered if a hybrid approach would be a good idea:
where you first schedule a percentage of the workload, say 80%
processing slices determined using size heuristics, and do the remaining
20% using work-stealing to counteract the discrepancy between size
heuristics and actually used execution time.
It also occurred to me today that if you'd have a large CU as the last
CU, with pure work-stealing you may well end up with one thread
processing that CU and the other threads idle, while if you use size
heuristics you may force a thread to start early on it.
Thanks,
- Tom
> In this situation, work stealing might make the overall implementation
> simpler. Perhaps the batching parameter patch (commit 82d734f7a) and
> the vector of results patch (commit f4565e4c9) could be removed.
>
> Then, rather than using parallel_for, the DWARF reader could send N jobs
> to the thread pool, and each job would simply take the next available CU
> by incrementing an atomic counter. When the counter reached the number
> of CUs, a job would stop.
>
> Tom
prev parent reply other threads:[~2022-07-23 5:56 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-18 19:42 Tom de Vries via Gdb-patches
2022-07-21 17:35 ` Pedro Alves
2022-07-21 20:23 ` Tom de Vries via Gdb-patches
2022-07-22 0:03 ` Pedro Alves
2022-07-22 11:07 ` Pedro Alves
2022-07-22 17:07 ` Tom de Vries via Gdb-patches
2022-07-22 19:08 ` Tom Tromey
2022-07-22 19:38 ` Simon Marchi via Gdb-patches
2022-07-23 3:17 ` Tom Tromey
2022-07-22 21:21 ` Tom Tromey
2022-07-23 6:51 ` Tom de Vries via Gdb-patches
2022-07-31 8:38 ` Tom de Vries via Gdb-patches
2022-07-23 5:55 ` Tom de Vries via Gdb-patches [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=805b45d3-66bb-3c49-d254-147525766ea4@suse.de \
--to=gdb-patches@sourceware.org \
--cc=pedro@palves.net \
--cc=tdevries@suse.de \
--cc=tom@tromey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox