Re: [PATCH][gdbsupport] Use task size in parallel_for_each

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Tom de Vries via Gdb-patches <gdb-patches@sourceware.org>
To: Tom Tromey <tom@tromey.com>,
	Tom de Vries via Gdb-patches <gdb-patches@sourceware.org>
Cc: Pedro Alves <pedro@palves.net>
Subject: Re: [PATCH][gdbsupport] Use task size in parallel_for_each
Date: Sat, 23 Jul 2022 07:55:09 +0200	[thread overview]
Message-ID: <805b45d3-66bb-3c49-d254-147525766ea4@suse.de> (raw)
In-Reply-To: <87leslj786.fsf@tromey.com>

On 7/22/22 21:08, Tom Tromey wrote:
> Pedro> My passerby comment is that I wonder whether we should consider
> Pedro> switching to a work stealing implementation, so that threads that
> Pedro> are done with their chunk would grab another chunk from the work
> Pedro> pool.  I think this would spare us from having to worry about
> Pedro> these distribution heuristics.
> 
> Tom> I also though about a dynamic solution, but decided to try the
> Tom> simplest solution first.
> 
> Tom> Anyway, with a dynamic solution you still might want to decide how big
> Tom> a chunck is, for which you could still need this type of heuristics.
> 
> I think the idea of the work-stealing approach is that each worker grabs
> work as it can, and so no sizing is needed at all.  If one worker ends
> up with a very large CU, it will simply end up working on fewer CUs.
> 

Right, I understand that.

My concern was that the current parallel-for solution potentially 
exploits caching behaviour by accessing a sequence of elements, and a 
pure work-stealing approach is more likely to execute non-subsequent 
elements, which wouldn't exploit that.  But thinking about it now, 
that's reasoning from the point of view of a single cpu, and the 
work-stealing approach would also allow for exploiting caching behaviour 
between cpus.

Anyway, so I wondered if a hybrid approach would be a good idea:
where you first schedule a percentage of the workload, say 80% 
processing slices determined using size heuristics, and do the remaining 
20% using work-stealing to counteract the discrepancy between size 
heuristics and actually used execution time.

It also occurred to me today that if you'd have a large CU as the last 
CU, with pure work-stealing you may well end up with one thread 
processing that CU and the other threads idle, while if you use size 
heuristics you may force a thread to start early on it.

Thanks,
- Tom

> In this situation, work stealing might make the overall implementation
> simpler.  Perhaps the batching parameter patch (commit 82d734f7a) and
> the vector of results patch (commit f4565e4c9) could be removed.
> 
> Then, rather than using parallel_for, the DWARF reader could send N jobs
> to the thread pool, and each job would simply take the next available CU
> by incrementing an atomic counter.  When the counter reached the number
> of CUs, a job would stop.
> 
> Tom

     prev parent reply	other threads:[~2022-07-23  5:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-18 19:42 Tom de Vries via Gdb-patches
2022-07-21 17:35 ` Pedro Alves
2022-07-21 20:23   ` Tom de Vries via Gdb-patches
2022-07-22  0:03     ` Pedro Alves
2022-07-22 11:07       ` Pedro Alves
2022-07-22 17:07       ` Tom de Vries via Gdb-patches
2022-07-22 19:08     ` Tom Tromey
2022-07-22 19:38       ` Simon Marchi via Gdb-patches
2022-07-23  3:17         ` Tom Tromey
2022-07-22 21:21       ` Tom Tromey
2022-07-23  6:51         ` Tom de Vries via Gdb-patches
2022-07-31  8:38           ` Tom de Vries via Gdb-patches
2022-07-23  5:55       ` Tom de Vries via Gdb-patches [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=805b45d3-66bb-3c49-d254-147525766ea4@suse.de \
    --to=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    --cc=tdevries@suse.de \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox