Re: [PATCH][gdbsupport] Use task size in parallel_for_each

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Pedro Alves <pedro@palves.net>
To: Tom de Vries <tdevries@suse.de>, gdb-patches@sourceware.org
Cc: Tom Tromey <tom@tromey.com>
Subject: Re: [PATCH][gdbsupport] Use task size in parallel_for_each
Date: Thu, 21 Jul 2022 18:35:09 +0100	[thread overview]
Message-ID: <4fc23fcd-c15d-7622-8b51-cc48cd3cba16@palves.net> (raw)
In-Reply-To: <20220718194219.GA16823@delia.home>

On 2022-07-18 8:42 p.m., Tom de Vries via Gdb-patches wrote:
> Hi,
> 
> Ensuring a fair distribution over the worker threads and main thread in terms
> of number of CUs might not be the most efficient way, given that CUs can vary
> in size.
> 
> Fix this by:
> - adding a task_size_ptr parameter to parallel_for_each,
>   defaulting to nullptr,
> - using per_cu->get_length () as the task size in the parallel_for_each
>   in dwarf2_build_psymtabs_hard, and
> - using the task size in parallel_for_each to distribute similarly-sized tasks
>   to the threads.
> 
> I've used this experiment to verify the performance impact:
> ...
> $ for n in $(seq 1 10); do \
>     time gdb -q -batch ~/firefox/libxul.so-93.0-1.1.x86_64.debug \
>     2>&1 \
>     | grep "real:"; \
>   done
> ...
> and without the patch got:
> ...
> real: 4.71
> real: 4.88
> real: 4.29
> real: 4.30
> real: 4.65
> real: 4.27
> real: 4.27
> real: 4.27
> real: 4.75
> real: 4.41
> ...
> and with the patch:
> ...
> real: 3.68
> real: 3.81
> real: 3.80
> real: 3.68
> real: 3.75
> real: 3.69
> real: 3.69
> real: 3.74
> real: 3.67
> real: 3.74
> ...
> so that seems a reasonable improvement.

Nice.

> 
> With parallel_for_each_debug set to true, we get some more details about the
> difference in behaviour.  Without the patch we have:
> ...
> Parallel for: n_elements: 2818
> Parallel for: minimum elements per thread: 1
> Parallel for: elts_per_thread: 704
> Parallel for: elements on worker thread 0       : 705
> Parallel for: elements on worker thread 1       : 705
> Parallel for: elements on worker thread 2       : 704
> Parallel for: elements on worker thread 3       : 0
> Parallel for: elements on main thread           : 704
> ...
> and with the patch:
> ...
> Parallel for: n_elements: 2818
> Parallel for: total_size: 1483674865
> Parallel for: size_per_thread: 370918716
> Parallel for: elements on worker thread 0       : 752   (size: 371811790)
> Parallel for: elements on worker thread 1       : 360   (size: 371509370)
> Parallel for: elements on worker thread 2       : 1130  (size: 372681710)
> Parallel for: elements on worker thread 3       : 0     (size: 0)
> Parallel for: elements on main thread           : 576   (size: 367671995)
> ...
> 
> Tested on x86_64-linux.
> 
> Any comments?

I'll defer to Tromey.  Just a passerby comment, and a comment on the code itself
if we follow this approach, below.

My passerby comment is that I wonder whether we should consider switching to a work
stealing implementation, so that threads that are done with their chunk would grab another
chunk from the work pool.  I think this would spare us from having to worry about these
distribution heuristics.

> @@ -7059,6 +7059,13 @@ dwarf2_build_psymtabs_hard (dwarf2_per_objfile *per_objfile)
>  
>      using iter_type = decltype (per_bfd->all_comp_units.begin ());
>  
> +    std::function<unsigned int (iter_type)> task_size
> +      = [=] (iter_type iter)
> +      {
> +	dwarf2_per_cu_data *per_cu = iter->get ();
> +	return per_cu->length ();
> +      };
> +
>      /* Each thread returns a pair holding a cooked index, and a vector
>         of errors that should be printed.  The latter is done because
>         GDB's I/O system is not thread-safe.  run_on_main_thread could be
> @@ -7087,7 +7094,7 @@ dwarf2_build_psymtabs_hard (dwarf2_per_objfile *per_objfile)
>  	      }
>  	  }
>  	return result_type (thread_storage.release (), std::move (errors));
> -      });
> +      }, &task_size);
>  
>      /* Only show a given exception a single time.  */
>      std::unordered_set<gdb_exception> seen_exceptions;
> diff --git a/gdbsupport/parallel-for.h b/gdbsupport/parallel-for.h
> index bf40f125f0f..3c9269574df 100644
> --- a/gdbsupport/parallel-for.h
> +++ b/gdbsupport/parallel-for.h
> @@ -134,7 +134,9 @@ typename gdb::detail::par_for_accumulator<
>      typename std::result_of<RangeFunction (RandomIt, RandomIt)>::type
>    >::result_type
>  parallel_for_each (unsigned n, RandomIt first, RandomIt last,
> -		   RangeFunction callback)
> +		   RangeFunction callback,
> +		   std::function<unsigned int(RandomIt)> *task_size_ptr
> +		     = (std::function<unsigned int(RandomIt)> *)nullptr)

That use of a std::function pointer looks odd.  AFAICT, TASK_SIZE_PTR is only ever called
as a callback by parallel_for_each, for setup, in the calling thread, and isn't stored
anywhere, right?  If so, gdb::function_view instead should work, is lightweight, and is
nullable, meaning you don't need a pointer.

And then, at the caller, just using a lambda instead of a std::function should work too:

    auto task_size = [=] (iter_type iter)
      {
	dwarf2_per_cu_data *per_cu = iter->get ();
	return per_cu->length ();
      };

next prev parent reply	other threads:[~2022-07-21 17:35 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-18 19:42 Tom de Vries via Gdb-patches
2022-07-21 17:35 ` Pedro Alves [this message]
2022-07-21 20:23   ` Tom de Vries via Gdb-patches
2022-07-22  0:03     ` Pedro Alves
2022-07-22 11:07       ` Pedro Alves
2022-07-22 17:07       ` Tom de Vries via Gdb-patches
2022-07-22 19:08     ` Tom Tromey
2022-07-22 19:38       ` Simon Marchi via Gdb-patches
2022-07-23  3:17         ` Tom Tromey
2022-07-22 21:21       ` Tom Tromey
2022-07-23  6:51         ` Tom de Vries via Gdb-patches
2022-07-31  8:38           ` Tom de Vries via Gdb-patches
2022-07-23  5:55       ` Tom de Vries via Gdb-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4fc23fcd-c15d-7622-8b51-cc48cd3cba16@palves.net \
    --to=pedro@palves.net \
    --cc=gdb-patches@sourceware.org \
    --cc=tdevries@suse.de \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox