On 7/15/22 21:05, Tom Tromey wrote: >>>>>> "Tom" == Tom de Vries via Gdb-patches writes: > > Tom> This introduces a performance regression on a particular test-case I happened > Tom> to use: > Tom> ... > Tom> $ for n in $(seq 1 10); do \ > Tom> time gdb -q -batch libxul.so.debug 2>&1 | grep real:; \ > Tom> done > Tom> ... > Tom> so revert to the original schedule by reducing the worker threads: > Tom> ... > > This seems like making a change and then undoing it somewhere else? > > Tom> Still, the performance experiment yields a slight performance loss. > > Sounds bad. > > Tom> if (n_threads < 0) > Tom> - n_threads = std::thread::hardware_concurrency (); > Tom> + { > Tom> + n_threads = std::thread::hardware_concurrency (); > Tom> + if (n_threads > 0) > Tom> + /* Account for main thread. */ > Tom> + n_threads--; > Tom> + } > > I think it's better if the setting just directly controls how many > threads there are. Then elsewhere we can decide what that means -- like > if it performs better with the defaults to not do any work in the main > thread, then parallel_for_each can be modified to just send tasks to the > workers and do nothing in the main thread except wait for results. > > Tom> size_t elts_per_thread = 0; > [...] > Tom> + elts_per_thread = n_elements / n_threads; > > The initial declaration can be removed and then this latter line can > declare the variable as well. > > Tom> for (int i = 0; i < count; ++i) > Tom> { > Tom> RandomIt end = first + elts_per_thread; > Tom> + if (i < left_over) > Tom> + end++; > > It may be nice to mention the distribution of leftovers in a comment > somewhere. I've ended up committing this patch, which just does the leftovers distribution part, with some comments added. Thanks, - Tom