From: Simon Marchi <simon.marchi@polymtl.ca>
To: gdb-patches@sourceware.org
Subject: Re: [PATCH v2 1/4] gdbsupport: use dynamic partitioning in gdb::parallel_for_each
Date: Mon, 8 Sep 2025 13:23:02 -0400 [thread overview]
Message-ID: <71b6ba0c-3463-4f8f-b7b4-da43c83ba667@polymtl.ca> (raw)
In-Reply-To: <e7c65bcc-f55e-47a7-b899-d9927576368a@polymtl.ca>
Ping.
On 8/28/25 1:00 PM, Simon Marchi wrote:
> Ping.
>
> On 8/5/25 1:01 PM, Simon Marchi wrote:
>> Ping.
>>
>> On 7/3/25 4:01 PM, simon.marchi@polymtl.ca wrote:
>>> From: Simon Marchi <simon.marchi@polymtl.ca>
>>>
>>> gdb::parallel_for_each uses static partitioning of the workload, meaning
>>> that each worker thread receives a similar number of work items. Change
>>> it to use dynamic partitioning, where worker threads pull work items
>>> from a shared work queue when they need to.
>>>
>>> Note that gdb::parallel_for_each is currently only used for processing
>>> minimal symbols in GDB. I am looking at improving the startup
>>> performance of GDB, where the minimal symbol process is one step.
>>>
>>> With static partitioning, there is a risk of workload imbalance if some
>>> threads receive "easier" work than others. Some threads sit still while
>>> others finish working on their share of the work. This is not
>>> desirable, because the gdb::parallel_for_each takes as long as the
>>> slowest thread takes.
>>>
>>> When loading a file with a lot of minimal symbols (~600k) in GDB, with
>>> "maint set per-command time on", I observe some imbalance:
>>>
>>> Time for "minsyms install worker": wall 0.732, user 0.550, sys 0.041, user+sys 0.591, 80.7 % CPU
>>> Time for "minsyms install worker": wall 0.881, user 0.722, sys 0.071, user+sys 0.793, 90.0 % CPU
>>> Time for "minsyms install worker": wall 2.107, user 1.804, sys 0.147, user+sys 1.951, 92.6 % CPU
>>> Time for "minsyms install worker": wall 2.351, user 2.003, sys 0.151, user+sys 2.154, 91.6 % CPU
>>> Time for "minsyms install worker": wall 2.611, user 2.322, sys 0.235, user+sys 2.557, 97.9 % CPU
>>> Time for "minsyms install worker": wall 3.074, user 2.729, sys 0.203, user+sys 2.932, 95.4 % CPU
>>> Time for "minsyms install worker": wall 3.486, user 3.074, sys 0.260, user+sys 3.334, 95.6 % CPU
>>> Time for "minsyms install worker": wall 3.927, user 3.475, sys 0.336, user+sys 3.811, 97.0 % CPU
>>> ^
>>> ----´
>>>
>>> The fastest thread took 0.732 seconds to complete its work (and then sat
>>> still), while the slowest took 3.927 seconds. This means the
>>> parallel_for_each took a bit less than 4 seconds.
>>>
>>> Even if the number of minimal symbols assigned to each worker is the
>>> same, I suppose that some symbols (e.g. those that need demangling) take
>>> longer to process, which could explain the imbalance.
>>>
>>> With this patch, things are much more balanced:
>>>
>>> Time for "minsym install worker": wall 2.807, user 2.222, sys 0.144, user+sys 2.366, 84.3 % CPU
>>> Time for "minsym install worker": wall 2.808, user 2.073, sys 0.131, user+sys 2.204, 78.5 % CPU
>>> Time for "minsym install worker": wall 2.804, user 1.994, sys 0.151, user+sys 2.145, 76.5 % CPU
>>> Time for "minsym install worker": wall 2.808, user 1.977, sys 0.135, user+sys 2.112, 75.2 % CPU
>>> Time for "minsym install worker": wall 2.808, user 2.061, sys 0.142, user+sys 2.203, 78.5 % CPU
>>> Time for "minsym install worker": wall 2.809, user 2.012, sys 0.146, user+sys 2.158, 76.8 % CPU
>>> Time for "minsym install worker": wall 2.809, user 2.178, sys 0.137, user+sys 2.315, 82.4 % CPU
>>> Time for "minsym install worker": wall 2.820, user 2.141, sys 0.170, user+sys 2.311, 82.0 % CPU
>>> ^
>>> ----´
>>>
>>> In this version, the parallel_for_each took about 2.8 seconds,
>>> representing a reduction of ~1.2 seconds for this step. Not
>>> life-changing, but it's still good I think.
>>>
>>> Note that this patch helps when loading big programs. My go-to test
>>> program for this is telegram-desktop that I built from source. For
>>> small programs (including loading gdb itself), it makes no perceptible
>>> difference.
>>>
>>> Now the technical bits:
>>>
>>> - One impact that this change has on the minimal symbol processing
>>> specifically is that not all calls to compute_and_set_names (a
>>> critical region guarded by a mutex) are done at the end of each
>>> worker thread's task anymore.
>>>
>>> Before this patch, each thread would compute the names and hash values for
>>> all the minimal symbols it has been assigned, and then would call
>>> compute_and_set_names for all of them, while holding the mutex (thus
>>> preventing other threads from doing this same step).
>>>
>>> With the shared work queue approach, each thread grabs a batch of of
>>> minimal symbols, computes the names and hash values for them, and
>>> then calls compute_and_set_names (with the mutex held) for this batch
>>> only. It then repeats that until the work queue is empty.
>>>
>>> There are therefore more small and spread out compute_and_set_names
>>> critical sections, instead of just one per worker thread at the end.
>>> Given that before this patch the work was not well balanced among worker
>>> threads, I guess that threads would enter that critical region at
>>> roughly different times, causing little contention.
>>>
>>> In the "with this patch" results, the CPU utilization numbers are not
>>> as good, suggesting that there is some contention. But I don't know
>>> if it's contention due to the compute_and_set_names critical section
>>> or the shared work queue critical section. That can be investigated
>>> later. In any case, what ultimately counts is the wall time, which
>>> improves.
>>>
>>> - One choice I had to make was to decide how many work items (in this
>>> case minimal symbols) each worker should pop when getting work from
>>> the shared queue. The general wisdom is that:
>>>
>>> - popping too few items, and the synchronization overhead becomes
>>> significant, and the total processing time increases
>>> - popping too many items, and we get some imbalance back, and the
>>> total processing time increases again
>>>
>>> I experimented using a dynamic batch size proportional to the number
>>> of remaining work items. It worked well in some cases but not
>>> always. So I decided to keep it simple, with a fixed batch size.
>>> That can always be tweaked later.
>>>
>>> - I want to still be able to use scoped_time_it to measure the time
>>> that each worker thread spent working on the task. I find it really
>>> handy when measuring the performance impact of changes.
>>>
>>> Unfortunately, the current interface of gdb::parallel_for_each, which
>>> receives a simple callback, is not well-suited for that, once I
>>> introduce the dynamic partitioning. The callback would get called
>>> once for each work item batch (multiple time for each worker thread),
>>> so it's not possible to maintain a per-worker thread object for the
>>> duration of the parallel for.
>>>
>>> To allow this, I changed gdb::parallel_for_each to receive a worker
>>> type as a template parameter. Each worker thread creates one local
>>> instance of that type, and calls operator() on it for each work item
>>> batch. By having a scoped_time_it object as a field of that worker,
>>> we can get the timings per worker thread.
>>>
>>> The drawbacks of this approach is that we must now define the
>>> parallel task in a separate class and manually capture any context we
>>> need as fields of that class.
>>>
>>> Change-Id: Ibf1fea65c91f76a95b9ed8f706fd6fa5ef52d9cf
>>> ---
>>> gdb/minsyms.c | 133 ++++++++++++--------
>>> gdb/unittests/parallel-for-selftests.c | 22 +++-
>>> gdbsupport/parallel-for.h | 164 ++++++++++++-------------
>>> 3 files changed, 178 insertions(+), 141 deletions(-)
>>>
>>> diff --git a/gdb/minsyms.c b/gdb/minsyms.c
>>> index 4a6459a6f2d2..e353a8a9ba00 100644
>>> --- a/gdb/minsyms.c
>>> +++ b/gdb/minsyms.c
>>> @@ -1392,6 +1392,81 @@ build_minimal_symbol_hash_tables
>>> }
>>> }
>>>
>>> +/* gdb::parallel_for_each worker to compute minimal symbol names and hashes. */
>>> +
>>> +class minimal_symbol_install_worker
>>> +{
>>> +public:
>>> + minimal_symbol_install_worker
>>> + (minimal_symbol *msymbols,
>>> + gdb::array_view<computed_hash_values> hash_values,
>>> + objfile_per_bfd_storage *per_bfd,
>>> + std::mutex &demangled_mutex)
>>> + : m_time_it ("minsym install worker"),
>>> + m_msymbols (msymbols),
>>> + m_hash_values (hash_values),
>>> + m_per_bfd (per_bfd),
>>> + m_demangled_mutex (demangled_mutex)
>>> + {}
>>> +
>>> + void operator() (minimal_symbol *start, minimal_symbol *end) noexcept
>>> + {
>>> + for (minimal_symbol *msym = start; msym < end; ++msym)
>>> + {
>>> + size_t idx = msym - m_msymbols;
>>> + m_hash_values[idx].name_length = strlen (msym->linkage_name ());
>>> +
>>> + if (!msym->name_set)
>>> + {
>>> + /* This will be freed later, by compute_and_set_names. */
>>> + gdb::unique_xmalloc_ptr<char> demangled_name
>>> + = symbol_find_demangled_name (msym, msym->linkage_name ());
>>> + msym->set_demangled_name (demangled_name.release (),
>>> + &m_per_bfd->storage_obstack);
>>> + msym->name_set = 1;
>>> + }
>>> +
>>> + /* This mangled_name_hash computation has to be outside of
>>> + the name_set check, or compute_and_set_names below will
>>> + be called with an invalid hash value. */
>>> + m_hash_values[idx].mangled_name_hash
>>> + = fast_hash (msym->linkage_name (), m_hash_values[idx].name_length);
>>> + m_hash_values[idx].minsym_hash = msymbol_hash (msym->linkage_name ());
>>> +
>>> + /* We only use this hash code if the search name differs
>>> + from the linkage name. See the code in
>>> + build_minimal_symbol_hash_tables. */
>>> + if (msym->search_name () != msym->linkage_name ())
>>> + m_hash_values[idx].minsym_demangled_hash
>>> + = search_name_hash (msym->language (), msym->search_name ());
>>> + }
>>> +
>>> + {
>>> + /* To limit how long we hold the lock, we only acquire it here
>>> + and not while we demangle the names above. */
>>> +#if CXX_STD_THREAD
>>> + std::lock_guard<std::mutex> guard (m_demangled_mutex);
>>> +#endif
>>> + for (minimal_symbol *msym = start; msym < end; ++msym)
>>> + {
>>> + size_t idx = msym - m_msymbols;
>>> + std::string_view name (msym->linkage_name (),
>>> + m_hash_values[idx].name_length);
>>> + hashval_t hashval = m_hash_values[idx].mangled_name_hash;
>>> +
>>> + msym->compute_and_set_names (name, false, m_per_bfd, hashval);
>>> + }
>>> + }
>>> + }
>>> +
>>> +private:
>>> + scoped_time_it m_time_it;
>>> + minimal_symbol *m_msymbols;
>>> + gdb::array_view<computed_hash_values> m_hash_values;
>>> + objfile_per_bfd_storage *m_per_bfd;
>>> + std::mutex &m_demangled_mutex;
>>> +};
>>> +
>>> /* Add the minimal symbols in the existing bunches to the objfile's official
>>> minimal symbol table. In most cases there is no minimal symbol table yet
>>> for this objfile, and the existing bunches are used to create one. Once
>>> @@ -1478,59 +1553,11 @@ minimal_symbol_reader::install ()
>>> std::vector<computed_hash_values> hash_values (mcount);
>>>
>>> msymbols = m_objfile->per_bfd->msymbols.get ();
>>> - /* Arbitrarily require at least 10 elements in a thread. */
>>> - gdb::parallel_for_each<10> (&msymbols[0], &msymbols[mcount],
>>> - [&] (minimal_symbol *start, minimal_symbol *end)
>>> - {
>>> - scoped_time_it time_it ("minsyms install worker");
>>> -
>>> - for (minimal_symbol *msym = start; msym < end; ++msym)
>>> - {
>>> - size_t idx = msym - msymbols;
>>> - hash_values[idx].name_length = strlen (msym->linkage_name ());
>>> - if (!msym->name_set)
>>> - {
>>> - /* This will be freed later, by compute_and_set_names. */
>>> - gdb::unique_xmalloc_ptr<char> demangled_name
>>> - = symbol_find_demangled_name (msym, msym->linkage_name ());
>>> - msym->set_demangled_name
>>> - (demangled_name.release (),
>>> - &m_objfile->per_bfd->storage_obstack);
>>> - msym->name_set = 1;
>>> - }
>>> - /* This mangled_name_hash computation has to be outside of
>>> - the name_set check, or compute_and_set_names below will
>>> - be called with an invalid hash value. */
>>> - hash_values[idx].mangled_name_hash
>>> - = fast_hash (msym->linkage_name (),
>>> - hash_values[idx].name_length);
>>> - hash_values[idx].minsym_hash
>>> - = msymbol_hash (msym->linkage_name ());
>>> - /* We only use this hash code if the search name differs
>>> - from the linkage name. See the code in
>>> - build_minimal_symbol_hash_tables. */
>>> - if (msym->search_name () != msym->linkage_name ())
>>> - hash_values[idx].minsym_demangled_hash
>>> - = search_name_hash (msym->language (), msym->search_name ());
>>> - }
>>> - {
>>> - /* To limit how long we hold the lock, we only acquire it here
>>> - and not while we demangle the names above. */
>>> -#if CXX_STD_THREAD
>>> - std::lock_guard<std::mutex> guard (demangled_mutex);
>>> -#endif
>>> - for (minimal_symbol *msym = start; msym < end; ++msym)
>>> - {
>>> - size_t idx = msym - msymbols;
>>> - msym->compute_and_set_names
>>> - (std::string_view (msym->linkage_name (),
>>> - hash_values[idx].name_length),
>>> - false,
>>> - m_objfile->per_bfd,
>>> - hash_values[idx].mangled_name_hash);
>>> - }
>>> - }
>>> - });
>>> +
>>> + gdb::parallel_for_each<1000, minimal_symbol *, minimal_symbol_install_worker>
>>> + (&msymbols[0], &msymbols[mcount], msymbols,
>>> + gdb::array_view<computed_hash_values> (hash_values),
>>> + m_objfile->per_bfd, demangled_mutex);
>>>
>>> build_minimal_symbol_hash_tables (m_objfile, hash_values);
>>> }
>>> diff --git a/gdb/unittests/parallel-for-selftests.c b/gdb/unittests/parallel-for-selftests.c
>>> index f5456141ff6e..86bf06c073a6 100644
>>> --- a/gdb/unittests/parallel-for-selftests.c
>>> +++ b/gdb/unittests/parallel-for-selftests.c
>>> @@ -91,12 +91,30 @@ test_one (int n_threads, do_foreach_t do_foreach)
>>> static void
>>> test_parallel_for_each ()
>>> {
>>> + struct test_worker
>>> + {
>>> + /* DUMMY is there to test passing multiple arguments to the worker
>>> + constructor. */
>>> + test_worker (foreach_callback_t callback, int dummy)
>>> + : m_callback (callback)
>>> + {
>>> + }
>>> +
>>> + void operator() (int first, int last) noexcept
>>> + {
>>> + return m_callback (first, last);
>>> + }
>>> +
>>> + private:
>>> + foreach_callback_t m_callback;
>>> + };
>>> +
>>> const std::vector<do_foreach_t> for_each_functions
>>> {
>>> [] (int start, int end, foreach_callback_t callback)
>>> - { gdb::parallel_for_each<1> (start, end, callback); },
>>> + { gdb::parallel_for_each<1, int, test_worker> (start, end, callback, 0); },
>>> [] (int start, int end, foreach_callback_t callback)
>>> - { gdb::sequential_for_each (start, end, callback);}
>>> + { gdb::sequential_for_each<int, test_worker> (start, end, callback, 0); },
>>> };
>>>
>>> for (int n_threads : { 0, 1, 3 })
>>> diff --git a/gdbsupport/parallel-for.h b/gdbsupport/parallel-for.h
>>> index b74c8068cf2c..bb41eb8700f0 100644
>>> --- a/gdbsupport/parallel-for.h
>>> +++ b/gdbsupport/parallel-for.h
>>> @@ -21,115 +21,107 @@
>>> #define GDBSUPPORT_PARALLEL_FOR_H
>>>
>>> #include <algorithm>
>>> -#include <type_traits>
>>> +#include <tuple>
>>> #include "gdbsupport/thread-pool.h"
>>> -#include "gdbsupport/function-view.h"
>>>
>>> namespace gdb
>>> {
>>>
>>> -/* A very simple "parallel for". This splits the range of iterators
>>> - into subranges, and then passes each subrange to the callback. The
>>> - work may or may not be done in separate threads.
>>> +/* A "parallel-for" implementation using a shared work queue. Work items get
>>> + popped in batches of size up to BATCH_SIZE from the queue and handed out to
>>> + worker threads.
>>>
>>> - This approach was chosen over having the callback work on single
>>> - items because it makes it simple for the caller to do
>>> - once-per-subrange initialization and destruction.
>>> + Each worker thread instantiates an object of type Worker, forwarding ARGS to
>>> + its constructor. The Worker object can be used to keep some per-worker
>>> + thread state.
>>>
>>> - The parameter N says how batching ought to be done -- there will be
>>> - at least N elements processed per thread. Setting N to 0 is not
>>> - allowed. */
>>> + Worker threads call Worker::operator() repeatedly until the queue is
>>> + empty. */
>>>
>>> -template<std::size_t n, class RandomIt, class RangeFunction>
>>> +template<std::size_t batch_size, class RandomIt, class Worker,
>>> + class... WorkerArgs>
>>> void
>>> -parallel_for_each (RandomIt first, RandomIt last, RangeFunction callback)
>>> +parallel_for_each (const RandomIt first, const RandomIt last,
>>> + WorkerArgs &&...worker_args)
>>> {
>>> /* If enabled, print debug info about how the work is distributed across
>>> the threads. */
>>> const bool parallel_for_each_debug = false;
>>>
>>> - size_t n_worker_threads = thread_pool::g_thread_pool->thread_count ();
>>> - size_t n_threads = n_worker_threads;
>>> - size_t n_elements = last - first;
>>> - size_t elts_per_thread = 0;
>>> - size_t elts_left_over = 0;
>>> + gdb_assert (first <= last);
>>>
>>> - if (n_threads > 1)
>>> + if (parallel_for_each_debug)
>>> {
>>> - /* Require that there should be at least N elements in a
>>> - thread. */
>>> - gdb_assert (n > 0);
>>> - if (n_elements / n_threads < n)
>>> - n_threads = std::max (n_elements / n, (size_t) 1);
>>> - elts_per_thread = n_elements / n_threads;
>>> - elts_left_over = n_elements % n_threads;
>>> - /* n_elements == n_threads * elts_per_thread + elts_left_over. */
>>> + debug_printf ("Parallel for: n elements: %zu\n",
>>> + static_cast<std::size_t> (last - first));
>>> + debug_printf ("Parallel for: batch size: %zu\n", batch_size);
>>> }
>>>
>>> - size_t count = n_threads == 0 ? 0 : n_threads - 1;
>>> + const size_t n_worker_threads
>>> + = std::max<size_t> (thread_pool::g_thread_pool->thread_count (), 1);
>>> std::vector<gdb::future<void>> results;
>>>
>>> - if (parallel_for_each_debug)
>>> - {
>>> - debug_printf (_("Parallel for: n_elements: %zu\n"), n_elements);
>>> - debug_printf (_("Parallel for: minimum elements per thread: %zu\n"), n);
>>> - debug_printf (_("Parallel for: elts_per_thread: %zu\n"), elts_per_thread);
>>> - }
>>> + /* The next item to hand out. */
>>> + std::atomic<RandomIt> next = first;
>>>
>>> - for (int i = 0; i < count; ++i)
>>> - {
>>> - RandomIt end;
>>> - end = first + elts_per_thread;
>>> - if (i < elts_left_over)
>>> - /* Distribute the leftovers over the worker threads, to avoid having
>>> - to handle all of them in a single thread. */
>>> - end++;
>>> -
>>> - /* This case means we don't have enough elements to really
>>> - distribute them. Rather than ever submit a task that does
>>> - nothing, we short-circuit here. */
>>> - if (first == end)
>>> - end = last;
>>> -
>>> - if (end == last)
>>> - {
>>> - /* We're about to dispatch the last batch of elements, which
>>> - we normally process in the main thread. So just truncate
>>> - the result list here. This avoids submitting empty tasks
>>> - to the thread pool. */
>>> - count = i;
>>> - break;
>>> - }
>>> + /* The worker thread task.
>>> +
>>> + We need to capture args as a tuple, because it's not possible to capture
>>> + the parameter pack directly in C++17. Once we migrate to C++20, the
>>> + capture can be simplified to:
>>>
>>> - if (parallel_for_each_debug)
>>> + ... args = std::forward<Args>(args)
>>> +
>>> + and `args` can be used as-is in the lambda. */
>>> + auto args_tuple
>>> + = std::forward_as_tuple (std::forward<WorkerArgs> (worker_args)...);
>>> + auto task =
>>> + [&next, first, last, n_worker_threads, &args_tuple] () {
>>> + /* Instantiate the user-defined worker. */
>>> + auto worker = std::make_from_tuple<Worker> (args_tuple);
>>> +
>>> + for (;;)
>>> {
>>> - debug_printf (_("Parallel for: elements on worker thread %i\t: %zu"),
>>> - i, (size_t)(end - first));
>>> - debug_printf (_("\n"));
>>> + /* Grab a snapshot of NEXT. */
>>> + auto local_next = next.load ();
>>> + gdb_assert (local_next <= last);
>>> +
>>> + /* Number of remaining items. */
>>> + auto n_remaining = last - local_next;
>>> + gdb_assert (n_remaining >= 0);
>>> +
>>> + /* Are we done? */
>>> + if (n_remaining == 0)
>>> + break;
>>> +
>>> + const auto this_batch_size
>>> + = std::min<std::size_t> (batch_size, n_remaining);
>>> +
>>> + /* The range to process in this iteration. */
>>> + const auto this_batch_first = local_next;
>>> + const auto this_batch_last = local_next + this_batch_size;
>>> +
>>> + /* Update NEXT. If the current value of NEXT doesn't match
>>> + LOCAL_NEXT, it means another thread updated it concurrently,
>>> + restart. */
>>> + if (!next.compare_exchange_weak (local_next, this_batch_last))
>>> + continue;
>>> +
>>> + if (parallel_for_each_debug)
>>> + debug_printf ("Processing %zu items, range [%zu, %zu[\n",
>>> + this_batch_size,
>>> + static_cast<size_t> (this_batch_first - first),
>>> + static_cast<size_t> (this_batch_last - first));
>>> +
>>> + worker (this_batch_first, this_batch_last);
>>> }
>>> - results.push_back (gdb::thread_pool::g_thread_pool->post_task ([=] ()
>>> - {
>>> - return callback (first, end);
>>> - }));
>>> - first = end;
>>> - }
>>> -
>>> - for (int i = count; i < n_worker_threads; ++i)
>>> - if (parallel_for_each_debug)
>>> - {
>>> - debug_printf (_("Parallel for: elements on worker thread %i\t: 0"), i);
>>> - debug_printf (_("\n"));
>>> - }
>>> + };
>>>
>>> - /* Process all the remaining elements in the main thread. */
>>> - if (parallel_for_each_debug)
>>> - {
>>> - debug_printf (_("Parallel for: elements on main thread\t\t: %zu"),
>>> - (size_t)(last - first));
>>> - debug_printf (_("\n"));
>>> - }
>>> - callback (first, last);
>>> + /* Start N_WORKER_THREADS tasks. */
>>> + for (int i = 0; i < n_worker_threads; ++i)
>>> + results.push_back (gdb::thread_pool::g_thread_pool->post_task (task));
>>>
>>> + /* Wait for all of them to be finished. */
>>> for (auto &fut : results)
>>> fut.get ();
>>> }
>>> @@ -138,11 +130,11 @@ parallel_for_each (RandomIt first, RandomIt last, RangeFunction callback)
>>> when debugging multi-threading behavior, and you want to limit
>>> multi-threading in a fine-grained way. */
>>>
>>> -template<class RandomIt, class RangeFunction>
>>> +template<class RandomIt, class Worker, class... WorkerArgs>
>>> void
>>> -sequential_for_each (RandomIt first, RandomIt last, RangeFunction callback)
>>> +sequential_for_each (RandomIt first, RandomIt last, WorkerArgs &&...worker_args)
>>> {
>>> - callback (first, last);
>>> + Worker (std::forward<WorkerArgs> (worker_args)...) (first, last);
>>> }
>>>
>>> }
>>>
>>> base-commit: b7ff16c68a2c0bacc0416c4b36a44e65888ce72b
>>> --
>>> 2.50.0
>>>
>>
>
next prev parent reply other threads:[~2025-09-08 17:24 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-03 20:01 simon.marchi
2025-07-03 20:01 ` [PATCH v2 2/4] gdbsupport: factor out work queue from parallel-for.h simon.marchi
2025-09-18 19:07 ` Tom Tromey
2025-09-19 18:06 ` Simon Marchi
2025-07-03 20:01 ` [PATCH v2 3/4] gdbsupport: add async parallel_for_each version simon.marchi
2025-09-18 19:16 ` Tom Tromey
2025-09-19 20:12 ` Simon Marchi
2025-07-03 20:01 ` [PATCH v2 4/4] gdb/dwarf: use dynamic partitioning for DWARF CU indexing simon.marchi
2025-09-18 19:26 ` Tom Tromey
2025-09-18 19:46 ` Simon Marchi
2025-09-18 19:53 ` Tom Tromey
2025-09-19 20:15 ` Simon Marchi
2025-08-05 17:01 ` [PATCH v2 1/4] gdbsupport: use dynamic partitioning in gdb::parallel_for_each Simon Marchi
2025-08-28 17:00 ` Simon Marchi
2025-09-08 17:23 ` Simon Marchi [this message]
2025-09-18 18:59 ` Tom Tromey
2025-09-19 17:47 ` Simon Marchi
2025-09-19 18:04 ` Tom Tromey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=71b6ba0c-3463-4f8f-b7b4-da43c83ba667@polymtl.ca \
--to=simon.marchi@polymtl.ca \
--cc=gdb-patches@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox