From: Simon Marchi <simon.marchi@polymtl.ca>
To: gdb-patches@sourceware.org
Subject: Re: [PATCH v2 1/4] gdbsupport: use dynamic partitioning in gdb::parallel_for_each
Date: Thu, 28 Aug 2025 13:00:17 -0400 [thread overview]
Message-ID: <e7c65bcc-f55e-47a7-b899-d9927576368a@polymtl.ca> (raw)
In-Reply-To: <879f5c1e-14e6-41d4-b3d9-76d6c91bcf15@polymtl.ca>
Ping.
On 8/5/25 1:01 PM, Simon Marchi wrote:
> Ping.
>
> On 7/3/25 4:01 PM, simon.marchi@polymtl.ca wrote:
>> From: Simon Marchi <simon.marchi@polymtl.ca>
>>
>> gdb::parallel_for_each uses static partitioning of the workload, meaning
>> that each worker thread receives a similar number of work items. Change
>> it to use dynamic partitioning, where worker threads pull work items
>> from a shared work queue when they need to.
>>
>> Note that gdb::parallel_for_each is currently only used for processing
>> minimal symbols in GDB. I am looking at improving the startup
>> performance of GDB, where the minimal symbol process is one step.
>>
>> With static partitioning, there is a risk of workload imbalance if some
>> threads receive "easier" work than others. Some threads sit still while
>> others finish working on their share of the work. This is not
>> desirable, because the gdb::parallel_for_each takes as long as the
>> slowest thread takes.
>>
>> When loading a file with a lot of minimal symbols (~600k) in GDB, with
>> "maint set per-command time on", I observe some imbalance:
>>
>> Time for "minsyms install worker": wall 0.732, user 0.550, sys 0.041, user+sys 0.591, 80.7 % CPU
>> Time for "minsyms install worker": wall 0.881, user 0.722, sys 0.071, user+sys 0.793, 90.0 % CPU
>> Time for "minsyms install worker": wall 2.107, user 1.804, sys 0.147, user+sys 1.951, 92.6 % CPU
>> Time for "minsyms install worker": wall 2.351, user 2.003, sys 0.151, user+sys 2.154, 91.6 % CPU
>> Time for "minsyms install worker": wall 2.611, user 2.322, sys 0.235, user+sys 2.557, 97.9 % CPU
>> Time for "minsyms install worker": wall 3.074, user 2.729, sys 0.203, user+sys 2.932, 95.4 % CPU
>> Time for "minsyms install worker": wall 3.486, user 3.074, sys 0.260, user+sys 3.334, 95.6 % CPU
>> Time for "minsyms install worker": wall 3.927, user 3.475, sys 0.336, user+sys 3.811, 97.0 % CPU
>> ^
>> ----´
>>
>> The fastest thread took 0.732 seconds to complete its work (and then sat
>> still), while the slowest took 3.927 seconds. This means the
>> parallel_for_each took a bit less than 4 seconds.
>>
>> Even if the number of minimal symbols assigned to each worker is the
>> same, I suppose that some symbols (e.g. those that need demangling) take
>> longer to process, which could explain the imbalance.
>>
>> With this patch, things are much more balanced:
>>
>> Time for "minsym install worker": wall 2.807, user 2.222, sys 0.144, user+sys 2.366, 84.3 % CPU
>> Time for "minsym install worker": wall 2.808, user 2.073, sys 0.131, user+sys 2.204, 78.5 % CPU
>> Time for "minsym install worker": wall 2.804, user 1.994, sys 0.151, user+sys 2.145, 76.5 % CPU
>> Time for "minsym install worker": wall 2.808, user 1.977, sys 0.135, user+sys 2.112, 75.2 % CPU
>> Time for "minsym install worker": wall 2.808, user 2.061, sys 0.142, user+sys 2.203, 78.5 % CPU
>> Time for "minsym install worker": wall 2.809, user 2.012, sys 0.146, user+sys 2.158, 76.8 % CPU
>> Time for "minsym install worker": wall 2.809, user 2.178, sys 0.137, user+sys 2.315, 82.4 % CPU
>> Time for "minsym install worker": wall 2.820, user 2.141, sys 0.170, user+sys 2.311, 82.0 % CPU
>> ^
>> ----´
>>
>> In this version, the parallel_for_each took about 2.8 seconds,
>> representing a reduction of ~1.2 seconds for this step. Not
>> life-changing, but it's still good I think.
>>
>> Note that this patch helps when loading big programs. My go-to test
>> program for this is telegram-desktop that I built from source. For
>> small programs (including loading gdb itself), it makes no perceptible
>> difference.
>>
>> Now the technical bits:
>>
>> - One impact that this change has on the minimal symbol processing
>> specifically is that not all calls to compute_and_set_names (a
>> critical region guarded by a mutex) are done at the end of each
>> worker thread's task anymore.
>>
>> Before this patch, each thread would compute the names and hash values for
>> all the minimal symbols it has been assigned, and then would call
>> compute_and_set_names for all of them, while holding the mutex (thus
>> preventing other threads from doing this same step).
>>
>> With the shared work queue approach, each thread grabs a batch of of
>> minimal symbols, computes the names and hash values for them, and
>> then calls compute_and_set_names (with the mutex held) for this batch
>> only. It then repeats that until the work queue is empty.
>>
>> There are therefore more small and spread out compute_and_set_names
>> critical sections, instead of just one per worker thread at the end.
>> Given that before this patch the work was not well balanced among worker
>> threads, I guess that threads would enter that critical region at
>> roughly different times, causing little contention.
>>
>> In the "with this patch" results, the CPU utilization numbers are not
>> as good, suggesting that there is some contention. But I don't know
>> if it's contention due to the compute_and_set_names critical section
>> or the shared work queue critical section. That can be investigated
>> later. In any case, what ultimately counts is the wall time, which
>> improves.
>>
>> - One choice I had to make was to decide how many work items (in this
>> case minimal symbols) each worker should pop when getting work from
>> the shared queue. The general wisdom is that:
>>
>> - popping too few items, and the synchronization overhead becomes
>> significant, and the total processing time increases
>> - popping too many items, and we get some imbalance back, and the
>> total processing time increases again
>>
>> I experimented using a dynamic batch size proportional to the number
>> of remaining work items. It worked well in some cases but not
>> always. So I decided to keep it simple, with a fixed batch size.
>> That can always be tweaked later.
>>
>> - I want to still be able to use scoped_time_it to measure the time
>> that each worker thread spent working on the task. I find it really
>> handy when measuring the performance impact of changes.
>>
>> Unfortunately, the current interface of gdb::parallel_for_each, which
>> receives a simple callback, is not well-suited for that, once I
>> introduce the dynamic partitioning. The callback would get called
>> once for each work item batch (multiple time for each worker thread),
>> so it's not possible to maintain a per-worker thread object for the
>> duration of the parallel for.
>>
>> To allow this, I changed gdb::parallel_for_each to receive a worker
>> type as a template parameter. Each worker thread creates one local
>> instance of that type, and calls operator() on it for each work item
>> batch. By having a scoped_time_it object as a field of that worker,
>> we can get the timings per worker thread.
>>
>> The drawbacks of this approach is that we must now define the
>> parallel task in a separate class and manually capture any context we
>> need as fields of that class.
>>
>> Change-Id: Ibf1fea65c91f76a95b9ed8f706fd6fa5ef52d9cf
>> ---
>> gdb/minsyms.c | 133 ++++++++++++--------
>> gdb/unittests/parallel-for-selftests.c | 22 +++-
>> gdbsupport/parallel-for.h | 164 ++++++++++++-------------
>> 3 files changed, 178 insertions(+), 141 deletions(-)
>>
>> diff --git a/gdb/minsyms.c b/gdb/minsyms.c
>> index 4a6459a6f2d2..e353a8a9ba00 100644
>> --- a/gdb/minsyms.c
>> +++ b/gdb/minsyms.c
>> @@ -1392,6 +1392,81 @@ build_minimal_symbol_hash_tables
>> }
>> }
>>
>> +/* gdb::parallel_for_each worker to compute minimal symbol names and hashes. */
>> +
>> +class minimal_symbol_install_worker
>> +{
>> +public:
>> + minimal_symbol_install_worker
>> + (minimal_symbol *msymbols,
>> + gdb::array_view<computed_hash_values> hash_values,
>> + objfile_per_bfd_storage *per_bfd,
>> + std::mutex &demangled_mutex)
>> + : m_time_it ("minsym install worker"),
>> + m_msymbols (msymbols),
>> + m_hash_values (hash_values),
>> + m_per_bfd (per_bfd),
>> + m_demangled_mutex (demangled_mutex)
>> + {}
>> +
>> + void operator() (minimal_symbol *start, minimal_symbol *end) noexcept
>> + {
>> + for (minimal_symbol *msym = start; msym < end; ++msym)
>> + {
>> + size_t idx = msym - m_msymbols;
>> + m_hash_values[idx].name_length = strlen (msym->linkage_name ());
>> +
>> + if (!msym->name_set)
>> + {
>> + /* This will be freed later, by compute_and_set_names. */
>> + gdb::unique_xmalloc_ptr<char> demangled_name
>> + = symbol_find_demangled_name (msym, msym->linkage_name ());
>> + msym->set_demangled_name (demangled_name.release (),
>> + &m_per_bfd->storage_obstack);
>> + msym->name_set = 1;
>> + }
>> +
>> + /* This mangled_name_hash computation has to be outside of
>> + the name_set check, or compute_and_set_names below will
>> + be called with an invalid hash value. */
>> + m_hash_values[idx].mangled_name_hash
>> + = fast_hash (msym->linkage_name (), m_hash_values[idx].name_length);
>> + m_hash_values[idx].minsym_hash = msymbol_hash (msym->linkage_name ());
>> +
>> + /* We only use this hash code if the search name differs
>> + from the linkage name. See the code in
>> + build_minimal_symbol_hash_tables. */
>> + if (msym->search_name () != msym->linkage_name ())
>> + m_hash_values[idx].minsym_demangled_hash
>> + = search_name_hash (msym->language (), msym->search_name ());
>> + }
>> +
>> + {
>> + /* To limit how long we hold the lock, we only acquire it here
>> + and not while we demangle the names above. */
>> +#if CXX_STD_THREAD
>> + std::lock_guard<std::mutex> guard (m_demangled_mutex);
>> +#endif
>> + for (minimal_symbol *msym = start; msym < end; ++msym)
>> + {
>> + size_t idx = msym - m_msymbols;
>> + std::string_view name (msym->linkage_name (),
>> + m_hash_values[idx].name_length);
>> + hashval_t hashval = m_hash_values[idx].mangled_name_hash;
>> +
>> + msym->compute_and_set_names (name, false, m_per_bfd, hashval);
>> + }
>> + }
>> + }
>> +
>> +private:
>> + scoped_time_it m_time_it;
>> + minimal_symbol *m_msymbols;
>> + gdb::array_view<computed_hash_values> m_hash_values;
>> + objfile_per_bfd_storage *m_per_bfd;
>> + std::mutex &m_demangled_mutex;
>> +};
>> +
>> /* Add the minimal symbols in the existing bunches to the objfile's official
>> minimal symbol table. In most cases there is no minimal symbol table yet
>> for this objfile, and the existing bunches are used to create one. Once
>> @@ -1478,59 +1553,11 @@ minimal_symbol_reader::install ()
>> std::vector<computed_hash_values> hash_values (mcount);
>>
>> msymbols = m_objfile->per_bfd->msymbols.get ();
>> - /* Arbitrarily require at least 10 elements in a thread. */
>> - gdb::parallel_for_each<10> (&msymbols[0], &msymbols[mcount],
>> - [&] (minimal_symbol *start, minimal_symbol *end)
>> - {
>> - scoped_time_it time_it ("minsyms install worker");
>> -
>> - for (minimal_symbol *msym = start; msym < end; ++msym)
>> - {
>> - size_t idx = msym - msymbols;
>> - hash_values[idx].name_length = strlen (msym->linkage_name ());
>> - if (!msym->name_set)
>> - {
>> - /* This will be freed later, by compute_and_set_names. */
>> - gdb::unique_xmalloc_ptr<char> demangled_name
>> - = symbol_find_demangled_name (msym, msym->linkage_name ());
>> - msym->set_demangled_name
>> - (demangled_name.release (),
>> - &m_objfile->per_bfd->storage_obstack);
>> - msym->name_set = 1;
>> - }
>> - /* This mangled_name_hash computation has to be outside of
>> - the name_set check, or compute_and_set_names below will
>> - be called with an invalid hash value. */
>> - hash_values[idx].mangled_name_hash
>> - = fast_hash (msym->linkage_name (),
>> - hash_values[idx].name_length);
>> - hash_values[idx].minsym_hash
>> - = msymbol_hash (msym->linkage_name ());
>> - /* We only use this hash code if the search name differs
>> - from the linkage name. See the code in
>> - build_minimal_symbol_hash_tables. */
>> - if (msym->search_name () != msym->linkage_name ())
>> - hash_values[idx].minsym_demangled_hash
>> - = search_name_hash (msym->language (), msym->search_name ());
>> - }
>> - {
>> - /* To limit how long we hold the lock, we only acquire it here
>> - and not while we demangle the names above. */
>> -#if CXX_STD_THREAD
>> - std::lock_guard<std::mutex> guard (demangled_mutex);
>> -#endif
>> - for (minimal_symbol *msym = start; msym < end; ++msym)
>> - {
>> - size_t idx = msym - msymbols;
>> - msym->compute_and_set_names
>> - (std::string_view (msym->linkage_name (),
>> - hash_values[idx].name_length),
>> - false,
>> - m_objfile->per_bfd,
>> - hash_values[idx].mangled_name_hash);
>> - }
>> - }
>> - });
>> +
>> + gdb::parallel_for_each<1000, minimal_symbol *, minimal_symbol_install_worker>
>> + (&msymbols[0], &msymbols[mcount], msymbols,
>> + gdb::array_view<computed_hash_values> (hash_values),
>> + m_objfile->per_bfd, demangled_mutex);
>>
>> build_minimal_symbol_hash_tables (m_objfile, hash_values);
>> }
>> diff --git a/gdb/unittests/parallel-for-selftests.c b/gdb/unittests/parallel-for-selftests.c
>> index f5456141ff6e..86bf06c073a6 100644
>> --- a/gdb/unittests/parallel-for-selftests.c
>> +++ b/gdb/unittests/parallel-for-selftests.c
>> @@ -91,12 +91,30 @@ test_one (int n_threads, do_foreach_t do_foreach)
>> static void
>> test_parallel_for_each ()
>> {
>> + struct test_worker
>> + {
>> + /* DUMMY is there to test passing multiple arguments to the worker
>> + constructor. */
>> + test_worker (foreach_callback_t callback, int dummy)
>> + : m_callback (callback)
>> + {
>> + }
>> +
>> + void operator() (int first, int last) noexcept
>> + {
>> + return m_callback (first, last);
>> + }
>> +
>> + private:
>> + foreach_callback_t m_callback;
>> + };
>> +
>> const std::vector<do_foreach_t> for_each_functions
>> {
>> [] (int start, int end, foreach_callback_t callback)
>> - { gdb::parallel_for_each<1> (start, end, callback); },
>> + { gdb::parallel_for_each<1, int, test_worker> (start, end, callback, 0); },
>> [] (int start, int end, foreach_callback_t callback)
>> - { gdb::sequential_for_each (start, end, callback);}
>> + { gdb::sequential_for_each<int, test_worker> (start, end, callback, 0); },
>> };
>>
>> for (int n_threads : { 0, 1, 3 })
>> diff --git a/gdbsupport/parallel-for.h b/gdbsupport/parallel-for.h
>> index b74c8068cf2c..bb41eb8700f0 100644
>> --- a/gdbsupport/parallel-for.h
>> +++ b/gdbsupport/parallel-for.h
>> @@ -21,115 +21,107 @@
>> #define GDBSUPPORT_PARALLEL_FOR_H
>>
>> #include <algorithm>
>> -#include <type_traits>
>> +#include <tuple>
>> #include "gdbsupport/thread-pool.h"
>> -#include "gdbsupport/function-view.h"
>>
>> namespace gdb
>> {
>>
>> -/* A very simple "parallel for". This splits the range of iterators
>> - into subranges, and then passes each subrange to the callback. The
>> - work may or may not be done in separate threads.
>> +/* A "parallel-for" implementation using a shared work queue. Work items get
>> + popped in batches of size up to BATCH_SIZE from the queue and handed out to
>> + worker threads.
>>
>> - This approach was chosen over having the callback work on single
>> - items because it makes it simple for the caller to do
>> - once-per-subrange initialization and destruction.
>> + Each worker thread instantiates an object of type Worker, forwarding ARGS to
>> + its constructor. The Worker object can be used to keep some per-worker
>> + thread state.
>>
>> - The parameter N says how batching ought to be done -- there will be
>> - at least N elements processed per thread. Setting N to 0 is not
>> - allowed. */
>> + Worker threads call Worker::operator() repeatedly until the queue is
>> + empty. */
>>
>> -template<std::size_t n, class RandomIt, class RangeFunction>
>> +template<std::size_t batch_size, class RandomIt, class Worker,
>> + class... WorkerArgs>
>> void
>> -parallel_for_each (RandomIt first, RandomIt last, RangeFunction callback)
>> +parallel_for_each (const RandomIt first, const RandomIt last,
>> + WorkerArgs &&...worker_args)
>> {
>> /* If enabled, print debug info about how the work is distributed across
>> the threads. */
>> const bool parallel_for_each_debug = false;
>>
>> - size_t n_worker_threads = thread_pool::g_thread_pool->thread_count ();
>> - size_t n_threads = n_worker_threads;
>> - size_t n_elements = last - first;
>> - size_t elts_per_thread = 0;
>> - size_t elts_left_over = 0;
>> + gdb_assert (first <= last);
>>
>> - if (n_threads > 1)
>> + if (parallel_for_each_debug)
>> {
>> - /* Require that there should be at least N elements in a
>> - thread. */
>> - gdb_assert (n > 0);
>> - if (n_elements / n_threads < n)
>> - n_threads = std::max (n_elements / n, (size_t) 1);
>> - elts_per_thread = n_elements / n_threads;
>> - elts_left_over = n_elements % n_threads;
>> - /* n_elements == n_threads * elts_per_thread + elts_left_over. */
>> + debug_printf ("Parallel for: n elements: %zu\n",
>> + static_cast<std::size_t> (last - first));
>> + debug_printf ("Parallel for: batch size: %zu\n", batch_size);
>> }
>>
>> - size_t count = n_threads == 0 ? 0 : n_threads - 1;
>> + const size_t n_worker_threads
>> + = std::max<size_t> (thread_pool::g_thread_pool->thread_count (), 1);
>> std::vector<gdb::future<void>> results;
>>
>> - if (parallel_for_each_debug)
>> - {
>> - debug_printf (_("Parallel for: n_elements: %zu\n"), n_elements);
>> - debug_printf (_("Parallel for: minimum elements per thread: %zu\n"), n);
>> - debug_printf (_("Parallel for: elts_per_thread: %zu\n"), elts_per_thread);
>> - }
>> + /* The next item to hand out. */
>> + std::atomic<RandomIt> next = first;
>>
>> - for (int i = 0; i < count; ++i)
>> - {
>> - RandomIt end;
>> - end = first + elts_per_thread;
>> - if (i < elts_left_over)
>> - /* Distribute the leftovers over the worker threads, to avoid having
>> - to handle all of them in a single thread. */
>> - end++;
>> -
>> - /* This case means we don't have enough elements to really
>> - distribute them. Rather than ever submit a task that does
>> - nothing, we short-circuit here. */
>> - if (first == end)
>> - end = last;
>> -
>> - if (end == last)
>> - {
>> - /* We're about to dispatch the last batch of elements, which
>> - we normally process in the main thread. So just truncate
>> - the result list here. This avoids submitting empty tasks
>> - to the thread pool. */
>> - count = i;
>> - break;
>> - }
>> + /* The worker thread task.
>> +
>> + We need to capture args as a tuple, because it's not possible to capture
>> + the parameter pack directly in C++17. Once we migrate to C++20, the
>> + capture can be simplified to:
>>
>> - if (parallel_for_each_debug)
>> + ... args = std::forward<Args>(args)
>> +
>> + and `args` can be used as-is in the lambda. */
>> + auto args_tuple
>> + = std::forward_as_tuple (std::forward<WorkerArgs> (worker_args)...);
>> + auto task =
>> + [&next, first, last, n_worker_threads, &args_tuple] () {
>> + /* Instantiate the user-defined worker. */
>> + auto worker = std::make_from_tuple<Worker> (args_tuple);
>> +
>> + for (;;)
>> {
>> - debug_printf (_("Parallel for: elements on worker thread %i\t: %zu"),
>> - i, (size_t)(end - first));
>> - debug_printf (_("\n"));
>> + /* Grab a snapshot of NEXT. */
>> + auto local_next = next.load ();
>> + gdb_assert (local_next <= last);
>> +
>> + /* Number of remaining items. */
>> + auto n_remaining = last - local_next;
>> + gdb_assert (n_remaining >= 0);
>> +
>> + /* Are we done? */
>> + if (n_remaining == 0)
>> + break;
>> +
>> + const auto this_batch_size
>> + = std::min<std::size_t> (batch_size, n_remaining);
>> +
>> + /* The range to process in this iteration. */
>> + const auto this_batch_first = local_next;
>> + const auto this_batch_last = local_next + this_batch_size;
>> +
>> + /* Update NEXT. If the current value of NEXT doesn't match
>> + LOCAL_NEXT, it means another thread updated it concurrently,
>> + restart. */
>> + if (!next.compare_exchange_weak (local_next, this_batch_last))
>> + continue;
>> +
>> + if (parallel_for_each_debug)
>> + debug_printf ("Processing %zu items, range [%zu, %zu[\n",
>> + this_batch_size,
>> + static_cast<size_t> (this_batch_first - first),
>> + static_cast<size_t> (this_batch_last - first));
>> +
>> + worker (this_batch_first, this_batch_last);
>> }
>> - results.push_back (gdb::thread_pool::g_thread_pool->post_task ([=] ()
>> - {
>> - return callback (first, end);
>> - }));
>> - first = end;
>> - }
>> -
>> - for (int i = count; i < n_worker_threads; ++i)
>> - if (parallel_for_each_debug)
>> - {
>> - debug_printf (_("Parallel for: elements on worker thread %i\t: 0"), i);
>> - debug_printf (_("\n"));
>> - }
>> + };
>>
>> - /* Process all the remaining elements in the main thread. */
>> - if (parallel_for_each_debug)
>> - {
>> - debug_printf (_("Parallel for: elements on main thread\t\t: %zu"),
>> - (size_t)(last - first));
>> - debug_printf (_("\n"));
>> - }
>> - callback (first, last);
>> + /* Start N_WORKER_THREADS tasks. */
>> + for (int i = 0; i < n_worker_threads; ++i)
>> + results.push_back (gdb::thread_pool::g_thread_pool->post_task (task));
>>
>> + /* Wait for all of them to be finished. */
>> for (auto &fut : results)
>> fut.get ();
>> }
>> @@ -138,11 +130,11 @@ parallel_for_each (RandomIt first, RandomIt last, RangeFunction callback)
>> when debugging multi-threading behavior, and you want to limit
>> multi-threading in a fine-grained way. */
>>
>> -template<class RandomIt, class RangeFunction>
>> +template<class RandomIt, class Worker, class... WorkerArgs>
>> void
>> -sequential_for_each (RandomIt first, RandomIt last, RangeFunction callback)
>> +sequential_for_each (RandomIt first, RandomIt last, WorkerArgs &&...worker_args)
>> {
>> - callback (first, last);
>> + Worker (std::forward<WorkerArgs> (worker_args)...) (first, last);
>> }
>>
>> }
>>
>> base-commit: b7ff16c68a2c0bacc0416c4b36a44e65888ce72b
>> --
>> 2.50.0
>>
>
next prev parent reply other threads:[~2025-08-28 17:01 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-03 20:01 simon.marchi
2025-07-03 20:01 ` [PATCH v2 2/4] gdbsupport: factor out work queue from parallel-for.h simon.marchi
2025-09-18 19:07 ` Tom Tromey
2025-09-19 18:06 ` Simon Marchi
2025-07-03 20:01 ` [PATCH v2 3/4] gdbsupport: add async parallel_for_each version simon.marchi
2025-09-18 19:16 ` Tom Tromey
2025-09-19 20:12 ` Simon Marchi
2025-07-03 20:01 ` [PATCH v2 4/4] gdb/dwarf: use dynamic partitioning for DWARF CU indexing simon.marchi
2025-09-18 19:26 ` Tom Tromey
2025-09-18 19:46 ` Simon Marchi
2025-09-18 19:53 ` Tom Tromey
2025-09-19 20:15 ` Simon Marchi
2025-08-05 17:01 ` [PATCH v2 1/4] gdbsupport: use dynamic partitioning in gdb::parallel_for_each Simon Marchi
2025-08-28 17:00 ` Simon Marchi [this message]
2025-09-08 17:23 ` Simon Marchi
2025-09-18 18:59 ` Tom Tromey
2025-09-19 17:47 ` Simon Marchi
2025-09-19 18:04 ` Tom Tromey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e7c65bcc-f55e-47a7-b899-d9927576368a@polymtl.ca \
--to=simon.marchi@polymtl.ca \
--cc=gdb-patches@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox