* [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c
@ 2026-02-06 17:57 simon.marchi
2026-02-20 15:24 ` Simon Marchi
2026-02-20 19:05 ` Tom Tromey
0 siblings, 2 replies; 4+ messages in thread
From: simon.marchi @ 2026-02-06 17:57 UTC (permalink / raw)
To: gdb-patches; +Cc: Simon Marchi
From: Simon Marchi <simon.marchi@polymtl.ca>
I get frequent failures in gdb.replay/missing-thread.exp, starting with:
FAIL: gdb.replay/missing-thread.exp: non_stop=on: record_initial_logfile: continue (timeout)
I tracked this down to a race condition in the test program, causing it
not to generate the expected SIGTRAP. This can be reproduced by adding
a sleep(1) just after the pthread_create.
The expected behavior is:
- The main thread starts a second thread
- The main thread blocks on pthread_cond_wait, waiting to be notified
by the second thread (to make sure the second thread had time to
start)
- The second thread notifies the main thread using pthread_cond_signal
- The main thread sends a SIGTRAP to the second thread
However, this can happen:
- The main thread starts a second thread
- The second thread calls pthread_cond_signal, while no one is waiting
to the condvar
- The main thread blocks on pthread_cond_wait forever
Fix it by introducing a separate boolean predicate protected by the
shared mutex, and looping until it is true. This is the way
pthread_cond_wait is meant to be used. From pthread_cond_wait(3):
When using condition variables there is always a boolean predicate
involving shared variables associated with each condition wait that
is true if the thread should proceed. Spurious wakeups from the
pthread_cond_wait() or pthread_cond_timedwait() functions may occur.
Since the return from pthread_cond_wait() or
pthread_cond_timedwait() does not imply anything about the value of
this predicate, the predicate should be re-evaluated upon such
return.
Finally, make things static, just out of principle, and fix minor
formatting issues.
Change-Id: I62ba2085a2d506dc3d91e32cd5de48c43a3ff55e
---
gdb/testsuite/gdb.replay/missing-thread.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/gdb/testsuite/gdb.replay/missing-thread.c b/gdb/testsuite/gdb.replay/missing-thread.c
index 0a17be1c7dd1..aad6396f24c2 100644
--- a/gdb/testsuite/gdb.replay/missing-thread.c
+++ b/gdb/testsuite/gdb.replay/missing-thread.c
@@ -22,23 +22,28 @@
#include <unistd.h>
#include <stdbool.h>
-pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;
-pthread_cond_t g_condvar = PTHREAD_COND_INITIALIZER;
+static pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;
+static pthread_cond_t g_condvar = PTHREAD_COND_INITIALIZER;
+static bool g_ready = false;
-void *
+static void *
worker_function (void *arg)
{
printf ("In worker, about to notify\n");
+
+ pthread_mutex_lock (&g_mutex);
+ g_ready = true;
+ pthread_mutex_unlock (&g_mutex);
pthread_cond_signal (&g_condvar);
while (true)
- sleep(1);
+ sleep (1);
return NULL;
}
int
-main()
+main ()
{
pthread_t my_thread;
@@ -46,7 +51,8 @@ main()
assert (result == 0);
pthread_mutex_lock (&g_mutex);
- pthread_cond_wait (&g_condvar, &g_mutex);
+ while (!g_ready)
+ pthread_cond_wait (&g_condvar, &g_mutex);
printf ("In main, have been woken.\n");
pthread_mutex_unlock (&g_mutex);
base-commit: deb47060b5812c6c41ecc790cb28898fcbc45c93
--
2.53.0
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c
2026-02-06 17:57 [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c simon.marchi
@ 2026-02-20 15:24 ` Simon Marchi
2026-02-20 19:05 ` Tom Tromey
1 sibling, 0 replies; 4+ messages in thread
From: Simon Marchi @ 2026-02-20 15:24 UTC (permalink / raw)
To: gdb-patches
Ping.
On 2/6/26 12:57 PM, simon.marchi@polymtl.ca wrote:
> From: Simon Marchi <simon.marchi@polymtl.ca>
>
> I get frequent failures in gdb.replay/missing-thread.exp, starting with:
>
> FAIL: gdb.replay/missing-thread.exp: non_stop=on: record_initial_logfile: continue (timeout)
>
> I tracked this down to a race condition in the test program, causing it
> not to generate the expected SIGTRAP. This can be reproduced by adding
> a sleep(1) just after the pthread_create.
>
> The expected behavior is:
>
> - The main thread starts a second thread
> - The main thread blocks on pthread_cond_wait, waiting to be notified
> by the second thread (to make sure the second thread had time to
> start)
> - The second thread notifies the main thread using pthread_cond_signal
> - The main thread sends a SIGTRAP to the second thread
>
> However, this can happen:
>
> - The main thread starts a second thread
> - The second thread calls pthread_cond_signal, while no one is waiting
> to the condvar
> - The main thread blocks on pthread_cond_wait forever
>
> Fix it by introducing a separate boolean predicate protected by the
> shared mutex, and looping until it is true. This is the way
> pthread_cond_wait is meant to be used. From pthread_cond_wait(3):
>
> When using condition variables there is always a boolean predicate
> involving shared variables associated with each condition wait that
> is true if the thread should proceed. Spurious wakeups from the
> pthread_cond_wait() or pthread_cond_timedwait() functions may occur.
> Since the return from pthread_cond_wait() or
> pthread_cond_timedwait() does not imply anything about the value of
> this predicate, the predicate should be re-evaluated upon such
> return.
>
> Finally, make things static, just out of principle, and fix minor
> formatting issues.
>
> Change-Id: I62ba2085a2d506dc3d91e32cd5de48c43a3ff55e
> ---
> gdb/testsuite/gdb.replay/missing-thread.c | 18 ++++++++++++------
> 1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/gdb/testsuite/gdb.replay/missing-thread.c b/gdb/testsuite/gdb.replay/missing-thread.c
> index 0a17be1c7dd1..aad6396f24c2 100644
> --- a/gdb/testsuite/gdb.replay/missing-thread.c
> +++ b/gdb/testsuite/gdb.replay/missing-thread.c
> @@ -22,23 +22,28 @@
> #include <unistd.h>
> #include <stdbool.h>
>
> -pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;
> -pthread_cond_t g_condvar = PTHREAD_COND_INITIALIZER;
> +static pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;
> +static pthread_cond_t g_condvar = PTHREAD_COND_INITIALIZER;
> +static bool g_ready = false;
>
> -void *
> +static void *
> worker_function (void *arg)
> {
> printf ("In worker, about to notify\n");
> +
> + pthread_mutex_lock (&g_mutex);
> + g_ready = true;
> + pthread_mutex_unlock (&g_mutex);
> pthread_cond_signal (&g_condvar);
>
> while (true)
> - sleep(1);
> + sleep (1);
>
> return NULL;
> }
>
> int
> -main()
> +main ()
> {
> pthread_t my_thread;
>
> @@ -46,7 +51,8 @@ main()
> assert (result == 0);
>
> pthread_mutex_lock (&g_mutex);
> - pthread_cond_wait (&g_condvar, &g_mutex);
> + while (!g_ready)
> + pthread_cond_wait (&g_condvar, &g_mutex);
>
> printf ("In main, have been woken.\n");
> pthread_mutex_unlock (&g_mutex);
>
> base-commit: deb47060b5812c6c41ecc790cb28898fcbc45c93
> --
> 2.53.0
>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c
2026-02-06 17:57 [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c simon.marchi
2026-02-20 15:24 ` Simon Marchi
@ 2026-02-20 19:05 ` Tom Tromey
2026-02-20 19:44 ` Simon Marchi
1 sibling, 1 reply; 4+ messages in thread
From: Tom Tromey @ 2026-02-20 19:05 UTC (permalink / raw)
To: simon.marchi; +Cc: gdb-patches
>>>>> "Simon" == simon marchi <simon.marchi@polymtl.ca> writes:
Simon> Fix it by introducing a separate boolean predicate protected by the
Simon> shared mutex, and looping until it is true. This is the way
Simon> pthread_cond_wait is meant to be used. From pthread_cond_wait(3):
Thanks.
Simon> + pthread_mutex_lock (&g_mutex);
Simon> + g_ready = true;
Simon> + pthread_mutex_unlock (&g_mutex);
Simon> pthread_cond_signal (&g_condvar);
I looked into this and TIL it's fine to pthread_cond_signal without
holding the lock.
Approved-By: Tom Tromey <tom@tromey.com>
Tom
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c
2026-02-20 19:05 ` Tom Tromey
@ 2026-02-20 19:44 ` Simon Marchi
0 siblings, 0 replies; 4+ messages in thread
From: Simon Marchi @ 2026-02-20 19:44 UTC (permalink / raw)
To: Tom Tromey; +Cc: gdb-patches
On 2/20/26 2:05 PM, Tom Tromey wrote:
>>>>>> "Simon" == simon marchi <simon.marchi@polymtl.ca> writes:
>
> Simon> Fix it by introducing a separate boolean predicate protected by the
> Simon> shared mutex, and looping until it is true. This is the way
> Simon> pthread_cond_wait is meant to be used. From pthread_cond_wait(3):
>
> Thanks.
>
> Simon> + pthread_mutex_lock (&g_mutex);
> Simon> + g_ready = true;
> Simon> + pthread_mutex_unlock (&g_mutex);
> Simon> pthread_cond_signal (&g_condvar);
>
> I looked into this and TIL it's fine to pthread_cond_signal without
> holding the lock.
>
> Approved-By: Tom Tromey <tom@tromey.com>
>
> Tom
Thanks, this will make my CI jobs much happier. Pushed.
Simon
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-20 19:45 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-06 17:57 [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c simon.marchi
2026-02-20 15:24 ` Simon Marchi
2026-02-20 19:05 ` Tom Tromey
2026-02-20 19:44 ` Simon Marchi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox