Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c
@ 2026-02-06 17:57 simon.marchi
  2026-02-20 15:24 ` Simon Marchi
  2026-02-20 19:05 ` Tom Tromey
  0 siblings, 2 replies; 4+ messages in thread
From: simon.marchi @ 2026-02-06 17:57 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi

From: Simon Marchi <simon.marchi@polymtl.ca>

I get frequent failures in gdb.replay/missing-thread.exp, starting with:

    FAIL: gdb.replay/missing-thread.exp: non_stop=on: record_initial_logfile: continue (timeout)

I tracked this down to a race condition in the test program, causing it
not to generate the expected SIGTRAP.  This can be reproduced by adding
a sleep(1) just after the pthread_create.

The expected behavior is:

 - The main thread starts a second thread
 - The main thread blocks on pthread_cond_wait, waiting to be notified
   by the second thread (to make sure the second thread had time to
   start)
 - The second thread notifies the main thread using pthread_cond_signal
 - The main thread sends a SIGTRAP to the second thread

However, this can happen:

 - The main thread starts a second thread
 - The second thread calls pthread_cond_signal, while no one is waiting
   to the condvar
 - The main thread blocks on pthread_cond_wait forever

Fix it by introducing a separate boolean predicate protected by the
shared mutex, and looping until it is true.  This is the way
pthread_cond_wait is meant to be used. From pthread_cond_wait(3):

    When using condition variables there is always a boolean predicate
    involving shared variables associated with each condition wait that
    is true if the thread should proceed. Spurious wakeups from the
    pthread_cond_wait() or pthread_cond_timedwait() functions may occur.
    Since the return from pthread_cond_wait() or
    pthread_cond_timedwait() does not imply anything about the value of
    this predicate, the predicate should be re-evaluated upon such
    return.

Finally, make things static, just out of principle, and fix minor
formatting issues.

Change-Id: I62ba2085a2d506dc3d91e32cd5de48c43a3ff55e
---
 gdb/testsuite/gdb.replay/missing-thread.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/gdb/testsuite/gdb.replay/missing-thread.c b/gdb/testsuite/gdb.replay/missing-thread.c
index 0a17be1c7dd1..aad6396f24c2 100644
--- a/gdb/testsuite/gdb.replay/missing-thread.c
+++ b/gdb/testsuite/gdb.replay/missing-thread.c
@@ -22,23 +22,28 @@
 #include <unistd.h>
 #include <stdbool.h>
 
-pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;
-pthread_cond_t g_condvar = PTHREAD_COND_INITIALIZER;
+static pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;
+static pthread_cond_t g_condvar = PTHREAD_COND_INITIALIZER;
+static bool g_ready = false;
 
-void *
+static void *
 worker_function (void *arg)
 {
   printf ("In worker, about to notify\n");
+
+  pthread_mutex_lock (&g_mutex);
+  g_ready = true;
+  pthread_mutex_unlock (&g_mutex);
   pthread_cond_signal (&g_condvar);
 
   while (true)
-    sleep(1);
+    sleep (1);
 
   return NULL;
 }
 
 int
-main()
+main ()
 {
   pthread_t my_thread;
 
@@ -46,7 +51,8 @@ main()
   assert (result == 0);
 
   pthread_mutex_lock (&g_mutex);
-  pthread_cond_wait (&g_condvar, &g_mutex);
+  while (!g_ready)
+    pthread_cond_wait (&g_condvar, &g_mutex);
 
   printf ("In main, have been woken.\n");
   pthread_mutex_unlock (&g_mutex);

base-commit: deb47060b5812c6c41ecc790cb28898fcbc45c93
-- 
2.53.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c
  2026-02-06 17:57 [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c simon.marchi
@ 2026-02-20 15:24 ` Simon Marchi
  2026-02-20 19:05 ` Tom Tromey
  1 sibling, 0 replies; 4+ messages in thread
From: Simon Marchi @ 2026-02-20 15:24 UTC (permalink / raw)
  To: gdb-patches

Ping.

On 2/6/26 12:57 PM, simon.marchi@polymtl.ca wrote:
> From: Simon Marchi <simon.marchi@polymtl.ca>
> 
> I get frequent failures in gdb.replay/missing-thread.exp, starting with:
> 
>     FAIL: gdb.replay/missing-thread.exp: non_stop=on: record_initial_logfile: continue (timeout)
> 
> I tracked this down to a race condition in the test program, causing it
> not to generate the expected SIGTRAP.  This can be reproduced by adding
> a sleep(1) just after the pthread_create.
> 
> The expected behavior is:
> 
>  - The main thread starts a second thread
>  - The main thread blocks on pthread_cond_wait, waiting to be notified
>    by the second thread (to make sure the second thread had time to
>    start)
>  - The second thread notifies the main thread using pthread_cond_signal
>  - The main thread sends a SIGTRAP to the second thread
> 
> However, this can happen:
> 
>  - The main thread starts a second thread
>  - The second thread calls pthread_cond_signal, while no one is waiting
>    to the condvar
>  - The main thread blocks on pthread_cond_wait forever
> 
> Fix it by introducing a separate boolean predicate protected by the
> shared mutex, and looping until it is true.  This is the way
> pthread_cond_wait is meant to be used. From pthread_cond_wait(3):
> 
>     When using condition variables there is always a boolean predicate
>     involving shared variables associated with each condition wait that
>     is true if the thread should proceed. Spurious wakeups from the
>     pthread_cond_wait() or pthread_cond_timedwait() functions may occur.
>     Since the return from pthread_cond_wait() or
>     pthread_cond_timedwait() does not imply anything about the value of
>     this predicate, the predicate should be re-evaluated upon such
>     return.
> 
> Finally, make things static, just out of principle, and fix minor
> formatting issues.
> 
> Change-Id: I62ba2085a2d506dc3d91e32cd5de48c43a3ff55e
> ---
>  gdb/testsuite/gdb.replay/missing-thread.c | 18 ++++++++++++------
>  1 file changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/gdb/testsuite/gdb.replay/missing-thread.c b/gdb/testsuite/gdb.replay/missing-thread.c
> index 0a17be1c7dd1..aad6396f24c2 100644
> --- a/gdb/testsuite/gdb.replay/missing-thread.c
> +++ b/gdb/testsuite/gdb.replay/missing-thread.c
> @@ -22,23 +22,28 @@
>  #include <unistd.h>
>  #include <stdbool.h>
>  
> -pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;
> -pthread_cond_t g_condvar = PTHREAD_COND_INITIALIZER;
> +static pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;
> +static pthread_cond_t g_condvar = PTHREAD_COND_INITIALIZER;
> +static bool g_ready = false;
>  
> -void *
> +static void *
>  worker_function (void *arg)
>  {
>    printf ("In worker, about to notify\n");
> +
> +  pthread_mutex_lock (&g_mutex);
> +  g_ready = true;
> +  pthread_mutex_unlock (&g_mutex);
>    pthread_cond_signal (&g_condvar);
>  
>    while (true)
> -    sleep(1);
> +    sleep (1);
>  
>    return NULL;
>  }
>  
>  int
> -main()
> +main ()
>  {
>    pthread_t my_thread;
>  
> @@ -46,7 +51,8 @@ main()
>    assert (result == 0);
>  
>    pthread_mutex_lock (&g_mutex);
> -  pthread_cond_wait (&g_condvar, &g_mutex);
> +  while (!g_ready)
> +    pthread_cond_wait (&g_condvar, &g_mutex);
>  
>    printf ("In main, have been woken.\n");
>    pthread_mutex_unlock (&g_mutex);
> 
> base-commit: deb47060b5812c6c41ecc790cb28898fcbc45c93
> -- 
> 2.53.0
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c
  2026-02-06 17:57 [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c simon.marchi
  2026-02-20 15:24 ` Simon Marchi
@ 2026-02-20 19:05 ` Tom Tromey
  2026-02-20 19:44   ` Simon Marchi
  1 sibling, 1 reply; 4+ messages in thread
From: Tom Tromey @ 2026-02-20 19:05 UTC (permalink / raw)
  To: simon.marchi; +Cc: gdb-patches

>>>>> "Simon" == simon marchi <simon.marchi@polymtl.ca> writes:

Simon> Fix it by introducing a separate boolean predicate protected by the
Simon> shared mutex, and looping until it is true.  This is the way
Simon> pthread_cond_wait is meant to be used. From pthread_cond_wait(3):

Thanks.

Simon> +  pthread_mutex_lock (&g_mutex);
Simon> +  g_ready = true;
Simon> +  pthread_mutex_unlock (&g_mutex);
Simon>    pthread_cond_signal (&g_condvar);

I looked into this and TIL it's fine to pthread_cond_signal without
holding the lock.

Approved-By: Tom Tromey <tom@tromey.com>

Tom

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c
  2026-02-20 19:05 ` Tom Tromey
@ 2026-02-20 19:44   ` Simon Marchi
  0 siblings, 0 replies; 4+ messages in thread
From: Simon Marchi @ 2026-02-20 19:44 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On 2/20/26 2:05 PM, Tom Tromey wrote:
>>>>>> "Simon" == simon marchi <simon.marchi@polymtl.ca> writes:
> 
> Simon> Fix it by introducing a separate boolean predicate protected by the
> Simon> shared mutex, and looping until it is true.  This is the way
> Simon> pthread_cond_wait is meant to be used. From pthread_cond_wait(3):
> 
> Thanks.
> 
> Simon> +  pthread_mutex_lock (&g_mutex);
> Simon> +  g_ready = true;
> Simon> +  pthread_mutex_unlock (&g_mutex);
> Simon>    pthread_cond_signal (&g_condvar);
> 
> I looked into this and TIL it's fine to pthread_cond_signal without
> holding the lock.
> 
> Approved-By: Tom Tromey <tom@tromey.com>
> 
> Tom

Thanks, this will make my CI jobs much happier.  Pushed.

Simon

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-02-20 19:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-06 17:57 [PATCH] gdb/testsuite: fix race condition in gdb.replay/missing-thread.c simon.marchi
2026-02-20 15:24 ` Simon Marchi
2026-02-20 19:05 ` Tom Tromey
2026-02-20 19:44   ` Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox