From: Tom de Vries <tdevries@suse.de>
To: Pedro Alves <pedro@palves.net>, gdb-patches@sourceware.org
Subject: Re: [PATCH] [gdb/tdep] Don't call WaitForSingleObject with INFINITE arg
Date: Fri, 6 Jun 2025 09:12:10 +0200 [thread overview]
Message-ID: <4311a30a-51cc-4054-8219-48f895bd962f@suse.de> (raw)
In-Reply-To: <a6a1fb8b-4a4b-4fc0-a983-bc8d6cea5af8@palves.net>
On 6/5/25 18:15, Pedro Alves wrote:
> On 2025-06-05 16:03, Tom de Vries wrote:
>> I decided to try to build and test gdb on Windows.
>>
>> I found a page on the wiki ( https://sourceware.org/gdb/wiki/BuildingOnWindows
>> ) suggesting three ways of building gdb:
>> - MinGW,
>> - MinGW on Cygwin, and
>> - Cygwin.
>>
>> I picked Cygwin, because I've used it before (though not recently).
>>
>> I managed to install Cygwin and sufficient packages to build gdb and start the
>> testsuite.
>
> Cool!
>
> FYI, I cross build from Linux using this: https://github.com/palves/cygwin-cross
> and then test from Cygwin, by:
>
> - smb mounting the sources from linux
> - configuring only the testsuite, not the whole gdb (src/gdb/testsuite/configure)
> - copying over gdb.exe to the Windows box (via rsync)
> - pointing at the copied gdb with make check GDB=.../gdb.exe
>
> I do this because building on Linux is so much faster.
>
Hi Pedro,
yeah, things are slow indeed, and thanks for sharing this setup. For
now, I'll stick with the basic setup, and leave optimizations for later.
>> However, testsuite progress ground to a halt at gdb.base/branch-to-self.exp.
>> [ AFAICT, similar problems reported here [1]. ]
>
> The testsuite fails to kill gdb all the time for me, it may well be the same.
> I'll give it a try after you've merged it.
>
> Ecstatic that you found this!
>
Yeah, I hope that this helps :)
FWIW, I've ran the testsuite again, and got stuck this time at
gdb.base/catch-fork-kill.exp. In this case, the problem didn't seem to
be gdb, but lingering test execs. After killing those manually using
"kill -9", the test-case finished.
The hang can be fixed by adding a missing "require supports_catch_syscall".
I was wondering, I saw you mention "I've got another series of patches
to improve such tests and skip others, and that's what I've been testing
with". Do you have similar fixes in that series? If so, can you submit
them, or share them in a branch?
>> I managed to reproduce this hang by running just the test-case.
>>
>> I attempted to kill the hanging processes by:
>> - first killing the inferior process, using the cygwin "kill -9" command, and
>> - then killing the gdb process, likewise.
>>
>> But the gdb process remained, and I had to point-and-click my way through task
>> manager to actually kill the gdb process.
>>
>> I investigated this by attaching to the hanging gdb process. Looking at the
>> main thread, I saw it was stopped in a call to WaitForSingleObject, with
>> the dwMilliseconds parameter set to INFINITE.
>>
>> The backtrace in more detail:
>> ...
>> (gdb) bt
>> #0 0x00007fff196fc044 in ntdll!ZwWaitForSingleObject () from
>> /cygdrive/c/windows/SYSTEM32/ntdll.dll
>> #1 0x00007fff16bbcdcf in WaitForSingleObjectEx () from
>> /cygdrive/c/windows/System32/KERNELBASE.dll
>> #2 0x0000000100998065 in wait_for_single (handle=0x1b8, howlong=4294967295) at
>> gdb/windows-nat.c:435
>> #3 0x0000000100999aa7 in
>> windows_nat_target::do_synchronously(gdb::function_view<bool ()>)
>> (this=this@entry=0xa001c6fe0, func=...) at gdb/windows-nat.c:487
>> #4 0x000000010099a7fb in windows_nat_target::wait_for_debug_event_main_thread
>> (event=<optimized out>, this=0xa001c6fe0)
>> at gdb/../gdbsupport/function-view.h:296
>> #5 windows_nat_target::kill (this=0xa001c6fe0) at gdb/windows-nat.c:2917
>> #6 0x00000001008f2f86 in target_kill () at gdb/target.c:901
>> #7 0x000000010091fc46 in kill_or_detach (from_tty=0, inf=0xa000577d0)
>> at gdb/top.c:1658
>> #8 quit_force (exit_arg=<optimized out>, from_tty=from_tty@entry=0)
>> at gdb/top.c:1759
>> #9 0x00000001004f9ea8 in quit_command (args=args@entry=0x0,
>> from_tty=from_tty@entry=0) at gdb/cli/cli-cmds.c:483
>> #10 0x000000010091c6d0 in quit_cover () at gdb/top.c:295
>> #11 0x00000001005e3d8a in async_disconnect (arg=<optimized out>)
>> at gdb/event-top.c:1496
>> #12 0x0000000100499c45 in invoke_async_signal_handlers ()
>> at gdb/async-event.c:233
>> #13 0x0000000100eb23d6 in gdb_do_one_event (mstimeout=mstimeout@entry=-1)
>> at gdbsupport/event-loop.cc:198
>> #14 0x00000001006df94a in interp::do_one_event (mstimeout=-1,
>> this=<optimized out>) at gdb/interps.h:87
>> #15 start_event_loop () at gdb/main.c:402
>> #16 captured_command_loop () at gdb/main.c:466
>> #17 0x00000001006e2865 in captured_main (data=0x7ffffcba0) at gdb/main.c:1346
>> #18 gdb_main (args=args@entry=0x7ffffcc10) at gdb/main.c:1365
>> #19 0x0000000100f98c70 in main (argc=10, argv=0xa000129f0) at gdb/gdb.c:38
>> ...
>>
>> In the docs [2], I read that using an INFINITE argument to WaitForSingleObject
>> might cause a system deadlock.
>
> Wow, unbelievable. Thanks a lot for going all the way and finding this.
>
>>
>> This prompted me to try this simple change in wait_for_single:
>> ...
>> while (true)
>> {
>> - DWORD r = WaitForSingleObject (handle, howlong);
>> + DWORD r = WaitForSingleObject (handle,
>> + howlong == INFINITE ? 100 : howlong);
>> + if (howlong == INFINITE && r == WAIT_TIMEOUT)
>> + continue;
>> ...
>> with the timeout of 0.1 second estimated to be:
>> - small enough for gdb to feel reactive, and
>> - big enough not to consume too much cpu cycles with looping.
>>
>> And indeed, the test-case, while still failing, now finishes in ~50 seconds.
>>
>> While there may be an underlying bug that triggers this behaviour, the failure
>> mode is so severe that I consider it a bug in itself.
>>
>
> Yeah. The documentation talks about the the case of the code spawning windows, which
> I don't think we do. But still. I've seen so many weird things on Windows...
> It's just one more.
>
>> Fix this by avoiding calling WaitForSingleObject with INFINITE argument.
>
> Approved-By: Pedro Alves <pedro@palves.net>
>
Thanks for the review.
- Tom
next prev parent reply other threads:[~2025-06-06 7:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-05 15:03 Tom de Vries
2025-06-05 16:15 ` Pedro Alves
2025-06-05 16:25 ` Pedro Alves
2025-06-06 6:27 ` Tom de Vries
2025-06-06 7:12 ` Tom de Vries [this message]
2025-06-06 15:14 ` Pedro Alves
2025-06-06 15:27 ` Pedro Alves
2025-06-09 17:52 ` Pedro Alves
2025-06-10 7:13 ` Tom de Vries
2025-06-06 13:11 ` Pedro Alves
2025-06-10 7:13 ` Tom de Vries
2025-06-10 9:51 ` Pedro Alves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4311a30a-51cc-4054-8219-48f895bd962f@suse.de \
--to=tdevries@suse.de \
--cc=gdb-patches@sourceware.org \
--cc=pedro@palves.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox