From: Sergey Zigachev via Gdb <gdb@sourceware.org>
To: gdb@sourceware.org
Subject: Multi-threading support on DragonFlyBSD
Date: Wed, 5 Mar 2025 15:22:33 +0500 [thread overview]
Message-ID: <DB9PR05MB82353E274191A69996FF924D92CB2@DB9PR05MB8235.eurprd05.prod.outlook.com> (raw)
Hi,
I'm trying to enable debugging of multi-threaded programs on
DragonFlyBSD using gdb and have several questions.
I checked out gdb 15.2 sources, applied out-of-tree patches with
native-dependent code implementation for amd64 DragonFlyBSD target.
These patches have basic support for enumerating registers, accessing
memory etc. All the OS-specific things. Single-thread debugging works
well with these patches. However, multiprocess debugging, async and
non-stop modes are not implemented. As is any notion of multiple threads
in the inferior. My goal is to add missing bits and pieces both in
userspace (gdb's nat-dep) and kernelspace (mostly ptrace syscall).
I'm at the stage where I have the basics working. I can report events
from different threads, suspend/resume single or all threads, inspect
TLS variables etc. Schedule-locking mode also supported.
My questions concern mostly step-over ("step") command issued to
multiple threads, not at once of course. Although I see approx. the same
behavior with the "next" command.
I wrote a little test C program that has main thread spawning two extra
threads running two separate but mostly identical functions. Each
function prints message to output, sleeps for several seconds, prints
another message, and then returns. Main function waits for both threads
to complete.
First test run.
I put a breakpoint at the start of the first function, run the program
and input "step" command until process exits. Essentially I'm debbuging
as-if it was a single-thread program. Everything works as expected. By
that i mean after each "step" command gdb steps on the next source line
until the end of the function. It skips instructions that are part of a
single source line, and skips functions that don't have debugging
symbols. This is the baseline I'm trying to replicate with stepping over
multiple threads.
Second test run.
Same as the first run but this time I put a breakpoint at the start of
both functions. This results in gdb stopping at different threads when I
enter "step" command.
I observe the following issues:
- each source line gets hit several times, I think by the number of
asm instructions it implements
- gdb stops at functions without debug symbols; sometimes it can step
over and continue; sometimes it doesn't if it can't find the end of the
function
Basically, gdb's acting as if it's running "stepi" command.
With trial and error and some printf sprinkled around I came up with a
theory why that could happen. Consider this part of the function:
> 23 printf("Running thread thr=%d sleep=%d!\n", d->thr_num, d->sleep);
> 0x0000000000400b2d <+29>: mov 0xc(%rdi),%edx
> 0x0000000000400b30 <+32>: mov $0x400cff,%edi
> 0x0000000000400b35 <+37>: xor %eax,%eax
> 0x0000000000400b37 <+39>: call 0x4007f0 <printf@plt>
>
> 24 sleep(d->sleep);
> 0x0000000000400b3c <+44>: mov 0xc(%rbx),%edi
When gdb stops at line 23 and I enter "step" command, it calculates step
range thread_info->control.step_range_{start,end}. In this case it will
be [0x400b2d-0x400b3c). Then it uses this info to step through asm
instructions 4 times transparently to the user.
The happy call chain looks like this (infrun.c):
- clear_proceed_status_thread *for each thread*
- prepare_one_step (infcmd.c) -- this one updates the range
- proceed
- 4x fetch_inferior_event
- stop at next line waiting for input
This works for single-thread case because gdb clears thread_info state
and then loads range for the *current* thread after clearing.
If I return event for a different thread, then its info was cleared and
haven't been loaded by gdb before that. Gdb can't recognize this signal
and assumes it's a "random signal". It goes downhill from there, I think
most consecutive signals become "random".
I don't want to track down heisenbugs caused by my dirty hands touching
gdb thread state and accidentally braking invariants. Hence this
message, any advice is appreciated.
My questions:
1. Does my theory make sense?
2. Is it a job of native-dependent code to save/restore thread state
before reporting a signal?
3. If yes, what else should I store, per-thread, per-process etc?
Thanks,
Sergey
next reply other threads:[~2025-03-05 10:23 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-05 10:22 Sergey Zigachev via Gdb [this message]
2025-03-09 15:16 ` Sergey Zigachev via Gdb
2025-03-10 16:22 ` Simon Marchi via Gdb
2025-03-11 7:59 ` Sergey Zigachev via Gdb
2025-03-11 14:37 ` Simon Marchi via Gdb
2025-03-12 4:12 ` Sergey Zigachev via Gdb
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DB9PR05MB82353E274191A69996FF924D92CB2@DB9PR05MB8235.eurprd05.prod.outlook.com \
--to=gdb@sourceware.org \
--cc=s.zi@outlook.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox