Mirror of the gdb mailing list
 help / color / mirror / Atom feed
From: Sergey Zigachev via Gdb <gdb@sourceware.org>
To: gdb@sourceware.org
Subject: Multi-threading support on DragonFlyBSD
Date: Wed, 5 Mar 2025 15:22:33 +0500	[thread overview]
Message-ID: <DB9PR05MB82353E274191A69996FF924D92CB2@DB9PR05MB8235.eurprd05.prod.outlook.com> (raw)

Hi,
I'm trying to enable debugging of multi-threaded programs on 
DragonFlyBSD using gdb and have several questions.

I checked out gdb 15.2 sources, applied out-of-tree patches with 
native-dependent code implementation for amd64 DragonFlyBSD target. 
These patches have basic support for enumerating registers, accessing 
memory etc. All the OS-specific things. Single-thread debugging works 
well with these patches. However, multiprocess debugging, async and 
non-stop modes are not implemented. As is any notion of multiple threads 
in the inferior. My goal is to add missing bits and pieces both in 
userspace (gdb's nat-dep) and kernelspace (mostly ptrace syscall).

I'm at the stage where I have the basics working. I can report events 
from different threads, suspend/resume single or all threads, inspect 
TLS variables etc. Schedule-locking mode also supported.

My questions concern mostly step-over ("step") command issued to 
multiple threads, not at once of course. Although I see approx. the same 
behavior with the "next" command.
I wrote a little test C program that has main thread spawning two extra 
threads running two separate but mostly identical functions. Each 
function prints message to output, sleeps for several seconds, prints 
another message, and then returns. Main function waits for both threads 
to complete.

First test run.
I put a breakpoint at the start of the first function, run the program 
and input "step" command until process exits. Essentially I'm debbuging 
as-if it was a single-thread program. Everything works as expected. By 
that i mean after each "step" command gdb steps on the next source line 
until the end of the function. It skips instructions that are part of a 
single source line, and skips functions that don't have debugging 
symbols. This is the baseline I'm trying to replicate with stepping over 
multiple threads.

Second test run.
Same as the first run but this time I put a breakpoint at the start of 
both functions. This results in gdb stopping at different threads when I 
enter "step" command.
I observe the following issues:
   - each source line gets hit several times, I think by the number of 
asm instructions it implements
   - gdb stops at functions without debug symbols; sometimes it can step 
over and continue; sometimes it doesn't if it can't find the end of the 
function
Basically, gdb's acting as if it's running "stepi" command.

With trial and error and some printf sprinkled around I came up with a 
theory why that could happen. Consider this part of the function:

 > 23 printf("Running thread thr=%d sleep=%d!\n", d->thr_num, d->sleep);
 >   0x0000000000400b2d <+29>:    mov    0xc(%rdi),%edx
 >   0x0000000000400b30 <+32>:    mov    $0x400cff,%edi
 >   0x0000000000400b35 <+37>:    xor    %eax,%eax
 >   0x0000000000400b37 <+39>:    call   0x4007f0 <printf@plt>
 >
 > 24 sleep(d->sleep);
 >   0x0000000000400b3c <+44>:    mov    0xc(%rbx),%edi

When gdb stops at line 23 and I enter "step" command, it calculates step 
range thread_info->control.step_range_{start,end}. In this case it will 
be [0x400b2d-0x400b3c). Then it uses this info to step through asm 
instructions 4 times transparently to the user.
The happy call chain looks like this (infrun.c):
   - clear_proceed_status_thread *for each thread*
   - prepare_one_step (infcmd.c) -- this one updates the range
   - proceed
   - 4x fetch_inferior_event
   - stop at next line waiting for input
This works for single-thread case because gdb clears thread_info state 
and then loads range for the *current* thread after clearing.
If I return event for a different thread, then its info was cleared and 
haven't been loaded by gdb before that. Gdb can't recognize this signal 
and assumes it's a "random signal". It goes downhill from there, I think 
most consecutive signals become "random".

I don't want to track down heisenbugs caused by my dirty hands touching 
gdb thread state and accidentally braking invariants. Hence this 
message, any advice is appreciated.

My questions:
   1. Does my theory make sense?
   2. Is it a job of native-dependent code to save/restore thread state 
before reporting a signal?
   3. If yes, what else should I store, per-thread, per-process etc?

Thanks,
Sergey

             reply	other threads:[~2025-03-05 10:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-05 10:22 Sergey Zigachev via Gdb [this message]
2025-03-09 15:16 ` Sergey Zigachev via Gdb
2025-03-10 16:22 ` Simon Marchi via Gdb
2025-03-11  7:59   ` Sergey Zigachev via Gdb
2025-03-11 14:37     ` Simon Marchi via Gdb
2025-03-12  4:12       ` Sergey Zigachev via Gdb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DB9PR05MB82353E274191A69996FF924D92CB2@DB9PR05MB8235.eurprd05.prod.outlook.com \
    --to=gdb@sourceware.org \
    --cc=s.zi@outlook.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox