Re: Multi-threading support on DragonFlyBSD

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

From: Simon Marchi via Gdb <gdb@sourceware.org>
To: Sergey Zigachev <s.zi@outlook.com>, gdb@sourceware.org
Subject: Re: Multi-threading support on DragonFlyBSD
Date: Mon, 10 Mar 2025 12:22:42 -0400	[thread overview]
Message-ID: <ff4d4680-3d5f-42f0-8b74-4ac7e2dca8fb@simark.ca> (raw)
In-Reply-To: <DB9PR05MB82353E274191A69996FF924D92CB2@DB9PR05MB8235.eurprd05.prod.outlook.com>

On 3/5/25 5:22 AM, Sergey Zigachev via Gdb wrote:
> Hi,
> I'm trying to enable debugging of multi-threaded programs on DragonFlyBSD using gdb and have several questions.
> 
> I checked out gdb 15.2 sources, applied out-of-tree patches with native-dependent code implementation for amd64 DragonFlyBSD target. These patches have basic support for enumerating registers, accessing memory etc. All the OS-specific things. Single-thread debugging works well with these patches. However, multiprocess debugging, async and non-stop modes are not implemented. As is any notion of multiple threads in the inferior. My goal is to add missing bits and pieces both in userspace (gdb's nat-dep) and kernelspace (mostly ptrace syscall).
> 
> I'm at the stage where I have the basics working. I can report events from different threads, suspend/resume single or all threads, inspect TLS variables etc. Schedule-locking mode also supported.
> 
> My questions concern mostly step-over ("step") command issued to multiple threads, not at once of course. Although I see approx. the same behavior with the "next" command.
> I wrote a little test C program that has main thread spawning two extra threads running two separate but mostly identical functions. Each function prints message to output, sleeps for several seconds, prints another message, and then returns. Main function waits for both threads to complete.
> 
> First test run.
> I put a breakpoint at the start of the first function, run the program and input "step" command until process exits. Essentially I'm debbuging as-if it was a single-thread program. Everything works as expected. By that i mean after each "step" command gdb steps on the next source line until the end of the function. It skips instructions that are part of a single source line, and skips functions that don't have debugging symbols. This is the baseline I'm trying to replicate with stepping over multiple threads.
> 
> Second test run.
> Same as the first run but this time I put a breakpoint at the start of both functions. This results in gdb stopping at different threads when I enter "step" command.
> I observe the following issues:
>   - each source line gets hit several times, I think by the number of asm instructions it implements
>   - gdb stops at functions without debug symbols; sometimes it can step over and continue; sometimes it doesn't if it can't find the end of the function
> Basically, gdb's acting as if it's running "stepi" command.
> 
> With trial and error and some printf sprinkled around I came up with a theory why that could happen. Consider this part of the function:
> 
>> 23 printf("Running thread thr=%d sleep=%d!\n", d->thr_num, d->sleep);
>>   0x0000000000400b2d <+29>:    mov    0xc(%rdi),%edx
>>   0x0000000000400b30 <+32>:    mov    $0x400cff,%edi
>>   0x0000000000400b35 <+37>:    xor    %eax,%eax
>>   0x0000000000400b37 <+39>:    call   0x4007f0 <printf@plt>
>>
>> 24 sleep(d->sleep);
>>   0x0000000000400b3c <+44>:    mov    0xc(%rbx),%edi
> 
> When gdb stops at line 23 and I enter "step" command, it calculates step range thread_info->control.step_range_{start,end}. In this case it will be [0x400b2d-0x400b3c). Then it uses this info to step through asm instructions 4 times transparently to the user.
> The happy call chain looks like this (infrun.c):
>   - clear_proceed_status_thread *for each thread*
>   - prepare_one_step (infcmd.c) -- this one updates the range
>   - proceed
>   - 4x fetch_inferior_event
>   - stop at next line waiting for input
> This works for single-thread case because gdb clears thread_info state and then loads range for the *current* thread after clearing.
> If I return event for a different thread, then its info was cleared and haven't been loaded by gdb before that. Gdb can't recognize this signal and assumes it's a "random signal". It goes downhill from there, I think most consecutive signals become "random".
> 
> I don't want to track down heisenbugs caused by my dirty hands touching gdb thread state and accidentally braking invariants. Hence this message, any advice is appreciated.
> 
> My questions:
>   1. Does my theory make sense?
>   2. Is it a job of native-dependent code to save/restore thread state before reporting a signal?
>   3. If yes, what else should I store, per-thread, per-process etc?
> 
> Thanks,
> Sergey

Hi Sergey,

What I don't understand here is: what other event do you report for the
other threads?  When you step one thread, I assume that your resume
method gets called with inferior_ptid (the global variable) equal to the
thread being stepped, and scope_ptid equal to (pid, 0, 0)?  If so, the
threads other than the one being stepped should be resumed freely.  In
your scenario, I don't see what signal these other threads would
receive.

Once your stepping thread is done stepping, it should generate a
SIGTRAP, after which your target should send some signal (SIGSTOP?) to
ask the other threads to stop and wait for these threads to report
"stopped by SIGSTOP" (since your target operates in non-stop mode).  But
these SIGSTOPs should not be reported to the core, as it's an
implementation detail of the target.  That's my understanding of how
things work on Linux, perhaps things are different on DragonflyBSD.

I assume that you have enabled and started at the "set debug infrun"
logs?  This is how you generally debug those issues, look at what infrun
asks your target to do, look at what your target answers, try to see if
that makes sense.

Finally, it's perhaps outside of the scope of your work, but "modern"
targets use non-stop (related to "maint set target-non-stop", not the
user-visible "set non-stop") and target-async.  I'm more used to reason
in terms of non-stop & target-async, and that code path of infrun is
likely more tested.  It's probably no small task, but switching to that
would be more future-proof.

Simon

next prev parent reply	other threads:[~2025-03-10 16:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-05 10:22 Sergey Zigachev via Gdb
2025-03-09 15:16 ` Sergey Zigachev via Gdb
2025-03-10 16:22 ` Simon Marchi via Gdb [this message]
2025-03-11  7:59   ` Sergey Zigachev via Gdb
2025-03-11 14:37     ` Simon Marchi via Gdb
2025-03-12  4:12       ` Sergey Zigachev via Gdb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ff4d4680-3d5f-42f0-8b74-4ac7e2dca8fb@simark.ca \
    --to=gdb@sourceware.org \
    --cc=s.zi@outlook.com \
    --cc=simark@simark.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox