From: Simon Marchi via Gdb <gdb@sourceware.org>
To: Sergey Zigachev <s.zi@outlook.com>, gdb@sourceware.org
Subject: Re: Multi-threading support on DragonFlyBSD
Date: Tue, 11 Mar 2025 10:37:29 -0400 [thread overview]
Message-ID: <a4b88146-b44b-4c08-98e5-01977f8dfa10@simark.ca> (raw)
In-Reply-To: <AS8PR05MB823010BFA1E808E716B3EE5392D12@AS8PR05MB8230.eurprd05.prod.outlook.com>
On 3/11/25 3:59 AM, Sergey Zigachev wrote:
> That's overall correct. Other threads can receive SIGTRAP if I put a
> breakpoint, for example. I thought that should work, isn't it? If GDB
> resumes all threads that means any number of threads can stop at a
> breakpoint or SIGTRAP caused by single-stepping a process.
If another thread (that is being ran freely) happens to hit a breakpoint
while single stepping the selected thread, then indeed you should report
that SIGTRAP. GDB will present that stop to the user, and the step will
be aborted. The stepping thread will be asked to stop, and will be
stopped at an arbitrary place.
> Here's what I think is happening:
> 1. put a breakpoint_1 at function_1 for thread_1
> 2. put a breakpoint_2 at function_2 for thread_2
> 3. -> "run" the process, main thread starts threads approx. at once
> 4. breakpoint_1 hit, current thread is thread_1
> 5. -> enter "step" command
> 6. GDB resumes thread_1, then all threads, every time with step=1
> 7. breakpoint_2 hit, GDB switches to thread_2
> 8. -> enter "step" command
> 9. GDB resumes thread_2, then all threads, every time with step=1
> 10. thread_1 generates SIGTRAP
> 11. GDB can't tell this SIGTRAP is the result of (5), so can't skip it
> 12. GDB reports it as a random signal
I think that the SIGTRAP at #10 is spurious. It is perhaps a leftover
from the step command at #5/#6?
When thread 2 hits breakpoint_2 at #7, then the step from thread 1
aborts and should be forgotten. When you step again at #8/#9, you start
from a blank slate.
> If GDB receives events for the current thread at steps (6) and (9),
> then it processes them just fine, skipping as necessary.
It's possible that the code path taken in that case clears something
that causes the spurious SIGTRAP to be forgotten.
> WRT implementation details: I don't send SIGSTOP to stop all threads
> when SIGTRAP generated and I don't use waitpid. Natdep target
> communicates with the kernel through ptrace syscall only. Ptrace
> request waits for a event, stops all threads and waits for them to
> actually stop before returning to userspace. Then in the natdep code I
> pull all pending events from all threads, stash them in a list, and
> return one event according to what GDB expects to get: if it resumed
> just a single thread, then I return an event strictly for that thread
> (if any); if it resumed all threads, then I return the next event from
> the list, which may or may not be for the current thread. In my test
> program, I only get SIGTRAPs triggered by single-stepping threads, no
> other signals generated.
Ok, so if I understand correctly, you do some kind of "ptrace resume",
which is blocking, and it unblocks when something happens. When that
returns, the kernel has stopped all threads for you. In a sense that
simplifies some things, but it would make async/non-stop harder to
implement.
> Please see attached debug log output. It's rather verbose though.
I'll try to go through it later.
> Couple notes:
> - for DragonFly, I'm using ptid_t format as (pid, thread_id, 0),
> thread_id is unique for a process, but is NOT globally unique
I think that's fine. The core of GDB never assumes that ptid_t::lwp is
globally unique.
> - you can ignore lines with [dfly-nat], that's my implementation's
> debug info, it mostly duplicates what infrun writes anyway
That's fine, it's very useful to add such logging.
> I'm sorry, my explanations are hard to understand, hopefully it'll be
> clearer with the log.
>
> Ideally, I would like to get GDB to transparently skip assembly
> instructions which are not relevant to report regardless which thread
> is reporting an event, be it currently selected or not. Like part of
> the already reported source line, dynsym code, missing debug symbols
> or anything else. For that to work I think I need to save and restore
> GDB's thread "context" (in the thread_info struct) before reporting an
> event. If it's not possible or doesn't make sense, then I'll to do
> what Linux is doing -- give currently selected thread higher priority
> and report it first.
This kind of logic is managed at a higher level (infrun), not by
targets. I don't think that your target_ops::wait method is the place
for such decision. I'm not super familiar with this, I just know it's
managed by spaghetti-like functions like `handle_signal_stop`.
>> I assume that you have enabled and started at the "set debug infrun"
>> logs? This is how you generally debug those issues, look at what infrun
>> asks your target to do, look at what your target answers, try to see if
>> that makes sense.
>
> Yes, I enabled infrun debug output and it was invaluable for gathering info.
>
>> Finally, it's perhaps outside of the scope of your work, but "modern"
>> targets use non-stop (related to "maint set target-non-stop", not the
>> user-visible "set non-stop") and target-async. I'm more used to reason
>> in terms of non-stop & target-async, and that code path of infrun is
>> likely more tested. It's probably no small task, but switching to that
>> would be more future-proof.
>
> I'll take a look at non-stop and async modes.
> Also, I wasn't aware of maintenance command, thanks!
Given what you said above, about how your debug interface works,
non-stop/async is probably not a priority.
Simon
next prev parent reply other threads:[~2025-03-11 14:38 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-05 10:22 Sergey Zigachev via Gdb
2025-03-09 15:16 ` Sergey Zigachev via Gdb
2025-03-10 16:22 ` Simon Marchi via Gdb
2025-03-11 7:59 ` Sergey Zigachev via Gdb
2025-03-11 14:37 ` Simon Marchi via Gdb [this message]
2025-03-12 4:12 ` Sergey Zigachev via Gdb
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a4b88146-b44b-4c08-98e5-01977f8dfa10@simark.ca \
--to=gdb@sourceware.org \
--cc=s.zi@outlook.com \
--cc=simark@simark.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox