> What I don't understand here is: what other event do you report for the
> other threads?  When you step one thread, I assume that your resume
> method gets called with inferior_ptid (the global variable) equal to the
> thread being stepped, and scope_ptid equal to (pid, 0, 0)?  If so, the
> threads other than the one being stepped should be resumed freely.  In
> your scenario, I don't see what signal these other threads would
> receive.
> 
> Once your stepping thread is done stepping, it should generate a
> SIGTRAP, after which your target should send some signal (SIGSTOP?) to
> ask the other threads to stop and wait for these threads to report
> "stopped by SIGSTOP" (since your target operates in non-stop mode).  But
> these SIGSTOPs should not be reported to the core, as it's an
> implementation detail of the target.  That's my understanding of how
> things work on Linux, perhaps things are different on DragonflyBSD.
> 

That's overall correct. Other threads can receive SIGTRAP if I put a 
breakpoint, for example. I thought that should work, isn't it? If GDB 
resumes all threads that means any number of threads can stop at a 
breakpoint or SIGTRAP caused by single-stepping a process.

Here's what I think is happening:
   1. put a breakpoint_1 at function_1 for thread_1
   2. put a breakpoint_2 at function_2 for thread_2
   3. -> "run" the process, main thread starts threads approx. at once
   4. breakpoint_1 hit, current thread is thread_1
   5. -> enter "step" command
   6. GDB resumes thread_1, then all threads, every time with step=1
   7. breakpoint_2 hit, GDB switches to thread_2
   8. -> enter "step" command
   9. GDB resumes thread_2, then all threads, every time with step=1
   10. thread_1 generates SIGTRAP
   11. GDB can't tell this SIGTRAP is the result of (5), so can't skip it
   12. GDB reports it as a random signal

If GDB receives events for the current thread at steps (6) and (9), then 
it processes them just fine, skipping as necessary.

WRT implementation details: I don't send SIGSTOP to stop all threads 
when SIGTRAP generated and I don't use waitpid. Natdep target 
communicates with the kernel through ptrace syscall only. Ptrace request 
waits for a event, stops all threads and waits for them to actually stop 
before returning to userspace. Then in the natdep code I pull all 
pending events from all threads, stash them in a list, and return one 
event according to what GDB expects to get: if it resumed just a single 
thread, then I return an event strictly for that thread (if any); if it 
resumed all threads, then I return the next event from the list, which 
may or may not be for the current thread. In my test program, I only get 
SIGTRAPs triggered by single-stepping threads, no other signals generated.
Please see attached debug log output. It's rather verbose though.

Couple notes:
   - for DragonFly, I'm using ptid_t format as (pid, thread_id, 0), 
thread_id is unique for a process, but is NOT globally unique
   - you can ignore lines with [dfly-nat], that's my implementation's 
debug info, it mostly duplicates what infrun writes anyway

I'm sorry, my explanations are hard to understand, hopefully it'll be 
clearer with the log.

Ideally, I would like to get GDB to transparently skip assembly 
instructions which are not relevant to report regardless which thread is 
reporting an event, be it currently selected or not. Like part of the 
already reported source line, dynsym code, missing debug symbols or 
anything else. For that to work I think I need to save and restore GDB's 
thread "context" (in the thread_info struct) before reporting an event. 
If it's not possible or doesn't make sense, then I'll to do what Linux 
is doing -- give currently selected thread higher priority and report it 
first.

> I assume that you have enabled and started at the "set debug infrun"
> logs?  This is how you generally debug those issues, look at what infrun
> asks your target to do, look at what your target answers, try to see if
> that makes sense.

Yes, I enabled infrun debug output and it was invaluable for gathering 
info.

> Finally, it's perhaps outside of the scope of your work, but "modern"
> targets use non-stop (related to "maint set target-non-stop", not the
> user-visible "set non-stop") and target-async.  I'm more used to reason
> in terms of non-stop & target-async, and that code path of infrun is
> likely more tested.  It's probably no small task, but switching to that
> would be more future-proof.

I'll take a look at non-stop and async modes.
Also, I wasn't aware of maintenance command, thanks!

-Sergey