From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-bounces~public-inbox=simark.ca@sourceware.org>
Received: from simark.ca
	by simark.ca with LMTP
	id sJmwFN1K0GfFKwwAWB0awg
	(envelope-from <gdb-bounces~public-inbox=simark.ca@sourceware.org>)
	for <public-inbox@simark.ca>; Tue, 11 Mar 2025 10:38:21 -0400
Authentication-Results: simark.ca;
	dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=k8CHjiX3;
	dkim-atps=neutral
Received: by simark.ca (Postfix, from userid 112)
	id 524941E105; Tue, 11 Mar 2025 10:38:21 -0400 (EDT)
X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on simark.ca
X-Spam-Level: 
X-Spam-Status: No, score=-5.4 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_00,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_MED autolearn=unavailable autolearn_force=no
	version=4.0.0
Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256)
	(No client certificate requested)
	by simark.ca (Postfix) with ESMTPS id C322C1E08E
	for <public-inbox@simark.ca>; Tue, 11 Mar 2025 10:38:20 -0400 (EDT)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 5D2213857BAF
	for <public-inbox@simark.ca>; Tue, 11 Mar 2025 14:38:20 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5D2213857BAF
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org;
	s=default; t=1741703900;
	bh=UEFYe33dfZondJnWs6deTyyGhRzPvp2haVl+ODIyYkU=;
	h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe:
	 List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
	 From;
	b=k8CHjiX3Kf4+iFlMRLo9Ahn5PRY2RDOpt6WpxxN+bQK5Ph3KKvCYoVQWZ5cSmulGA
	 A9/lJGK1SwPguiHKsYT5Zmlqh4+b+StUdVkw8Hmvd0j7WWouTaKawuGkInoL/YjAnw
	 PBCpXriU3SpYwLD1w5H2zVGYK8DIDVBCBERqEqUw=
Received: from simark.ca (simark.ca [158.69.221.121])
 by sourceware.org (Postfix) with ESMTPS id 12E3D3858435
 for <gdb@sourceware.org>; Tue, 11 Mar 2025 14:37:32 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 12E3D3858435
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 12E3D3858435
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1741703852; cv=none;
 b=oC6eWu9oz7h2c6Nq/7xSus6Hc98TdeaPeDqE4vtwYyPUi5/jC6KtH8xpWBkDJxkW+03K2Oku1BTP2D7OjbOyV58Tjb4nL6HLe+ly9kDDZ15MqUgXHWEQ4ksa0m7djaHyBw6293l/SoL5oG/yojOXhIwgoJhDOGV9I9k8vz0WSzc=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1741703852; c=relaxed/simple;
 bh=FCe23dfd3+TE/3IPE3h8m05Ll+vAOrxMeg7JFIjThPk=;
 h=DKIM-Signature:DKIM-Signature:Message-ID:Date:MIME-Version:
 Subject:To:From;
 b=E2P9dCIERsaBdPScntwTv7QFNG5Q+ZRdEqaL6iho9HhiGXaDhYy01BCPf+FrmNwVLjlEys5YREaOs4hTjvKbOYCytU2Sz3cf0Kng/AOBKb88GXbgiuyoE/WcHP17pzkqaY+6JVHQVUrCfu6K9Nna3Zfwi3H4adbcDsc4YMGeqWw=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 12E3D3858435
Received: by simark.ca (Postfix, from userid 112)
 id C84691E10A; Tue, 11 Mar 2025 10:37:31 -0400 (EDT)
Received: from [172.16.0.192] (96-127-217-162.qc.cable.ebox.net
 [96.127.217.162])
 (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
 key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256)
 (No client certificate requested)
 by simark.ca (Postfix) with ESMTPSA id 34BBC1E08E;
 Tue, 11 Mar 2025 10:37:30 -0400 (EDT)
Message-ID: <a4b88146-b44b-4c08-98e5-01977f8dfa10@simark.ca>
Date: Tue, 11 Mar 2025 10:37:29 -0400
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: Multi-threading support on DragonFlyBSD
To: Sergey Zigachev <s.zi@outlook.com>, gdb@sourceware.org
References: <DB9PR05MB82353E274191A69996FF924D92CB2@DB9PR05MB8235.eurprd05.prod.outlook.com>
 <ff4d4680-3d5f-42f0-8b74-4ac7e2dca8fb@simark.ca>
 <AS8PR05MB823010BFA1E808E716B3EE5392D12@AS8PR05MB8230.eurprd05.prod.outlook.com>
Content-Language: fr
In-Reply-To: <AS8PR05MB823010BFA1E808E716B3EE5392D12@AS8PR05MB8230.eurprd05.prod.outlook.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-BeenThere: gdb@sourceware.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gdb mailing list <gdb.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb>,
 <mailto:gdb-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb>,
 <mailto:gdb-request@sourceware.org?subject=subscribe>
From: Simon Marchi via Gdb <gdb@sourceware.org>
Reply-To: Simon Marchi <simark@simark.ca>
Errors-To: gdb-bounces~public-inbox=simark.ca@sourceware.org
Sender: "Gdb" <gdb-bounces~public-inbox=simark.ca@sourceware.org>

On 3/11/25 3:59 AM, Sergey Zigachev wrote:
> That's overall correct. Other threads can receive SIGTRAP if I put a
> breakpoint, for example. I thought that should work, isn't it? If GDB
> resumes all threads that means any number of threads can stop at a
> breakpoint or SIGTRAP caused by single-stepping a process.

If another thread (that is being ran freely) happens to hit a breakpoint
while single stepping the selected thread, then indeed you should report
that SIGTRAP.  GDB will present that stop to the user, and the step will
be aborted.  The stepping thread will be asked to stop, and will be
stopped at an arbitrary place.

> Here's what I think is happening:
>   1. put a breakpoint_1 at function_1 for thread_1
>   2. put a breakpoint_2 at function_2 for thread_2
>   3. -> "run" the process, main thread starts threads approx. at once
>   4. breakpoint_1 hit, current thread is thread_1
>   5. -> enter "step" command
>   6. GDB resumes thread_1, then all threads, every time with step=1
>   7. breakpoint_2 hit, GDB switches to thread_2
>   8. -> enter "step" command
>   9. GDB resumes thread_2, then all threads, every time with step=1
>   10. thread_1 generates SIGTRAP
>   11. GDB can't tell this SIGTRAP is the result of (5), so can't skip it
>   12. GDB reports it as a random signal

I think that the SIGTRAP at #10 is spurious.  It is perhaps a leftover
from the step command at #5/#6?

When thread 2 hits breakpoint_2 at #7, then the step from thread 1
aborts and should be forgotten.  When you step again at #8/#9, you start
from a blank slate.

> If GDB receives events for the current thread at steps (6) and (9),
> then it processes them just fine, skipping as necessary.

It's possible that the code path taken in that case clears something
that causes the spurious SIGTRAP to be forgotten.

> WRT implementation details: I don't send SIGSTOP to stop all threads
> when SIGTRAP generated and I don't use waitpid. Natdep target
> communicates with the kernel through ptrace syscall only. Ptrace
> request waits for a event, stops all threads and waits for them to
> actually stop before returning to userspace. Then in the natdep code I
> pull all pending events from all threads, stash them in a list, and
> return one event according to what GDB expects to get: if it resumed
> just a single thread, then I return an event strictly for that thread
> (if any); if it resumed all threads, then I return the next event from
> the list, which may or may not be for the current thread. In my test
> program, I only get SIGTRAPs triggered by single-stepping threads, no
> other signals generated.

Ok, so if I understand correctly, you do some kind of "ptrace resume",
which is blocking, and it unblocks when something happens.  When that
returns, the kernel has stopped all threads for you.  In a sense that
simplifies some things, but it would make async/non-stop harder to
implement.

> Please see attached debug log output. It's rather verbose though.

I'll try to go through it later.

> Couple notes:
>   - for DragonFly, I'm using ptid_t format as (pid, thread_id, 0),
>   thread_id is unique for a process, but is NOT globally unique

I think that's fine.  The core of GDB never assumes that ptid_t::lwp is
globally unique.

>   - you can ignore lines with [dfly-nat], that's my implementation's
>   debug info, it mostly duplicates what infrun writes anyway

That's fine, it's very useful to add such logging.

> I'm sorry, my explanations are hard to understand, hopefully it'll be
> clearer with the log.
> 
> Ideally, I would like to get GDB to transparently skip assembly
> instructions which are not relevant to report regardless which thread
> is reporting an event, be it currently selected or not. Like part of
> the already reported source line, dynsym code, missing debug symbols
> or anything else. For that to work I think I need to save and restore
> GDB's thread "context" (in the thread_info struct) before reporting an
> event. If it's not possible or doesn't make sense, then I'll to do
> what Linux is doing -- give currently selected thread higher priority
> and report it first.

This kind of logic is managed at a higher level (infrun), not by
targets.  I don't think that your target_ops::wait method is the place
for such decision.  I'm not super familiar with this, I just know it's
managed by spaghetti-like functions like `handle_signal_stop`.

>> I assume that you have enabled and started at the "set debug infrun"
>> logs?  This is how you generally debug those issues, look at what infrun
>> asks your target to do, look at what your target answers, try to see if
>> that makes sense.
> 
> Yes, I enabled infrun debug output and it was invaluable for gathering info.
> 
>> Finally, it's perhaps outside of the scope of your work, but "modern"
>> targets use non-stop (related to "maint set target-non-stop", not the
>> user-visible "set non-stop") and target-async.  I'm more used to reason
>> in terms of non-stop & target-async, and that code path of infrun is
>> likely more tested.  It's probably no small task, but switching to that
>> would be more future-proof.
> 
> I'll take a look at non-stop and async modes.
> Also, I wasn't aware of maintenance command, thanks!

Given what you said above, about how your debug interface works,
non-stop/async is probably not a priority.

Simon