From: John Baldwin <jhb@FreeBSD.org>
To: Pedro Alves <palves@redhat.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH 00/28] Decouple inferior_ptid/inferior_thread(); dup ptids in thread list (PR/25412)
Date: Tue, 7 Jul 2020 16:16:19 -0700 [thread overview]
Message-ID: <935bd7c8-f107-f2d1-ade2-f6259dc1297c@FreeBSD.org> (raw)
In-Reply-To: <20200414175434.8047-1-palves@redhat.com>
On 4/14/20 10:54 AM, Pedro Alves via Gdb-patches wrote:
> In PR/25412, Simon noticed that after the multi-target series, the
> tid-reuse.exp testcase manages to create a duplicate thread in the
> thread list. Or rather, two threads with the same PTID.
>
> This in turn exposes a design problem in GDB. The inferior_thread()
> function looks up the current thread based on inferior_ptid:
>
> struct thread_info*
> inferior_thread (void)
> {
> struct thread_info *tp = find_thread_ptid (current_inferior (), inferior_ptid);
> gdb_assert (tp);
> return tp;
> }
>
> But if there can be more than one thread in the thread list with the
> same ptid_t, inferior_thread() may well return the wrong thread.
>
> This series fixes this by making the current thread be a global
> thread_info pointer that is written to directly by switch_to_thread,
> etc., and making inferior_thread() return that pointer, instead of
> having inferior_thread() lookup up the inferior_ptid thread, by
> ptid_t. You can look at this as a continuation of the effort of using
> more thread_info pointers instead of ptids when possible.
>
> This change required auditing the whole codebase for places where we
> were writing to inferior_ptid directly to change the current thread,
> and change them to use switch_to_thread instead or one of its
> siblings, because otherwise inferior_thread() and inferior_ptid would
> get out of sync and inferior_thread() would return a thread unrelated
> to the new inferior_ptid we wanted to switch to. That was all
> (hopefully) done in all the patches leading up to the last one.
>
> After this, inferior_ptid still exists, but it is mostly read-only and
> mainly used by target backend code. It is also relied on by a number
> of target methods as a global input argument. E.g., the target_resume
> interface and the memory reading routines -- we still need it there
> because we need to be able to access memory off of processes for which
> we don't have a corresponding inferior/thread object, like when
> handling forks. Maybe we could pass down a context explicitly to
> target_read_memory, etc.
>
> Most of the host-/native-specific code here is untested. I did my
> best, but I won't be surprised if more tweaking is necessary.
>
> Testing on native x86_64 GNU/Linux is regression free for me. Testing
> against gdbserver has regressed significantly in the past months and
> is becoming difficult to run with a high number of long timeout
> sequences; really looks like people aren't paying much attention to
> that. I think this series doesn't regress gdbserver, but it's getting
> hard to tell. :-/
This appears to have broken native debugging on FreeBSD for me in that
when I run a process to completion it triggers an assertion failure:
(gdb) r
Starting program: /bin/echo
Child process unexpectedly missing: No child processes.
/home/john/work/git/gdb/gdb/inferior.c:293: internal-error: struct inferior *find_inferior_pid(process_stratum_target *, int): Assertion `pid != 0' failed.
I tracked this down to this code in inf_ptrace::wait():
/* Ignore terminated detached child processes. */
if (!WIFSTOPPED (status) && pid != inferior_ptid.pid ())
pid = -1;
At this point, inferior_ptid() is all zeroes and the process
has reported a non-stopped exit status (WIFEXITED) so this
ignores the exit event and loops back around to call waitpid()
again which then fails with ECHILD.
Looks like we always clear the inferior thread (and thus
inferior_ptid) in do_target_wait_1:
/* We know that we are looking for an event in the target of inferior
INF, but we don't know which thread the event might come from. As
such we want to make sure that INFERIOR_PTID is reset so that none of
the wait code relies on it - doing so is always a mistake. */
switch_to_inferior_no_thread (inf);
Commenting out the check in inf_ptrace::wait() "fixes" the issue for
me on FreeBSD, but I'm not sure that is the right fix.
It seems to me that multiprocess probably needs to return events for
not just the current inferior pid but for any valid pid for example,
and though multiprocess is still broken for me on FreeBSD if I comment
out the check against inferior_ptid, I can now see that I was getting
an event for the "wrong" inferior that was previously discarded but
with the check commented out now gets reported to the core.
(The reason I get an event for the wrong process is that for some
reason the core asks the native target to resume the wrong process:
> ./gdb /bin/ls
(gdb) set debug fbsd-nat
(gdb) set debug fbsd-lwp
(gdb) start
Temporary breakpoint 1 at 0x20430a: file /usr/src/bin/ls/ls.c, line 236.
Starting program: /bin/ls
FNAT: stop for LWP 101518 event 1 flags 0x18
FLWP: using LWP 101518 for first thread
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101518 event 1 flags 0x20
FNAT: si_signo 20 si_code 1
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101518 event 1 flags 0x20
FNAT: si_signo 20 si_code 1
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101518 event 1 flags 0x20
FNAT: si_signo 20 si_code 1
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101518 event 1 flags 0x20
FNAT: si_signo 20 si_code 1
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101518 event 1 flags 0x20
FNAT: si_signo 20 si_code 1
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101518 event 1 flags 0x18
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101518 event 1 flags 0x20
FNAT: si_signo 5 si_code 1
FNAT: sw breakpoint trap for LWP 101518
FLWP: fbsd_resume for ptid (70453, 101518, 0)
FNAT: stop for LWP 101518 event 1 flags 0x20
FNAT: si_signo 5 si_code 2
FNAT: trace trap for LWP 101518
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101518 event 1 flags 0x20
FNAT: si_signo 5 si_code 1
FNAT: sw breakpoint trap for LWP 101518
Temporary breakpoint 1, main (argc=1, argv=0x7fffffffe710)
at /usr/src/bin/ls/ls.c:236
236 const char *errstr = NULL;
(gdb) add-inferior
[New inferior 2]
Added inferior 2 on connection 1 (native)
(gdb) inferior 2
[Switching to inferior 2 [<null>] (<noexec>)]
(gdb) file /bin/ls
Reading symbols from /bin/ls...
Reading symbols from /usr/lib/debug//bin/ls.debug...
(gdb) start
Temporary breakpoint 2 at 0x20430a: -qualified main. (2 locations)
Starting program: /bin/ls
FNAT: stop for LWP 101641 event 1 flags 0x18
FLWP: using LWP 101641 for first thread
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101641 event 1 flags 0x20
FNAT: si_signo 20 si_code 1
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101641 event 1 flags 0x20
FNAT: si_signo 20 si_code 1
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101641 event 1 flags 0x20
FNAT: si_signo 20 si_code 1
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101641 event 1 flags 0x24
FNAT: si_signo 20 si_code 1
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101641 event 1 flags 0x24
FNAT: si_signo 20 si_code 1
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101641 event 1 flags 0x20
FNAT: si_signo 20 si_code 1
FLWP: fbsd_resume for ptid (-1, 0, 0)
FNAT: stop for LWP 101641 event 1 flags 0x18
FLWP: fbsd_resume for ptid (70453, 101518, 0)
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
The program no longer exists.
Here you can see that the last call to fbsd_resume() used the ptid from
inferior 1 instead of inferior 2, and inferior 1 didn't discard it's
pending SIGTRAP but instead was killed by it.)
--
John Baldwin
next prev parent reply other threads:[~2020-07-07 23:17 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-14 17:54 Pedro Alves
2020-04-14 17:54 ` [PATCH 01/28] Don't write to inferior_ptid in linux_get_siginfo_data Pedro Alves
2020-04-14 17:54 ` [PATCH 02/28] gcore, handle exited threads better Pedro Alves
2020-04-14 17:54 ` [PATCH 03/28] Refactor delete_program_space as a destructor Pedro Alves
2020-04-15 15:54 ` Simon Marchi
2020-04-16 14:47 ` Pedro Alves
2020-04-14 17:54 ` [PATCH 04/28] Don't write to inferior_ptid in gdbarch-selftests.c, mock address_space too Pedro Alves
2020-04-14 17:54 ` [PATCH 05/28] Don't write to inferior_ptid in inf-ptrace.c Pedro Alves
2020-04-14 17:54 ` [PATCH 06/28] Don't write to inferior_ptid in target.c Pedro Alves
2020-04-14 17:54 ` [PATCH 07/28] Don't write to inferior_ptid in infrun.c Pedro Alves
2020-04-14 17:54 ` [PATCH 08/28] Don't write to inferior_ptid in procfs.c Pedro Alves
2020-04-14 17:54 ` [PATCH 09/28] Don't write to inferior_ptid in tracefile-tfile.c Pedro Alves
2020-04-14 17:54 ` [PATCH 10/28] Don't write to inferior_ptid in tracectf.c Pedro Alves
2020-04-14 17:54 ` [PATCH 11/28] Don't write to inferior_ptid in remote.c Pedro Alves
2020-04-14 17:54 ` [PATCH 12/28] Don't write to inferior_ptid in remote-sim.c Pedro Alves
2020-04-14 17:54 ` [PATCH 13/28] Don't write to inferior_ptid in nto-procfs.c Pedro Alves
2020-04-14 17:54 ` [PATCH 14/28] Don't write to inferior_ptid in go32-nat.c Pedro Alves
2020-04-14 17:54 ` [PATCH 15/28] Don't write to inferior_ptid in gnu-nat.c Pedro Alves
2020-04-14 17:54 ` [PATCH 16/28] Don't write to inferior_ptid in darwin-nat.c Pedro Alves
2020-04-16 1:33 ` Simon Marchi
2020-04-16 19:23 ` Pedro Alves
2020-04-14 17:54 ` [PATCH 17/28] Don't write to inferior_ptid in corelow.c Pedro Alves
2020-04-14 17:54 ` [PATCH 18/28] Don't write to inferior_ptid in bsd-kvm.c Pedro Alves
2020-04-14 17:54 ` [PATCH 19/28] Don't write to inferior_ptid in btrace_fetch Pedro Alves
2020-04-15 4:52 ` Metzger, Markus T
2020-04-15 14:13 ` Pedro Alves
2020-04-15 15:17 ` Metzger, Markus T
2020-04-14 17:54 ` [PATCH 20/28] Don't write to inferior_ptid in bsd-kvm.c Pedro Alves
2020-04-14 17:54 ` [PATCH 21/28] Don't write to inferior_ptid in fork-child.c Pedro Alves
2020-04-14 17:54 ` [PATCH 22/28] Don't write to inferior_ptid in go32-nat.c Pedro Alves
2020-04-14 17:54 ` [PATCH 23/28] Don't write to inferior_ptid in remote-sim.c Pedro Alves
2020-04-16 0:53 ` Simon Marchi
2020-04-16 14:58 ` Pedro Alves
2020-04-14 17:54 ` [PATCH 24/28] Don't write to inferior_ptid in windows-nat.c, part I Pedro Alves
2020-04-14 17:54 ` [PATCH 25/28] Don't write to inferior_ptid in windows-nat.c, part II Pedro Alves
2020-04-14 22:41 ` Hannes Domani
2020-04-15 15:08 ` Pedro Alves
2020-04-15 15:32 ` Hannes Domani
2020-04-14 17:54 ` [PATCH 26/28] Don't write to inferior_ptid in ravenscar-thread.c Pedro Alves
2020-04-17 18:45 ` Tom Tromey
2020-06-18 20:00 ` Pedro Alves
2020-06-18 21:38 ` Tom Tromey
2020-04-14 17:54 ` [PATCH 27/28] Don't write to inferior_ptid in aix-thread.c Pedro Alves
2020-04-14 17:54 ` [PATCH 28/28] Decouple inferior_ptid/inferior_thread(); dup ptids in thread list (PR/25412) Pedro Alves
2020-04-16 19:39 ` Simon Marchi
2020-04-16 20:12 ` Pedro Alves
2020-04-16 20:38 ` Simon Marchi
2020-04-17 10:29 ` Pedro Alves
2020-04-17 14:06 ` Simon Marchi
2020-04-17 16:46 ` Pedro Alves
2020-04-17 18:53 ` Tom Tromey
2020-06-18 19:59 ` Pedro Alves
2020-06-23 13:37 ` Andrew Burgess
2020-06-23 14:26 ` Pedro Alves
2020-06-23 15:38 ` [PATCH] Fix "maint selftest" regression, add struct, scoped_mock_context Pedro Alves
2020-06-23 16:34 ` Andrew Burgess
2020-06-23 17:58 ` Pedro Alves
2020-04-14 18:46 ` [PATCH 00/28] Decouple inferior_ptid/inferior_thread(); dup ptids in thread list (PR/25412) Hannes Domani
2020-04-14 19:24 ` Pedro Alves
2020-04-15 15:04 ` Simon Marchi
2020-04-16 13:41 ` Pedro Alves
2020-04-15 14:46 ` Simon Marchi
2020-04-15 15:33 ` Pedro Alves
2020-04-15 15:42 ` Simon Marchi
2020-04-17 20:20 ` Tom Tromey
2020-06-18 20:00 ` Pedro Alves
2020-06-18 22:30 ` Pedro Alves
2020-07-07 23:16 ` John Baldwin [this message]
2020-07-07 23:53 ` Pedro Alves
2020-07-08 0:19 ` John Baldwin
2020-07-08 0:10 ` Multiprocess on FreeBSD John Baldwin
2020-07-08 0:34 ` John Baldwin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=935bd7c8-f107-f2d1-ade2-f6259dc1297c@FreeBSD.org \
--to=jhb@freebsd.org \
--cc=gdb-patches@sourceware.org \
--cc=palves@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox