From: Pedro Alves <pedro@palves.net>
To: John Baldwin <jhb@FreeBSD.org>, gdb-patches@sourceware.org
Subject: Re: [PATCH 00/28] Decouple inferior_ptid/inferior_thread(); dup ptids in thread list (PR/25412)
Date: Wed, 8 Jul 2020 00:53:41 +0100 [thread overview]
Message-ID: <1000f354-aa00-9cf3-7417-57b2efe59216@palves.net> (raw)
In-Reply-To: <935bd7c8-f107-f2d1-ade2-f6259dc1297c@FreeBSD.org>
On 7/8/20 12:16 AM, John Baldwin wrote:
> This appears to have broken native debugging on FreeBSD for me in that
> when I run a process to completion it triggers an assertion failure:
>
> (gdb) r
> Starting program: /bin/echo
>
> Child process unexpectedly missing: No child processes.
> /home/john/work/git/gdb/gdb/inferior.c:293: internal-error: struct inferior *find_inferior_pid(process_stratum_target *, int): Assertion `pid != 0' failed.
>
> I tracked this down to this code in inf_ptrace::wait():
>
> /* Ignore terminated detached child processes. */
> if (!WIFSTOPPED (status) && pid != inferior_ptid.pid ())
> pid = -1;
>
> At this point, inferior_ptid() is all zeroes and the process
> has reported a non-stopped exit status (WIFEXITED) so this
> ignores the exit event and loops back around to call waitpid()
> again which then fails with ECHILD.
>
> Looks like we always clear the inferior thread (and thus
> inferior_ptid) in do_target_wait_1:
>
> /* We know that we are looking for an event in the target of inferior
> INF, but we don't know which thread the event might come from. As
> such we want to make sure that INFERIOR_PTID is reset so that none of
> the wait code relies on it - doing so is always a mistake. */
> switch_to_inferior_no_thread (inf);
>
> Commenting out the check in inf_ptrace::wait() "fixes" the issue for
> me on FreeBSD, but I'm not sure that is the right fix.
linux-nat.c has something logic, here in linux_nat_filter_event:
/* Make sure we don't report an event for the exit of an LWP not in
our list, i.e. not part of the current process. This can happen
if we detach from a program we originally forked and then it
exits. */
if (!WIFSTOPPED (status) && !lp)
return NULL;
For inf-ptrace.c, the right fix should be something around this:
diff --git c/gdb/inf-ptrace.c w/gdb/inf-ptrace.c
index d25d226abb..ae0b0f7ff0 100644
--- c/gdb/inf-ptrace.c
+++ w/gdb/inf-ptrace.c
@@ -347,7 +347,7 @@ inf_ptrace_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
}
/* Ignore terminated detached child processes. */
- if (!WIFSTOPPED (status) && pid != inferior_ptid.pid ())
+ if (!WIFSTOPPED (status) && find_inferior_pid (this, pid) == nullptr)
pid = -1;
}
while (pid == -1);
The inferior_ptid reference in the error path above that code, where it
reads:
"Claim it exited with unknown signal"
is also wrong. I'm not sure what to do about that one.
Not clear if that path can really happen. linux-nat.c calls
perror_with_name in some places if waitpid returns -1,
and in the main waitpid call, just assumes that error -1 never
happens...
>
> It seems to me that multiprocess probably needs to return events for
> not just the current inferior pid but for any valid pid for example,
Yes.
> and though multiprocess is still broken for me on FreeBSD if I comment
> out the check against inferior_ptid, I can now see that I was getting
> an event for the "wrong" inferior that was previously discarded but
> with the check commented out now gets reported to the core.
>
> (The reason I get an event for the wrong process is that for some
> reason the core asks the native target to resume the wrong process:
>
>> ./gdb /bin/ls
> (gdb) set debug fbsd-nat
> (gdb) set debug fbsd-lwp
> (gdb) start
> Temporary breakpoint 1 at 0x20430a: file /usr/src/bin/ls/ls.c, line 236.
> Starting program: /bin/ls
> FNAT: stop for LWP 101518 event 1 flags 0x18
> FLWP: using LWP 101518 for first thread
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101518 event 1 flags 0x20
> FNAT: si_signo 20 si_code 1
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101518 event 1 flags 0x20
> FNAT: si_signo 20 si_code 1
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101518 event 1 flags 0x20
> FNAT: si_signo 20 si_code 1
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101518 event 1 flags 0x20
> FNAT: si_signo 20 si_code 1
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101518 event 1 flags 0x20
> FNAT: si_signo 20 si_code 1
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101518 event 1 flags 0x18
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101518 event 1 flags 0x20
> FNAT: si_signo 5 si_code 1
> FNAT: sw breakpoint trap for LWP 101518
> FLWP: fbsd_resume for ptid (70453, 101518, 0)
> FNAT: stop for LWP 101518 event 1 flags 0x20
> FNAT: si_signo 5 si_code 2
> FNAT: trace trap for LWP 101518
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101518 event 1 flags 0x20
> FNAT: si_signo 5 si_code 1
> FNAT: sw breakpoint trap for LWP 101518
>
> Temporary breakpoint 1, main (argc=1, argv=0x7fffffffe710)
> at /usr/src/bin/ls/ls.c:236
> 236 const char *errstr = NULL;
> (gdb) add-inferior
> [New inferior 2]
> Added inferior 2 on connection 1 (native)
> (gdb) inferior 2
> [Switching to inferior 2 [<null>] (<noexec>)]
> (gdb) file /bin/ls
> Reading symbols from /bin/ls...
> Reading symbols from /usr/lib/debug//bin/ls.debug...
> (gdb) start
> Temporary breakpoint 2 at 0x20430a: -qualified main. (2 locations)
> Starting program: /bin/ls
> FNAT: stop for LWP 101641 event 1 flags 0x18
> FLWP: using LWP 101641 for first thread
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101641 event 1 flags 0x20
> FNAT: si_signo 20 si_code 1
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101641 event 1 flags 0x20
> FNAT: si_signo 20 si_code 1
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101641 event 1 flags 0x20
> FNAT: si_signo 20 si_code 1
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101641 event 1 flags 0x24
> FNAT: si_signo 20 si_code 1
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101641 event 1 flags 0x24
> FNAT: si_signo 20 si_code 1
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101641 event 1 flags 0x20
> FNAT: si_signo 20 si_code 1
> FLWP: fbsd_resume for ptid (-1, 0, 0)
> FNAT: stop for LWP 101641 event 1 flags 0x18
> FLWP: fbsd_resume for ptid (70453, 101518, 0)
>
> Program terminated with signal SIGTRAP, Trace/breakpoint trap.
> The program no longer exists.
>
> Here you can see that the last call to fbsd_resume() used the ptid from
> inferior 1 instead of inferior 2, and inferior 1 didn't discard it's
> pending SIGTRAP but instead was killed by it.)
"set debug infrun 1" will probably reveal what is happening.
Thanks,
Pedro Alves
next prev parent reply other threads:[~2020-07-07 23:53 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-14 17:54 Pedro Alves
2020-04-14 17:54 ` [PATCH 01/28] Don't write to inferior_ptid in linux_get_siginfo_data Pedro Alves
2020-04-14 17:54 ` [PATCH 02/28] gcore, handle exited threads better Pedro Alves
2020-04-14 17:54 ` [PATCH 03/28] Refactor delete_program_space as a destructor Pedro Alves
2020-04-15 15:54 ` Simon Marchi
2020-04-16 14:47 ` Pedro Alves
2020-04-14 17:54 ` [PATCH 04/28] Don't write to inferior_ptid in gdbarch-selftests.c, mock address_space too Pedro Alves
2020-04-14 17:54 ` [PATCH 05/28] Don't write to inferior_ptid in inf-ptrace.c Pedro Alves
2020-04-14 17:54 ` [PATCH 06/28] Don't write to inferior_ptid in target.c Pedro Alves
2020-04-14 17:54 ` [PATCH 07/28] Don't write to inferior_ptid in infrun.c Pedro Alves
2020-04-14 17:54 ` [PATCH 08/28] Don't write to inferior_ptid in procfs.c Pedro Alves
2020-04-14 17:54 ` [PATCH 09/28] Don't write to inferior_ptid in tracefile-tfile.c Pedro Alves
2020-04-14 17:54 ` [PATCH 10/28] Don't write to inferior_ptid in tracectf.c Pedro Alves
2020-04-14 17:54 ` [PATCH 11/28] Don't write to inferior_ptid in remote.c Pedro Alves
2020-04-14 17:54 ` [PATCH 12/28] Don't write to inferior_ptid in remote-sim.c Pedro Alves
2020-04-14 17:54 ` [PATCH 13/28] Don't write to inferior_ptid in nto-procfs.c Pedro Alves
2020-04-14 17:54 ` [PATCH 14/28] Don't write to inferior_ptid in go32-nat.c Pedro Alves
2020-04-14 17:54 ` [PATCH 15/28] Don't write to inferior_ptid in gnu-nat.c Pedro Alves
2020-04-14 17:54 ` [PATCH 16/28] Don't write to inferior_ptid in darwin-nat.c Pedro Alves
2020-04-16 1:33 ` Simon Marchi
2020-04-16 19:23 ` Pedro Alves
2020-04-14 17:54 ` [PATCH 17/28] Don't write to inferior_ptid in corelow.c Pedro Alves
2020-04-14 17:54 ` [PATCH 18/28] Don't write to inferior_ptid in bsd-kvm.c Pedro Alves
2020-04-14 17:54 ` [PATCH 19/28] Don't write to inferior_ptid in btrace_fetch Pedro Alves
2020-04-15 4:52 ` Metzger, Markus T
2020-04-15 14:13 ` Pedro Alves
2020-04-15 15:17 ` Metzger, Markus T
2020-04-14 17:54 ` [PATCH 20/28] Don't write to inferior_ptid in bsd-kvm.c Pedro Alves
2020-04-14 17:54 ` [PATCH 21/28] Don't write to inferior_ptid in fork-child.c Pedro Alves
2020-04-14 17:54 ` [PATCH 22/28] Don't write to inferior_ptid in go32-nat.c Pedro Alves
2020-04-14 17:54 ` [PATCH 23/28] Don't write to inferior_ptid in remote-sim.c Pedro Alves
2020-04-16 0:53 ` Simon Marchi
2020-04-16 14:58 ` Pedro Alves
2020-04-14 17:54 ` [PATCH 24/28] Don't write to inferior_ptid in windows-nat.c, part I Pedro Alves
2020-04-14 17:54 ` [PATCH 25/28] Don't write to inferior_ptid in windows-nat.c, part II Pedro Alves
2020-04-14 22:41 ` Hannes Domani
2020-04-15 15:08 ` Pedro Alves
2020-04-15 15:32 ` Hannes Domani
2020-04-14 17:54 ` [PATCH 26/28] Don't write to inferior_ptid in ravenscar-thread.c Pedro Alves
2020-04-17 18:45 ` Tom Tromey
2020-06-18 20:00 ` Pedro Alves
2020-06-18 21:38 ` Tom Tromey
2020-04-14 17:54 ` [PATCH 27/28] Don't write to inferior_ptid in aix-thread.c Pedro Alves
2020-04-14 17:54 ` [PATCH 28/28] Decouple inferior_ptid/inferior_thread(); dup ptids in thread list (PR/25412) Pedro Alves
2020-04-16 19:39 ` Simon Marchi
2020-04-16 20:12 ` Pedro Alves
2020-04-16 20:38 ` Simon Marchi
2020-04-17 10:29 ` Pedro Alves
2020-04-17 14:06 ` Simon Marchi
2020-04-17 16:46 ` Pedro Alves
2020-04-17 18:53 ` Tom Tromey
2020-06-18 19:59 ` Pedro Alves
2020-06-23 13:37 ` Andrew Burgess
2020-06-23 14:26 ` Pedro Alves
2020-06-23 15:38 ` [PATCH] Fix "maint selftest" regression, add struct, scoped_mock_context Pedro Alves
2020-06-23 16:34 ` Andrew Burgess
2020-06-23 17:58 ` Pedro Alves
2020-04-14 18:46 ` [PATCH 00/28] Decouple inferior_ptid/inferior_thread(); dup ptids in thread list (PR/25412) Hannes Domani
2020-04-14 19:24 ` Pedro Alves
2020-04-15 15:04 ` Simon Marchi
2020-04-16 13:41 ` Pedro Alves
2020-04-15 14:46 ` Simon Marchi
2020-04-15 15:33 ` Pedro Alves
2020-04-15 15:42 ` Simon Marchi
2020-04-17 20:20 ` Tom Tromey
2020-06-18 20:00 ` Pedro Alves
2020-06-18 22:30 ` Pedro Alves
2020-07-07 23:16 ` John Baldwin
2020-07-07 23:53 ` Pedro Alves [this message]
2020-07-08 0:19 ` John Baldwin
2020-07-08 0:10 ` Multiprocess on FreeBSD John Baldwin
2020-07-08 0:34 ` John Baldwin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1000f354-aa00-9cf3-7417-57b2efe59216@palves.net \
--to=pedro@palves.net \
--cc=gdb-patches@sourceware.org \
--cc=jhb@FreeBSD.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox