From: William Tambe <tambewilliam@gmail.com>
To: Andrew Burgess <andrew.burgess@embecosm.com>
Cc: gdb@sourceware.org
Subject: Re: sim_engine_halt() broken in binutils 2.34
Date: Tue, 04 Feb 2020 02:59:00 -0000 [thread overview]
Message-ID: <CAF8i9mMdLvQHWuYh=wHxXVZtJLnDmZA6evSkbJv-vjNpH8szMQ@mail.gmail.com> (raw)
In-Reply-To: <20200203100549.GB20838@embecosm.com>
On Mon, Feb 3, 2020 at 5:05 AM Andrew Burgess
<andrew.burgess@embecosm.com> wrote:
>
> * William Tambe <tambewilliam@gmail.com> [2020-02-02 21:12:00 -0500]:
>
> > On Sun, Feb 2, 2020 at 4:47 PM Andrew Burgess
> > <andrew.burgess@embecosm.com> wrote:
> > >
> > > * William Tambe <tambewilliam@gmail.com> [2020-02-02 00:03:40 -0500]:
> > >
> > > > I am running GDB with target sim.
> > > >
> > > > Calling sim_engine_halt() triggers the following error:
> > > > ../../binutils-gdb/gdb/inferior.c:286: internal-error: inferior*
> > > > find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0'
> > > > failed.
> > > > A problem internal to GDB has been detected,
> > > > further debugging may prove unreliable.
> > > > Quit this debugging session? (y or n)
> > > >
> > > > Any idea idea what changed ?
> > >
> > > I suspect that this is a similar issue (but not the exact same fix) as
> > > something I already posted here:
> > >
> > > https://sourceware.org/ml/gdb-patches/2020-01/msg00980.html
> >
> > I tried using above fix, but it did not resolve the issue.
> >
> > >
> > > You could double check if you like, I believe if you check out the
> > > commit just before 5b6d1e4fa4fc682 then you'll find your issue goes
> > > away.
> >
> > I could not quickly test above as I need to do rebase adjustments for
> > the custom port that I am using.
> >
> > >
> > > The problem is caused, I suspect, by a use of `inferior_ptid' at a
> > > place where this used to not be the null_ptid, but now is null_ptid.
> > >
> > > I had a little nose around for places related to the simulator where
> > > this bug might be hiding, but couldn't see anything obvious.
> > >
> > > If you could let us know which simulator you're using, and what you do
> > > to trigger the call to sim_engine_halt then we might be able to
> > > reproduce the issue.
> >
> > The simulator I am using is a custom port that is not upstream.
> >
> > sim_engine_halt() is used by simulators to do things like SIM_SIGTRAP,
> > SIGABRT, SIGILL or SIM_SIGSEGV;
> > The fact that sim_engine_halt() is broken makes gdb target sim useless
> > because breakpoints cannot be set.
>
> Indeed, but the problem here isn't that sim_engine_halt is broken,
> it's that something else is broken, and stopping the simulator reveals
> the issue.
>
> So in this case the bug is about pids not being 0 when GDB tries to
> lookup the inferior in its internal database. It will do this lookup
> when an inferior stops, for example, when the simulator stops by
> calling sim_engine_halt.
>
> However, there's a clear separation of responsibilities here, the
> simulator code simulates a target, while wrapping the simulator up as
> a "fake" process, with a pid is done in gdb/remote-sim.c, so one thing
> we can be reasonably certain of is that the issue is not with
> sim_engine_halt, but with remote-sim.
>
> Further, I've tested using a different simulator, and I don't see this
> same problem, nor looking at the code do I see how this problem could
> occur, so my instinct right now is that this is something that has
> maybe been broken in your out of tree port.
>
> >
> > >
> > > You could also try to collect a backtrace from GDB at the point the
> > > assertion triggers, I believe GDB offers to dump core on an assertion,
> > > so you can use that to obtain the backtrace. We might get lucky, and
> > > something in the backtrace could be the offender.
> >
> > See below backtrace from GDB when assert triggers due to
> > sim_engine_halt() getting called for a breakpoint using SIM_SIGTRAP:
> >
> > ../../binutils-gdb/gdb/inferior.c:279: internal-error: inferior*
> > find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0'
> > failed.
> > A problem internal to GDB has been detected,
> > further debugging may prove unreliable.
> > Quit this debugging session? (y or n)
> > (gdb) bt
> > #0 0x00007ffff66c9bf9 in __GI___poll (fds=0x555555fd36e0, nfds=4,
> > timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
> > #1 0x00005555556ecf90 in poll (__timeout=-1, __nfds=<optimized out>,
> > __fds=<optimized out>) at
> > /usr/include/x86_64-linux-gnu/bits/poll2.h:46
> > #2 gdb_wait_for_event (block=block@entry=1) at
> > ../../binutils-gdb/gdb/event-loop.c:769
> > #3 0x00005555556ed152 in gdb_do_one_event () at
> > ../../binutils-gdb/gdb/event-loop.c:346
> > #4 0x00005555556ed286 in gdb_do_one_event () at
> > ../../binutils-gdb/gdb/event-loop.c:352
> > #5 0x000055555585d60f in gdb_readline_wrapper (prompt=<optimized
> > out>) at ../../binutils-gdb/gdb/top.c:1078
> > #6 0x000055555588db16 in defaulted_query(const char *, char, typedef
> > __va_list_tag __va_list_tag *) (ctlstr=<optimized out>,
> > defchar=defchar@entry=0 '\000', args=args@entry=0x7fffffffd740)
> > at ../../binutils-gdb/gdb/utils.c:867
> > #7 0x000055555588de5b in query (ctlstr=<optimized out>) at
> > ../../binutils-gdb/gdb/utils.c:959
> > #8 0x000055555588e3a6 in internal_vproblem(internal_problem *, const
> > char *, int, const char *, typedef __va_list_tag __va_list_tag *) (
> > problem=problem@entry=0x555555d59880 <internal_error_problem>,
> > file=<optimized out>, line=279, fmt=<optimized out>,
> > ap=ap@entry=0x7fffffffd900) at ../../binutils-gdb/gdb/utils.c:372
> > #9 0x000055555588e4eb in internal_verror (file=<optimized out>,
> > line=<optimized out>, fmt=<optimized out>, ap=ap@entry=0x7fffffffd900)
> > at ../../binutils-gdb/gdb/utils.c:438
> > #10 0x00005555559590df in internal_error
> > (file=file@entry=0x5555559f2cd0 "../../binutils-gdb/gdb/inferior.c",
> > line=line@entry=279, fmt=<optimized out>)
> > at ../../binutils-gdb/gdbsupport/errors.c:55
> > #11 0x000055555572e72f in find_inferior_pid (targ=<optimized out>,
> > pid=<optimized out>) at ../../binutils-gdb/gdb/inferior.c:279
> > #12 0x0000555555856905 in find_thread_ptid
> > (targ=targ@entry=0x555555d58f60 <gdbsim_ops>, ptid=...) at
> > ../../binutils-gdb/gdb/thread.c:527
> > #13 0x000055555573f42d in handle_inferior_event
> > (ecs=ecs@entry=0x7fffffffdd30) at ../../binutils-gdb/gdb/infrun.c:5062
> > #14 0x000055555574096f in fetch_inferior_event (client_data=<optimized
> > out>) at ../../binutils-gdb/gdb/infrun.c:4002
> > #15 0x00005555556ed19c in check_async_event_handlers () at
> > ../../binutils-gdb/gdb/event-loop.c:1062
> > #16 gdb_do_one_event () at ../../binutils-gdb/gdb/event-loop.c:325
> > #17 0x00005555556ed2be in gdb_do_one_event () at
> > ../../binutils-gdb/gdb/event-loop.c:303
> > #18 start_event_loop () at ../../binutils-gdb/gdb/event-loop.c:370
> > #19 0x0000555555765018 in captured_command_loop () at
> > ../../binutils-gdb/gdb/main.c:360
> > #20 0x0000555555766f75 in captured_main (data=<optimized out>) at
> > ../../binutils-gdb/gdb/main.c:1203
> > #21 gdb_main (args=<optimized out>) at ../../binutils-gdb/gdb/main.c:1218
> > #22 0x00005555555bff1b in main (argc=<optimized out>, argv=<optimized
> > out>) at ../../binutils-gdb/gdb/gdb.c:32
>
> What this shows us is as follows:
>
> 1. The null_ptid (with the 0 value) is present in
> handle_inferior_event at line 5062 (frame 13). Now your line
> numbers don't match up exactly with top of tree, but the closest
> line looks like this:
>
> ecs->event_thread = find_thread_ptid (ecs->target, ecs->ptid);
>
> So we're interested in where ecs->ptid came from.
>
> 2. Follow this back into frame 14, then look back and ecs is
> populated with this call:
>
> if (!do_target_wait (minus_one_ptid, ecs, TARGET_WNOHANG))
> ....
>
> 3. In do_target_wait ecs->ptid is initialised with:
>
> ecs->ptid = do_target_wait_1 (inf, wait_ptid, &ecs->ws, options);
>
> 4. The return value of do_target_wait_1 comes from this code:
>
> if (deprecated_target_wait_hook)
> event_ptid = deprecated_target_wait_hook (ptid, status, options);
> else
> event_ptid = target_wait (ptid, status, options);
>
> return event_ptid;
>
> And we're almost certainly following the 'target_wait' path unless
> your target is doing something vert weird, and 'target_wait' just
> calls the 'wait' method on the current target class, which should
> get us into 'gdbsim_target::wait'.
>
> 5. In gdbsim_target::wait we always return
> sim_data->remote_sim_ptid, so the question becomes how is this
> initialised.
No, I see that in 'gdbsim_target::wait' , inferior_ptid is used as the
return value; its value is set only if ptid is different from
minus_one_ptid.
In my target when I break at 'gdbsim_target::wait', I see the following
(gdb) p minus_one_ptid
$7 = {m_pid = -1, m_lwp = 0, m_tid = 0}
(gdb) p ptid
$8 = {m_pid = -1, m_lwp = 0, m_tid = 0}
(gdb) p inferior_ptid
$9 = {m_pid = 0, m_lwp = 0, m_tid = 0}
hence the return value of 'gdbsim_target::wait' become {m_pid = 0,
m_lwp = 0, m_tid = 0};
When I watch for what modifies the value of inferior_ptid, I find it
getting first set in gdbsim_target::create_inferior() to {m_pid =
42000, m_lwp = 0, m_tid = 42000}; however, before calling
do_target_wait_1(), I see it being set to null by a call to
switch_to_no_thread() with the following backtrace:
Old value = {m_pid = 42000, m_lwp = 0, m_tid = 42000}
New value = {m_pid = 0, m_lwp = 0, m_tid = 0}
switch_to_no_thread () at ../../binutils-gdb/gdb/thread.c:1332
1332 reinit_frame_cache ();
=> 0x0000555555857d3b <switch_to_no_thread()+59>: e9 a0 6e ea ff
jmpq 0x5555556febe0 <reinit_frame_cache()>
(gdb) bt
#0 switch_to_no_thread () at ../../binutils-gdb/gdb/thread.c:1332
#1 0x000055555572ea0e in switch_to_inferior_no_thread
(inf=0x555555e4fb00) at ../../binutils-gdb/gdb/inferior.c:623
#2 0x0000555555735b64 in <lambda(inferior*)>::operator()
(inf=0x555555e4fb00, __closure=<synthetic pointer>) at
../../binutils-gdb/gdb/infrun.c:3666
#3 do_target_wait (wait_ptid=..., ecs=0x7fffffffdd30,
options=<optimized out>) at ../../binutils-gdb/gdb/infrun.c:3681
#4 0x00005555557408ca in fetch_inferior_event (client_data=<optimized
out>) at ../../binutils-gdb/gdb/infrun.c:3971
#5 0x00005555556ed19c in check_async_event_handlers () at
../../binutils-gdb/gdb/event-loop.c:1062
#6 gdb_do_one_event () at ../../binutils-gdb/gdb/event-loop.c:325
#7 0x00005555556ed2be in gdb_do_one_event () at
../../binutils-gdb/gdb/event-loop.c:303
#8 start_event_loop () at ../../binutils-gdb/gdb/event-loop.c:370
#9 0x0000555555765018 in captured_command_loop () at
../../binutils-gdb/gdb/main.c:360
#10 0x0000555555766f75 in captured_main (data=<optimized out>) at
../../binutils-gdb/gdb/main.c:1203
#11 gdb_main (args=<optimized out>) at ../../binutils-gdb/gdb/main.c:1218
#12 0x00005555555bff1b in main (argc=<optimized out>, argv=<optimized
out>) at ../../binutils-gdb/gdb/gdb.c:32
From there on, because inferior_ptid is now {m_pid = 0, m_lwp = 0,
m_tid = 0}, it becomes the return value of 'gdbsim_target::wait',
which in turn becomes the return value of target_wait() and
do_target_wait_1() and finally get set in ecs->ptid.
>
> 6. The only initialisation I can see for this variable is this line:
>
> remote_sim_ptid (next_pid, 0, next_pid)
>
> So then we ask, what is next_pid? At this point I would add the
> following assertion to gdb:
>
> ## START ##
>
> diff --git a/gdb/remote-sim.c b/gdb/remote-sim.c
> index 281232cc4e5..b7ce4672a1a 100644
> --- a/gdb/remote-sim.c
> +++ b/gdb/remote-sim.c
> @@ -86,6 +86,7 @@ struct sim_inferior_data {
> : gdbsim_desc (desc),
> remote_sim_ptid (next_pid, 0, next_pid)
> {
> + gdb_assert (remote_sim_ptid != null_ptid);
> ++next_pid;
> }
I tried the above change; the gdb_assert never triggers.
>
> ## END ##
>
> 7. My guess is that in your port this will fail, but for other ports
> this doesn't fail. For other ports next_pid is initialised to 42000
> in gdbsim_target_open - is it possible that this initialisation has
> been removed in your tree somehow?
I am using commit d7f734bc7e9e5fb6c33b433973b57e1eed3a7e9f (tag: binutils-2_34)
next_pid is set at remote-sim.c:758 .
>
> Thanks,
> Andrew
next prev parent reply other threads:[~2020-02-04 2:59 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-02 5:03 William Tambe
2020-02-02 21:47 ` Andrew Burgess
2020-02-03 2:12 ` William Tambe
2020-02-03 10:05 ` Andrew Burgess
2020-02-04 2:59 ` William Tambe [this message]
2020-02-04 10:32 ` Andrew Burgess
2020-02-04 10:41 ` William Tambe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAF8i9mMdLvQHWuYh=wHxXVZtJLnDmZA6evSkbJv-vjNpH8szMQ@mail.gmail.com' \
--to=tambewilliam@gmail.com \
--cc=andrew.burgess@embecosm.com \
--cc=gdb@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox