From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 116319 invoked by alias); 4 Feb 2020 02:59:58 -0000 Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org Received: (qmail 116305 invoked by uid 89); 4 Feb 2020 02:59:57 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-14.4 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=3681, 3971, 1332, synthetic X-HELO: mail-lf1-f68.google.com Received: from mail-lf1-f68.google.com (HELO mail-lf1-f68.google.com) (209.85.167.68) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 04 Feb 2020 02:59:54 +0000 Received: by mail-lf1-f68.google.com with SMTP id n25so11204906lfl.0 for ; Mon, 03 Feb 2020 18:59:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=xq3uQyXHvmbBacnYF67aCBcFK1NXzbWgqSKoXdyh148=; b=hBMp8Ja4i/GILZQdSVWG4GqrQmaRnBxM67KByBuOnDBVHUzIJEue8+pJ9UJogbxDz1 c/EUGn1oQ0/R2g8A9ACJmyuOlCZyQjYP4BipjaKzhsjeFUGTFFcdzmF5tMjNsygdLtBJ Xa8gaaEXCTwF5DtU7VlW0GkwEDqFtDrpkuL7oEyFtMdNLw2jrW0Tlcmc1djB8d1fR7Yn 1tpysjzKtkEpSJApbVmJ6tgPoEaSZjM7gN/+kZFem46puYsHFYBcFZzX1dRLJJG7kIOY WwKbnUDjwn2wc4qYmTg/hy2ny3jyf4neP1rn0omoKOLXmWeWa4SmshB93gJnIAJaBCCY qfCA== MIME-Version: 1.0 References: <20200202214715.GA20838@embecosm.com> <20200203100549.GB20838@embecosm.com> In-Reply-To: <20200203100549.GB20838@embecosm.com> From: William Tambe Date: Tue, 04 Feb 2020 02:59:00 -0000 Message-ID: Subject: Re: sim_engine_halt() broken in binutils 2.34 To: Andrew Burgess Cc: gdb@sourceware.org Content-Type: text/plain; charset="UTF-8" X-SW-Source: 2020-02/txt/msg00007.txt.bz2 On Mon, Feb 3, 2020 at 5:05 AM Andrew Burgess wrote: > > * William Tambe [2020-02-02 21:12:00 -0500]: > > > On Sun, Feb 2, 2020 at 4:47 PM Andrew Burgess > > wrote: > > > > > > * William Tambe [2020-02-02 00:03:40 -0500]: > > > > > > > I am running GDB with target sim. > > > > > > > > Calling sim_engine_halt() triggers the following error: > > > > ../../binutils-gdb/gdb/inferior.c:286: internal-error: inferior* > > > > find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' > > > > failed. > > > > A problem internal to GDB has been detected, > > > > further debugging may prove unreliable. > > > > Quit this debugging session? (y or n) > > > > > > > > Any idea idea what changed ? > > > > > > I suspect that this is a similar issue (but not the exact same fix) as > > > something I already posted here: > > > > > > https://sourceware.org/ml/gdb-patches/2020-01/msg00980.html > > > > I tried using above fix, but it did not resolve the issue. > > > > > > > > You could double check if you like, I believe if you check out the > > > commit just before 5b6d1e4fa4fc682 then you'll find your issue goes > > > away. > > > > I could not quickly test above as I need to do rebase adjustments for > > the custom port that I am using. > > > > > > > > The problem is caused, I suspect, by a use of `inferior_ptid' at a > > > place where this used to not be the null_ptid, but now is null_ptid. > > > > > > I had a little nose around for places related to the simulator where > > > this bug might be hiding, but couldn't see anything obvious. > > > > > > If you could let us know which simulator you're using, and what you do > > > to trigger the call to sim_engine_halt then we might be able to > > > reproduce the issue. > > > > The simulator I am using is a custom port that is not upstream. > > > > sim_engine_halt() is used by simulators to do things like SIM_SIGTRAP, > > SIGABRT, SIGILL or SIM_SIGSEGV; > > The fact that sim_engine_halt() is broken makes gdb target sim useless > > because breakpoints cannot be set. > > Indeed, but the problem here isn't that sim_engine_halt is broken, > it's that something else is broken, and stopping the simulator reveals > the issue. > > So in this case the bug is about pids not being 0 when GDB tries to > lookup the inferior in its internal database. It will do this lookup > when an inferior stops, for example, when the simulator stops by > calling sim_engine_halt. > > However, there's a clear separation of responsibilities here, the > simulator code simulates a target, while wrapping the simulator up as > a "fake" process, with a pid is done in gdb/remote-sim.c, so one thing > we can be reasonably certain of is that the issue is not with > sim_engine_halt, but with remote-sim. > > Further, I've tested using a different simulator, and I don't see this > same problem, nor looking at the code do I see how this problem could > occur, so my instinct right now is that this is something that has > maybe been broken in your out of tree port. > > > > > > > > > You could also try to collect a backtrace from GDB at the point the > > > assertion triggers, I believe GDB offers to dump core on an assertion, > > > so you can use that to obtain the backtrace. We might get lucky, and > > > something in the backtrace could be the offender. > > > > See below backtrace from GDB when assert triggers due to > > sim_engine_halt() getting called for a breakpoint using SIM_SIGTRAP: > > > > ../../binutils-gdb/gdb/inferior.c:279: internal-error: inferior* > > find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' > > failed. > > A problem internal to GDB has been detected, > > further debugging may prove unreliable. > > Quit this debugging session? (y or n) > > (gdb) bt > > #0 0x00007ffff66c9bf9 in __GI___poll (fds=0x555555fd36e0, nfds=4, > > timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/poll.c:29 > > #1 0x00005555556ecf90 in poll (__timeout=-1, __nfds=, > > __fds=) at > > /usr/include/x86_64-linux-gnu/bits/poll2.h:46 > > #2 gdb_wait_for_event (block=block@entry=1) at > > ../../binutils-gdb/gdb/event-loop.c:769 > > #3 0x00005555556ed152 in gdb_do_one_event () at > > ../../binutils-gdb/gdb/event-loop.c:346 > > #4 0x00005555556ed286 in gdb_do_one_event () at > > ../../binutils-gdb/gdb/event-loop.c:352 > > #5 0x000055555585d60f in gdb_readline_wrapper (prompt= > out>) at ../../binutils-gdb/gdb/top.c:1078 > > #6 0x000055555588db16 in defaulted_query(const char *, char, typedef > > __va_list_tag __va_list_tag *) (ctlstr=, > > defchar=defchar@entry=0 '\000', args=args@entry=0x7fffffffd740) > > at ../../binutils-gdb/gdb/utils.c:867 > > #7 0x000055555588de5b in query (ctlstr=) at > > ../../binutils-gdb/gdb/utils.c:959 > > #8 0x000055555588e3a6 in internal_vproblem(internal_problem *, const > > char *, int, const char *, typedef __va_list_tag __va_list_tag *) ( > > problem=problem@entry=0x555555d59880 , > > file=, line=279, fmt=, > > ap=ap@entry=0x7fffffffd900) at ../../binutils-gdb/gdb/utils.c:372 > > #9 0x000055555588e4eb in internal_verror (file=, > > line=, fmt=, ap=ap@entry=0x7fffffffd900) > > at ../../binutils-gdb/gdb/utils.c:438 > > #10 0x00005555559590df in internal_error > > (file=file@entry=0x5555559f2cd0 "../../binutils-gdb/gdb/inferior.c", > > line=line@entry=279, fmt=) > > at ../../binutils-gdb/gdbsupport/errors.c:55 > > #11 0x000055555572e72f in find_inferior_pid (targ=, > > pid=) at ../../binutils-gdb/gdb/inferior.c:279 > > #12 0x0000555555856905 in find_thread_ptid > > (targ=targ@entry=0x555555d58f60 , ptid=...) at > > ../../binutils-gdb/gdb/thread.c:527 > > #13 0x000055555573f42d in handle_inferior_event > > (ecs=ecs@entry=0x7fffffffdd30) at ../../binutils-gdb/gdb/infrun.c:5062 > > #14 0x000055555574096f in fetch_inferior_event (client_data= > out>) at ../../binutils-gdb/gdb/infrun.c:4002 > > #15 0x00005555556ed19c in check_async_event_handlers () at > > ../../binutils-gdb/gdb/event-loop.c:1062 > > #16 gdb_do_one_event () at ../../binutils-gdb/gdb/event-loop.c:325 > > #17 0x00005555556ed2be in gdb_do_one_event () at > > ../../binutils-gdb/gdb/event-loop.c:303 > > #18 start_event_loop () at ../../binutils-gdb/gdb/event-loop.c:370 > > #19 0x0000555555765018 in captured_command_loop () at > > ../../binutils-gdb/gdb/main.c:360 > > #20 0x0000555555766f75 in captured_main (data=) at > > ../../binutils-gdb/gdb/main.c:1203 > > #21 gdb_main (args=) at ../../binutils-gdb/gdb/main.c:1218 > > #22 0x00005555555bff1b in main (argc=, argv= > out>) at ../../binutils-gdb/gdb/gdb.c:32 > > What this shows us is as follows: > > 1. The null_ptid (with the 0 value) is present in > handle_inferior_event at line 5062 (frame 13). Now your line > numbers don't match up exactly with top of tree, but the closest > line looks like this: > > ecs->event_thread = find_thread_ptid (ecs->target, ecs->ptid); > > So we're interested in where ecs->ptid came from. > > 2. Follow this back into frame 14, then look back and ecs is > populated with this call: > > if (!do_target_wait (minus_one_ptid, ecs, TARGET_WNOHANG)) > .... > > 3. In do_target_wait ecs->ptid is initialised with: > > ecs->ptid = do_target_wait_1 (inf, wait_ptid, &ecs->ws, options); > > 4. The return value of do_target_wait_1 comes from this code: > > if (deprecated_target_wait_hook) > event_ptid = deprecated_target_wait_hook (ptid, status, options); > else > event_ptid = target_wait (ptid, status, options); > > return event_ptid; > > And we're almost certainly following the 'target_wait' path unless > your target is doing something vert weird, and 'target_wait' just > calls the 'wait' method on the current target class, which should > get us into 'gdbsim_target::wait'. > > 5. In gdbsim_target::wait we always return > sim_data->remote_sim_ptid, so the question becomes how is this > initialised. No, I see that in 'gdbsim_target::wait' , inferior_ptid is used as the return value; its value is set only if ptid is different from minus_one_ptid. In my target when I break at 'gdbsim_target::wait', I see the following (gdb) p minus_one_ptid $7 = {m_pid = -1, m_lwp = 0, m_tid = 0} (gdb) p ptid $8 = {m_pid = -1, m_lwp = 0, m_tid = 0} (gdb) p inferior_ptid $9 = {m_pid = 0, m_lwp = 0, m_tid = 0} hence the return value of 'gdbsim_target::wait' become {m_pid = 0, m_lwp = 0, m_tid = 0}; When I watch for what modifies the value of inferior_ptid, I find it getting first set in gdbsim_target::create_inferior() to {m_pid = 42000, m_lwp = 0, m_tid = 42000}; however, before calling do_target_wait_1(), I see it being set to null by a call to switch_to_no_thread() with the following backtrace: Old value = {m_pid = 42000, m_lwp = 0, m_tid = 42000} New value = {m_pid = 0, m_lwp = 0, m_tid = 0} switch_to_no_thread () at ../../binutils-gdb/gdb/thread.c:1332 1332 reinit_frame_cache (); => 0x0000555555857d3b : e9 a0 6e ea ff jmpq 0x5555556febe0 (gdb) bt #0 switch_to_no_thread () at ../../binutils-gdb/gdb/thread.c:1332 #1 0x000055555572ea0e in switch_to_inferior_no_thread (inf=0x555555e4fb00) at ../../binutils-gdb/gdb/inferior.c:623 #2 0x0000555555735b64 in ::operator() (inf=0x555555e4fb00, __closure=) at ../../binutils-gdb/gdb/infrun.c:3666 #3 do_target_wait (wait_ptid=..., ecs=0x7fffffffdd30, options=) at ../../binutils-gdb/gdb/infrun.c:3681 #4 0x00005555557408ca in fetch_inferior_event (client_data=) at ../../binutils-gdb/gdb/infrun.c:3971 #5 0x00005555556ed19c in check_async_event_handlers () at ../../binutils-gdb/gdb/event-loop.c:1062 #6 gdb_do_one_event () at ../../binutils-gdb/gdb/event-loop.c:325 #7 0x00005555556ed2be in gdb_do_one_event () at ../../binutils-gdb/gdb/event-loop.c:303 #8 start_event_loop () at ../../binutils-gdb/gdb/event-loop.c:370 #9 0x0000555555765018 in captured_command_loop () at ../../binutils-gdb/gdb/main.c:360 #10 0x0000555555766f75 in captured_main (data=) at ../../binutils-gdb/gdb/main.c:1203 #11 gdb_main (args=) at ../../binutils-gdb/gdb/main.c:1218 #12 0x00005555555bff1b in main (argc=, argv=) at ../../binutils-gdb/gdb/gdb.c:32 >From there on, because inferior_ptid is now {m_pid = 0, m_lwp = 0, m_tid = 0}, it becomes the return value of 'gdbsim_target::wait', which in turn becomes the return value of target_wait() and do_target_wait_1() and finally get set in ecs->ptid. > > 6. The only initialisation I can see for this variable is this line: > > remote_sim_ptid (next_pid, 0, next_pid) > > So then we ask, what is next_pid? At this point I would add the > following assertion to gdb: > > ## START ## > > diff --git a/gdb/remote-sim.c b/gdb/remote-sim.c > index 281232cc4e5..b7ce4672a1a 100644 > --- a/gdb/remote-sim.c > +++ b/gdb/remote-sim.c > @@ -86,6 +86,7 @@ struct sim_inferior_data { > : gdbsim_desc (desc), > remote_sim_ptid (next_pid, 0, next_pid) > { > + gdb_assert (remote_sim_ptid != null_ptid); > ++next_pid; > } I tried the above change; the gdb_assert never triggers. > > ## END ## > > 7. My guess is that in your port this will fail, but for other ports > this doesn't fail. For other ports next_pid is initialised to 42000 > in gdbsim_target_open - is it possible that this initialisation has > been removed in your tree somehow? I am using commit d7f734bc7e9e5fb6c33b433973b57e1eed3a7e9f (tag: binutils-2_34) next_pid is set at remote-sim.c:758 . > > Thanks, > Andrew