From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12396 invoked by alias); 3 Feb 2020 10:05:57 -0000 Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org Received: (qmail 12384 invoked by uid 89); 3 Feb 2020 10:05:56 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-20.6 required=5.0 tests=AWL,BAYES_50,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=reveals, simulator, nose, 5b6d1e4fa4fc682 X-HELO: mail-wr1-f67.google.com Received: from mail-wr1-f67.google.com (HELO mail-wr1-f67.google.com) (209.85.221.67) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 03 Feb 2020 10:05:54 +0000 Received: by mail-wr1-f67.google.com with SMTP id z9so4932368wrs.10 for ; Mon, 03 Feb 2020 02:05:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=+Z1pX7tcU0qTM5NMl3BFTu5mMHWhpK6s/grshSIIjpg=; b=BVPnQy5ZKSDHIIMz+EyKZiOYY96hMljMUSFdtE8lIsx1dsyYx9nZv8yQm0ed/D2JsI mJa6mJOS82NRgYP3yxteqLFspLbv+JzDk888ibVNNfFGLAAmdH4kCBrNLuo36GiLntOJ l9uXYHx3eb/iCpKJAVp92vRVf9/eFujWnyC/Gj0Yko02i49v4qMMXGChvCYWNFDAMZZL NBcHJZTyjUw7VUpQexcvkoZ+FBF1H+L3eeOlWgKt2XqGU6vHz5k24joAP/ExqoTZMjZ7 Ic5M2cydXM6+DZ8NwChScGoJ1erU4305n7w7rHCPp5SLX7L/ayrYHSStx3JwD0SKHr8V A7gQ== Return-Path: Received: from localhost (host86-191-239-73.range86-191.btcentralplus.com. [86.191.239.73]) by smtp.gmail.com with ESMTPSA id f12sm21559335wmj.10.2020.02.03.02.05.50 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 03 Feb 2020 02:05:50 -0800 (PST) Date: Mon, 03 Feb 2020 10:05:00 -0000 From: Andrew Burgess To: William Tambe Cc: gdb@sourceware.org Subject: Re: sim_engine_halt() broken in binutils 2.34 Message-ID: <20200203100549.GB20838@embecosm.com> References: <20200202214715.GA20838@embecosm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Fortune: Byte your tongue. X-Editor: GNU Emacs [ http://www.gnu.org/software/emacs ] User-Agent: Mutt/1.9.2 (2017-12-15) X-IsSubscribed: yes X-SW-Source: 2020-02/txt/msg00005.txt.bz2 * William Tambe [2020-02-02 21:12:00 -0500]: > On Sun, Feb 2, 2020 at 4:47 PM Andrew Burgess > wrote: > > > > * William Tambe [2020-02-02 00:03:40 -0500]: > > > > > I am running GDB with target sim. > > > > > > Calling sim_engine_halt() triggers the following error: > > > ../../binutils-gdb/gdb/inferior.c:286: internal-error: inferior* > > > find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' > > > failed. > > > A problem internal to GDB has been detected, > > > further debugging may prove unreliable. > > > Quit this debugging session? (y or n) > > > > > > Any idea idea what changed ? > > > > I suspect that this is a similar issue (but not the exact same fix) as > > something I already posted here: > > > > https://sourceware.org/ml/gdb-patches/2020-01/msg00980.html > > I tried using above fix, but it did not resolve the issue. > > > > > You could double check if you like, I believe if you check out the > > commit just before 5b6d1e4fa4fc682 then you'll find your issue goes > > away. > > I could not quickly test above as I need to do rebase adjustments for > the custom port that I am using. > > > > > The problem is caused, I suspect, by a use of `inferior_ptid' at a > > place where this used to not be the null_ptid, but now is null_ptid. > > > > I had a little nose around for places related to the simulator where > > this bug might be hiding, but couldn't see anything obvious. > > > > If you could let us know which simulator you're using, and what you do > > to trigger the call to sim_engine_halt then we might be able to > > reproduce the issue. > > The simulator I am using is a custom port that is not upstream. > > sim_engine_halt() is used by simulators to do things like SIM_SIGTRAP, > SIGABRT, SIGILL or SIM_SIGSEGV; > The fact that sim_engine_halt() is broken makes gdb target sim useless > because breakpoints cannot be set. Indeed, but the problem here isn't that sim_engine_halt is broken, it's that something else is broken, and stopping the simulator reveals the issue. So in this case the bug is about pids not being 0 when GDB tries to lookup the inferior in its internal database. It will do this lookup when an inferior stops, for example, when the simulator stops by calling sim_engine_halt. However, there's a clear separation of responsibilities here, the simulator code simulates a target, while wrapping the simulator up as a "fake" process, with a pid is done in gdb/remote-sim.c, so one thing we can be reasonably certain of is that the issue is not with sim_engine_halt, but with remote-sim. Further, I've tested using a different simulator, and I don't see this same problem, nor looking at the code do I see how this problem could occur, so my instinct right now is that this is something that has maybe been broken in your out of tree port. > > > > > You could also try to collect a backtrace from GDB at the point the > > assertion triggers, I believe GDB offers to dump core on an assertion, > > so you can use that to obtain the backtrace. We might get lucky, and > > something in the backtrace could be the offender. > > See below backtrace from GDB when assert triggers due to > sim_engine_halt() getting called for a breakpoint using SIM_SIGTRAP: > > ../../binutils-gdb/gdb/inferior.c:279: internal-error: inferior* > find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' > failed. > A problem internal to GDB has been detected, > further debugging may prove unreliable. > Quit this debugging session? (y or n) > (gdb) bt > #0 0x00007ffff66c9bf9 in __GI___poll (fds=0x555555fd36e0, nfds=4, > timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/poll.c:29 > #1 0x00005555556ecf90 in poll (__timeout=-1, __nfds=, > __fds=) at > /usr/include/x86_64-linux-gnu/bits/poll2.h:46 > #2 gdb_wait_for_event (block=block@entry=1) at > ../../binutils-gdb/gdb/event-loop.c:769 > #3 0x00005555556ed152 in gdb_do_one_event () at > ../../binutils-gdb/gdb/event-loop.c:346 > #4 0x00005555556ed286 in gdb_do_one_event () at > ../../binutils-gdb/gdb/event-loop.c:352 > #5 0x000055555585d60f in gdb_readline_wrapper (prompt= out>) at ../../binutils-gdb/gdb/top.c:1078 > #6 0x000055555588db16 in defaulted_query(const char *, char, typedef > __va_list_tag __va_list_tag *) (ctlstr=, > defchar=defchar@entry=0 '\000', args=args@entry=0x7fffffffd740) > at ../../binutils-gdb/gdb/utils.c:867 > #7 0x000055555588de5b in query (ctlstr=) at > ../../binutils-gdb/gdb/utils.c:959 > #8 0x000055555588e3a6 in internal_vproblem(internal_problem *, const > char *, int, const char *, typedef __va_list_tag __va_list_tag *) ( > problem=problem@entry=0x555555d59880 , > file=, line=279, fmt=, > ap=ap@entry=0x7fffffffd900) at ../../binutils-gdb/gdb/utils.c:372 > #9 0x000055555588e4eb in internal_verror (file=, > line=, fmt=, ap=ap@entry=0x7fffffffd900) > at ../../binutils-gdb/gdb/utils.c:438 > #10 0x00005555559590df in internal_error > (file=file@entry=0x5555559f2cd0 "../../binutils-gdb/gdb/inferior.c", > line=line@entry=279, fmt=) > at ../../binutils-gdb/gdbsupport/errors.c:55 > #11 0x000055555572e72f in find_inferior_pid (targ=, > pid=) at ../../binutils-gdb/gdb/inferior.c:279 > #12 0x0000555555856905 in find_thread_ptid > (targ=targ@entry=0x555555d58f60 , ptid=...) at > ../../binutils-gdb/gdb/thread.c:527 > #13 0x000055555573f42d in handle_inferior_event > (ecs=ecs@entry=0x7fffffffdd30) at ../../binutils-gdb/gdb/infrun.c:5062 > #14 0x000055555574096f in fetch_inferior_event (client_data= out>) at ../../binutils-gdb/gdb/infrun.c:4002 > #15 0x00005555556ed19c in check_async_event_handlers () at > ../../binutils-gdb/gdb/event-loop.c:1062 > #16 gdb_do_one_event () at ../../binutils-gdb/gdb/event-loop.c:325 > #17 0x00005555556ed2be in gdb_do_one_event () at > ../../binutils-gdb/gdb/event-loop.c:303 > #18 start_event_loop () at ../../binutils-gdb/gdb/event-loop.c:370 > #19 0x0000555555765018 in captured_command_loop () at > ../../binutils-gdb/gdb/main.c:360 > #20 0x0000555555766f75 in captured_main (data=) at > ../../binutils-gdb/gdb/main.c:1203 > #21 gdb_main (args=) at ../../binutils-gdb/gdb/main.c:1218 > #22 0x00005555555bff1b in main (argc=, argv= out>) at ../../binutils-gdb/gdb/gdb.c:32 What this shows us is as follows: 1. The null_ptid (with the 0 value) is present in handle_inferior_event at line 5062 (frame 13). Now your line numbers don't match up exactly with top of tree, but the closest line looks like this: ecs->event_thread = find_thread_ptid (ecs->target, ecs->ptid); So we're interested in where ecs->ptid came from. 2. Follow this back into frame 14, then look back and ecs is populated with this call: if (!do_target_wait (minus_one_ptid, ecs, TARGET_WNOHANG)) .... 3. In do_target_wait ecs->ptid is initialised with: ecs->ptid = do_target_wait_1 (inf, wait_ptid, &ecs->ws, options); 4. The return value of do_target_wait_1 comes from this code: if (deprecated_target_wait_hook) event_ptid = deprecated_target_wait_hook (ptid, status, options); else event_ptid = target_wait (ptid, status, options); return event_ptid; And we're almost certainly following the 'target_wait' path unless your target is doing something vert weird, and 'target_wait' just calls the 'wait' method on the current target class, which should get us into 'gdbsim_target::wait'. 5. In gdbsim_target::wait we always return sim_data->remote_sim_ptid, so the question becomes how is this initialised. 6. The only initialisation I can see for this variable is this line: remote_sim_ptid (next_pid, 0, next_pid) So then we ask, what is next_pid? At this point I would add the following assertion to gdb: ## START ## diff --git a/gdb/remote-sim.c b/gdb/remote-sim.c index 281232cc4e5..b7ce4672a1a 100644 --- a/gdb/remote-sim.c +++ b/gdb/remote-sim.c @@ -86,6 +86,7 @@ struct sim_inferior_data { : gdbsim_desc (desc), remote_sim_ptid (next_pid, 0, next_pid) { + gdb_assert (remote_sim_ptid != null_ptid); ++next_pid; } ## END ## 7. My guess is that in your port this will fail, but for other ports this doesn't fail. For other ports next_pid is initialised to 42000 in gdbsim_target_open - is it possible that this initialisation has been removed in your tree somehow? Thanks, Andrew