From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28368 invoked by alias); 8 Aug 2008 20:49:16 -0000 Received: (qmail 28360 invoked by uid 22791); 8 Aug 2008 20:49:15 -0000 X-Spam-Check-By: sourceware.org Received: from hiauly1.hia.nrc.ca (HELO hiauly1.hia.nrc.ca) (132.246.100.193) by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 08 Aug 2008 20:48:41 +0000 Received: by hiauly1.hia.nrc.ca (Postfix, from userid 1000) id DD43A4DCB; Fri, 8 Aug 2008 16:48:38 -0400 (EDT) Subject: Re: ttrace: Protocal error To: pedro@codesourcery.com (Pedro Alves) Date: Fri, 08 Aug 2008 20:49:00 -0000 From: "John David Anglin" Cc: gdb-patches@sourceware.org In-Reply-To: <200808082102.11828.pedro@codesourcery.com> from "Pedro Alves" at Aug 8, 2008 09:02:11 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: <20080808204838.DD43A4DCB@hiauly1.hia.nrc.ca> Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2008-08/txt/msg00222.txt.bz2 > Note, I know nothing about ttrace and HP-UX. That makes us equal. > On Friday 08 August 2008 19:33:06, John David Anglin wrote: > > While were on the subject of threads, it seems we are still not in > > a position to debug the vla6.f90 failure: > > What's this test doing different? It's not entirely clear. However, it is using emulated TLS support and multiple lwp threads. This support may be initialized by a constructor run directly by the dynamic loader. There's a timing or some other random effect associated with the failure (could be some variable is being randomly intialized). > > #4 0x000a3390 in target_resume (ptid=3D > > {pid =3D 1953788513, lwp =3D 1667563520, tid =3D 774778670}, step=3D0, > > signal=3DTARGET_SIGNAL_0) at ../../src/gdb/target.c:1789 > > ^^^^^^^^^^ ^^^^^^^^^^ ^^^^^^^^^ > > I assume this ptid is GDB getting bogus info, right? That's pretty common for optimized code. > This should be setting the dying flag on the thread, but > it is still listed in gdb's thread table. Yes. > case TTEVT_LWP_EXIT: > if (print_thread_events) > printf_unfiltered (_("[%s exited]\n"), target_pid_to_str (ptid)); > ti =3D find_thread_pid (ptid); > gdb_assert (ti !=3D NULL); > ((struct inf_ttrace_private_thread_info *)ti->private)->dying =3D 1; > inf_ttrace_num_lwps--; > ttrace (TT_LWP_CONTINUE, ptid_get_pid (ptid), > ptid_get_lwp (ptid), TT_NOPC, 0, 0); > /* If we don't return -1 here, core GDB will re-add the thread. */ > ptid =3D minus_one_ptid; > break; The dying flag is set when the resume is attempted. > inf_ttrace_resume: > > if (ptid_equal (ptid, minus_one_ptid)) > { > /* Let all the other threads run too. */ > iterate_over_threads (inf_ttrace_resume_callback, NULL); > iterate_over_threads (inf_ttrace_delete_dying_threads_callback, NULL); > } > > Is this the first resume after that "exit" notification? > Any chance we're trying to resume a dead thread here then? Yes. That's what I think is happening. > What happens when you delete the dying threads before resuming? > > iterate_over_threads (inf_ttrace_delete_dying_threads_callback, NULL); > iterate_over_threads (inf_ttrace_resume_callback, NULL); > iterate_over_threads (inf_ttrace_delete_dying_threads_callback, NULL); > > Hmmm, I assume not, if my sources match yours, your the program is stopped > at a syscall event: > > /* Be careful not to try to gather much state about a thread > that's in a syscall. It's frequently a losing proposition. */ > case TARGET_WAITKIND_SYSCALL_ENTRY: > if (debug_infrun) > fprintf_unfiltered (gdb_stdlog, "infrun:=20 > TARGET_WAITKIND_SYSCALL_ENTRY\n"); > resume (0, TARGET_SIGNAL_0); > prepare_to_wait (ecs); > return; > > So, there should have already been a resume in between. > > Could you check which thread got the syscall event? Is it the same > thread we fail to resume? Is it possibly to disable syscall events, > just for checking if it is related? I don't know how to disable syscall events. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602)