From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3110 invoked by alias); 24 Nov 2007 05:33:54 -0000 Received: (qmail 3102 invoked by uid 22791); 24 Nov 2007 05:33:52 -0000 X-Spam-Check-By: sourceware.org Received: from pool-70-20-17-24.bstnma.fios.verizon.net (HELO ednor.cgf.cx) (70.20.17.24) by sourceware.org (qpsmtpd/0.31) with ESMTP; Sat, 24 Nov 2007 05:33:43 +0000 Received: by ednor.cgf.cx (Postfix, from userid 201) id D818B2B352; Sat, 24 Nov 2007 00:33:41 -0500 (EST) Date: Sat, 24 Nov 2007 05:33:00 -0000 From: Christopher Faylor To: gdb-patches@sourceware.org Subject: Re: [win32] Fix suspend count handling Message-ID: <20071124053341.GA4214@ednor.casa.cgf.cx> Mail-Followup-To: gdb-patches@sourceware.org References: <47437D3A.3000107@portugalmail.pt> <000001c82c31$4a57b220$df071660$@u-strasbg.fr> <4053daab0711210543w4b241e1ek2371e887f3c4f7d2@mail.gmail.com> <000401c82c48$a450df10$ecf29d30$@u-strasbg.fr> <4053daab0711210708o607018b9n8b63147a8498a207@mail.gmail.com> <4053daab0711211019r15f3a862g677080b65b4d8e71@mail.gmail.com> <4744BCCE.60705@portugalmail.pt> <20071123010744.GA31180@ednor.casa.cgf.cx> <4746A922.30404@champenstudios.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4746A922.30404@champenstudios.com> User-Agent: Mutt/1.5.16 (2007-06-09) Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2007-11/txt/msg00445.txt.bz2 On Fri, Nov 23, 2007 at 11:19:14AM +0100, Lerele wrote: > Christopher Faylor escribi?: >> On Wed, Nov 21, 2007 at 11:18:38PM +0000, Pedro Alves wrote: >> >>> Pedro Alves wrote: >>> >>>> Pedro Alves wrote: >>>> >>>>> On Nov 21, 2007 2:13 PM, Pierre Muller wrote: >>>>> That's not what I see here. Can you show me a run where you get 4 >>>>> only this patch applied? >>>>> >>>>> >>>> I did try that, but posted a log of not doing it :-). >>>> I've just tried about 30 times, and only once I did see >>>> a 4 coming out ... oh, well, one of those things. >>>> >>> OK. Back at my home laptop, I can reproduce that with >>> no problems. Let me clarify what the 4 problem really is. >>> It's a race between gdb and the inferior. >>> >>> Take this slightly changed test case. The only difference >>> to the original version is the extra Sleep call. >>> >>> #include >>> >>> HANDLE started; >>> HANDLE stop; >>> >>> DWORD WINAPI >>> thread_start (void *arg) >>> { >>> SetEvent (started); >>> WaitForSingleObject (stop, INFINITE); >>> return 0; >>> } >>> >>> int >>> main (int argc, char **argv) >>> { >>> int i; >>> DWORD suspend_count; >>> started = CreateEvent (NULL, TRUE, FALSE, NULL); >>> stop = CreateEvent (NULL, TRUE, FALSE, NULL); >>> >>> HANDLE h = CreateThread (NULL, 0, thread_start, NULL, >>> 0, NULL); >>> >>> WaitForSingleObject (started, INFINITE); >>> >>> for (i = 0; i < 3; i++) >>> if (SuspendThread (h) == (DWORD) -1) >>> { >>> printf ("SuspendThreadFailed\n"); >>> return 1; >>> } >>> >>> Sleep (300); >>> >>> suspend_count = ResumeThread (h); /* set breakpoint here */ >>> >>> printf ("%lu\n", suspend_count); /* should be 3 */ >>> >>> while ((suspend_count = ResumeThread (h)) != 0 >>> && suspend_count != -1) >>> ; >>> SetEvent (stop); >>> WaitForSingleObject (h, INFINITE); >>> CloseHandle (h); >>> CloseHandle (started); >>> CloseHandle (stop); >>> return 0; >>> } >>> >>> If you do the "break at ...", "run", "thread 3", "continue" >>> sequence, and "..." is the "Sleep" line, you'll get 3, >>> but if you put the break at the /* set breakpoint here */ >>> line, you'll get 4 (if you're (un)lucky). >>> >>> The race happens due to the fact that gdb is >>> doing something similar to this: >>> >>> win32_continue() >>> { >>> ContinueDebugEvent (...); /* Resumes all non suspended >>> threads of the process. */ >>> >>> /* At this point, gdb is running concurrently with >>> the inferior threads that were not suspended - which >>> included the main thread of the testcase. */ >>> foreach t in threads do >>> if t is suspended >>> ResumeThread t >>> fi >>> done >>> } >>> >>> If you break at the Sleep call, when we resume, gdb will >>> have a bit of time to call ResumeThread on the suspended >>> thread of the testcase. If you instead break at the >>> ResumeThread line, you'll have a good chance that the >>> inferior wins the race, hence the "4" result (remember >>> that ResumeThread returns the previous suspend count). >>> If we put something like this after the ResumeThread call: >>> >>> (...) >>> suspend_count = ResumeThread (h); /* set breakpoint here */ >>> >>> + Sleep (300); >>> + SuspendThread (h); >>> + suspend_count = ResumeThread (h); >>> >>> printf ("%lu\n", suspend_count); /* should be 3 */ >>> (...) >>> >>> ... you'll see that eventually gdb will bring the >>> suspend count back to 3. (A SuspendThread, ResumeThread >>> pair is the way to get at the suspend count.) >>> >>> >>>> Since the watchpoint patch should fix this, what shall I do? Shall I >>>> merge >>>> the two and resubmit, or leave it at that ? They've already been tested >>>> together without regressions. >>>> >>> Here is the merge from the patch I posted at the start of the >>> thread with this patch: >>> [win32] Fix watchpoint support >>> http://sourceware.org/ml/gdb-patches/2007-11/msg00390.html >>> >>> This patch fixes both the suspend_count >>> handling, and the watchpoint support. >>> >>> Thanks Pierre, for looking at it. >>> >>> OK ? >>> >>> -- >>> Pedro Alves >>> >> >> >>> 2007-11-21 Pedro Alves >>> >>> * win32-nat.c (thread_info_struct): Rename suspend_count to >>> suspended, to be used as a flag. >>> (thread_rec): Only suspend the thread if it wasn't suspended by >>> gdb before. Warn if suspending failed. >>> (win32_add_thread): Set Dr6 to 0xffff0ff0. >>> (win32_resume): Likewise. >>> (win32_continue): Set Dr6 to 0xffff0ff0. Update usage of the >>> `suspended' flag. Do ContinueDebugEvent after resuming the >>> suspended threads, not before. Set threads' contexts before >>> resuming them, not after. >>> >> >> >> >>> --- >>> gdb/win32-nat.c | 80 >>> +++++++++++++++++++++++++++++++------------------------- >>> 1 file changed, 45 insertions(+), 35 deletions(-) >>> >>> Index: src/gdb/win32-nat.c >>> =================================================================== >>> --- src.orig/gdb/win32-nat.c 2007-11-11 23:13:04.000000000 +0000 >>> +++ src/gdb/win32-nat.c 2007-11-21 22:39:56.000000000 +0000 >>> @@ -112,14 +112,14 @@ static enum target_signal last_sig = TAR >>> /* Set if a signal was received from the debugged process */ >>> >>> /* Thread information structure used to track information that is >>> - not available in gdb's thread structure. */ >>> + not available in gdb's thread structure. */ >>> typedef struct thread_info_struct >>> { >>> struct thread_info_struct *next; >>> DWORD id; >>> HANDLE h; >>> char *name; >>> - int suspend_count; >>> + int suspended; >>> int reload_context; >>> CONTEXT context; >>> STACKFRAME sf; >>> @@ -244,9 +244,10 @@ check (BOOL ok, const char *file, int li >>> GetLastError ()); >>> } >>> >>> -/* Find a thread record given a thread id. >>> - If get_context then also retrieve the context for this >>> - thread. */ >>> +/* Find a thread record given a thread id passed in ID. If >>> + GET_CONTEXT is not 0, then also retrieve the context for this >>> + thread. If GET_CONTEXT is negative, then don't suspend the >>> + thread. */ >>> >> >> I don't see any reason to capitalize get_context in the comment. >> >> >>> static thread_info * >>> thread_rec (DWORD id, int get_context) >>> { >>> @@ -255,12 +256,21 @@ thread_rec (DWORD id, int get_context) >>> for (th = &thread_head; (th = th->next) != NULL;) >>> if (th->id == id) >>> { >>> - if (!th->suspend_count && get_context) >>> + if (!th->suspended && get_context) >>> { >>> if (get_context > 0 && id != current_event.dwThreadId) >>> - th->suspend_count = SuspendThread (th->h) + 1; >>> + { >>> + if (SuspendThread (th->h) == (DWORD) -1) >>> + { >>> + DWORD err = GetLastError (); >>> + warning (_("SuspendThread failed. (winerr %d)"), >>> + (int) err); >>> + return NULL; >>> + } >>> + th->suspended = 1; >>> + } >>> else if (get_context < 0) >>> - th->suspend_count = -1; >>> + th->suspended = -1; >>> th->reload_context = 1; >>> } >>> return th; >>> @@ -294,8 +304,7 @@ win32_add_thread (DWORD id, HANDLE h) >>> th->context.Dr1 = dr[1]; >>> th->context.Dr2 = dr[2]; >>> th->context.Dr3 = dr[3]; >>> - /* th->context.Dr6 = dr[6]; >>> - FIXME: should we set dr6 also ?? */ >>> + th->context.Dr6 = 0xffff0ff0; >>> >> >> This, and similar cases, needs to use a #define with an explanatory >> comment. >> >> With the above minor changes, this looks ok. >> >> I have to ask, however, if the SuspendThread's are even needed at all. >> When I was writing this code, I wasn't entirely sure that gdb needed to >> do anything like this but I erred on the side of caution. So, I'm >> wondering if things would still work ok if the >> SuspendThread/ResumeThread stuff was gone. > >I think they are needed. They were anyway with the new gdbserver based >(vs. Win32 API based) interrupt code I sent several days ago, and that >so very kindly Pedro prepared for commitment, but that I still haven't >found the time to sit down and look at them (however I'm absolutely >sure they're just fine), I guess his changes must be similar to what I >sent in the first place. *Why* did you think you needed them in gdbserver? Did you actually try not using them or did you just copy my code from win32-nat.c? > Apart from this, there's also the case where (at least for gdbserver) > socket data is received asynchronously while the child is running. This > socket data could indicate gdbserver to set/enable/disable a breakpoint, > read thread registers, etc., and this kind of things may require to stop > the child using SuspendThread. > Right? I don't know exactly what you are referring to but if you are proposing using SuspendThread/ResumeThread to work around non UNIX-like handling of signals, I've previously mentioned that the use of SuspendThread is dangerous since, the last I checked, Windows wasn't very good about not avoiding reentracy problems if you suspend a thread and then use another funcion from the Windows API. This used to cause strange random behavior and even blue screens in older versions of Windows NT. I guess it's possible that it's better these days but there is no way to really know for sure. cgf