From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21821 invoked by alias); 5 Dec 2013 10:54:52 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 21809 invoked by uid 89); 5 Dec 2013 10:54:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.2 X-HELO: rock.gnat.com Received: from Unknown (HELO rock.gnat.com) (205.232.38.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Thu, 05 Dec 2013 10:54:50 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by filtered-rock.gnat.com (Postfix) with ESMTP id C8E3B1168EE; Thu, 5 Dec 2013 05:55:19 -0500 (EST) Received: from rock.gnat.com ([127.0.0.1]) by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id VgsYM40ZUzbq; Thu, 5 Dec 2013 05:55:19 -0500 (EST) Received: from joel.gnat.com (localhost.localdomain [127.0.0.1]) by rock.gnat.com (Postfix) with ESMTP id 502CF1168EC; Thu, 5 Dec 2013 05:55:19 -0500 (EST) Received: by joel.gnat.com (Postfix, from userid 1000) id 8E44CE03EB; Thu, 5 Dec 2013 14:54:37 +0400 (RET) Date: Thu, 05 Dec 2013 10:54:00 -0000 From: Joel Brobecker To: Pedro Alves Cc: gdb-patches@sourceware.org Subject: Re: [RFA] nameless LOAD_DLL_DEBUG_EVENT causes ntdll.dll to be missing Message-ID: <20131205105437.GE3175@adacore.com> References: <1386070185-8020-1-git-send-email-brobecker@adacore.com> <529E361B.7070807@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <529E361B.7070807@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-SW-Source: 2013-12/txt/msg00174.txt.bz2 > > % gnatmake -g a -bargs -shared > > % gdb a > > (gdb) start [...] > Does this happen only on "attach", or also with "run"? I ask > because of this: It only happens on "run". I think I have a pretty good handle on what is going on, now, thanks to your inquisitive questions. First, something that was not obvious, and yet gave me an "aha!" moment. At the time the LOAD_DLL_DEBUG_EVENT happens, it turns out that the modules-array snapshot returned by EnumProcessModules does not include the DLL being loaded yet. When I started instrumenting the code in get_module_name, I looked at the result of cbNeeded by sizeof (HMODULE), and on the first call (first DLL) it fails, then return 2, 3, 4, etc, and the base address of the new element at the end of the array always corresponds to the base address of the _previous_ LOAD_DLL_DEBUG_EVENT, not the one being processed. The reason why it did not catch the first time I looked at it is because the first element at index 0 must be for the main executable. Certainly the base address corresponds. With that in mind, we can see that during the inferior startup phase (when we "run"), calling "get_image_name" is hopeless. But it is also unnecessary, since, most of the time, we get in the event an address in inferior memory where the path to the relevant DLL is stored (event->lpImageName). All we have to do is read that address, which we do but only after having tried the first method of iterating over all process modules. During "attach", on the other hand, the program has had time to correctly set the entire array, so we get a full array each time we process the LOAD_DLL_DEBUG_EVENT, allowing us resolve the DLL name that way. Now, back to the original problem, the difference between 2012 and older versions is the fact that, in older versions, the LOAD_DLL_DEBUG_EVENT data for ntdll.dll provided the event->lpImageName is set, and we can read the DLL's path that way. But with 2012, it's null for this DLL. MSDN says it may happen: This member is strictly optional. Debuggers must be prepared to handle the case where lpImageName is NULL or *lpImageName (in the address space of the process being debugged) is NULL And I just reacted to the following bit: ... and it will __NOT__ likely pass an image name for the __first__ DLL event ... Sounds familiar? So, back to the suggestion I made for the future (post 7.7), I think we should ignore the LOAD_DLL_DEBUG_EVENT events during do_initial_windows_stuff, and just rely on EnumProcessModules at the end of that function, once we're in control of the inferior. But that's a major change, and given the number of versions of Windows, multiplied by 2 for 32bits vs 64 bits - this is why I do not suggest it for now. But I can work on that sometime next year, to explore how well we would do. This is modulo the issue of EnumProcessModules' availability. > EnumProcessModules used to be exported by psapi.dll in older Windows > versions (I think prior to Windows 7), I don't think we can assume > that API/dll is always around. We don't. We load the DLL explicitly in GDB, and then set things up so that EnumProcessModules points either to the function in that DLL, or else to bad_EnumProcessModules. If not available, the fallback function returns a failure, which is correctly handled. In our case, it will NOT be used, and in older versions of Windows, the issue should not exist. > > + int i; > ... > > + for (i = 0; i < (int) (cb_needed / sizeof (HMODULE)); i++) > > Use size_t, and then you don't need the cast. It's copy/paste of the existing code. I will fix this one, and create a followup patch for the rest. > Does gdbserver need this too? I'd almost bet it doesn't, > due to the extra fallback on toolhelp (despite > the 32 in the toolhelp API names, it works on 64-bit): I hadn't checked, but it looks like we have the very same issue. I just build gdbserver (we don't use it on that platform), and after connecting to GDBserver, I don't see "ntdll.dll" in the output of "info shared" :-(. Even after setting a breakpoint inside the program and continuing to that breakpoint, new DLLs appear, but not ntdll.dll. I think the same long-term treatment would apply to GDBserver, and solve the problem as a result. For the immediate problem, I could attempt a fix, but I am not sure how well I could test it. In the meantime, does the GDB-side fix look OK to commit to you (modulo the small issue you raised)? -- Joel