From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Jacobowitz To: Andrew Cagney , Kimball Thurston , gdb@sources.redhat.com Subject: Re: gdb and dlopen Date: Tue, 16 Oct 2001 22:19:00 -0000 Message-id: <20011017011923.A27536@nevyn.them.org> References: <20011016161525.A1241@nevyn.them.org> <20011016213252.A8694@nevyn.them.org> <20011016220353.A9538@nevyn.them.org> <3BCCF83F.8010401@cygnus.com> <20011017010849.A23345@nevyn.them.org> X-SW-Source: 2001-10/msg00169.html On Wed, Oct 17, 2001 at 01:08:49AM -0400, Daniel Jacobowitz wrote: > On Tue, Oct 16, 2001 at 11:17:19PM -0400, Andrew Cagney wrote: > > Thread support was given a serious overhall in 5.0 (it became > > maintainable and fixable). > > > > Can you try this with/without the thread library linked in? Everytime > > GDB sees a shared library being loaded it goes frobbing around to see if > > it contains some thread support code. That could be the problem. > > I can verify that this's the problem. It takes negligible time (still > more ptraces than it should, maybe, but not by too much) for a > non-threaded testcase. Link in -lpthread, and the time skyrockets. > > thread_db is, plain and simply, horribly slow. We could speed it up > tremendously if we cached memory reads from the child across periods > where we knew it was safe to do so; I'll have to think about how to do > this. Meanwhile, the real speed penalty seems to be: > > /* FIXME: This seems to be necessary to make sure breakpoints > are removed. */ > if (!target_thread_alive (inferior_ptid)) > inferior_ptid = pid_to_ptid (GET_PID (inferior_ptid)); > else > inferior_ptid = lwp_from_thread (inferior_ptid); > > thread_db_thread_alive is EXPENSIVE! And we do it on every attempt to > read the child's memory, of which we appear to have several hundred in > a call to current_sos (). (and lwp_from_thread is a little expensive too...) In the case I'm looking at, where I don't need to mess with either breakpoints or multiple threads (:P), I can safely comment out that whole check. I get an interesting result: Without thread library: loading 50 DSOs takes about 0.09 - 0.11 sec With thread library but without that chunk: 1.47 - 1.56 sec With thread library as it currently stands: 7.24 - 7.36 sec We've definitely got some room for improvement here. Amusingly, there are something like eight million calls to ptid_get_pid. I'll send along a trivial patch to shrink the worst offenders. I understand the opacity that functions over macros is going for here, but a function that does 'return a.b;' and gets called eight MILLION times is a little bit absurd, don't you think? Absurd enough that it shows up as the second highest item on the profile. -- Daniel Jacobowitz Carnegie Mellon University MontaVista Software Debian GNU/Linux Developer