* performance of multithreading gets gradually worse under gdb
@ 2011-02-02 20:16 Markus Alber
2011-02-02 20:28 ` Michael Snyder
0 siblings, 1 reply; 18+ messages in thread
From: Markus Alber @ 2011-02-02 20:16 UTC (permalink / raw)
To: gdb
Hello,
I have experienced the following problem:
I'm debugging a number-crunching application which spawns a lot (36)
little
worker threads per iteration. The system does typically OoM 200
iterations.
Although each of them should take about the same amount of time, the
performance
gets worse with every iteration and becomes excruciatingly slow.
A system monitor reveals that gdb allocates more memory with every
iteration,
i.e. with every 36 threads started and finished. The CPU load of GDB
goes up, too.
The CPU usage of the application goes down. Compared to the solo
performance, it
gets slower by a factor 20 and more, if run long enough.
The application behaves perfectly when run by itself. The
multi-threaded part is not
debugging compiled when this behaviour occurs.
The distribution is SuSE 11.3 / gdb 7.1.
Is there anything I can change about this behaviour, any options of gdb
that need to
be set in these circumstances?
Thank you very much for your help in advance,
Markus
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-02 20:16 performance of multithreading gets gradually worse under gdb Markus Alber
@ 2011-02-02 20:28 ` Michael Snyder
[not found] ` <76bccf1875854ebc69b6a892fb84a976@hyperion-imrt.org>
0 siblings, 1 reply; 18+ messages in thread
From: Michael Snyder @ 2011-02-02 20:28 UTC (permalink / raw)
To: Markus Alber; +Cc: gdb
Markus Alber wrote:
> Hello,
>
> I have experienced the following problem:
>
> I'm debugging a number-crunching application which spawns a lot (36)
> little
> worker threads per iteration. The system does typically OoM 200
> iterations.
> Although each of them should take about the same amount of time, the
> performance
> gets worse with every iteration and becomes excruciatingly slow.
>
> A system monitor reveals that gdb allocates more memory with every
> iteration,
> i.e. with every 36 threads started and finished. The CPU load of GDB
> goes up, too.
> The CPU usage of the application goes down. Compared to the solo
> performance, it
> gets slower by a factor 20 and more, if run long enough.
>
> The application behaves perfectly when run by itself. The
> multi-threaded part is not
> debugging compiled when this behaviour occurs.
>
> The distribution is SuSE 11.3 / gdb 7.1.
>
> Is there anything I can change about this behaviour, any options of gdb
> that need to
> be set in these circumstances?
Interesting.
By how much does gdb's memory allocation increase?
In total or, if possible, per iteration? This might
give is a clue as to where to look.
Do you think you could write a simple sample program that
allocates threads in a manner similar to your application?
Thanks,
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
[not found] ` <76bccf1875854ebc69b6a892fb84a976@hyperion-imrt.org>
@ 2011-02-02 21:43 ` Michael Snyder
2011-02-03 7:03 ` Markus Alber
0 siblings, 1 reply; 18+ messages in thread
From: Michael Snyder @ 2011-02-02 21:43 UTC (permalink / raw)
To: Markus Alber, gdb
Markus Alber wrote:
> Thank you for the quick response.
You're welcome, but let's keep the discussion on the public list, so
that other maintainers may jump in if so inclined.
> It allocates about 100kB per iteration.
Hmmm, and that's for roughly 36 thread start/stops, so it could be
losing roughly 3000 bytes per thread. That's much bigger than my
first guess would have been (a "struct thread_info" is only 336 bytes).
> One interesting finding might also be:
>
> I terminated the process when an iteration took about 3 min (instead of
> 1 sec)
> and gdb had about 115MB allocated.
I assume that at this point, your system still had plenty of ram to
spare? It wasn't simply swapping?
> On starting the application again, it ran alright for a while and the
> gdb memory allocation
> stayed constant. When it finally started to grow again, the application
> slowed down, and became
> slower with every iteration - the usual picture.
>
> I attached a sample file from the application where the computation
> bifurcates into
> the worker threads. This is one of three instances per iteration, but
> they all follow the
> same pattern.
I was really hoping for a stripped-down sample that we could compile and
run.
> The machine has 2x6 cores x 3 instances per iteration =
> 36 worker threads per iteration.
x86 architecture?
> On another note, I tried to compile gdb-6.5 on my machine (because it
> was the release I
> used to work with before, without problems) and configure comes back
> with an error that it cannot find a termcap lib. There is none on the
> SuSE. Which package would I need to install?
That would be libncurses, I think.
> On Wed, 02 Feb 2011 12:27:58 -0800, Michael Snyder wrote:
>> Markus Alber wrote:
>>> Hello,
>>> I have experienced the following problem:
>>> I'm debugging a number-crunching application which spawns a lot (36)
>>> little
>>> worker threads per iteration. The system does typically OoM 200
>>> iterations.
>>> Although each of them should take about the same amount of time,
>>> the performance
>>> gets worse with every iteration and becomes excruciatingly slow.
>>> A system monitor reveals that gdb allocates more memory with every
>>> iteration,
>>> i.e. with every 36 threads started and finished. The CPU load of
>>> GDB goes up, too.
>>> The CPU usage of the application goes down. Compared to the solo
>>> performance, it
>>> gets slower by a factor 20 and more, if run long enough.
>>> The application behaves perfectly when run by itself. The
>>> multi-threaded part is not
>>> debugging compiled when this behaviour occurs.
>>> The distribution is SuSE 11.3 / gdb 7.1.
>>> Is there anything I can change about this behaviour, any options of
>>> gdb that need to
>>> be set in these circumstances?
>> Interesting.
>>
>> By how much does gdb's memory allocation increase?
>> In total or, if possible, per iteration? This might
>> give is a clue as to where to look.
>>
>> Do you think you could write a simple sample program that
>> allocates threads in a manner similar to your application?
>>
>> Thanks,
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-02 21:43 ` Michael Snyder
@ 2011-02-03 7:03 ` Markus Alber
2011-02-03 20:26 ` Michael Snyder
2011-02-03 20:57 ` Tom Tromey
0 siblings, 2 replies; 18+ messages in thread
From: Markus Alber @ 2011-02-03 7:03 UTC (permalink / raw)
To: Michael Snyder; +Cc: gdb
[-- Attachment #1: Type: text/plain, Size: 3355 bytes --]
On Wed, 02 Feb 2011 13:43:50 -0800, Michael Snyder wrote:
>> It allocates about 100kB per iteration.
>
> Hmmm, and that's for roughly 36 thread start/stops, so it could be
> losing roughly 3000 bytes per thread. That's much bigger than my
> first guess would have been (a "struct thread_info" is only 336
> bytes).
>
>> One interesting finding might also be:
>> I terminated the process when an iteration took about 3 min (instead
>> of 1 sec)
>> and gdb had about 115MB allocated.
>
> I assume that at this point, your system still had plenty of ram to
> spare? It wasn't simply swapping?
The system had about 20 GB to spare.
>> On starting the application again, it ran alright for a while and
>> the gdb memory allocation
>> stayed constant. When it finally started to grow again, the
>> application slowed down, and became
>> slower with every iteration - the usual picture.
>> I attached a sample file from the application where the computation
>> bifurcates into
>> the worker threads. This is one of three instances per iteration,
>> but they all follow the
>> same pattern.
>
> I was really hoping for a stripped-down sample that we could compile
> and run.
See the attached file. It shows a similar behaviour, although it only
allocates 8kB per iteration.
You have to wait some time before this happens.
>> The machine has 2x6 cores x 3 instances per iteration = 36 worker
>> threads per iteration.
>
> x86 architecture?
>
>> On another note, I tried to compile gdb-6.5 on my machine (because
>> it was the release I
>> used to work with before, without problems) and configure comes
>> back with an error that it cannot find a termcap lib. There is none on
>> the SuSE. Which package would I need to install?
>
> That would be libncurses, I think.
I compiled gdb-6.5 alright and it performs well as usual, without this
problem.
>
>
>> On Wed, 02 Feb 2011 12:27:58 -0800, Michael Snyder wrote:
>>> Markus Alber wrote:
>>>> Hello,
>>>> I have experienced the following problem:
>>>> I'm debugging a number-crunching application which spawns a lot
>>>> (36) little
>>>> worker threads per iteration. The system does typically OoM 200
>>>> iterations.
>>>> Although each of them should take about the same amount of time,
>>>> the performance
>>>> gets worse with every iteration and becomes excruciatingly slow.
>>>> A system monitor reveals that gdb allocates more memory with every
>>>> iteration,
>>>> i.e. with every 36 threads started and finished. The CPU load of
>>>> GDB goes up, too.
>>>> The CPU usage of the application goes down. Compared to the solo
>>>> performance, it
>>>> gets slower by a factor 20 and more, if run long enough.
>>>> The application behaves perfectly when run by itself. The
>>>> multi-threaded part is not
>>>> debugging compiled when this behaviour occurs.
>>>> The distribution is SuSE 11.3 / gdb 7.1.
>>>> Is there anything I can change about this behaviour, any options
>>>> of gdb that need to
>>>> be set in these circumstances?
>>> Interesting.
>>>
>>> By how much does gdb's memory allocation increase?
>>> In total or, if possible, per iteration? This might
>>> give is a clue as to where to look.
>>>
>>> Do you think you could write a simple sample program that
>>> allocates threads in a manner similar to your application?
>>>
>>> Thanks,
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: mt_test.cpp --]
[-- Type: text/x-c++src; name=mt_test.cpp, Size: 1555 bytes --]
#include <stdlib.h>
#include <iostream>
#include <math.h>
#include <vector>
#include <boost/thread/thread.hpp>
#include <boost/bind.hpp>
using namespace std;
// Compile with: g++ -g -lboost_thread mt_test.cpp
struct ThreadData {
ThreadData(const vector<double>& vOne,
const double factor) :
m_vOne(vOne),
m_factor(factor),
m_vTwo(vOne.size(), m_factor)
{
// empty
}
const vector<double>& m_vOne;
double m_factor;
vector<double> m_vTwo;
};
static void* doAdd(void* data)
{
ThreadData* pData = static_cast<ThreadData*>(data);
for(unsigned int nAt = 0; nAt < pData->m_vOne.size(); ++nAt)
pData->m_vTwo[nAt] += pData->m_factor * pData->m_vOne[nAt];
return NULL;
}
int main(void) {
const int nThreads = 12;
const int nRepetitions = 200;
vector<double> vOne(1<<24, 1.0);
for(int nAtRepetition = 0; nAtRepetition < nRepetitions; ++nAtRepetition) {
cout << "Repetition no. " << nAtRepetition << endl;
vector<ThreadData*> vpData(nThreads, static_cast<ThreadData*>(NULL));
boost::thread_group threads;
for (int nAtThread = 0; nAtThread < nThreads; nAtThread++) {
vpData[nAtThread] = new ThreadData(vOne, static_cast<double>(nAtThread+1) );
if( threads.create_thread( boost::bind(&doAdd, static_cast<void*> (vpData[nAtThread])) ) == 0 )
throw std::exception();
}
threads.join_all();
for (int nAtThread = 0; nAtThread < nThreads; nAtThread++)
delete vpData[nAtThread];
}
return 0;
}
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-03 7:03 ` Markus Alber
@ 2011-02-03 20:26 ` Michael Snyder
2011-02-03 20:52 ` Markus Alber
2011-02-03 20:57 ` Tom Tromey
1 sibling, 1 reply; 18+ messages in thread
From: Michael Snyder @ 2011-02-03 20:26 UTC (permalink / raw)
To: Markus Alber; +Cc: gdb
Markus Alber wrote:
>
> I compiled gdb-6.5 alright and it performs well as usual, without this
> problem.
Good info. What version of gdb were you using when you detected the
problem?
I ran your program stand-alone, then under gdb 7.0, then gdb 7.2CVS.
On my not-so-new system I got the following times:
stand alone: 15m33s
gdb 7.0 16m
gdb 7.2CVS 16m10s
I could not perceive any change in the cycle time from beginning to end.
GDB 7.0 did increase its memory footprint by about 3mb during the run
(3 percent). So that could indicate some leakage, around 15k per cycle.
But I can't see that as the main reason for the slow-down you're
reporting, considering that your system is not memory bound.
GDB 7.2CVS increased its memory footprint by only about 500kb during the
whole run, so you might consider giving that a try. Instructions for
getting the latest development sources are here:
http://sourceware.org/gdb/current/
If you'd be willing to contribute your little sample program, we might
be able to use it for a thread debugging stress test or something.
Michael
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-03 20:26 ` Michael Snyder
@ 2011-02-03 20:52 ` Markus Alber
0 siblings, 0 replies; 18+ messages in thread
From: Markus Alber @ 2011-02-03 20:52 UTC (permalink / raw)
To: Michael Snyder; +Cc: gdb
On Thu, 03 Feb 2011 12:26:25 -0800, Michael Snyder wrote:
> Markus Alber wrote:
>> I compiled gdb-6.5 alright and it performs well as usual, without
>> this problem.
>
> Good info. What version of gdb were you using when you detected the
> problem?
it was gdb 7.1 as it comes with OpenSuSE 11.3
Meanwhile I tried every release between 6.5 and 7.2. All the 7's showed
the
problem, but it was also perceivable with 6.8.91.
I was a bit too quick to assert that 6.5 was working well... it wasn't
possible
to set break points, some 6.8 versions crashed. There seems to be some
incompatibility
between the compiler on OpenSuse 11.3 and the older gdbs (I got them as
sources from
the gnu repository)
> I ran your program stand-alone, then under gdb 7.0, then gdb 7.2CVS.
> On my not-so-new system I got the following times:
> stand alone: 15m33s
> gdb 7.0 16m
> gdb 7.2CVS 16m10s
So I presume the stripping-down removed something that causes the
problem in the
larger application. Or, is it a possibility that it has something to do
with the kernel or compiler of
OpenSuSE 11.3?
Meanwhile I've gone back to my older 4-core machine running gdb 6.5 on
OpenSuse 10.2.
Performance is constant throughout. I did notice that gdb allocates
memory for each
iteration.
>
> I could not perceive any change in the cycle time from beginning to
> end.
>
> GDB 7.0 did increase its memory footprint by about 3mb during the run
> (3 percent). So that could indicate some leakage, around 15k per
> cycle.
> But I can't see that as the main reason for the slow-down you're
> reporting, considering that your system is not memory bound.
>
> GDB 7.2CVS increased its memory footprint by only about 500kb during
> the
> whole run, so you might consider giving that a try. Instructions for
> getting the latest development sources are here:
>
> http://sourceware.org/gdb/current/
>
> If you'd be willing to contribute your little sample program, we
> might be able to use it for a thread debugging stress test or
> something.
>
> Michael
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-03 7:03 ` Markus Alber
2011-02-03 20:26 ` Michael Snyder
@ 2011-02-03 20:57 ` Tom Tromey
2011-02-03 21:00 ` Tom Tromey
2011-02-03 21:40 ` Ulrich Weigand
1 sibling, 2 replies; 18+ messages in thread
From: Tom Tromey @ 2011-02-03 20:57 UTC (permalink / raw)
To: Markus Alber; +Cc: Michael Snyder, gdb
>>>>> "Markus" == Markus Alber <markus@hyperion-imrt.org> writes:
Markus> See the attached file. It shows a similar behaviour, although it only
Markus> allocates 8kB per iteration.
Markus> You have to wait some time before this happens.
Thanks.
I changed 1<<24 to 1<<15, to spare my underpowered machine, and ran gdb
under massif.
This part is interesting:
->20.78% (2,954,016B) 0x8253683: regcache_xmalloc_1 (regcache.c:232)
| ->20.78% (2,954,016B) 0x8253F85: get_thread_arch_regcache (regcache.c:463)
| ->20.78% (2,954,016B) 0x82540B5: get_thread_regcache (regcache.c:488)
| ->20.78% (2,954,016B) 0x81D1579: i386_linux_resume (i386-linux-nat.c:861)
| | ->20.78% (2,954,016B) 0x81D7D32: linux_nat_resume (linux-nat.c:1983)
I debugged gdb a little and it does indeed seem to be leaking here.
I don't understand why registers_changed_ptid unconditionally clears
current_regcache. I suspect that may be the source of the problem.
Perhaps someone who knows this code better could take a look.
Tom
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-03 20:57 ` Tom Tromey
@ 2011-02-03 21:00 ` Tom Tromey
2011-02-03 21:40 ` Ulrich Weigand
1 sibling, 0 replies; 18+ messages in thread
From: Tom Tromey @ 2011-02-03 21:00 UTC (permalink / raw)
To: Markus Alber; +Cc: Michael Snyder, gdb
>>>>> "Tom" == Tom Tromey <tromey@redhat.com> writes:
Tom> I debugged gdb a little and it does indeed seem to be leaking here.
Also, I didn't notice a gdb slowdown. It would be interesting to see a
profile.
Tom
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-03 20:57 ` Tom Tromey
2011-02-03 21:00 ` Tom Tromey
@ 2011-02-03 21:40 ` Ulrich Weigand
2011-02-03 22:04 ` Tom Tromey
2011-02-04 14:55 ` Pedro Alves
1 sibling, 2 replies; 18+ messages in thread
From: Ulrich Weigand @ 2011-02-03 21:40 UTC (permalink / raw)
To: Tom Tromey; +Cc: Markus Alber, Michael Snyder, gdb, pedro
Tom Tromey wrote:
> >>>>> "Markus" == Markus Alber <markus@hyperion-imrt.org> writes:
>
> Markus> See the attached file. It shows a similar behaviour, although it only
> Markus> allocates 8kB per iteration.
> Markus> You have to wait some time before this happens.
>
> Thanks.
>
> I changed 1<<24 to 1<<15, to spare my underpowered machine, and ran gdb
> under massif.
>
> This part is interesting:
>
> ->20.78% (2,954,016B) 0x8253683: regcache_xmalloc_1 (regcache.c:232)
> | ->20.78% (2,954,016B) 0x8253F85: get_thread_arch_regcache (regcache.c:463)
> | ->20.78% (2,954,016B) 0x82540B5: get_thread_regcache (regcache.c:488)
> | ->20.78% (2,954,016B) 0x81D1579: i386_linux_resume (i386-linux-nat.c:861)
> | | ->20.78% (2,954,016B) 0x81D7D32: linux_nat_resume (linux-nat.c:1983)
>
>
> I debugged gdb a little and it does indeed seem to be leaking here.
>
> I don't understand why registers_changed_ptid unconditionally clears
> current_regcache. I suspect that may be the source of the problem.
>
> Perhaps someone who knows this code better could take a look.
It seems this leak was introduced by Pedro's patch here:
http://sourceware.org/ml/gdb-patches/2010-04/msg00960.html
The function used to free all regcaches in the list and then
reset current_regcache. The new code now takes care to selectively
free only a subset of regcaches -- and then resets current_regcache
anyway ...
I guess we should just remove the current_regcache = NULL line now.
(Actually, now that every thread always has a thread_info, the
best thing would probably be anyway to hang each thread's regcaches
off the thread_info, and do away with the global list completely.)
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
Ulrich.Weigand@de.ibm.com
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-03 21:40 ` Ulrich Weigand
@ 2011-02-03 22:04 ` Tom Tromey
2011-02-04 13:49 ` Ulrich Weigand
2011-02-04 14:55 ` Pedro Alves
1 sibling, 1 reply; 18+ messages in thread
From: Tom Tromey @ 2011-02-03 22:04 UTC (permalink / raw)
To: Ulrich Weigand; +Cc: Markus Alber, Michael Snyder, gdb, pedro
>>>>> "Ulrich" == Ulrich Weigand <uweigand@de.ibm.com> writes:
Ulrich> I guess we should just remove the current_regcache = NULL line now.
Yes, it seems so to me as well.
Ulrich> (Actually, now that every thread always has a thread_info, the
Ulrich> best thing would probably be anyway to hang each thread's regcaches
Ulrich> off the thread_info, and do away with the global list completely.)
I suppose in this case, invalidating could simply clear a regcache
instead of deleting (and subsequently recreating) it. Is that correct?
Tom
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-03 22:04 ` Tom Tromey
@ 2011-02-04 13:49 ` Ulrich Weigand
0 siblings, 0 replies; 18+ messages in thread
From: Ulrich Weigand @ 2011-02-04 13:49 UTC (permalink / raw)
To: Tom Tromey; +Cc: Markus Alber, Michael Snyder, gdb, pedro
Tom Tromey wrote:
> >>>>> "Ulrich" == Ulrich Weigand <uweigand@de.ibm.com> writes:
> Ulrich> (Actually, now that every thread always has a thread_info, the
> Ulrich> best thing would probably be anyway to hang each thread's regcaches
> Ulrich> off the thread_info, and do away with the global list completely.)
>
> I suppose in this case, invalidating could simply clear a regcache
> instead of deleting (and subsequently recreating) it. Is that correct?
Yes, I think so. Note that we still need to (potentially) keep more than
one regcache per thread for multi-arch support. We may also need to reset
the regcache's address space before reusing it ...
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
Ulrich.Weigand@de.ibm.com
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-03 21:40 ` Ulrich Weigand
2011-02-03 22:04 ` Tom Tromey
@ 2011-02-04 14:55 ` Pedro Alves
2011-02-04 15:13 ` Ulrich Weigand
2011-02-04 15:26 ` Tom Tromey
1 sibling, 2 replies; 18+ messages in thread
From: Pedro Alves @ 2011-02-04 14:55 UTC (permalink / raw)
To: Ulrich Weigand; +Cc: Tom Tromey, Markus Alber, Michael Snyder, gdb
On Friday 04 February 2011 21:40:08, Ulrich Weigand wrote:
> Tom Tromey wrote:
> > >>>>> "Markus" == Markus Alber <markus@hyperion-imrt.org> writes:
> >
> > Markus> See the attached file. It shows a similar behaviour, although it only
> > Markus> allocates 8kB per iteration.
> > Markus> You have to wait some time before this happens.
> >
> > Thanks.
> >
> > I changed 1<<24 to 1<<15, to spare my underpowered machine, and ran gdb
> > under massif.
> >
> > This part is interesting:
> >
> > ->20.78% (2,954,016B) 0x8253683: regcache_xmalloc_1 (regcache.c:232)
> > | ->20.78% (2,954,016B) 0x8253F85: get_thread_arch_regcache (regcache.c:463)
> > | ->20.78% (2,954,016B) 0x82540B5: get_thread_regcache (regcache.c:488)
> > | ->20.78% (2,954,016B) 0x81D1579: i386_linux_resume (i386-linux-nat.c:861)
> > | | ->20.78% (2,954,016B) 0x81D7D32: linux_nat_resume (linux-nat.c:1983)
> >
> >
> > I debugged gdb a little and it does indeed seem to be leaking here.
> >
> > I don't understand why registers_changed_ptid unconditionally clears
> > current_regcache. I suspect that may be the source of the problem.
> >
> > Perhaps someone who knows this code better could take a look.
>
> It seems this leak was introduced by Pedro's patch here:
> http://sourceware.org/ml/gdb-patches/2010-04/msg00960.html
>
> The function used to free all regcaches in the list and then
> reset current_regcache. The new code now takes care to selectively
> free only a subset of regcaches -- and then resets current_regcache
> anyway ...
Egads!
> I guess we should just remove the current_regcache = NULL line now.
Yeah. And only clear current_regcache_ptid if it was deleted in the
first place; and only reinit the frame cache if we deleted the
regcache of inferior_ptid ? Like the patch below.
> (Actually, now that every thread always has a thread_info, the
> best thing would probably be anyway to hang each thread's regcaches
> off the thread_info, and do away with the global list completely.)
Not sure we can do that yet, at least as sole mechanism to
keep track of regcache pointers.
We have targets that do thread<->lwp ptid translation between
thread/proc stratum layers, and random places that do
get_thread_regcache (ptid) behind the core's back -- it appears
aix-thread.c could be one of those.
Another example: linux-nat.c:cancel_breakpoint builds
regcaches for lwps before linux-thread-db.c (if active at all)
has had a chance of telling the core about new
threads (in all-stop mode).
--
Pedro Alves
2011-02-04 Pedro Alves <pedro@codesourcery.com>
gdb/
* regcache.c (registers_changed_ptid): Don't explictly always
clear `current_regcache'. Only clear current_thread_ptid and
current_thread_arch when PTID matches. Only reinit the frame
cache if PTID matches the current inferior_ptid. Move alloca(0)
call to ...
(registers_changed): ... here.
---
gdb/regcache.c | 28 +++++++++++++++++-----------
1 file changed, 17 insertions(+), 11 deletions(-)
Index: src/gdb/regcache.c
===================================================================
--- src.orig/gdb/regcache.c 2011-02-01 15:27:37.000000000 +0000
+++ src/gdb/regcache.c 2011-02-04 11:56:49.273328004 +0000
@@ -530,6 +530,7 @@ void
registers_changed_ptid (ptid_t ptid)
{
struct regcache_list *list, **list_link;
+ int wildcard = ptid_equal (ptid, minus_one_ptid);
list = current_regcache;
list_link = ¤t_regcache;
@@ -550,13 +551,24 @@ registers_changed_ptid (ptid_t ptid)
list = *list_link;
}
- current_regcache = NULL;
+ if (wildcard || ptid_equal (ptid, current_thread_ptid))
+ {
+ current_thread_ptid = null_ptid;
+ current_thread_arch = NULL;
+ }
- current_thread_ptid = null_ptid;
- current_thread_arch = NULL;
+ if (wildcard || ptid_equal (ptid, inferior_ptid))
+ {
+ /* We just deleted the regcache of the current thread. Need to
+ forget about any frames we have cached, too. */
+ reinit_frame_cache ();
+ }
+}
- /* Need to forget about any frames we have cached, too. */
- reinit_frame_cache ();
+void
+registers_changed (void)
+{
+ registers_changed_ptid (minus_one_ptid);
/* Force cleanup of any alloca areas if using C alloca instead of
a builtin alloca. This particular call is used to clean up
@@ -567,12 +579,6 @@ registers_changed_ptid (ptid_t ptid)
}
void
-registers_changed (void)
-{
- registers_changed_ptid (minus_one_ptid);
-}
-
-void
regcache_raw_read (struct regcache *regcache, int regnum, gdb_byte *buf)
{
gdb_assert (regcache != NULL && buf != NULL);
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-04 14:55 ` Pedro Alves
@ 2011-02-04 15:13 ` Ulrich Weigand
2011-02-04 15:26 ` Tom Tromey
1 sibling, 0 replies; 18+ messages in thread
From: Ulrich Weigand @ 2011-02-04 15:13 UTC (permalink / raw)
To: Pedro Alves; +Cc: Tom Tromey, Markus Alber, Michael Snyder, gdb
Pedro Alves wrote:
> > (Actually, now that every thread always has a thread_info, the
> > best thing would probably be anyway to hang each thread's regcaches
> > off the thread_info, and do away with the global list completely.)
>
> Not sure we can do that yet, at least as sole mechanism to
> keep track of regcache pointers.
> We have targets that do thread<->lwp ptid translation between
> thread/proc stratum layers, and random places that do
> get_thread_regcache (ptid) behind the core's back -- it appears
> aix-thread.c could be one of those.
> Another example: linux-nat.c:cancel_breakpoint builds
> regcaches for lwps before linux-thread-db.c (if active at all)
> has had a chance of telling the core about new
> threads (in all-stop mode).
Ah, I see. Yes, this looks like some more work is needed ...
Thanks,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
Ulrich.Weigand@de.ibm.com
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-04 14:55 ` Pedro Alves
2011-02-04 15:13 ` Ulrich Weigand
@ 2011-02-04 15:26 ` Tom Tromey
2011-02-04 15:56 ` Pedro Alves
[not found] ` <201102041555.52179.pedro__21913.9744448059$1296834976$gmane$org@codesourcery.com>
1 sibling, 2 replies; 18+ messages in thread
From: Tom Tromey @ 2011-02-04 15:26 UTC (permalink / raw)
To: Pedro Alves; +Cc: Ulrich Weigand, Markus Alber, Michael Snyder, gdb
Pedro> Yeah. And only clear current_regcache_ptid if it was deleted in the
Pedro> first place; and only reinit the frame cache if we deleted the
Pedro> regcache of inferior_ptid ? Like the patch below.
Yeah, this looks reasonable to me.
Pedro> We have targets that do thread<->lwp ptid translation between
Pedro> thread/proc stratum layers, and random places that do
Pedro> get_thread_regcache (ptid) behind the core's back -- it appears
Pedro> aix-thread.c could be one of those.
Pedro> Another example: linux-nat.c:cancel_breakpoint builds
Pedro> regcaches for lwps before linux-thread-db.c (if active at all)
Pedro> has had a chance of telling the core about new
Pedro> threads (in all-stop mode).
Thanks for pointing this out. I have a lot to learn in this area.
One somewhat distressing thing is that my patch did not cause any
regressions.
Tom
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-04 15:26 ` Tom Tromey
@ 2011-02-04 15:56 ` Pedro Alves
[not found] ` <201102041555.52179.pedro__21913.9744448059$1296834976$gmane$org@codesourcery.com>
1 sibling, 0 replies; 18+ messages in thread
From: Pedro Alves @ 2011-02-04 15:56 UTC (permalink / raw)
To: gdb; +Cc: Tom Tromey, Ulrich Weigand, Markus Alber, Michael Snyder, gdb-patches
[Adding gdb-patches@]
On Friday 04 February 2011 15:25:54, Tom Tromey wrote:
> Pedro> Yeah. And only clear current_regcache_ptid if it was deleted in the
> Pedro> first place; and only reinit the frame cache if we deleted the
> Pedro> regcache of inferior_ptid ? Like the patch below.
>
> Yeah, this looks reasonable to me.
Okay, thanks, I've applied it.
> One somewhat distressing thing is that my patch did not cause any
> regressions.
Indeed.
--
Pedro Alves
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
[not found] ` <201102041555.52179.pedro__21913.9744448059$1296834976$gmane$org@codesourcery.com>
@ 2011-02-04 17:02 ` Tom Tromey
2011-02-05 9:34 ` Markus Alber
2011-02-07 14:05 ` Markus Alber
0 siblings, 2 replies; 18+ messages in thread
From: Tom Tromey @ 2011-02-04 17:02 UTC (permalink / raw)
To: Pedro Alves
Cc: gdb, Ulrich Weigand, Markus Alber, Michael Snyder, gdb-patches
Tom> Yeah, this looks reasonable to me.
Pedro> Okay, thanks, I've applied it.
Thanks
Markus -- any way you could try gdb CVS HEAD to see if it fixes your
problem? If it doesn't, then there are more bugs to find...
Tom
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-04 17:02 ` Tom Tromey
@ 2011-02-05 9:34 ` Markus Alber
2011-02-07 14:05 ` Markus Alber
1 sibling, 0 replies; 18+ messages in thread
From: Markus Alber @ 2011-02-05 9:34 UTC (permalink / raw)
To: Tom Tromey; +Cc: Pedro Alves, gdb, Ulrich Weigand, Michael Snyder, gdb-patches
On Fri, 04 Feb 2011 10:01:57 -0700, Tom Tromey wrote:
> Markus -- any way you could try gdb CVS HEAD to see if it fixes your
> problem? If it doesn't, then there are more bugs to find...
>
> Tom
Will try on Monday, thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: performance of multithreading gets gradually worse under gdb
2011-02-04 17:02 ` Tom Tromey
2011-02-05 9:34 ` Markus Alber
@ 2011-02-07 14:05 ` Markus Alber
1 sibling, 0 replies; 18+ messages in thread
From: Markus Alber @ 2011-02-07 14:05 UTC (permalink / raw)
To: Tom Tromey; +Cc: Pedro Alves, gdb, Ulrich Weigand, Michael Snyder, gdb-patches
On Fri, 04 Feb 2011 10:01:57 -0700, Tom Tromey wrote:
> Tom> Yeah, this looks reasonable to me.
>
> Pedro> Okay, thanks, I've applied it.
>
> Thanks
>
> Markus -- any way you could try gdb CVS HEAD to see if it fixes your
> problem? If it doesn't, then there are more bugs to find...
>
> Tom
Hi, I did as told and it does indeed make the situation a lot better.
The application runs to the end (for the first time!) and it does not
slow down.
Thanks for that!
The application is a lot slower though when run in gdb, than
stand-alone.
I think the reason is this. During each iteration, it branches into the
worker threads and then comes back to some part of the computation
which
is performed only by one CPU. With 12 cores, the time per iteration is
essentially spent in there. It is here that gdb competes with the
application,
because both seem to be running on the same core. The average load
drops from
66% to about 20% when run in gdb.
Just for your information/in case you are interested.
I did an oprofile as well, and here is what it says:
samples % image name app name
symbol name
1113 19.3464 libthread_db-1.0.so libthread_db-1.0.so
_td_locate_field
1054 18.3209 libthread_db-1.0.so libthread_db-1.0.so
td_thr_get_info
847 14.7228 libthread_db-1.0.so libthread_db-1.0.so
td_ta_event_getmsg
804 13.9753 libthread_db-1.0.so libthread_db-1.0.so
_td_fetch_value_local
568 9.8731 libthread_db-1.0.so libthread_db-1.0.so
td_ta_map_lwp2thr
519 9.0214 libthread_db-1.0.so libthread_db-1.0.so
_td_fetch_value
484 8.4130 libthread_db-1.0.so libthread_db-1.0.so
__td_ta_lookup_th_unique
169 2.9376 libthread_db-1.0.so libthread_db-1.0.so
_td_store_value
167 2.9028 libthread_db-1.0.so libthread_db-1.0.so
td_thr_event_enable
3 0.0521 iconv iconv
/usr/bin/iconv
3 0.0521 libstreamanalyzer.so.0.7.2 libstreamanalyzer.so.0.7.2
/usr/lib64/libstreamanalyzer.so.0.7.2
2 0.0348 [vdso] (tgid:2539
range:0x7fffe81ff000-0x7fffe8200000) auditd [vdso]
(tgid:2539 range:0x7fffe81
200000)
2 0.0348 libpcre.so.0.0.1 libpcre.so.0.0.1
/lib64/libpcre.so.0.0.1
2 0.0348 ophelp ophelp
/usr/bin/ophelp
1 0.0174 [vdso] (tgid:7680
range:0x7fff267c6000-0x7fff267c7000) kdeinit4 [vdso]
(tgid:7680 range:0x7fff267
7c7000)
1 0.0174 [vdso] (tgid:8945
range:0x7fffbbbff000-0x7fffbbc00000) gconfd-2 [vdso]
(tgid:8945 range:0x7fffbbb
c00000)
1 0.0174 auditd auditd
/sbin/auditd
1 0.0174 basename basename
/bin/basename
1 0.0174 kded_networkstatus.so kded_networkstatus.so
/usr/lib64/kde4/kded_networkstatus.so
1 0.0174 libXau.so.6.0.0 libXau.so.6.0.0
/usr/lib64/libXau.so.6.0.0
1 0.0174 libXi.so.6.1.0 libXi.so.6.1.0
/usr/lib64/libXi.so.6.1.0
1 0.0174 libkdeinit4_kglobalaccel.so
libkdeinit4_kglobalaccel.so /usr/lib64/libkdeinit4_kglobalaccel.so
1 0.0174 libstreams.so.0.7.2 libstreams.so.0.7.2
/usr/lib64/libstreams.so.0.7.2
1 0.0174 libthread_db-1.0.so libthread_db-1.0.so
__do_global_ctors_aux
1 0.0174 pickup pickup
/usr/lib/postfix/pickup
1 0.0174 plasma_applet_icon.so plasma_applet_icon.so
/usr/lib64/kde4/plasma_applet_icon.so
1 0.0174 qmgr qmgr
/usr/lib/postfix/qmgr
1 0.0174 rpcbind rpcbind
/sbin/rpcbind
1 0.0174 sleep sleep
/bin/sleep
1 0.0174 touch touch
/bin/touch
If you want to have it in more detail, please tell me what options
you'd like to have engaged.
regards,
Markus
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2011-02-07 14:05 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-02 20:16 performance of multithreading gets gradually worse under gdb Markus Alber
2011-02-02 20:28 ` Michael Snyder
[not found] ` <76bccf1875854ebc69b6a892fb84a976@hyperion-imrt.org>
2011-02-02 21:43 ` Michael Snyder
2011-02-03 7:03 ` Markus Alber
2011-02-03 20:26 ` Michael Snyder
2011-02-03 20:52 ` Markus Alber
2011-02-03 20:57 ` Tom Tromey
2011-02-03 21:00 ` Tom Tromey
2011-02-03 21:40 ` Ulrich Weigand
2011-02-03 22:04 ` Tom Tromey
2011-02-04 13:49 ` Ulrich Weigand
2011-02-04 14:55 ` Pedro Alves
2011-02-04 15:13 ` Ulrich Weigand
2011-02-04 15:26 ` Tom Tromey
2011-02-04 15:56 ` Pedro Alves
[not found] ` <201102041555.52179.pedro__21913.9744448059$1296834976$gmane$org@codesourcery.com>
2011-02-04 17:02 ` Tom Tromey
2011-02-05 9:34 ` Markus Alber
2011-02-07 14:05 ` Markus Alber
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox