performance of multithreading gets gradually worse under gdb

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

* performance of multithreading gets gradually worse under gdb
@ 2011-02-02 20:16 Markus Alber
  2011-02-02 20:28 ` Michael Snyder
  0 siblings, 1 reply; 18+ messages in thread
From: Markus Alber @ 2011-02-02 20:16 UTC (permalink / raw)
  To: gdb

 Hello,

 I have experienced the following problem:

 I'm debugging a number-crunching application which spawns a lot (36) 
 little
 worker threads per iteration. The system does typically OoM 200 
 iterations.
 Although each of them should take about the same amount of time, the 
 performance
 gets worse with every iteration and becomes excruciatingly slow.

 A system monitor reveals that gdb allocates more memory with every 
 iteration,
 i.e. with every 36 threads started and finished. The CPU load of GDB 
 goes up, too.
 The CPU usage of the application goes down. Compared to the solo 
 performance, it
 gets slower by a factor 20 and more, if run long enough.

 The application behaves perfectly when run by itself. The 
 multi-threaded part is not
 debugging compiled when this behaviour occurs.

 The distribution is SuSE 11.3 / gdb 7.1.

 Is there anything I can change about this behaviour, any options of gdb 
 that need to
 be set in these circumstances?

 Thank you very much for your help in advance,

 Markus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-02 20:16 performance of multithreading gets gradually worse under gdb Markus Alber
@ 2011-02-02 20:28 ` Michael Snyder
       [not found]   ` <76bccf1875854ebc69b6a892fb84a976@hyperion-imrt.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Snyder @ 2011-02-02 20:28 UTC (permalink / raw)
  To: Markus Alber; +Cc: gdb

Markus Alber wrote:
>  Hello,
> 
>  I have experienced the following problem:
> 
>  I'm debugging a number-crunching application which spawns a lot (36) 
>  little
>  worker threads per iteration. The system does typically OoM 200 
>  iterations.
>  Although each of them should take about the same amount of time, the 
>  performance
>  gets worse with every iteration and becomes excruciatingly slow.
> 
>  A system monitor reveals that gdb allocates more memory with every 
>  iteration,
>  i.e. with every 36 threads started and finished. The CPU load of GDB 
>  goes up, too.
>  The CPU usage of the application goes down. Compared to the solo 
>  performance, it
>  gets slower by a factor 20 and more, if run long enough.
> 
>  The application behaves perfectly when run by itself. The 
>  multi-threaded part is not
>  debugging compiled when this behaviour occurs.
> 
>  The distribution is SuSE 11.3 / gdb 7.1.
> 
>  Is there anything I can change about this behaviour, any options of gdb 
>  that need to
>  be set in these circumstances?

Interesting.

By how much does gdb's memory allocation increase?
In total or, if possible, per iteration?  This might
give is a clue as to where to look.

Do you think you could write a simple sample program that
allocates threads in a manner similar to your application?

Thanks,


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
       [not found]   ` <76bccf1875854ebc69b6a892fb84a976@hyperion-imrt.org>
@ 2011-02-02 21:43     ` Michael Snyder
  2011-02-03  7:03       ` Markus Alber
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Snyder @ 2011-02-02 21:43 UTC (permalink / raw)
  To: Markus Alber, gdb

Markus Alber wrote:
>  Thank you for the quick response.

You're welcome, but let's keep the discussion on the public list, so 
that other maintainers may jump in if so inclined.


>  It allocates about 100kB per iteration.

Hmmm, and that's for roughly 36 thread start/stops, so it could be
losing roughly 3000 bytes per thread.  That's much bigger than my
first guess would have been (a "struct thread_info" is only 336 bytes).

>  One interesting finding might also be:
> 
>  I terminated the process when an iteration took about 3 min (instead of 
>  1 sec)
>  and gdb had about 115MB allocated.

I assume that at this point, your system still had plenty of ram to 
spare?  It wasn't simply swapping?

>  On starting the application again, it ran alright for a while and the 
>  gdb memory allocation
>  stayed constant. When it finally started to grow again, the application 
>  slowed down, and became
>  slower with every iteration - the usual picture.
> 
>  I attached a sample file from the application where the computation 
>  bifurcates into
>  the worker threads. This is one of three instances per iteration, but 
>  they all follow the
>  same pattern.

I was really hoping for a stripped-down sample that we could compile and 
run.


> The machine has 2x6 cores x 3 instances per iteration = 
>  36 worker threads per iteration.

x86 architecture?

>  On another note, I tried to compile gdb-6.5 on my machine (because it 
>  was the release I
>  used to work with before, without problems) and configure comes back 
>  with an error that it cannot find a termcap lib. There is none on the 
>  SuSE. Which package would I need to install?

That would be libncurses, I think.


>  On Wed, 02 Feb 2011 12:27:58 -0800, Michael Snyder wrote:
>> Markus Alber wrote:
>>>  Hello,
>>> I have experienced the following problem:
>>> I'm debugging a number-crunching application which spawns a lot (36) 
>>> little
>>>  worker threads per iteration. The system does typically OoM 200 
>>> iterations.
>>>  Although each of them should take about the same amount of time, 
>>> the performance
>>>  gets worse with every iteration and becomes excruciatingly slow.
>>> A system monitor reveals that gdb allocates more memory with every 
>>> iteration,
>>>  i.e. with every 36 threads started and finished. The CPU load of 
>>> GDB goes up, too.
>>>  The CPU usage of the application goes down. Compared to the solo 
>>> performance, it
>>>  gets slower by a factor 20 and more, if run long enough.
>>> The application behaves perfectly when run by itself. The 
>>> multi-threaded part is not
>>>  debugging compiled when this behaviour occurs.
>>> The distribution is SuSE 11.3 / gdb 7.1.
>>> Is there anything I can change about this behaviour, any options of 
>>> gdb that need to
>>>  be set in these circumstances?
>> Interesting.
>>
>> By how much does gdb's memory allocation increase?
>> In total or, if possible, per iteration?  This might
>> give is a clue as to where to look.
>>
>> Do you think you could write a simple sample program that
>> allocates threads in a manner similar to your application?
>>
>> Thanks,


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-02 21:43     ` Michael Snyder
@ 2011-02-03  7:03       ` Markus Alber
  2011-02-03 20:26         ` Michael Snyder
  2011-02-03 20:57         ` Tom Tromey
  0 siblings, 2 replies; 18+ messages in thread
From: Markus Alber @ 2011-02-03  7:03 UTC (permalink / raw)
  To: Michael Snyder; +Cc: gdb

[-- Attachment #1: Type: text/plain, Size: 3355 bytes --]

 On Wed, 02 Feb 2011 13:43:50 -0800, Michael Snyder wrote:

>>  It allocates about 100kB per iteration.
>
> Hmmm, and that's for roughly 36 thread start/stops, so it could be
> losing roughly 3000 bytes per thread.  That's much bigger than my
> first guess would have been (a "struct thread_info" is only 336 
> bytes).
>
>>  One interesting finding might also be:
>> I terminated the process when an iteration took about 3 min (instead 
>> of 1 sec)
>>  and gdb had about 115MB allocated.
>
> I assume that at this point, your system still had plenty of ram to
> spare?  It wasn't simply swapping?

 The system had about 20 GB to spare.
 
>>  On starting the application again, it ran alright for a while and 
>> the gdb memory allocation
>>  stayed constant. When it finally started to grow again, the 
>> application slowed down, and became
>>  slower with every iteration - the usual picture.
>> I attached a sample file from the application where the computation 
>> bifurcates into
>>  the worker threads. This is one of three instances per iteration, 
>> but they all follow the
>>  same pattern.
>
> I was really hoping for a stripped-down sample that we could compile 
> and run.

 See the attached file. It shows a similar behaviour, although it only 
 allocates 8kB per iteration.
 You have to wait some time before this happens.

>> The machine has 2x6 cores x 3 instances per iteration = 36 worker 
>> threads per iteration.
>
> x86 architecture?
>
>>  On another note, I tried to compile gdb-6.5 on my machine (because 
>> it was the release I
>>  used to work with before, without problems) and configure comes 
>> back with an error that it cannot find a termcap lib. There is none on 
>> the SuSE. Which package would I need to install?
>
> That would be libncurses, I think.


 I compiled gdb-6.5 alright and it performs well as usual, without this 
 problem.


>
>
>>  On Wed, 02 Feb 2011 12:27:58 -0800, Michael Snyder wrote:
>>> Markus Alber wrote:
>>>>  Hello,
>>>> I have experienced the following problem:
>>>> I'm debugging a number-crunching application which spawns a lot 
>>>> (36) little
>>>>  worker threads per iteration. The system does typically OoM 200 
>>>> iterations.
>>>>  Although each of them should take about the same amount of time, 
>>>> the performance
>>>>  gets worse with every iteration and becomes excruciatingly slow.
>>>> A system monitor reveals that gdb allocates more memory with every 
>>>> iteration,
>>>>  i.e. with every 36 threads started and finished. The CPU load of 
>>>> GDB goes up, too.
>>>>  The CPU usage of the application goes down. Compared to the solo 
>>>> performance, it
>>>>  gets slower by a factor 20 and more, if run long enough.
>>>> The application behaves perfectly when run by itself. The 
>>>> multi-threaded part is not
>>>>  debugging compiled when this behaviour occurs.
>>>> The distribution is SuSE 11.3 / gdb 7.1.
>>>> Is there anything I can change about this behaviour, any options 
>>>> of gdb that need to
>>>>  be set in these circumstances?
>>> Interesting.
>>>
>>> By how much does gdb's memory allocation increase?
>>> In total or, if possible, per iteration?  This might
>>> give is a clue as to where to look.
>>>
>>> Do you think you could write a simple sample program that
>>> allocates threads in a manner similar to your application?
>>>
>>> Thanks,

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: mt_test.cpp --]
[-- Type: text/x-c++src; name=mt_test.cpp, Size: 1555 bytes --]

#include <stdlib.h>
#include <iostream>
#include <math.h>

#include <vector>

#include <boost/thread/thread.hpp>
#include <boost/bind.hpp>

using namespace std;

// Compile with: g++ -g -lboost_thread mt_test.cpp

struct ThreadData {
  ThreadData(const vector<double>& vOne,
	     const double factor) :
    m_vOne(vOne),
    m_factor(factor),
    m_vTwo(vOne.size(), m_factor)
    {
      // empty
    }
  
  const vector<double>&   m_vOne;
  double                  m_factor;
  vector<double>          m_vTwo;
};
  
static void* doAdd(void* data)
{
  ThreadData* pData = static_cast<ThreadData*>(data);

  for(unsigned int nAt = 0; nAt < pData->m_vOne.size(); ++nAt)
    pData->m_vTwo[nAt] += pData->m_factor * pData->m_vOne[nAt]; 

  return NULL;
}


int main(void) {

  const int nThreads = 12;
  const int nRepetitions = 200;
  
  vector<double> vOne(1<<24, 1.0);
  
  for(int nAtRepetition = 0; nAtRepetition < nRepetitions; ++nAtRepetition) {

    cout << "Repetition no. " << nAtRepetition << endl;

    vector<ThreadData*> vpData(nThreads, static_cast<ThreadData*>(NULL));  
    boost::thread_group threads;
    
    for (int nAtThread = 0; nAtThread < nThreads; nAtThread++) {
      vpData[nAtThread] = new ThreadData(vOne, static_cast<double>(nAtThread+1) );
      if( threads.create_thread( boost::bind(&doAdd, static_cast<void*> (vpData[nAtThread])) ) == 0 )
   	throw std::exception();
    }
    
    threads.join_all();
    
    for (int nAtThread = 0; nAtThread < nThreads; nAtThread++) 
      delete vpData[nAtThread];
  }

  return 0;
}

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-03  7:03       ` Markus Alber
@ 2011-02-03 20:26         ` Michael Snyder
  2011-02-03 20:52           ` Markus Alber
  2011-02-03 20:57         ` Tom Tromey
  1 sibling, 1 reply; 18+ messages in thread
From: Michael Snyder @ 2011-02-03 20:26 UTC (permalink / raw)
  To: Markus Alber; +Cc: gdb

Markus Alber wrote:
> 
>  I compiled gdb-6.5 alright and it performs well as usual, without this 
>  problem.

Good info.  What version of gdb were you using when you detected the 
problem?

I ran your program stand-alone, then under gdb 7.0, then gdb 7.2CVS.
On my not-so-new system I got the following times:
stand alone: 15m33s
gdb 7.0      16m
gdb 7.2CVS   16m10s

I could not perceive any change in the cycle time from beginning to end.

GDB 7.0 did increase its memory footprint by about 3mb during the run
(3 percent).  So that could indicate some leakage, around 15k per cycle.
But I can't see that as the main reason for the slow-down you're 
reporting, considering that your system is not memory bound.

GDB 7.2CVS increased its memory footprint by only about 500kb during the
whole run, so you might consider giving that a try.  Instructions for 
getting the latest development sources are here:

http://sourceware.org/gdb/current/

If you'd be willing to contribute your little sample program, we might 
be able to use it for a thread debugging stress test or something.

Michael

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-03 20:26         ` Michael Snyder
@ 2011-02-03 20:52           ` Markus Alber
  0 siblings, 0 replies; 18+ messages in thread
From: Markus Alber @ 2011-02-03 20:52 UTC (permalink / raw)
  To: Michael Snyder; +Cc: gdb

 On Thu, 03 Feb 2011 12:26:25 -0800, Michael Snyder wrote:
> Markus Alber wrote:
>> I compiled gdb-6.5 alright and it performs well as usual, without 
>> this problem.
>
> Good info.  What version of gdb were you using when you detected the 
> problem?

 it was gdb 7.1 as it comes with OpenSuSE 11.3

 Meanwhile I tried every release between 6.5 and 7.2. All the 7's showed 
 the
 problem, but it was also perceivable with 6.8.91.
 I was a bit too quick to assert that 6.5 was working well... it wasn't 
 possible
 to set break points, some 6.8 versions crashed. There seems to be some 
 incompatibility
 between the compiler on OpenSuse 11.3 and the older gdbs (I got them as 
 sources from
 the gnu repository)

> I ran your program stand-alone, then under gdb 7.0, then gdb 7.2CVS.
> On my not-so-new system I got the following times:
> stand alone: 15m33s
> gdb 7.0      16m
> gdb 7.2CVS   16m10s

 So I presume the stripping-down removed something that causes the 
 problem in the
 larger application. Or, is it a possibility that it has something to do 
 with the kernel or compiler of
 OpenSuSE 11.3?

 Meanwhile I've gone back to my older 4-core machine running gdb 6.5 on 
 OpenSuse 10.2.
 Performance is constant throughout. I did notice that gdb allocates 
 memory for each
 iteration.

>
> I could not perceive any change in the cycle time from beginning to 
> end.
>
> GDB 7.0 did increase its memory footprint by about 3mb during the run
> (3 percent).  So that could indicate some leakage, around 15k per 
> cycle.
> But I can't see that as the main reason for the slow-down you're
> reporting, considering that your system is not memory bound.
>
> GDB 7.2CVS increased its memory footprint by only about 500kb during 
> the
> whole run, so you might consider giving that a try.  Instructions for
> getting the latest development sources are here:
>
> http://sourceware.org/gdb/current/
>
> If you'd be willing to contribute your little sample program, we
> might be able to use it for a thread debugging stress test or
> something.
>
> Michael

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-03  7:03       ` Markus Alber
  2011-02-03 20:26         ` Michael Snyder
@ 2011-02-03 20:57         ` Tom Tromey
  2011-02-03 21:00           ` Tom Tromey
  2011-02-03 21:40           ` Ulrich Weigand
  1 sibling, 2 replies; 18+ messages in thread
From: Tom Tromey @ 2011-02-03 20:57 UTC (permalink / raw)
  To: Markus Alber; +Cc: Michael Snyder, gdb

>>>>> "Markus" == Markus Alber <markus@hyperion-imrt.org> writes:

Markus> See the attached file. It shows a similar behaviour, although it only
Markus> allocates 8kB per iteration.
Markus> You have to wait some time before this happens.

Thanks.

I changed 1<<24 to 1<<15, to spare my underpowered machine, and ran gdb
under massif.

This part is interesting:

->20.78% (2,954,016B) 0x8253683: regcache_xmalloc_1 (regcache.c:232)
| ->20.78% (2,954,016B) 0x8253F85: get_thread_arch_regcache (regcache.c:463)
|   ->20.78% (2,954,016B) 0x82540B5: get_thread_regcache (regcache.c:488)
|     ->20.78% (2,954,016B) 0x81D1579: i386_linux_resume (i386-linux-nat.c:861)
|     | ->20.78% (2,954,016B) 0x81D7D32: linux_nat_resume (linux-nat.c:1983)

I debugged gdb a little and it does indeed seem to be leaking here.

I don't understand why registers_changed_ptid unconditionally clears
current_regcache.  I suspect that may be the source of the problem.

Perhaps someone who knows this code better could take a look.

Tom

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-03 20:57         ` Tom Tromey
@ 2011-02-03 21:00           ` Tom Tromey
  2011-02-03 21:40           ` Ulrich Weigand
  1 sibling, 0 replies; 18+ messages in thread
From: Tom Tromey @ 2011-02-03 21:00 UTC (permalink / raw)
  To: Markus Alber; +Cc: Michael Snyder, gdb

>>>>> "Tom" == Tom Tromey <tromey@redhat.com> writes:

Tom> I debugged gdb a little and it does indeed seem to be leaking here.

Also, I didn't notice a gdb slowdown.  It would be interesting to see a
profile.

Tom


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-03 20:57         ` Tom Tromey
  2011-02-03 21:00           ` Tom Tromey
@ 2011-02-03 21:40           ` Ulrich Weigand
  2011-02-03 22:04             ` Tom Tromey
  2011-02-04 14:55             ` Pedro Alves
  1 sibling, 2 replies; 18+ messages in thread
From: Ulrich Weigand @ 2011-02-03 21:40 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Markus Alber, Michael Snyder, gdb, pedro

Tom Tromey wrote:
> >>>>> "Markus" == Markus Alber <markus@hyperion-imrt.org> writes:
> 
> Markus> See the attached file. It shows a similar behaviour, although it only
> Markus> allocates 8kB per iteration.
> Markus> You have to wait some time before this happens.
> 
> Thanks.
> 
> I changed 1<<24 to 1<<15, to spare my underpowered machine, and ran gdb
> under massif.
> 
> This part is interesting:
> 
> ->20.78% (2,954,016B) 0x8253683: regcache_xmalloc_1 (regcache.c:232)
> | ->20.78% (2,954,016B) 0x8253F85: get_thread_arch_regcache (regcache.c:463)
> |   ->20.78% (2,954,016B) 0x82540B5: get_thread_regcache (regcache.c:488)
> |     ->20.78% (2,954,016B) 0x81D1579: i386_linux_resume (i386-linux-nat.c:861)
> |     | ->20.78% (2,954,016B) 0x81D7D32: linux_nat_resume (linux-nat.c:1983)
> 
> 
> I debugged gdb a little and it does indeed seem to be leaking here.
> 
> I don't understand why registers_changed_ptid unconditionally clears
> current_regcache.  I suspect that may be the source of the problem.
> 
> Perhaps someone who knows this code better could take a look.

It seems this leak was introduced by Pedro's patch here:
http://sourceware.org/ml/gdb-patches/2010-04/msg00960.html

The function used to free all regcaches in the list and then
reset current_regcache.  The new code now takes care to selectively
free only a subset of regcaches -- and then resets current_regcache
anyway ...

I guess we should just remove the current_regcache = NULL line now.

(Actually, now that every thread always has a thread_info, the
best thing would probably be anyway to hang each thread's regcaches
off the thread_info, and do away with the global list completely.)

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-03 21:40           ` Ulrich Weigand
@ 2011-02-03 22:04             ` Tom Tromey
  2011-02-04 13:49               ` Ulrich Weigand
  2011-02-04 14:55             ` Pedro Alves
  1 sibling, 1 reply; 18+ messages in thread
From: Tom Tromey @ 2011-02-03 22:04 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: Markus Alber, Michael Snyder, gdb, pedro

>>>>> "Ulrich" == Ulrich Weigand <uweigand@de.ibm.com> writes:

Ulrich> I guess we should just remove the current_regcache = NULL line now.

Yes, it seems so to me as well.

Ulrich> (Actually, now that every thread always has a thread_info, the
Ulrich> best thing would probably be anyway to hang each thread's regcaches
Ulrich> off the thread_info, and do away with the global list completely.)

I suppose in this case, invalidating could simply clear a regcache
instead of deleting (and subsequently recreating) it.  Is that correct?

Tom


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-03 22:04             ` Tom Tromey
@ 2011-02-04 13:49               ` Ulrich Weigand
  0 siblings, 0 replies; 18+ messages in thread
From: Ulrich Weigand @ 2011-02-04 13:49 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Markus Alber, Michael Snyder, gdb, pedro

Tom Tromey wrote:
> >>>>> "Ulrich" == Ulrich Weigand <uweigand@de.ibm.com> writes:
> Ulrich> (Actually, now that every thread always has a thread_info, the
> Ulrich> best thing would probably be anyway to hang each thread's regcaches
> Ulrich> off the thread_info, and do away with the global list completely.)
> 
> I suppose in this case, invalidating could simply clear a regcache
> instead of deleting (and subsequently recreating) it.  Is that correct?

Yes, I think so.  Note that we still need to (potentially) keep more than
one regcache per thread for multi-arch support.  We may also need to reset
the regcache's address space before reusing it ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-03 21:40           ` Ulrich Weigand
  2011-02-03 22:04             ` Tom Tromey
@ 2011-02-04 14:55             ` Pedro Alves
  2011-02-04 15:13               ` Ulrich Weigand
  2011-02-04 15:26               ` Tom Tromey
  1 sibling, 2 replies; 18+ messages in thread
From: Pedro Alves @ 2011-02-04 14:55 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: Tom Tromey, Markus Alber, Michael Snyder, gdb

On Friday 04 February 2011 21:40:08, Ulrich Weigand wrote:
> Tom Tromey wrote:
> > >>>>> "Markus" == Markus Alber <markus@hyperion-imrt.org> writes:
> > 
> > Markus> See the attached file. It shows a similar behaviour, although it only
> > Markus> allocates 8kB per iteration.
> > Markus> You have to wait some time before this happens.
> > 
> > Thanks.
> > 
> > I changed 1<<24 to 1<<15, to spare my underpowered machine, and ran gdb
> > under massif.
> > 
> > This part is interesting:
> > 
> > ->20.78% (2,954,016B) 0x8253683: regcache_xmalloc_1 (regcache.c:232)
> > | ->20.78% (2,954,016B) 0x8253F85: get_thread_arch_regcache (regcache.c:463)
> > |   ->20.78% (2,954,016B) 0x82540B5: get_thread_regcache (regcache.c:488)
> > |     ->20.78% (2,954,016B) 0x81D1579: i386_linux_resume (i386-linux-nat.c:861)
> > |     | ->20.78% (2,954,016B) 0x81D7D32: linux_nat_resume (linux-nat.c:1983)
> > 
> > 
> > I debugged gdb a little and it does indeed seem to be leaking here.
> > 
> > I don't understand why registers_changed_ptid unconditionally clears
> > current_regcache.  I suspect that may be the source of the problem.
> > 
> > Perhaps someone who knows this code better could take a look.
> 
> It seems this leak was introduced by Pedro's patch here:
> http://sourceware.org/ml/gdb-patches/2010-04/msg00960.html
>
> The function used to free all regcaches in the list and then
> reset current_regcache.  The new code now takes care to selectively
> free only a subset of regcaches -- and then resets current_regcache
> anyway ...

Egads!

> I guess we should just remove the current_regcache = NULL line now.

Yeah.  And only clear current_regcache_ptid if it was deleted in the
first place; and only reinit the frame cache if we deleted the
regcache of inferior_ptid ?  Like the patch below.

> (Actually, now that every thread always has a thread_info, the
> best thing would probably be anyway to hang each thread's regcaches
> off the thread_info, and do away with the global list completely.)

Not sure we can do that yet, at least as sole mechanism to
keep track of regcache pointers.
We have targets that do thread<->lwp ptid translation between
thread/proc stratum layers, and random places that do
get_thread_regcache (ptid) behind the core's back -- it appears
aix-thread.c could be one of those.
Another example: linux-nat.c:cancel_breakpoint builds
regcaches for lwps before linux-thread-db.c (if active at all)
has had a chance of telling the core about new
threads (in all-stop mode).

-- 
Pedro Alves

2011-02-04  Pedro Alves  <pedro@codesourcery.com>

	gdb/
	* regcache.c (registers_changed_ptid): Don't explictly always
	clear `current_regcache'.  Only clear current_thread_ptid and
	current_thread_arch when PTID matches.  Only reinit the frame
	cache if PTID matches the current inferior_ptid.  Move alloca(0)
	call to ...
	(registers_changed): ... here.

---
 gdb/regcache.c |   28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

Index: src/gdb/regcache.c
===================================================================
--- src.orig/gdb/regcache.c	2011-02-01 15:27:37.000000000 +0000
+++ src/gdb/regcache.c	2011-02-04 11:56:49.273328004 +0000
@@ -530,6 +530,7 @@ void
 registers_changed_ptid (ptid_t ptid)
 {
   struct regcache_list *list, **list_link;
+  int wildcard = ptid_equal (ptid, minus_one_ptid);
 
   list = current_regcache;
   list_link = &current_regcache;
@@ -550,13 +551,24 @@ registers_changed_ptid (ptid_t ptid)
       list = *list_link;
     }
 
-  current_regcache = NULL;
+  if (wildcard || ptid_equal (ptid, current_thread_ptid))
+    {
+      current_thread_ptid = null_ptid;
+      current_thread_arch = NULL;
+    }
 
-  current_thread_ptid = null_ptid;
-  current_thread_arch = NULL;
+  if (wildcard || ptid_equal (ptid, inferior_ptid))
+    {
+      /* We just deleted the regcache of the current thread.  Need to
+	 forget about any frames we have cached, too.  */
+      reinit_frame_cache ();
+    }
+}
 
-  /* Need to forget about any frames we have cached, too.  */
-  reinit_frame_cache ();
+void
+registers_changed (void)
+{
+  registers_changed_ptid (minus_one_ptid);
 
   /* Force cleanup of any alloca areas if using C alloca instead of
      a builtin alloca.  This particular call is used to clean up
@@ -567,12 +579,6 @@ registers_changed_ptid (ptid_t ptid)
 }
 
 void
-registers_changed (void)
-{
-  registers_changed_ptid (minus_one_ptid);
-}
-
-void
 regcache_raw_read (struct regcache *regcache, int regnum, gdb_byte *buf)
 {
   gdb_assert (regcache != NULL && buf != NULL);


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-04 14:55             ` Pedro Alves
@ 2011-02-04 15:13               ` Ulrich Weigand
  2011-02-04 15:26               ` Tom Tromey
  1 sibling, 0 replies; 18+ messages in thread
From: Ulrich Weigand @ 2011-02-04 15:13 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Tom Tromey, Markus Alber, Michael Snyder, gdb

Pedro Alves wrote:

> > (Actually, now that every thread always has a thread_info, the
> > best thing would probably be anyway to hang each thread's regcaches
> > off the thread_info, and do away with the global list completely.)
> 
> Not sure we can do that yet, at least as sole mechanism to
> keep track of regcache pointers.
> We have targets that do thread<->lwp ptid translation between
> thread/proc stratum layers, and random places that do
> get_thread_regcache (ptid) behind the core's back -- it appears
> aix-thread.c could be one of those.
> Another example: linux-nat.c:cancel_breakpoint builds
> regcaches for lwps before linux-thread-db.c (if active at all)
> has had a chance of telling the core about new
> threads (in all-stop mode).

Ah, I see.  Yes, this looks like some more work is needed ...

Thanks,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-04 14:55             ` Pedro Alves
  2011-02-04 15:13               ` Ulrich Weigand
@ 2011-02-04 15:26               ` Tom Tromey
  2011-02-04 15:56                 ` Pedro Alves
       [not found]                 ` <201102041555.52179.pedro__21913.9744448059$1296834976$gmane$org@codesourcery.com>
  1 sibling, 2 replies; 18+ messages in thread
From: Tom Tromey @ 2011-02-04 15:26 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Ulrich Weigand, Markus Alber, Michael Snyder, gdb

Pedro> Yeah.  And only clear current_regcache_ptid if it was deleted in the
Pedro> first place; and only reinit the frame cache if we deleted the
Pedro> regcache of inferior_ptid ?  Like the patch below.

Yeah, this looks reasonable to me.

Pedro> We have targets that do thread<->lwp ptid translation between
Pedro> thread/proc stratum layers, and random places that do
Pedro> get_thread_regcache (ptid) behind the core's back -- it appears
Pedro> aix-thread.c could be one of those.
Pedro> Another example: linux-nat.c:cancel_breakpoint builds
Pedro> regcaches for lwps before linux-thread-db.c (if active at all)
Pedro> has had a chance of telling the core about new
Pedro> threads (in all-stop mode).

Thanks for pointing this out.  I have a lot to learn in this area.

One somewhat distressing thing is that my patch did not cause any
regressions.

Tom

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-04 15:26               ` Tom Tromey
@ 2011-02-04 15:56                 ` Pedro Alves
       [not found]                 ` <201102041555.52179.pedro__21913.9744448059$1296834976$gmane$org@codesourcery.com>
  1 sibling, 0 replies; 18+ messages in thread
From: Pedro Alves @ 2011-02-04 15:56 UTC (permalink / raw)
  To: gdb; +Cc: Tom Tromey, Ulrich Weigand, Markus Alber, Michael Snyder, gdb-patches

[Adding gdb-patches@]

On Friday 04 February 2011 15:25:54, Tom Tromey wrote:
> Pedro> Yeah.  And only clear current_regcache_ptid if it was deleted in the
> Pedro> first place; and only reinit the frame cache if we deleted the
> Pedro> regcache of inferior_ptid ?  Like the patch below.
> 
> Yeah, this looks reasonable to me.

Okay, thanks, I've applied it.

> One somewhat distressing thing is that my patch did not cause any
> regressions.

Indeed.

-- 
Pedro Alves


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
       [not found]                 ` <201102041555.52179.pedro__21913.9744448059$1296834976$gmane$org@codesourcery.com>
@ 2011-02-04 17:02                   ` Tom Tromey
  2011-02-05  9:34                     ` Markus Alber
  2011-02-07 14:05                     ` Markus Alber
  0 siblings, 2 replies; 18+ messages in thread
From: Tom Tromey @ 2011-02-04 17:02 UTC (permalink / raw)
  To: Pedro Alves
  Cc: gdb, Ulrich Weigand, Markus Alber, Michael Snyder, gdb-patches

Tom> Yeah, this looks reasonable to me.

Pedro> Okay, thanks, I've applied it.

Thanks

Markus -- any way you could try gdb CVS HEAD to see if it fixes your
problem?  If it doesn't, then there are more bugs to find...

Tom


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-04 17:02                   ` Tom Tromey
@ 2011-02-05  9:34                     ` Markus Alber
  2011-02-07 14:05                     ` Markus Alber
  1 sibling, 0 replies; 18+ messages in thread
From: Markus Alber @ 2011-02-05  9:34 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Pedro Alves, gdb, Ulrich Weigand, Michael Snyder, gdb-patches

 On Fri, 04 Feb 2011 10:01:57 -0700, Tom Tromey wrote:

> Markus -- any way you could try gdb CVS HEAD to see if it fixes your
> problem?  If it doesn't, then there are more bugs to find...
>
> Tom

 Will try on Monday, thanks.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: performance of multithreading gets gradually worse under gdb
  2011-02-04 17:02                   ` Tom Tromey
  2011-02-05  9:34                     ` Markus Alber
@ 2011-02-07 14:05                     ` Markus Alber
  1 sibling, 0 replies; 18+ messages in thread
From: Markus Alber @ 2011-02-07 14:05 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Pedro Alves, gdb, Ulrich Weigand, Michael Snyder, gdb-patches

 On Fri, 04 Feb 2011 10:01:57 -0700, Tom Tromey wrote:
> Tom> Yeah, this looks reasonable to me.
>
> Pedro> Okay, thanks, I've applied it.
>
> Thanks
>
> Markus -- any way you could try gdb CVS HEAD to see if it fixes your
> problem?  If it doesn't, then there are more bugs to find...
>
> Tom

 Hi, I did as told and it does indeed make the situation a lot better.
 The application runs to the end (for the first time!) and it does not
 slow down.

 Thanks for that!

 The application is a lot slower though when run in gdb, than 
 stand-alone.
 I think the reason is this. During each iteration, it branches into the
 worker threads and then comes back to some part of the computation 
 which
 is performed only by one CPU. With 12 cores, the time per iteration is
 essentially spent in there. It is here that gdb competes with the 
 application,
 because both seem to be running on the same core. The average load 
 drops from
 66% to about 20% when run in gdb.

 Just for your information/in case you are interested.

 I did an oprofile as well, and here is what it says:

 samples  %        image name               app name                 
 symbol name
 1113     19.3464  libthread_db-1.0.so      libthread_db-1.0.so      
 _td_locate_field
 1054     18.3209  libthread_db-1.0.so      libthread_db-1.0.so      
 td_thr_get_info
 847      14.7228  libthread_db-1.0.so      libthread_db-1.0.so      
 td_ta_event_getmsg
 804      13.9753  libthread_db-1.0.so      libthread_db-1.0.so      
 _td_fetch_value_local
 568       9.8731  libthread_db-1.0.so      libthread_db-1.0.so      
 td_ta_map_lwp2thr
 519       9.0214  libthread_db-1.0.so      libthread_db-1.0.so      
 _td_fetch_value
 484       8.4130  libthread_db-1.0.so      libthread_db-1.0.so      
 __td_ta_lookup_th_unique
 169       2.9376  libthread_db-1.0.so      libthread_db-1.0.so      
 _td_store_value
 167       2.9028  libthread_db-1.0.so      libthread_db-1.0.so      
 td_thr_event_enable
 3         0.0521  iconv                    iconv                    
 /usr/bin/iconv
 3         0.0521  libstreamanalyzer.so.0.7.2 libstreamanalyzer.so.0.7.2 
 /usr/lib64/libstreamanalyzer.so.0.7.2
 2         0.0348  [vdso] (tgid:2539 
 range:0x7fffe81ff000-0x7fffe8200000) auditd                   [vdso] 
 (tgid:2539 range:0x7fffe81
 200000)
 2         0.0348  libpcre.so.0.0.1         libpcre.so.0.0.1         
 /lib64/libpcre.so.0.0.1
 2         0.0348  ophelp                   ophelp                   
 /usr/bin/ophelp
 1         0.0174  [vdso] (tgid:7680 
 range:0x7fff267c6000-0x7fff267c7000) kdeinit4                 [vdso] 
 (tgid:7680 range:0x7fff267
 7c7000)
 1         0.0174  [vdso] (tgid:8945 
 range:0x7fffbbbff000-0x7fffbbc00000) gconfd-2                 [vdso] 
 (tgid:8945 range:0x7fffbbb
 c00000)
 1         0.0174  auditd                   auditd                   
 /sbin/auditd
 1         0.0174  basename                 basename                 
 /bin/basename
 1         0.0174  kded_networkstatus.so    kded_networkstatus.so    
 /usr/lib64/kde4/kded_networkstatus.so
 1         0.0174  libXau.so.6.0.0          libXau.so.6.0.0          
 /usr/lib64/libXau.so.6.0.0
 1         0.0174  libXi.so.6.1.0           libXi.so.6.1.0           
 /usr/lib64/libXi.so.6.1.0
 1         0.0174  libkdeinit4_kglobalaccel.so 
 libkdeinit4_kglobalaccel.so /usr/lib64/libkdeinit4_kglobalaccel.so
 1         0.0174  libstreams.so.0.7.2      libstreams.so.0.7.2      
 /usr/lib64/libstreams.so.0.7.2
 1         0.0174  libthread_db-1.0.so      libthread_db-1.0.so      
 __do_global_ctors_aux
 1         0.0174  pickup                   pickup                   
 /usr/lib/postfix/pickup
 1         0.0174  plasma_applet_icon.so    plasma_applet_icon.so    
 /usr/lib64/kde4/plasma_applet_icon.so
 1         0.0174  qmgr                     qmgr                     
 /usr/lib/postfix/qmgr
 1         0.0174  rpcbind                  rpcbind                  
 /sbin/rpcbind
 1         0.0174  sleep                    sleep                    
 /bin/sleep
 1         0.0174  touch                    touch                    
 /bin/touch

 If you want to have it in more detail, please tell me what options 
 you'd like to have engaged.

 regards,

 Markus



 



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2011-02-07 14:05 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-02 20:16 performance of multithreading gets gradually worse under gdb Markus Alber
2011-02-02 20:28 ` Michael Snyder
     [not found]   ` <76bccf1875854ebc69b6a892fb84a976@hyperion-imrt.org>
2011-02-02 21:43     ` Michael Snyder
2011-02-03  7:03       ` Markus Alber
2011-02-03 20:26         ` Michael Snyder
2011-02-03 20:52           ` Markus Alber
2011-02-03 20:57         ` Tom Tromey
2011-02-03 21:00           ` Tom Tromey
2011-02-03 21:40           ` Ulrich Weigand
2011-02-03 22:04             ` Tom Tromey
2011-02-04 13:49               ` Ulrich Weigand
2011-02-04 14:55             ` Pedro Alves
2011-02-04 15:13               ` Ulrich Weigand
2011-02-04 15:26               ` Tom Tromey
2011-02-04 15:56                 ` Pedro Alves
     [not found]                 ` <201102041555.52179.pedro__21913.9744448059$1296834976$gmane$org@codesourcery.com>
2011-02-04 17:02                   ` Tom Tromey
2011-02-05  9:34                     ` Markus Alber
2011-02-07 14:05                     ` Markus Alber

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox