JIT interface slowness

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

* JIT interface slowness
@ 2010-12-31 19:43 Vyacheslav Egorov
  2010-12-31 20:10 ` Pedro Alves
  0 siblings, 1 reply; 14+ messages in thread
From: Vyacheslav Egorov @ 2010-12-31 19:43 UTC (permalink / raw)
  To: gdb

Hi,

I've implemented[1] basic GDB JIT client for V8[2] and when people
started testing it with node.js[3] we noticed that it makes execution
under GDB slower especially if JIT generates and registers inmemory
ELF object for every single code object it emits.

Apparently GDB is just not optimized to handle tons of small ELF
objects being registered (and unregistered). When I just comment out
calls to __jit_debug_register_code (leaving ELF-generation intact)
everything becomes fast again.

Is there any known bottlenecks in JIT interface/any known workaround for them?

Any hints would be highly appreciated.

Thanks.

[1] http://codereview.chromium.org/5965011/
[2] http://v8.googlecode.com/
[3] http://nodejs.org

--
Vyacheslav Egorov

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2010-12-31 19:43 JIT interface slowness Vyacheslav Egorov
@ 2010-12-31 20:10 ` Pedro Alves
  2010-12-31 21:39   ` Vyacheslav Egorov
  0 siblings, 1 reply; 14+ messages in thread
From: Pedro Alves @ 2010-12-31 20:10 UTC (permalink / raw)
  To: gdb; +Cc: Vyacheslav Egorov

On Friday 31 December 2010 19:43:12, Vyacheslav Egorov wrote:
> Hi,
> 
> I've implemented[1] basic GDB JIT client for V8[2] and when people
> started testing it with node.js[3] we noticed that it makes execution
> under GDB slower especially if JIT generates and registers inmemory
> ELF object for every single code object it emits.
> 
> Apparently GDB is just not optimized to handle tons of small ELF
> objects being registered (and unregistered). When I just comment out
> calls to __jit_debug_register_code (leaving ELF-generation intact)
> everything becomes fast again.
> 
> Is there any known bottlenecks in JIT interface/any known workaround for them?

Your description of the problem is a bit ambiguous: are you saying that in
your use case the JIT client is constantly registering and unregistering debug
info with gdb, or that you generate a ton of small ELF objects in one go, and
then gdb becomes slow with all of those loaded?

If the former, I guess that would be attributed to the fact that whenever
your client calls __jit_debug_register_code, your process hits a(n internal)
breakpoint, which stops the world, then GDB reacts to that breakpoint,
reading the debug info object that the client has generated (or unloaded),
and then re-resumes the program (read, it's a syncronous, blocking process).
Easy to imagine a lot of those in quick and constant succession slowing
down execution significantly.  Logs would give a better hint (e.g.,
"set debug timestamp on; set debug infrun 1", look for bp_jit_event).

If the latter, how many is "tons"?

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2010-12-31 20:10 ` Pedro Alves
@ 2010-12-31 21:39   ` Vyacheslav Egorov
  2010-12-31 22:23     ` Pedro Alves
  0 siblings, 1 reply; 14+ messages in thread
From: Vyacheslav Egorov @ 2010-12-31 21:39 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb

Hi Pedro,

Sorry for not being clear. It is the former. V8 continuously generates
code objects while the application runs (and it generates more than it
reclaims at least on this particular node.js sample which I am using
to evaluate GDB JIT client implementation).

During startup of the aforementioned sample (in the worst case when an
ELF object is emitted for each generated code object) 1118 code
objects are generated.

Also it seems that with each new ELF object registration of a single
new entry costs more and more:

registered new entry, total 1101 entries [took 324 ms]
registered new entry, total 1102 entries [took 324 ms]
registered new entry, total 1103 entries [took 325 ms]
registered new entry, total 1104 entries [took 326 ms]
registered new entry, total 1105 entries [took 326 ms]
registered new entry, total 1106 entries [took 327 ms]
registered new entry, total 1107 entries [took 327 ms]
registered new entry, total 1108 entries [took 328 ms]
registered new entry, total 1109 entries [took 329 ms]
registered new entry, total 1110 entries [took 329 ms]
registered new entry, total 1111 entries [took 330 ms]
registered new entry, total 1112 entries [took 331 ms]
registered new entry, total 1113 entries [took 332 ms]
registered new entry, total 1114 entries [took 333 ms]
registered new entry, total 1115 entries [took 333 ms]
registered new entry, total 1116 entries [took 334 ms]
registered new entry, total 1117 entries [took 335 ms]
registered new entry, total 1118 entries [took 336 ms]

// value in the brackets is not a sum; it's time spent on registering
a single entry.

Happy New Year!

--
Vyacheslav Egorov


On Fri, Dec 31, 2010 at 9:10 PM, Pedro Alves <pedro@codesourcery.com> wrote:
> On Friday 31 December 2010 19:43:12, Vyacheslav Egorov wrote:
>> Hi,
>>
>> I've implemented[1] basic GDB JIT client for V8[2] and when people
>> started testing it with node.js[3] we noticed that it makes execution
>> under GDB slower especially if JIT generates and registers inmemory
>> ELF object for every single code object it emits.
>>
>> Apparently GDB is just not optimized to handle tons of small ELF
>> objects being registered (and unregistered). When I just comment out
>> calls to __jit_debug_register_code (leaving ELF-generation intact)
>> everything becomes fast again.
>>
>> Is there any known bottlenecks in JIT interface/any known workaround for them?
>
> Your description of the problem is a bit ambiguous: are you saying that in
> your use case the JIT client is constantly registering and unregistering debug
> info with gdb, or that you generate a ton of small ELF objects in one go, and
> then gdb becomes slow with all of those loaded?
>
> If the former, I guess that would be attributed to the fact that whenever
> your client calls __jit_debug_register_code, your process hits a(n internal)
> breakpoint, which stops the world, then GDB reacts to that breakpoint,
> reading the debug info object that the client has generated (or unloaded),
> and then re-resumes the program (read, it's a syncronous, blocking process).
> Easy to imagine a lot of those in quick and constant succession slowing
> down execution significantly.  Logs would give a better hint (e.g.,
> "set debug timestamp on; set debug infrun 1", look for bp_jit_event).
>
> If the latter, how many is "tons"?
>
> --
> Pedro Alves
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2010-12-31 21:39   ` Vyacheslav Egorov
@ 2010-12-31 22:23     ` Pedro Alves
  2010-12-31 23:12       ` Vyacheslav Egorov
  0 siblings, 1 reply; 14+ messages in thread
From: Pedro Alves @ 2010-12-31 22:23 UTC (permalink / raw)
  To: gdb; +Cc: Vyacheslav Egorov

On Friday 31 December 2010 21:38:49, Vyacheslav Egorov wrote:

> Sorry for not being clear. It is the former. V8 continuously generates
> code objects while the application runs (and it generates more than it
> reclaims at least on this particular node.js sample which I am using
> to evaluate GDB JIT client implementation).
> 
> During startup of the aforementioned sample (in the worst case when an
> ELF object is emitted for each generated code object) 1118 code
> objects are generated.

Yeah.  It may be useful to have a way for the client
to batch informing gdb of the elf objects becoming available.  IIRC,
you only have the option of letting gdb know of one entry at a time
currently.  Not clear whether that'd be a clear gain without a better
understanding of where time is spent.  Or maybe you could paralelize
on your end somehow, or change design to make it so you have fewer
objects.

If your JIT runs on a separate thread, and pausing just that
thread doesn't block all others immediately, you could try
running gdb in non-stop mode.  It may speed up interactive startup
in this use case, since when the JIT notification breakpoint only
stops the JIT thread (instead of all threads) in that mode,
though it sounds like the debug info for the JIT compiled code
would still take a bit to become available anyway.

> 
> Also it seems that with each new ELF object registration of a single
> new entry costs more and more:
> 
> registered new entry, total 1101 entries [took 324 ms]
> registered new entry, total 1102 entries [took 324 ms]
> registered new entry, total 1103 entries [took 325 ms]
> registered new entry, total 1104 entries [took 326 ms]
> registered new entry, total 1105 entries [took 326 ms]
> registered new entry, total 1106 entries [took 327 ms]
> registered new entry, total 1107 entries [took 327 ms]
> registered new entry, total 1108 entries [took 328 ms]
> registered new entry, total 1109 entries [took 329 ms]
> registered new entry, total 1110 entries [took 329 ms]
> registered new entry, total 1111 entries [took 330 ms]
> registered new entry, total 1112 entries [took 331 ms]
> registered new entry, total 1113 entries [took 332 ms]
> registered new entry, total 1114 entries [took 333 ms]
> registered new entry, total 1115 entries [took 333 ms]
> registered new entry, total 1116 entries [took 334 ms]
> registered new entry, total 1117 entries [took 335 ms]
> registered new entry, total 1118 entries [took 336 ms]
> 
> // value in the brackets is not a sum; it's time spent on registering
> a single entry.

Yeah.  Quite difficult to put the finger on a single
culprit without digging deeper.  Might be easy to fix,
might not.  What was the cost for a first registrations?

If you're interested in investigating deeper, and perhaps
addressing the problem, here are a few questions that could help
investigate where most of the time is spent: is this native
debugging, or remote debugging? If remote debugging, a significant
chunk of time may be getting spent on reading the debug info
from memory.  What's the elf object's or debug info's size?  (If
most of time is spent on debug info parsing rather than extracting the
file from memory, this may be something that could benefit
from Tromey's multi-threading debug info reading patches.)

> Happy New Year!

Ditto!

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2010-12-31 22:23     ` Pedro Alves
@ 2010-12-31 23:12       ` Vyacheslav Egorov
  2010-12-31 23:36         ` Pedro Alves
  0 siblings, 1 reply; 14+ messages in thread
From: Vyacheslav Egorov @ 2010-12-31 23:12 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb

On Fri, Dec 31, 2010 at 11:22 PM, Pedro Alves <pedro@codesourcery.com> wrote:
> or change design to make it so you have fewer objects.

Yep, this is the most obvious optimization to pursue and I am already
tried to figure out some way of merging ELF-objects for different code
objects together. Unfortunately it is not easy due to the lazy nature
of JIT compilation.

>
> If your JIT runs on a separate thread, and pausing just that
> thread doesn't block all others immediately, you could try
> running gdb in non-stop mode.
>

I thought you said that hitting __jit_debug_register_code stops the
world i.e. stops all threads. If this is not the case I can just form
a queue on my side and let a separate thread poll it and register
incoming elf-objects (in V8 compilation and execution are performed in
the same thread).

> What was the cost for a first registrations?

Up to 88 it is < 2ms
Up to 276 --- < 10ms
Up to 535 --- < 50ms

> is this native debugging, or remote debugging?

native

> What's the elf object's or debug info's size?

for ia32 below 1k. average seems to be 600 bytes.

Contents are (in addition to standard ELF header, section table,
string table for section table):

.text section: type = NOBITS, address = start of JIT emitted code
object, size = size of JIT emitted code object
.symtab section: 2 symbols, one of type FILE with local binding, one
of type FUNC with global binding and the name equal to name of JIT
emitted code object.

plus for code objects with available line information:

.debug_info section with one compilation unit, .debug_abbrev section
with one abbreviation corresponding to compilation unit from
.debug_info, .debug_line with DWARF2 encoded line information.

--
Vyacheslav Egorov

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2010-12-31 23:12       ` Vyacheslav Egorov
@ 2010-12-31 23:36         ` Pedro Alves
  2011-01-02  7:54           ` Paul Pluzhnikov
  0 siblings, 1 reply; 14+ messages in thread
From: Pedro Alves @ 2010-12-31 23:36 UTC (permalink / raw)
  To: Vyacheslav Egorov; +Cc: gdb

On Friday 31 December 2010 23:11:55, Vyacheslav Egorov wrote:
> > If your JIT runs on a separate thread, and pausing just that
> > thread doesn't block all others immediately, you could try
> > running gdb in non-stop mode.
> >
> 
> I thought you said that hitting __jit_debug_register_code stops the
> world i.e. stops all threads. 

I did, and it does, in all-stop mode, which is the gdb default
mode.  There's a new-ish mode (called the non-stop mode), where
gdb does _not_ stop all your threads whenever a breakpoint is
hit --- only the particular thread that hit the breakpoint.
(There's a chapter about it in the manual).  Not all your users
will want to enable this mode.  And most frontends don't know
about it either, so, it's not really a "fix" for everyone, I
guess.

> > What was the cost for a first registrations?
> 
> Up to 88 it is < 2ms
> Up to 276 --- < 10ms
> Up to 535 --- < 50ms

> registered new entry, total 1115 entries [took 333 ms]
> registered new entry, total 1116 entries [took 334 ms]
> registered new entry, total 1117 entries [took 335 ms]
> registered new entry, total 1118 entries [took 336 ms]

It would be quite interesting to know what causes this.
You should also try a recent snapshot (or cvs head), and
7.2, if you aren't already.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2010-12-31 23:36         ` Pedro Alves
@ 2011-01-02  7:54           ` Paul Pluzhnikov
  2011-01-03 10:44             ` Vyacheslav Egorov
  0 siblings, 1 reply; 14+ messages in thread
From: Paul Pluzhnikov @ 2011-01-02  7:54 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Vyacheslav Egorov, gdb

On Fri, Dec 31, 2010 at 3:36 PM, Pedro Alves <pedro@codesourcery.com> wrote:
> On Friday 31 December 2010 23:11:55, Vyacheslav Egorov wrote:
>> > If your JIT runs on a separate thread, and pausing just that
>> > thread doesn't block all others immediately, you could try
>> > running gdb in non-stop mode.
>> >
>>
>> I thought you said that hitting __jit_debug_register_code stops the
>> world i.e. stops all threads.
>
> I did, and it does, in all-stop mode, which is the gdb default
> mode.  There's a new-ish mode (called the non-stop mode), where
> gdb does _not_ stop all your threads whenever a breakpoint is
> hit --- only the particular thread that hit the breakpoint.
> (There's a chapter about it in the manual).  Not all your users
> will want to enable this mode.  And most frontends don't know
> about it either, so, it's not really a "fix" for everyone, I
> guess.
>
>> > What was the cost for a first registrations?
>>
>> Up to 88 it is < 2ms
>> Up to 276 --- < 10ms
>> Up to 535 --- < 50ms
>
>> registered new entry, total 1115 entries [took 333 ms]
>> registered new entry, total 1116 entries [took 334 ms]
>> registered new entry, total 1117 entries [took 335 ms]
>> registered new entry, total 1118 entries [took 336 ms]
>
> It would be quite interesting to know what causes this.
> You should also try a recent snapshot (or cvs head), and
> 7.2, if you aren't already.

FWIW, when Reid K. was implementing this in GSoC, his use case was
Unladen Swallow (http://en.wikipedia.org/wiki/Unladen_Swallow), and I
don't think he expected 1K+ separate JITted ELF objects.

In jit.c, I see
  objfile = symbol_file_add_from_bfd (nbfd, 0, sai, OBJF_SHARED);

And in symfile.c (new_symfile_objfile):

  else if ((add_flags & SYMFILE_DEFER_BP_RESET) == 0)
    {
      breakpoint_re_set ();
    }

I think this is going to repeatedly remove and re-insert JIT
breakpoint into every added jit ELF, giving O(N*N) runtime, wouldn't
it?

-- 
Paul Pluzhnikov


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2011-01-02  7:54           ` Paul Pluzhnikov
@ 2011-01-03 10:44             ` Vyacheslav Egorov
  2011-01-03 17:32               ` Paul Pluzhnikov
  0 siblings, 1 reply; 14+ messages in thread
From: Vyacheslav Egorov @ 2011-01-03 10:44 UTC (permalink / raw)
  To: Paul Pluzhnikov; +Cc: Pedro Alves, gdb

On Fri, Dec 31, 2010 at 3:36 PM, Pedro Alves <pedro@codesourcery.com> wrote:
> It would be quite interesting to know what causes this.
> You should also try a recent snapshot (or cvs head), and 7.2, if you aren't already.

Yep, it would be interesting to figure it out. I am not sure that I
will be able to invest my own time into investigating this though.

I am running gdb 7.2.

On Sun, Jan 2, 2011 at 8:07 AM, Paul Pluzhnikov <ppluzhnikov@google.com> wrote:
> I don't think he expected 1K+ separate JITted ELF objects.
> I think this is going to repeatedly remove and re-insert JIT
> breakpoint into every added jit ELF, giving O(N*N) runtime, wouldn't
> it?

1K is just a startup of a sample. I would expect any realistic node.js
based application to generate more code objects during lifetime.

Currently I delay manifestation of the problem by not emitting
ELF-objects for code objects that does not have line info attached.
This reduces 1K to ~500. But this will not work as a long-term
solution: I will have to emit debug information (namely .debug_frame)
for all code object otherwise gdb is not able to unwind stack properly
on x64.

I thought that setting a breakpoint requires O(1) not O(n). It's a bit
disappointing.

--
Vyacheslav Egorov


On Sun, Jan 2, 2011 at 8:07 AM, Paul Pluzhnikov <ppluzhnikov@google.com> wrote:
> On Fri, Dec 31, 2010 at 3:36 PM, Pedro Alves <pedro@codesourcery.com> wrote:
>> On Friday 31 December 2010 23:11:55, Vyacheslav Egorov wrote:
>>> > If your JIT runs on a separate thread, and pausing just that
>>> > thread doesn't block all others immediately, you could try
>>> > running gdb in non-stop mode.
>>> >
>>>
>>> I thought you said that hitting __jit_debug_register_code stops the
>>> world i.e. stops all threads.
>>
>> I did, and it does, in all-stop mode, which is the gdb default
>> mode.  There's a new-ish mode (called the non-stop mode), where
>> gdb does _not_ stop all your threads whenever a breakpoint is
>> hit --- only the particular thread that hit the breakpoint.
>> (There's a chapter about it in the manual).  Not all your users
>> will want to enable this mode.  And most frontends don't know
>> about it either, so, it's not really a "fix" for everyone, I
>> guess.
>>
>>> > What was the cost for a first registrations?
>>>
>>> Up to 88 it is < 2ms
>>> Up to 276 --- < 10ms
>>> Up to 535 --- < 50ms
>>
>>> registered new entry, total 1115 entries [took 333 ms]
>>> registered new entry, total 1116 entries [took 334 ms]
>>> registered new entry, total 1117 entries [took 335 ms]
>>> registered new entry, total 1118 entries [took 336 ms]
>>
>> It would be quite interesting to know what causes this.
>> You should also try a recent snapshot (or cvs head), and
>> 7.2, if you aren't already.
>
> FWIW, when Reid K. was implementing this in GSoC, his use case was
> Unladen Swallow (http://en.wikipedia.org/wiki/Unladen_Swallow), and I
> don't think he expected 1K+ separate JITted ELF objects.
>
> In jit.c, I see
>  objfile = symbol_file_add_from_bfd (nbfd, 0, sai, OBJF_SHARED);
>
> And in symfile.c (new_symfile_objfile):
>
>  else if ((add_flags & SYMFILE_DEFER_BP_RESET) == 0)
>    {
>      breakpoint_re_set ();
>    }
>
> I think this is going to repeatedly remove and re-insert JIT
> breakpoint into every added jit ELF, giving O(N*N) runtime, wouldn't
> it?
>
> --
> Paul Pluzhnikov
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2011-01-03 10:44             ` Vyacheslav Egorov
@ 2011-01-03 17:32               ` Paul Pluzhnikov
  2011-01-03 22:01                 ` Paul Pluzhnikov
  0 siblings, 1 reply; 14+ messages in thread
From: Paul Pluzhnikov @ 2011-01-03 17:32 UTC (permalink / raw)
  To: Vyacheslav Egorov; +Cc: Pedro Alves, gdb

On Mon, Jan 3, 2011 at 2:43 AM, Vyacheslav Egorov <vegorov@chromium.org> wrote:

> I thought that setting a breakpoint requires O(1) not O(n). It's a bit
> disappointing.

I think I've mis-diagnosed the problem slightly -- there is no breakpoint
inserted into every JITted ELF. But there is still a quadratic loop in jit.c:

/* Look up the objfile with this code entry address.  */

static struct objfile *
jit_find_objf_with_entry_addr (CORE_ADDR entry_addr)
{
  struct objfile *objf;
  CORE_ADDR *objf_entry_addr;

  ALL_OBJFILES (objf)
    {
      objf_entry_addr = (CORE_ADDR *) objfile_data (objf, jit_objfile_data);
      if (objf_entry_addr != NULL && *objf_entry_addr == entry_addr)
        return objf;
    }
  return NULL;
}

...
static void
jit_inferior_init (struct gdbarch *gdbarch)
...
  /* If we've attached to a running program, we need to check the descriptor to
     register any functions that were already generated.  */
  for (cur_entry_addr = descriptor.first_entry;
       cur_entry_addr != 0;
       cur_entry_addr = cur_entry.next_entry)
    {
      jit_read_code_entry (gdbarch, cur_entry_addr, &cur_entry);

      /* This hook may be called many times during setup, so make sure we don't
         add the same symbol file twice.  */
      if (jit_find_objf_with_entry_addr (cur_entry_addr) != NULL)
        continue;

      jit_register_code (gdbarch, cur_entry_addr, &cur_entry);
    }

This is quadratic if you insert all 1000 JITted ELFs and call the
__jit_debug_register_code() once.

If you call __jit_debug_register_code() after adding every new ELF,
you get N*N*N instead :-(

My gut feeling though is that the really slow operation here is
jit_read_code_entry (which reads target memory) and everything else is
noise in comparison (but I haven't measured it).  Assuming this is so,
you get O(N) if you call __jit_debug_register_code once, and O(N*N) if
you call it after every new ELF.

I think we could do much better if we keep a "shadow" copy of the list of
JITted ELFs in GDB, instead of re-reading it from inferior all the time.

Let me try to come up with a patch ...

> I will have to emit debug information (namely .debug_frame)
> for all code object otherwise gdb is not able to unwind stack properly
> on x64.

This is slightly incorrect: you must emit .eh_frame{,_hdr} for any
x86_64 code if you want stack unwinder to work, but you don't need to emit
any debug info (IOW, .debug_frame is not required).

Cheers,
-- 
Paul Pluzhnikov

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2011-01-03 17:32               ` Paul Pluzhnikov
@ 2011-01-03 22:01                 ` Paul Pluzhnikov
  2011-01-03 23:32                   ` Pedro Alves
  0 siblings, 1 reply; 14+ messages in thread
From: Paul Pluzhnikov @ 2011-01-03 22:01 UTC (permalink / raw)
  To: Vyacheslav Egorov; +Cc: Pedro Alves, gdb

On Mon, Jan 3, 2011 at 9:32 AM, Paul Pluzhnikov <ppluzhnikov@google.com> wrote:

> My gut feeling though is that the really slow operation here is
> jit_read_code_entry (which reads target memory) and everything else is
> noise in comparison (but I haven't measured it).

I've reproduced this with current CVS Head.

I'd like to submit a test case for this, but where?
gdb.base/, gdb.gdb/, elsewhere?


I was also wrong about where the time is spent :-(

The quadratic/cubic loop does not in fact happen (it is protected against
with 'registering_code' in jit.c).

But I wasn't completely off base -- breakpoints to have something to do
with this. The profile (from 'perf') is:

# Events: 19K cycles
#
# Overhead    Command         Shared Object
        Symbol
# ........  .........  ....................
........................................
#
   67.30%  gdb64-cvs  gdb                   [.] lookup_minimal_symbol_text
   29.59%  gdb64-cvs  gdb                   [.] lookup_minimal_symbol
    0.81%  gdb64-cvs  gdb                   [.] find_pc_section
    0.19%  gdb64-cvs  libc-2.11.1.so        [.] __ctype_b_loc


And lookup_minimal_symbol_text is called from breakpoint_re_set:

10523     create_overlay_event_breakpoint ("_ovly_debug_event");
10524     create_longjmp_master_breakpoint ("longjmp");
10525     create_longjmp_master_breakpoint ("_longjmp");
10526     create_longjmp_master_breakpoint ("siglongjmp");
10527     create_longjmp_master_breakpoint ("_siglongjmp");

The following patch:

Index: jit.c
===================================================================
RCS file: /cvs/src/src/gdb/jit.c,v
retrieving revision 1.8
diff -u -r1.8 jit.c
--- jit.c       1 Jan 2011 15:33:09 -0000       1.8
+++ jit.c       3 Jan 2011 21:34:17 -0000
@@ -263,7 +263,7 @@
  my_cleanups = make_cleanup (clear_int, &registering_code);

  /* This call takes ownership of sai.  */
-  objfile = symbol_file_add_from_bfd (nbfd, 0, sai, OBJF_SHARED);
+  objfile = symbol_file_add_from_bfd (nbfd, SYMFILE_DEFER_BP_RESET,
sai, OBJF_SHARED);

  /* Clear the registering_code flag.  */
  do_cleanups (my_cleanups);


makes 100x difference (0.64s vs. 61.5s) for 1000 simulated JIT ELFs.

I think it's pretty safe to assume that we don't need to search JITted
objfiles for above functions, as JITs do not participate in normal symbol
resolution at all.

Do we need a new OBJF_JIT flag, or is above patch good enough?

Thanks,
-- 
Paul Pluzhnikov


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2011-01-03 22:01                 ` Paul Pluzhnikov
@ 2011-01-03 23:32                   ` Pedro Alves
  2011-01-03 23:40                     ` Pedro Alves
  2011-01-03 23:47                     ` Paul Pluzhnikov
  0 siblings, 2 replies; 14+ messages in thread
From: Pedro Alves @ 2011-01-03 23:32 UTC (permalink / raw)
  To: Paul Pluzhnikov; +Cc: Vyacheslav Egorov, gdb

Hi Paul,

thanks for the tests, stats and for looking into this.

On Monday 03 January 2011 22:00:48, Paul Pluzhnikov wrote:
> I'd like to submit a test case for this, but where?
> gdb.base/, gdb.gdb/, elsewhere?

gdb.base/ should be fine.

gdb.gdb/ is for tests that load gdb itself as an inferior.

> diff -u -r1.8 jit.c

Please use cvs diff -up.  -p makes wonders at making
patches easier to read.  (I just put "diff -up" in ~/.cvsup).

> --- jit.c       1 Jan 2011 15:33:09 -0000       1.8
> +++ jit.c       3 Jan 2011 21:34:17 -0000
> @@ -263,7 +263,7 @@
>   my_cleanups = make_cleanup (clear_int, &registering_code);
> 

>   /* This call takes ownership of sai.  */
> -  objfile = symbol_file_add_from_bfd (nbfd, 0, sai, OBJF_SHARED);
> +  objfile = symbol_file_add_from_bfd (nbfd, SYMFILE_DEFER_BP_RESET,
> sai, OBJF_SHARED);
> 
>   /* Clear the registering_code flag.  */
>   do_cleanups (my_cleanups);
> 
> 
> makes 100x difference (0.64s vs. 61.5s) for 1000 simulated JIT ELFs.
> 
> I think it's pretty safe to assume that we don't need to search JITted
> objfiles for above functions, as JITs do not participate in normal symbol
> resolution at all.

But doesn't that mean that pending breakpoints on JITed functions
won't resolve anymore?

> Do we need a new OBJF_JIT flag, or is above patch good enough?

What if we record a per-objfile flag or cache storing whether a given
objfile contains "longjmp" related symbols, so that we only lookup
those symbols at least once per DSO?  Quite similar in spirit to
your objc_objfile_data change.  Do we still get a significant
slowdown from breakpoint_re_set with that change?

(I've also noticed before that lookup_minimal_symbol_text iterates
over all objfiles even if given an objfile to work with (worse
case, but then again, new objfiles are appended at the end
of the objfile list).  Probably contributes to the noise, but
these things add up.)

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2011-01-03 23:32                   ` Pedro Alves
@ 2011-01-03 23:40                     ` Pedro Alves
  2011-01-03 23:47                     ` Paul Pluzhnikov
  1 sibling, 0 replies; 14+ messages in thread
From: Pedro Alves @ 2011-01-03 23:40 UTC (permalink / raw)
  To: gdb; +Cc: Paul Pluzhnikov, Vyacheslav Egorov

On Monday 03 January 2011 23:32:18, Pedro Alves wrote:
> patches easier to read.  (I just put "diff -up" in ~/.cvsup).

Err, I meant ~/.cvsrc.

-- 
Pedro Alves


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2011-01-03 23:32                   ` Pedro Alves
  2011-01-03 23:40                     ` Pedro Alves
@ 2011-01-03 23:47                     ` Paul Pluzhnikov
  2011-01-04  0:13                       ` Pedro Alves
  1 sibling, 1 reply; 14+ messages in thread
From: Paul Pluzhnikov @ 2011-01-03 23:47 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Vyacheslav Egorov, gdb

On Mon, Jan 3, 2011 at 3:32 PM, Pedro Alves <pedro@codesourcery.com> wrote:

> gdb.base/ should be fine.
> gdb.gdb/ is for tests that load gdb itself as an inferior.

Ah, thanks.

>> diff -u -r1.8 jit.c
>
> Please use cvs diff -up.  -p makes wonders at making
> patches easier to read.

Sorry about that. It's been a long time since I sent patches, got rusty :-(

>> I think it's pretty safe to assume that we don't need to search JITted
>> objfiles for above functions, as JITs do not participate in normal symbol
>> resolution at all.
>
> But doesn't that mean that pending breakpoints on JITed functions
> won't resolve anymore?

Right.

>> Do we need a new OBJF_JIT flag, or is above patch good enough?
>
> What if we record a per-objfile flag or cache storing whether a given
> objfile contains "longjmp" related symbols, so that we only lookup
> those symbols at least once per DSO?

Did you mean "at most once per DSO"?

> Quite similar in spirit to
> your objc_objfile_data change.  Do we still get a significant
> slowdown from breakpoint_re_set with that change?

We probably wouldn't. I'll make a proper patch, measure, and send to
gdb-patches.

> (I've also noticed before that lookup_minimal_symbol_text iterates
> over all objfiles even if given an objfile to work with (worse
> case, but then again, new objfiles are appended at the end
> of the objfile list).  Probably contributes to the noise, but
> these things add up.)

I'll fix that as well.

Thanks for comments!
-- 
Paul Pluzhnikov


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: JIT interface slowness
  2011-01-03 23:47                     ` Paul Pluzhnikov
@ 2011-01-04  0:13                       ` Pedro Alves
  0 siblings, 0 replies; 14+ messages in thread
From: Pedro Alves @ 2011-01-04  0:13 UTC (permalink / raw)
  To: Paul Pluzhnikov; +Cc: Vyacheslav Egorov, gdb

On Monday 03 January 2011 23:46:50, Paul Pluzhnikov wrote:
> >> Do we need a new OBJF_JIT flag, or is above patch good enough?
> >
> > What if we record a per-objfile flag or cache storing whether a given
> > objfile contains "longjmp" related symbols, so that we only lookup
> > those symbols at least once per DSO?
> 
> Did you mean "at most once per DSO"?

Yep.

-- 
Pedro Alves


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-01-04  0:13 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-31 19:43 JIT interface slowness Vyacheslav Egorov
2010-12-31 20:10 ` Pedro Alves
2010-12-31 21:39   ` Vyacheslav Egorov
2010-12-31 22:23     ` Pedro Alves
2010-12-31 23:12       ` Vyacheslav Egorov
2010-12-31 23:36         ` Pedro Alves
2011-01-02  7:54           ` Paul Pluzhnikov
2011-01-03 10:44             ` Vyacheslav Egorov
2011-01-03 17:32               ` Paul Pluzhnikov
2011-01-03 22:01                 ` Paul Pluzhnikov
2011-01-03 23:32                   ` Pedro Alves
2011-01-03 23:40                     ` Pedro Alves
2011-01-03 23:47                     ` Paul Pluzhnikov
2011-01-04  0:13                       ` Pedro Alves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox