* JIT interface slowness @ 2010-12-31 19:43 Vyacheslav Egorov 2010-12-31 20:10 ` Pedro Alves 0 siblings, 1 reply; 14+ messages in thread From: Vyacheslav Egorov @ 2010-12-31 19:43 UTC (permalink / raw) To: gdb Hi, I've implemented[1] basic GDB JIT client for V8[2] and when people started testing it with node.js[3] we noticed that it makes execution under GDB slower especially if JIT generates and registers inmemory ELF object for every single code object it emits. Apparently GDB is just not optimized to handle tons of small ELF objects being registered (and unregistered). When I just comment out calls to __jit_debug_register_code (leaving ELF-generation intact) everything becomes fast again. Is there any known bottlenecks in JIT interface/any known workaround for them? Any hints would be highly appreciated. Thanks. [1] http://codereview.chromium.org/5965011/ [2] http://v8.googlecode.com/ [3] http://nodejs.org -- Vyacheslav Egorov ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2010-12-31 19:43 JIT interface slowness Vyacheslav Egorov @ 2010-12-31 20:10 ` Pedro Alves 2010-12-31 21:39 ` Vyacheslav Egorov 0 siblings, 1 reply; 14+ messages in thread From: Pedro Alves @ 2010-12-31 20:10 UTC (permalink / raw) To: gdb; +Cc: Vyacheslav Egorov On Friday 31 December 2010 19:43:12, Vyacheslav Egorov wrote: > Hi, > > I've implemented[1] basic GDB JIT client for V8[2] and when people > started testing it with node.js[3] we noticed that it makes execution > under GDB slower especially if JIT generates and registers inmemory > ELF object for every single code object it emits. > > Apparently GDB is just not optimized to handle tons of small ELF > objects being registered (and unregistered). When I just comment out > calls to __jit_debug_register_code (leaving ELF-generation intact) > everything becomes fast again. > > Is there any known bottlenecks in JIT interface/any known workaround for them? Your description of the problem is a bit ambiguous: are you saying that in your use case the JIT client is constantly registering and unregistering debug info with gdb, or that you generate a ton of small ELF objects in one go, and then gdb becomes slow with all of those loaded? If the former, I guess that would be attributed to the fact that whenever your client calls __jit_debug_register_code, your process hits a(n internal) breakpoint, which stops the world, then GDB reacts to that breakpoint, reading the debug info object that the client has generated (or unloaded), and then re-resumes the program (read, it's a syncronous, blocking process). Easy to imagine a lot of those in quick and constant succession slowing down execution significantly. Logs would give a better hint (e.g., "set debug timestamp on; set debug infrun 1", look for bp_jit_event). If the latter, how many is "tons"? -- Pedro Alves ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2010-12-31 20:10 ` Pedro Alves @ 2010-12-31 21:39 ` Vyacheslav Egorov 2010-12-31 22:23 ` Pedro Alves 0 siblings, 1 reply; 14+ messages in thread From: Vyacheslav Egorov @ 2010-12-31 21:39 UTC (permalink / raw) To: Pedro Alves; +Cc: gdb Hi Pedro, Sorry for not being clear. It is the former. V8 continuously generates code objects while the application runs (and it generates more than it reclaims at least on this particular node.js sample which I am using to evaluate GDB JIT client implementation). During startup of the aforementioned sample (in the worst case when an ELF object is emitted for each generated code object) 1118 code objects are generated. Also it seems that with each new ELF object registration of a single new entry costs more and more: registered new entry, total 1101 entries [took 324 ms] registered new entry, total 1102 entries [took 324 ms] registered new entry, total 1103 entries [took 325 ms] registered new entry, total 1104 entries [took 326 ms] registered new entry, total 1105 entries [took 326 ms] registered new entry, total 1106 entries [took 327 ms] registered new entry, total 1107 entries [took 327 ms] registered new entry, total 1108 entries [took 328 ms] registered new entry, total 1109 entries [took 329 ms] registered new entry, total 1110 entries [took 329 ms] registered new entry, total 1111 entries [took 330 ms] registered new entry, total 1112 entries [took 331 ms] registered new entry, total 1113 entries [took 332 ms] registered new entry, total 1114 entries [took 333 ms] registered new entry, total 1115 entries [took 333 ms] registered new entry, total 1116 entries [took 334 ms] registered new entry, total 1117 entries [took 335 ms] registered new entry, total 1118 entries [took 336 ms] // value in the brackets is not a sum; it's time spent on registering a single entry. Happy New Year! -- Vyacheslav Egorov On Fri, Dec 31, 2010 at 9:10 PM, Pedro Alves <pedro@codesourcery.com> wrote: > On Friday 31 December 2010 19:43:12, Vyacheslav Egorov wrote: >> Hi, >> >> I've implemented[1] basic GDB JIT client for V8[2] and when people >> started testing it with node.js[3] we noticed that it makes execution >> under GDB slower especially if JIT generates and registers inmemory >> ELF object for every single code object it emits. >> >> Apparently GDB is just not optimized to handle tons of small ELF >> objects being registered (and unregistered). When I just comment out >> calls to __jit_debug_register_code (leaving ELF-generation intact) >> everything becomes fast again. >> >> Is there any known bottlenecks in JIT interface/any known workaround for them? > > Your description of the problem is a bit ambiguous: are you saying that in > your use case the JIT client is constantly registering and unregistering debug > info with gdb, or that you generate a ton of small ELF objects in one go, and > then gdb becomes slow with all of those loaded? > > If the former, I guess that would be attributed to the fact that whenever > your client calls __jit_debug_register_code, your process hits a(n internal) > breakpoint, which stops the world, then GDB reacts to that breakpoint, > reading the debug info object that the client has generated (or unloaded), > and then re-resumes the program (read, it's a syncronous, blocking process). > Easy to imagine a lot of those in quick and constant succession slowing > down execution significantly. Logs would give a better hint (e.g., > "set debug timestamp on; set debug infrun 1", look for bp_jit_event). > > If the latter, how many is "tons"? > > -- > Pedro Alves > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2010-12-31 21:39 ` Vyacheslav Egorov @ 2010-12-31 22:23 ` Pedro Alves 2010-12-31 23:12 ` Vyacheslav Egorov 0 siblings, 1 reply; 14+ messages in thread From: Pedro Alves @ 2010-12-31 22:23 UTC (permalink / raw) To: gdb; +Cc: Vyacheslav Egorov On Friday 31 December 2010 21:38:49, Vyacheslav Egorov wrote: > Sorry for not being clear. It is the former. V8 continuously generates > code objects while the application runs (and it generates more than it > reclaims at least on this particular node.js sample which I am using > to evaluate GDB JIT client implementation). > > During startup of the aforementioned sample (in the worst case when an > ELF object is emitted for each generated code object) 1118 code > objects are generated. Yeah. It may be useful to have a way for the client to batch informing gdb of the elf objects becoming available. IIRC, you only have the option of letting gdb know of one entry at a time currently. Not clear whether that'd be a clear gain without a better understanding of where time is spent. Or maybe you could paralelize on your end somehow, or change design to make it so you have fewer objects. If your JIT runs on a separate thread, and pausing just that thread doesn't block all others immediately, you could try running gdb in non-stop mode. It may speed up interactive startup in this use case, since when the JIT notification breakpoint only stops the JIT thread (instead of all threads) in that mode, though it sounds like the debug info for the JIT compiled code would still take a bit to become available anyway. > > Also it seems that with each new ELF object registration of a single > new entry costs more and more: > > registered new entry, total 1101 entries [took 324 ms] > registered new entry, total 1102 entries [took 324 ms] > registered new entry, total 1103 entries [took 325 ms] > registered new entry, total 1104 entries [took 326 ms] > registered new entry, total 1105 entries [took 326 ms] > registered new entry, total 1106 entries [took 327 ms] > registered new entry, total 1107 entries [took 327 ms] > registered new entry, total 1108 entries [took 328 ms] > registered new entry, total 1109 entries [took 329 ms] > registered new entry, total 1110 entries [took 329 ms] > registered new entry, total 1111 entries [took 330 ms] > registered new entry, total 1112 entries [took 331 ms] > registered new entry, total 1113 entries [took 332 ms] > registered new entry, total 1114 entries [took 333 ms] > registered new entry, total 1115 entries [took 333 ms] > registered new entry, total 1116 entries [took 334 ms] > registered new entry, total 1117 entries [took 335 ms] > registered new entry, total 1118 entries [took 336 ms] > > // value in the brackets is not a sum; it's time spent on registering > a single entry. Yeah. Quite difficult to put the finger on a single culprit without digging deeper. Might be easy to fix, might not. What was the cost for a first registrations? If you're interested in investigating deeper, and perhaps addressing the problem, here are a few questions that could help investigate where most of the time is spent: is this native debugging, or remote debugging? If remote debugging, a significant chunk of time may be getting spent on reading the debug info from memory. What's the elf object's or debug info's size? (If most of time is spent on debug info parsing rather than extracting the file from memory, this may be something that could benefit from Tromey's multi-threading debug info reading patches.) > Happy New Year! Ditto! -- Pedro Alves ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2010-12-31 22:23 ` Pedro Alves @ 2010-12-31 23:12 ` Vyacheslav Egorov 2010-12-31 23:36 ` Pedro Alves 0 siblings, 1 reply; 14+ messages in thread From: Vyacheslav Egorov @ 2010-12-31 23:12 UTC (permalink / raw) To: Pedro Alves; +Cc: gdb On Fri, Dec 31, 2010 at 11:22 PM, Pedro Alves <pedro@codesourcery.com> wrote: > or change design to make it so you have fewer objects. Yep, this is the most obvious optimization to pursue and I am already tried to figure out some way of merging ELF-objects for different code objects together. Unfortunately it is not easy due to the lazy nature of JIT compilation. > > If your JIT runs on a separate thread, and pausing just that > thread doesn't block all others immediately, you could try > running gdb in non-stop mode. > I thought you said that hitting __jit_debug_register_code stops the world i.e. stops all threads. If this is not the case I can just form a queue on my side and let a separate thread poll it and register incoming elf-objects (in V8 compilation and execution are performed in the same thread). > What was the cost for a first registrations? Up to 88 it is < 2ms Up to 276 --- < 10ms Up to 535 --- < 50ms > is this native debugging, or remote debugging? native > What's the elf object's or debug info's size? for ia32 below 1k. average seems to be 600 bytes. Contents are (in addition to standard ELF header, section table, string table for section table): .text section: type = NOBITS, address = start of JIT emitted code object, size = size of JIT emitted code object .symtab section: 2 symbols, one of type FILE with local binding, one of type FUNC with global binding and the name equal to name of JIT emitted code object. plus for code objects with available line information: .debug_info section with one compilation unit, .debug_abbrev section with one abbreviation corresponding to compilation unit from .debug_info, .debug_line with DWARF2 encoded line information. -- Vyacheslav Egorov ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2010-12-31 23:12 ` Vyacheslav Egorov @ 2010-12-31 23:36 ` Pedro Alves 2011-01-02 7:54 ` Paul Pluzhnikov 0 siblings, 1 reply; 14+ messages in thread From: Pedro Alves @ 2010-12-31 23:36 UTC (permalink / raw) To: Vyacheslav Egorov; +Cc: gdb On Friday 31 December 2010 23:11:55, Vyacheslav Egorov wrote: > > If your JIT runs on a separate thread, and pausing just that > > thread doesn't block all others immediately, you could try > > running gdb in non-stop mode. > > > > I thought you said that hitting __jit_debug_register_code stops the > world i.e. stops all threads. I did, and it does, in all-stop mode, which is the gdb default mode. There's a new-ish mode (called the non-stop mode), where gdb does _not_ stop all your threads whenever a breakpoint is hit --- only the particular thread that hit the breakpoint. (There's a chapter about it in the manual). Not all your users will want to enable this mode. And most frontends don't know about it either, so, it's not really a "fix" for everyone, I guess. > > What was the cost for a first registrations? > > Up to 88 it is < 2ms > Up to 276 --- < 10ms > Up to 535 --- < 50ms > registered new entry, total 1115 entries [took 333 ms] > registered new entry, total 1116 entries [took 334 ms] > registered new entry, total 1117 entries [took 335 ms] > registered new entry, total 1118 entries [took 336 ms] It would be quite interesting to know what causes this. You should also try a recent snapshot (or cvs head), and 7.2, if you aren't already. -- Pedro Alves ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2010-12-31 23:36 ` Pedro Alves @ 2011-01-02 7:54 ` Paul Pluzhnikov 2011-01-03 10:44 ` Vyacheslav Egorov 0 siblings, 1 reply; 14+ messages in thread From: Paul Pluzhnikov @ 2011-01-02 7:54 UTC (permalink / raw) To: Pedro Alves; +Cc: Vyacheslav Egorov, gdb On Fri, Dec 31, 2010 at 3:36 PM, Pedro Alves <pedro@codesourcery.com> wrote: > On Friday 31 December 2010 23:11:55, Vyacheslav Egorov wrote: >> > If your JIT runs on a separate thread, and pausing just that >> > thread doesn't block all others immediately, you could try >> > running gdb in non-stop mode. >> > >> >> I thought you said that hitting __jit_debug_register_code stops the >> world i.e. stops all threads. > > I did, and it does, in all-stop mode, which is the gdb default > mode. There's a new-ish mode (called the non-stop mode), where > gdb does _not_ stop all your threads whenever a breakpoint is > hit --- only the particular thread that hit the breakpoint. > (There's a chapter about it in the manual). Not all your users > will want to enable this mode. And most frontends don't know > about it either, so, it's not really a "fix" for everyone, I > guess. > >> > What was the cost for a first registrations? >> >> Up to 88 it is < 2ms >> Up to 276 --- < 10ms >> Up to 535 --- < 50ms > >> registered new entry, total 1115 entries [took 333 ms] >> registered new entry, total 1116 entries [took 334 ms] >> registered new entry, total 1117 entries [took 335 ms] >> registered new entry, total 1118 entries [took 336 ms] > > It would be quite interesting to know what causes this. > You should also try a recent snapshot (or cvs head), and > 7.2, if you aren't already. FWIW, when Reid K. was implementing this in GSoC, his use case was Unladen Swallow (http://en.wikipedia.org/wiki/Unladen_Swallow), and I don't think he expected 1K+ separate JITted ELF objects. In jit.c, I see objfile = symbol_file_add_from_bfd (nbfd, 0, sai, OBJF_SHARED); And in symfile.c (new_symfile_objfile): else if ((add_flags & SYMFILE_DEFER_BP_RESET) == 0) { breakpoint_re_set (); } I think this is going to repeatedly remove and re-insert JIT breakpoint into every added jit ELF, giving O(N*N) runtime, wouldn't it? -- Paul Pluzhnikov ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2011-01-02 7:54 ` Paul Pluzhnikov @ 2011-01-03 10:44 ` Vyacheslav Egorov 2011-01-03 17:32 ` Paul Pluzhnikov 0 siblings, 1 reply; 14+ messages in thread From: Vyacheslav Egorov @ 2011-01-03 10:44 UTC (permalink / raw) To: Paul Pluzhnikov; +Cc: Pedro Alves, gdb On Fri, Dec 31, 2010 at 3:36 PM, Pedro Alves <pedro@codesourcery.com> wrote: > It would be quite interesting to know what causes this. > You should also try a recent snapshot (or cvs head), and 7.2, if you aren't already. Yep, it would be interesting to figure it out. I am not sure that I will be able to invest my own time into investigating this though. I am running gdb 7.2. On Sun, Jan 2, 2011 at 8:07 AM, Paul Pluzhnikov <ppluzhnikov@google.com> wrote: > I don't think he expected 1K+ separate JITted ELF objects. > I think this is going to repeatedly remove and re-insert JIT > breakpoint into every added jit ELF, giving O(N*N) runtime, wouldn't > it? 1K is just a startup of a sample. I would expect any realistic node.js based application to generate more code objects during lifetime. Currently I delay manifestation of the problem by not emitting ELF-objects for code objects that does not have line info attached. This reduces 1K to ~500. But this will not work as a long-term solution: I will have to emit debug information (namely .debug_frame) for all code object otherwise gdb is not able to unwind stack properly on x64. I thought that setting a breakpoint requires O(1) not O(n). It's a bit disappointing. -- Vyacheslav Egorov On Sun, Jan 2, 2011 at 8:07 AM, Paul Pluzhnikov <ppluzhnikov@google.com> wrote: > On Fri, Dec 31, 2010 at 3:36 PM, Pedro Alves <pedro@codesourcery.com> wrote: >> On Friday 31 December 2010 23:11:55, Vyacheslav Egorov wrote: >>> > If your JIT runs on a separate thread, and pausing just that >>> > thread doesn't block all others immediately, you could try >>> > running gdb in non-stop mode. >>> > >>> >>> I thought you said that hitting __jit_debug_register_code stops the >>> world i.e. stops all threads. >> >> I did, and it does, in all-stop mode, which is the gdb default >> mode. There's a new-ish mode (called the non-stop mode), where >> gdb does _not_ stop all your threads whenever a breakpoint is >> hit --- only the particular thread that hit the breakpoint. >> (There's a chapter about it in the manual). Not all your users >> will want to enable this mode. And most frontends don't know >> about it either, so, it's not really a "fix" for everyone, I >> guess. >> >>> > What was the cost for a first registrations? >>> >>> Up to 88 it is < 2ms >>> Up to 276 --- < 10ms >>> Up to 535 --- < 50ms >> >>> registered new entry, total 1115 entries [took 333 ms] >>> registered new entry, total 1116 entries [took 334 ms] >>> registered new entry, total 1117 entries [took 335 ms] >>> registered new entry, total 1118 entries [took 336 ms] >> >> It would be quite interesting to know what causes this. >> You should also try a recent snapshot (or cvs head), and >> 7.2, if you aren't already. > > FWIW, when Reid K. was implementing this in GSoC, his use case was > Unladen Swallow (http://en.wikipedia.org/wiki/Unladen_Swallow), and I > don't think he expected 1K+ separate JITted ELF objects. > > In jit.c, I see > objfile = symbol_file_add_from_bfd (nbfd, 0, sai, OBJF_SHARED); > > And in symfile.c (new_symfile_objfile): > > else if ((add_flags & SYMFILE_DEFER_BP_RESET) == 0) > { > breakpoint_re_set (); > } > > I think this is going to repeatedly remove and re-insert JIT > breakpoint into every added jit ELF, giving O(N*N) runtime, wouldn't > it? > > -- > Paul Pluzhnikov > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2011-01-03 10:44 ` Vyacheslav Egorov @ 2011-01-03 17:32 ` Paul Pluzhnikov 2011-01-03 22:01 ` Paul Pluzhnikov 0 siblings, 1 reply; 14+ messages in thread From: Paul Pluzhnikov @ 2011-01-03 17:32 UTC (permalink / raw) To: Vyacheslav Egorov; +Cc: Pedro Alves, gdb On Mon, Jan 3, 2011 at 2:43 AM, Vyacheslav Egorov <vegorov@chromium.org> wrote: > I thought that setting a breakpoint requires O(1) not O(n). It's a bit > disappointing. I think I've mis-diagnosed the problem slightly -- there is no breakpoint inserted into every JITted ELF. But there is still a quadratic loop in jit.c: /* Look up the objfile with this code entry address. */ static struct objfile * jit_find_objf_with_entry_addr (CORE_ADDR entry_addr) { struct objfile *objf; CORE_ADDR *objf_entry_addr; ALL_OBJFILES (objf) { objf_entry_addr = (CORE_ADDR *) objfile_data (objf, jit_objfile_data); if (objf_entry_addr != NULL && *objf_entry_addr == entry_addr) return objf; } return NULL; } ... static void jit_inferior_init (struct gdbarch *gdbarch) ... /* If we've attached to a running program, we need to check the descriptor to register any functions that were already generated. */ for (cur_entry_addr = descriptor.first_entry; cur_entry_addr != 0; cur_entry_addr = cur_entry.next_entry) { jit_read_code_entry (gdbarch, cur_entry_addr, &cur_entry); /* This hook may be called many times during setup, so make sure we don't add the same symbol file twice. */ if (jit_find_objf_with_entry_addr (cur_entry_addr) != NULL) continue; jit_register_code (gdbarch, cur_entry_addr, &cur_entry); } This is quadratic if you insert all 1000 JITted ELFs and call the __jit_debug_register_code() once. If you call __jit_debug_register_code() after adding every new ELF, you get N*N*N instead :-( My gut feeling though is that the really slow operation here is jit_read_code_entry (which reads target memory) and everything else is noise in comparison (but I haven't measured it). Assuming this is so, you get O(N) if you call __jit_debug_register_code once, and O(N*N) if you call it after every new ELF. I think we could do much better if we keep a "shadow" copy of the list of JITted ELFs in GDB, instead of re-reading it from inferior all the time. Let me try to come up with a patch ... > I will have to emit debug information (namely .debug_frame) > for all code object otherwise gdb is not able to unwind stack properly > on x64. This is slightly incorrect: you must emit .eh_frame{,_hdr} for any x86_64 code if you want stack unwinder to work, but you don't need to emit any debug info (IOW, .debug_frame is not required). Cheers, -- Paul Pluzhnikov ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2011-01-03 17:32 ` Paul Pluzhnikov @ 2011-01-03 22:01 ` Paul Pluzhnikov 2011-01-03 23:32 ` Pedro Alves 0 siblings, 1 reply; 14+ messages in thread From: Paul Pluzhnikov @ 2011-01-03 22:01 UTC (permalink / raw) To: Vyacheslav Egorov; +Cc: Pedro Alves, gdb On Mon, Jan 3, 2011 at 9:32 AM, Paul Pluzhnikov <ppluzhnikov@google.com> wrote: > My gut feeling though is that the really slow operation here is > jit_read_code_entry (which reads target memory) and everything else is > noise in comparison (but I haven't measured it). I've reproduced this with current CVS Head. I'd like to submit a test case for this, but where? gdb.base/, gdb.gdb/, elsewhere? I was also wrong about where the time is spent :-( The quadratic/cubic loop does not in fact happen (it is protected against with 'registering_code' in jit.c). But I wasn't completely off base -- breakpoints to have something to do with this. The profile (from 'perf') is: # Events: 19K cycles # # Overhead Command Shared Object Symbol # ........ ......... .................... ........................................ # 67.30% gdb64-cvs gdb [.] lookup_minimal_symbol_text 29.59% gdb64-cvs gdb [.] lookup_minimal_symbol 0.81% gdb64-cvs gdb [.] find_pc_section 0.19% gdb64-cvs libc-2.11.1.so [.] __ctype_b_loc And lookup_minimal_symbol_text is called from breakpoint_re_set: 10523 create_overlay_event_breakpoint ("_ovly_debug_event"); 10524 create_longjmp_master_breakpoint ("longjmp"); 10525 create_longjmp_master_breakpoint ("_longjmp"); 10526 create_longjmp_master_breakpoint ("siglongjmp"); 10527 create_longjmp_master_breakpoint ("_siglongjmp"); The following patch: Index: jit.c =================================================================== RCS file: /cvs/src/src/gdb/jit.c,v retrieving revision 1.8 diff -u -r1.8 jit.c --- jit.c 1 Jan 2011 15:33:09 -0000 1.8 +++ jit.c 3 Jan 2011 21:34:17 -0000 @@ -263,7 +263,7 @@ my_cleanups = make_cleanup (clear_int, ®istering_code); /* This call takes ownership of sai. */ - objfile = symbol_file_add_from_bfd (nbfd, 0, sai, OBJF_SHARED); + objfile = symbol_file_add_from_bfd (nbfd, SYMFILE_DEFER_BP_RESET, sai, OBJF_SHARED); /* Clear the registering_code flag. */ do_cleanups (my_cleanups); makes 100x difference (0.64s vs. 61.5s) for 1000 simulated JIT ELFs. I think it's pretty safe to assume that we don't need to search JITted objfiles for above functions, as JITs do not participate in normal symbol resolution at all. Do we need a new OBJF_JIT flag, or is above patch good enough? Thanks, -- Paul Pluzhnikov ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2011-01-03 22:01 ` Paul Pluzhnikov @ 2011-01-03 23:32 ` Pedro Alves 2011-01-03 23:40 ` Pedro Alves 2011-01-03 23:47 ` Paul Pluzhnikov 0 siblings, 2 replies; 14+ messages in thread From: Pedro Alves @ 2011-01-03 23:32 UTC (permalink / raw) To: Paul Pluzhnikov; +Cc: Vyacheslav Egorov, gdb Hi Paul, thanks for the tests, stats and for looking into this. On Monday 03 January 2011 22:00:48, Paul Pluzhnikov wrote: > I'd like to submit a test case for this, but where? > gdb.base/, gdb.gdb/, elsewhere? gdb.base/ should be fine. gdb.gdb/ is for tests that load gdb itself as an inferior. > diff -u -r1.8 jit.c Please use cvs diff -up. -p makes wonders at making patches easier to read. (I just put "diff -up" in ~/.cvsup). > --- jit.c 1 Jan 2011 15:33:09 -0000 1.8 > +++ jit.c 3 Jan 2011 21:34:17 -0000 > @@ -263,7 +263,7 @@ > my_cleanups = make_cleanup (clear_int, ®istering_code); > > /* This call takes ownership of sai. */ > - objfile = symbol_file_add_from_bfd (nbfd, 0, sai, OBJF_SHARED); > + objfile = symbol_file_add_from_bfd (nbfd, SYMFILE_DEFER_BP_RESET, > sai, OBJF_SHARED); > > /* Clear the registering_code flag. */ > do_cleanups (my_cleanups); > > > makes 100x difference (0.64s vs. 61.5s) for 1000 simulated JIT ELFs. > > I think it's pretty safe to assume that we don't need to search JITted > objfiles for above functions, as JITs do not participate in normal symbol > resolution at all. But doesn't that mean that pending breakpoints on JITed functions won't resolve anymore? > Do we need a new OBJF_JIT flag, or is above patch good enough? What if we record a per-objfile flag or cache storing whether a given objfile contains "longjmp" related symbols, so that we only lookup those symbols at least once per DSO? Quite similar in spirit to your objc_objfile_data change. Do we still get a significant slowdown from breakpoint_re_set with that change? (I've also noticed before that lookup_minimal_symbol_text iterates over all objfiles even if given an objfile to work with (worse case, but then again, new objfiles are appended at the end of the objfile list). Probably contributes to the noise, but these things add up.) -- Pedro Alves ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2011-01-03 23:32 ` Pedro Alves @ 2011-01-03 23:40 ` Pedro Alves 2011-01-03 23:47 ` Paul Pluzhnikov 1 sibling, 0 replies; 14+ messages in thread From: Pedro Alves @ 2011-01-03 23:40 UTC (permalink / raw) To: gdb; +Cc: Paul Pluzhnikov, Vyacheslav Egorov On Monday 03 January 2011 23:32:18, Pedro Alves wrote: > patches easier to read. (I just put "diff -up" in ~/.cvsup). Err, I meant ~/.cvsrc. -- Pedro Alves ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2011-01-03 23:32 ` Pedro Alves 2011-01-03 23:40 ` Pedro Alves @ 2011-01-03 23:47 ` Paul Pluzhnikov 2011-01-04 0:13 ` Pedro Alves 1 sibling, 1 reply; 14+ messages in thread From: Paul Pluzhnikov @ 2011-01-03 23:47 UTC (permalink / raw) To: Pedro Alves; +Cc: Vyacheslav Egorov, gdb On Mon, Jan 3, 2011 at 3:32 PM, Pedro Alves <pedro@codesourcery.com> wrote: > gdb.base/ should be fine. > gdb.gdb/ is for tests that load gdb itself as an inferior. Ah, thanks. >> diff -u -r1.8 jit.c > > Please use cvs diff -up. -p makes wonders at making > patches easier to read. Sorry about that. It's been a long time since I sent patches, got rusty :-( >> I think it's pretty safe to assume that we don't need to search JITted >> objfiles for above functions, as JITs do not participate in normal symbol >> resolution at all. > > But doesn't that mean that pending breakpoints on JITed functions > won't resolve anymore? Right. >> Do we need a new OBJF_JIT flag, or is above patch good enough? > > What if we record a per-objfile flag or cache storing whether a given > objfile contains "longjmp" related symbols, so that we only lookup > those symbols at least once per DSO? Did you mean "at most once per DSO"? > Quite similar in spirit to > your objc_objfile_data change. Do we still get a significant > slowdown from breakpoint_re_set with that change? We probably wouldn't. I'll make a proper patch, measure, and send to gdb-patches. > (I've also noticed before that lookup_minimal_symbol_text iterates > over all objfiles even if given an objfile to work with (worse > case, but then again, new objfiles are appended at the end > of the objfile list). Probably contributes to the noise, but > these things add up.) I'll fix that as well. Thanks for comments! -- Paul Pluzhnikov ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: JIT interface slowness 2011-01-03 23:47 ` Paul Pluzhnikov @ 2011-01-04 0:13 ` Pedro Alves 0 siblings, 0 replies; 14+ messages in thread From: Pedro Alves @ 2011-01-04 0:13 UTC (permalink / raw) To: Paul Pluzhnikov; +Cc: Vyacheslav Egorov, gdb On Monday 03 January 2011 23:46:50, Paul Pluzhnikov wrote: > >> Do we need a new OBJF_JIT flag, or is above patch good enough? > > > > What if we record a per-objfile flag or cache storing whether a given > > objfile contains "longjmp" related symbols, so that we only lookup > > those symbols at least once per DSO? > > Did you mean "at most once per DSO"? Yep. -- Pedro Alves ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-01-04 0:13 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-12-31 19:43 JIT interface slowness Vyacheslav Egorov 2010-12-31 20:10 ` Pedro Alves 2010-12-31 21:39 ` Vyacheslav Egorov 2010-12-31 22:23 ` Pedro Alves 2010-12-31 23:12 ` Vyacheslav Egorov 2010-12-31 23:36 ` Pedro Alves 2011-01-02 7:54 ` Paul Pluzhnikov 2011-01-03 10:44 ` Vyacheslav Egorov 2011-01-03 17:32 ` Paul Pluzhnikov 2011-01-03 22:01 ` Paul Pluzhnikov 2011-01-03 23:32 ` Pedro Alves 2011-01-03 23:40 ` Pedro Alves 2011-01-03 23:47 ` Paul Pluzhnikov 2011-01-04 0:13 ` Pedro Alves
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox