* [RFA] bug in symtab.c:lookup_block_symbol()'s search method
@ 2001-09-09 7:48 Jason Molenda
2001-09-10 11:24 ` Michael Snyder
0 siblings, 1 reply; 34+ messages in thread
From: Jason Molenda @ 2001-09-09 7:48 UTC (permalink / raw)
To: gdb-patches
This patch fixes a bug introduced in October, 2000. Discussion and history
are here:
http://sources.redhat.com/ml/gdb-patches/2001-09/msg00076.html
Analysis of performance impact is here:
http://sources.redhat.com/ml/gdb-patches/2001-09/msg00084.html
This patch should be approved for both the mainline and the 5.1 branch.
It adds no new testsuite failures.
Jason
2001-09-07 Jason Molenda (jmolenda@apple.com)
* symtab.c (lookup_block_symbol): Break out of linear search
if we're past the range of possible matches.
Index: symtab.c
===================================================================
RCS file: /cvs/src/src/gdb/symtab.c,v
retrieving revision 1.42
diff -u -p -r1.42 symtab.c
--- symtab.c 2001/07/07 17:19:50 1.42
+++ symtab.c 2001/09/09 14:17:25
@@ -1249,6 +1249,10 @@ lookup_block_symbol (register const stru
{
return sym;
}
+ if (SYMBOL_SOURCE_NAME (sym)[0] > name[0])
+ {
+ break;
+ }
bot++;
}
}
^ permalink raw reply [flat|nested] 34+ messages in thread* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-09 7:48 [RFA] bug in symtab.c:lookup_block_symbol()'s search method Jason Molenda @ 2001-09-10 11:24 ` Michael Snyder 2001-09-10 11:32 ` Jason Molenda 0 siblings, 1 reply; 34+ messages in thread From: Michael Snyder @ 2001-09-10 11:24 UTC (permalink / raw) To: Jason Molenda; +Cc: gdb-patches Jason Molenda wrote: > > This patch fixes a bug introduced in October, 2000. Discussion and history > are here: > http://sources.redhat.com/ml/gdb-patches/2001-09/msg00076.html > > Analysis of performance impact is here: > http://sources.redhat.com/ml/gdb-patches/2001-09/msg00084.html > > This patch should be approved for both the mainline and the 5.1 branch. > It adds no new testsuite failures. > > Jason > > 2001-09-07 Jason Molenda (jmolenda@apple.com) > > * symtab.c (lookup_block_symbol): Break out of linear search > if we're past the range of possible matches. > > Index: symtab.c > =================================================================== > RCS file: /cvs/src/src/gdb/symtab.c,v > retrieving revision 1.42 > diff -u -p -r1.42 symtab.c > --- symtab.c 2001/07/07 17:19:50 1.42 > +++ symtab.c 2001/09/09 14:17:25 > @@ -1249,6 +1249,10 @@ lookup_block_symbol (register const stru > { > return sym; > } > + if (SYMBOL_SOURCE_NAME (sym)[0] > name[0]) > + { > + break; > + } If this test works, then wouldn't some sort of strcmp test work too? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-10 11:24 ` Michael Snyder @ 2001-09-10 11:32 ` Jason Molenda 2001-09-10 11:50 ` Daniel Berlin 0 siblings, 1 reply; 34+ messages in thread From: Jason Molenda @ 2001-09-10 11:32 UTC (permalink / raw) To: Michael Snyder; +Cc: gdb-patches On Mon, Sep 10, 2001 at 11:24:10AM -0700, Michael Snyder wrote: > > + if (SYMBOL_SOURCE_NAME (sym)[0] > name[0]) > > + { > > + break; > > + } > > If this test works, then wouldn't some sort of strcmp test work too? It is comparing unmangled names, so there may be space chars which are not significant. I don't know enough about mangling to be confident that I could put a strcmp in here safely. I'll try it out later today and see if the testsuite happens to trip up on it. The SYMBOL_MATCHES_NAME() call used for the actual comparison uses util.c:strcmp_iw, which skips over whitespace. (On the good news side, when I first started working on this loop, I found that using strcmp() to detect matches produced a number of testsuite regressions, helping to Show Me The Light.) Jason ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-10 11:32 ` Jason Molenda @ 2001-09-10 11:50 ` Daniel Berlin 2001-09-10 11:52 ` Daniel Berlin [not found] ` <20010910130347.A5628@shell17.ba.best.com> 0 siblings, 2 replies; 34+ messages in thread From: Daniel Berlin @ 2001-09-10 11:50 UTC (permalink / raw) To: Jason Molenda; +Cc: Michael Snyder, gdb-patches Jason Molenda <jason-swarelist@molenda.com> writes: > On Mon, Sep 10, 2001 at 11:24:10AM -0700, Michael Snyder wrote: > >> > + if (SYMBOL_SOURCE_NAME (sym)[0] > name[0]) >> > + { >> > + break; >> > + } >> >> If this test works, then wouldn't some sort of strcmp test work too? > > It is comparing unmangled names, so there may be space chars which > are not significant. I don't know enough about mangling to be > confident that I could put a strcmp in here safely. You could if they were always mangled, but they aren't. The unmangled names need to be strcmp_iw'd. We use demangled names when doing symbol lookups because it is easier to demangle whatever you type in, and look that up, than it is to take what you type in, mangle it, and look that up. *Much* easier. > out later today and see if the testsuite happens to trip up on it. > The SYMBOL_MATCHES_NAME() call used for the actual comparison uses > util.c:strcmp_iw, which skips over whitespace. Right. > > (On the good news side, when I first started working on this loop, > I found that using strcmp() to detect matches produced a number of > testsuite regressions, helping to Show Me The Light.) Your test above isn't quite correct, actually. There is a corner case of the first character being a space. I don't think this can ever occur in c or C++, no clue about other languages. > > Jason -- "One time the power went out in my house and I had to use the flash on my camera to see my way around. I made a sandwich and took fifty pictures of my face. The neighbors thought there was lightning in my house. "-Steven Wright ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-10 11:50 ` Daniel Berlin @ 2001-09-10 11:52 ` Daniel Berlin [not found] ` <20010910130347.A5628@shell17.ba.best.com> 1 sibling, 0 replies; 34+ messages in thread From: Daniel Berlin @ 2001-09-10 11:52 UTC (permalink / raw) To: Daniel Berlin; +Cc: Jason Molenda, Michael Snyder, gdb-patches Daniel Berlin <dan@cgsoftware.com> writes: > Jason Molenda <jason-swarelist@molenda.com> writes: > >> On Mon, Sep 10, 2001 at 11:24:10AM -0700, Michael Snyder wrote: >> >>> > + if (SYMBOL_SOURCE_NAME (sym)[0] > name[0]) >>> > + { >>> > + break; >>> > + } >>> >>> If this test works, then wouldn't some sort of strcmp test work too? >> >> It is comparing unmangled names, so there may be space chars which >> are not significant. I don't know enough about mangling to be >> confident that I could put a strcmp in here safely. > You could if they were always mangled, but they aren't. > The unmangled names need to be strcmp_iw'd. > > We use demangled names when doing symbol lookups because it is easier > to demangle whatever you type in, and look that up, than it is to take > what you type in, mangle it, and look that up. > > *Much* easier. > >> out later today and see if the testsuite happens to trip up on it. >> The SYMBOL_MATCHES_NAME() call used for the actual comparison uses >> util.c:strcmp_iw, which skips over whitespace. > Right. > >> >> (On the good news side, when I first started working on this loop, >> I found that using strcmp() to detect matches produced a number of >> testsuite regressions, helping to Show Me The Light.) > > Your test above isn't quite correct, actually. > There is a corner case of the first character being a space. I don't > think this can ever occur in c or C++, no clue about other > languages. Oh, and in case it's not clear, the problem with the first char being space is that in strcmp_iw, we'd ignore that. > >> >> Jason > > -- > "One time the power went out in my house and I had to use the > flash on my camera to see my way around. I made a sandwich and > took fifty pictures of my face. The neighbors thought there was > lightning in my house. > "-Steven Wright -- ""-Steven Wright ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <20010910130347.A5628@shell17.ba.best.com>]
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method [not found] ` <20010910130347.A5628@shell17.ba.best.com> @ 2001-09-10 14:17 ` Daniel Berlin 2001-09-14 7:53 ` Andrew Cagney 0 siblings, 1 reply; 34+ messages in thread From: Daniel Berlin @ 2001-09-10 14:17 UTC (permalink / raw) To: Jason Molenda; +Cc: gdb-patches Jason Molenda <jason-swarelist@molenda.com> writes: > On Mon, Sep 10, 2001 at 02:50:29PM -0400, Daniel Berlin wrote: > >> Your test above isn't quite correct, actually. >> There is a corner case of the first character being a space. I don't >> think this can ever occur in c or C++, no clue about other languages. > > The initial binary search in lookup_block_symbol() isn't correct > for the same reason - it uses individual character comparisons to > determine less-than/greater-than relationships on the first character > of the symbol, and then uses the stock strcmp to narrow in on the > rest of the symbol. Nevertheless, it all generally works. Yet I remember just a few messages ago someone complaining that my patches don't handle the corner cases, which strangely, aren't handled anyway. It's like speed was being put ahead of correctness. > The binary search gets us to the beginning of the valid matches; the > follow-up linear search finds the right symbol. > > My instinct is to change strcmp_iw() to streq_iw() (as well as all > the callers) and add a new strcmp_iw() function which returns a > less-than/greater-than relationship instead of just match/not-match. > That would be the best way to deal with all of this. The best way to deal with this is to get rid of most of it. > > > IMNSHO gdb 5.1 can not be released with the symbol binary search > lookup broken as it has been for the last year. Broken? You mean slower. It works *correctly*, just not as *efficiently* as it could. Big difference. > The patch I sent > is a conservative one which brings the lookup cost near the original > gdb search time. Errr, realize I did *extensive* performance testing on that patch before submitting it. On files ranging from 1's to 100's of megs of debug info. The lookup cost in general was a fraction of what it used to be, even with the bug. In no case was the lookup time greater in any programs i profiled gdb on. > the behavior of strcmp_iw() so that it can be used here, but that's > a slightly more pervasive change. As I pointed out to you in private email, a better solution is to avoid the basic need to always binary search altogether. It's quite easy to implement, taking somewhere around 5 hours to throw together something that worked, and passed all the regression tests. Assign each block a unique id (easy, since blocks are only created once each). Create a mapping of symbol names to block id's they appear in (easy, since we can just add the symbol names to the map as we see them while creating the blocks containing them.), storing the block id's for a particular symbol in a sparse bitmap. For symbol lookups: Lookup the name in the mapping once at the beginning of the symbol lookup. Where we call lookup_block_symbol now, just test to see if the id'd bit is set in the sparse bitmap for the symbol. If so, call lookup block symbol. Otherwise, continue on. This moves us from O(blocks log n) worst case to O(blocks) worst case (where n is the number of symbols in a given block) When you have blocks with 100k symbols, it makes a large difference (a factor of 16, in fact), since we'll never look in blocks which won't contain the symbol. We already have minsym_hash_iw, which gives you the hash function to use for the mapping. I went down the "make lookup_block_symbol itself faster" road myself, fixing not only that bug, but replacing the linear array in the blocks with a hashtable. It's still slower than just avoiding the block lookup in the first place. > > Jason -- "I have an answering machine in my car. It says, "I'm home now. But leave a message and I'll call when I'm out." "-Steven Wright ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-10 14:17 ` Daniel Berlin @ 2001-09-14 7:53 ` Andrew Cagney 2001-09-14 8:53 ` Daniel Berlin 2001-09-14 9:06 ` Eli Zaretskii 0 siblings, 2 replies; 34+ messages in thread From: Andrew Cagney @ 2001-09-14 7:53 UTC (permalink / raw) To: Daniel Berlin; +Cc: Jason Molenda, gdb-patches >> IMNSHO gdb 5.1 can not be released with the symbol binary search >> lookup broken as it has been for the last year. > > Broken? > You mean slower. > It works *correctly*, just not as *efficiently* as it could. > Big difference. gdb is measured against many criteria, one is performance. if gdb's performance drops, gdb has regressed. some would describe it as broken. in some some situtations - replacing a macro by a function say - such a regression is considered acceptable. in other cases - such as an algorithm change that is ment to improve performance - it is not. andrew ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-14 7:53 ` Andrew Cagney @ 2001-09-14 8:53 ` Daniel Berlin 2001-09-14 9:06 ` Eli Zaretskii 1 sibling, 0 replies; 34+ messages in thread From: Daniel Berlin @ 2001-09-14 8:53 UTC (permalink / raw) To: Andrew Cagney; +Cc: Daniel Berlin, Jason Molenda, gdb-patches Andrew Cagney <ac131313@cygnus.com> writes: >>> IMNSHO gdb 5.1 can not be released with the symbol binary search >>> lookup broken as it has been for the last year. >> Broken? >> You mean slower. >> It works *correctly*, just not as *efficiently* as it could. >> Big difference. > > gdb is measured against many criteria, one is performance. if gdb's > performance drops, gdb has regressed. some would describe it as > broken. By that measure, gdb has been broken for a *long* time, well before my patch. > > in some some situtations - replacing a macro by a function say - such > a regression is considered acceptable. in other cases - such as an > algorithm change that is ment to improve performance - it is not. Pardone? As you will note, my patch improved correctness in corner cases, at the expense of some performance. Jason's patch will remove that correctness, on the assumption that these corner cases never occur, which nobody has answered definitively either way for all the languages GDB supports. If someone would have told me that other languages besides C and C++ will never cause symbol lookups to start with something strcmp_iw ignores, i would have not removed that portion of the code. There was no obvious correctness to the code that existed before, in fact, it was obviously broken in cases. The question was whether these cases can occur or not. Removing the code was not unintentional. Nobody ever answered the emails, so i did the safe thing. I've been yelled at before for valuing performance over correctness. So I value correctness over performance, and get yelled at again. Is it any wonder I don't participate in GDB development anymore? > > andrew > -- "If toast always lands butter-side down, and cats always land on their feet, what happen if you strap toast on the back of a cat and drop it? "-Steven Wright ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-14 7:53 ` Andrew Cagney 2001-09-14 8:53 ` Daniel Berlin @ 2001-09-14 9:06 ` Eli Zaretskii 2001-09-14 9:13 ` Jason Molenda 1 sibling, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2001-09-14 9:06 UTC (permalink / raw) To: ac131313; +Cc: dan, jason-swarelist, gdb-patches > Date: Fri, 14 Sep 2001 10:53:35 -0400 > From: Andrew Cagney <ac131313@cygnus.com> > > >> IMNSHO gdb 5.1 can not be released with the symbol binary search > >> lookup broken as it has been for the last year. > > > > Broken? > > You mean slower. > > It works *correctly*, just not as *efficiently* as it could. > > Big difference. > > gdb is measured against many criteria, one is performance. if gdb's > performance drops, gdb has regressed. some would describe it as broken. Some would describe this a broken, but most reasonable people (including you, Andrew ;-) probably won't. I agree with Dan here: I don't think this specific issue can be a valid reason for saying that GDB is ``broken'' and that ``gdb 5.1 can not be released'' in its current shape. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-14 9:06 ` Eli Zaretskii @ 2001-09-14 9:13 ` Jason Molenda 2001-09-14 9:58 ` Daniel Berlin 2001-09-14 10:52 ` Eli Zaretskii 0 siblings, 2 replies; 34+ messages in thread From: Jason Molenda @ 2001-09-14 9:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: gdb-patches On Fri, Sep 14, 2001 at 07:02:24PM +0300, Eli Zaretskii wrote: > > I agree with Dan here: I don't think this specific issue can be a > valid reason for saying that GDB is ``broken'' and that ``gdb 5.1 can > not be released'' in its current shape. Unfortunately, this performance issue becomes "breakage" on some platforms. Why, MacOS X just happens to be one. In some of our libraries, there are a lot of these opaque struct pointers in use. Looking up one of those variables requires gdb to sit-and-spin for 5-10 seconds, before realizing that it has no definition. Why does it take 5-10 seconds? Because lookup_block_symbol was changed from a binary search to a linear search for symbols not defined in a given block. After restoring the search to O-log time, the symbol lookup returns to being unmeasurably fast. If you're using GDB in under an IDE and you have a Locals window open, and one of those locals is an opaque structure, whenever you step into our out of that frame, you'll have this 5-10 second delay. Or if you have a stack trace window open and you have some opaque structures in that stack trace, every time that's recomputed you'll see these huge delays. I am hard pressed to not define this as "broken". Particularly when it is easy to fix, despite vague hand-waving of how this is not correct. Jason ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-14 9:13 ` Jason Molenda @ 2001-09-14 9:58 ` Daniel Berlin 2001-09-14 10:55 ` Eli Zaretskii 2001-09-14 10:52 ` Eli Zaretskii 1 sibling, 1 reply; 34+ messages in thread From: Daniel Berlin @ 2001-09-14 9:58 UTC (permalink / raw) To: Jason Molenda; +Cc: Eli Zaretskii, gdb-patches Jason Molenda <jason-swarelist@molenda.com> writes: > On Fri, Sep 14, 2001 at 07:02:24PM +0300, Eli Zaretskii wrote: > >> >> I agree with Dan here: I don't think this specific issue can be a >> valid reason for saying that GDB is ``broken'' and that ``gdb 5.1 can >> not be released'' in its current shape. > > Unfortunately, this performance issue becomes "breakage" on some > platforms. Why, MacOS X just happens to be one. In some of our > libraries, there are a lot of these opaque struct pointers in use. > Looking up one of those variables requires gdb to sit-and-spin for > 5-10 seconds, before realizing that it has no definition. Why does > it take 5-10 seconds? Because lookup_block_symbol was changed from > a binary search to a linear search for symbols not defined in a > given block. After restoring the search to O-log time, the symbol > lookup returns to being unmeasurably fast. Returns to being unmeasurably fast? Try reverting the patch entirely, and see how fast it is. It was a complete linear search for objective c, C++, and java before. No binary search at all. The search was not O-log time before I made the patch, so please don't pretend it's a regression from what was previously there. > > If you're using GDB in under an IDE and you have a Locals window > open, and one of those locals is an opaque structure, whenever you > step into our out of that frame, you'll have this 5-10 second delay. > Or if you have a stack trace window open and you have some opaque > structures in that stack trace, every time that's recomputed you'll > see these huge delays. > > I am hard pressed to not define this as "broken". Particularly > when it is easy to fix, despite vague hand-waving of how this is > not correct. Vague hand-waving? Fine Jason. Whatever you say. I'm unsubscribing from the list, I'm tired of trying to be helpful. > > Jason -- "They say we're 98% water. We're that close to drowning... (Picks up his glass of water from the stool...) I like to live on the edge... "-Steven Wright ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-14 9:58 ` Daniel Berlin @ 2001-09-14 10:55 ` Eli Zaretskii 0 siblings, 0 replies; 34+ messages in thread From: Eli Zaretskii @ 2001-09-14 10:55 UTC (permalink / raw) To: dan; +Cc: jason-swarelist, gdb-patches > From: Daniel Berlin <dan@cgsoftware.com> > Date: Fri, 14 Sep 2001 12:58:56 -0400 > > I'm unsubscribing from the list, I'm tired of trying to be helpful. I urge you not to do that. I'm sure Jason didn't mean that to sound so harsh. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-14 9:13 ` Jason Molenda 2001-09-14 9:58 ` Daniel Berlin @ 2001-09-14 10:52 ` Eli Zaretskii 2001-09-14 10:59 ` Daniel Jacobowitz 2001-09-15 0:54 ` Jason Molenda 1 sibling, 2 replies; 34+ messages in thread From: Eli Zaretskii @ 2001-09-14 10:52 UTC (permalink / raw) To: jason-swarelist; +Cc: gdb-patches > Date: Fri, 14 Sep 2001 09:12:41 -0700 > From: Jason Molenda <jason-swarelist@molenda.com> > > > I agree with Dan here: I don't think this specific issue can be a > > valid reason for saying that GDB is ``broken'' and that ``gdb 5.1 can > > not be released'' in its current shape. > > Unfortunately, this performance issue becomes "breakage" on some > platforms. Even if the performance hit is significant, I fail to understand how can someone say the entire program is broken, or that it cannot be released. Can we please get things back into their proportion? > Why, MacOS X just happens to be one. In some of our > libraries, there are a lot of these opaque struct pointers in use. > Looking up one of those variables requires gdb to sit-and-spin for > 5-10 seconds, before realizing that it has no definition. On what machine? Timing data is useless without saying on what kind of CPU and with what clock speed did you see that. Anyway, I don't consider 5-10 seconds such a long time. We still have in GDB operations that take much more, and we don't consider it ``broken'' because of that. > If you're using GDB in under an IDE and you have a Locals window > open, and one of those locals is an opaque structure, whenever you > step into our out of that frame, you'll have this 5-10 second delay. So display the hourglass for 10 seconds and be done with that. No one will really notice, except you and me. The world is full with good software that sometimes has 10-sec delays, to say nothing of bad software. > I am hard pressed to not define this as "broken". Particularly > when it is easy to fix, despite vague hand-waving of how this is > not correct. I'd urge everyone to please stay calm and technical, and avoid offending others by careless wording. Please! Can we disagree but still stay friends? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-14 10:52 ` Eli Zaretskii @ 2001-09-14 10:59 ` Daniel Jacobowitz 2001-09-14 11:57 ` Eli Zaretskii 2001-09-15 0:54 ` Jason Molenda 1 sibling, 1 reply; 34+ messages in thread From: Daniel Jacobowitz @ 2001-09-14 10:59 UTC (permalink / raw) To: gdb-patches On Fri, Sep 14, 2001 at 08:49:28PM +0300, Eli Zaretskii wrote: > > Date: Fri, 14 Sep 2001 09:12:41 -0700 > > From: Jason Molenda <jason-swarelist@molenda.com> > > > > > I agree with Dan here: I don't think this specific issue can be a > > > valid reason for saying that GDB is ``broken'' and that ``gdb 5.1 can > > > not be released'' in its current shape. > > > > Unfortunately, this performance issue becomes "breakage" on some > > platforms. > > Even if the performance hit is significant, I fail to understand how > can someone say the entire program is broken, or that it cannot be > released. Can we please get things back into their proportion? This much I can mostly agree with, but... > Anyway, I don't consider 5-10 seconds such a long time. We still have > in GDB operations that take much more, and we don't consider it > ``broken'' because of that. Maybe you don't. I'm sure I'm not the only one who would like to start eliminating them. I'm guessing that Jason is another. A lot of operations take frustratingly long that shouldn't. For instance, Jason is probably testing on a machine capable of displaying a GUI IDE and running MacOS X. That means a PowerPC, presumably, and at least in the 300MHz-500MHz range. I do much of my native GDB testing on a 50MHz MIPS R5432 board, which is probably on the order of thirty or forty times slower. His ten second delays become my five minute coffee breaks. I'm not exagerating; hitting tab accidentally while typing locks gdb solid for five minutes. I intend to do some work on fixing that. If he has a patch which can remove one such delay, I'm vastly in favor of it. > > If you're using GDB in under an IDE and you have a Locals window > > open, and one of those locals is an opaque structure, whenever you > > step into our out of that frame, you'll have this 5-10 second delay. > > So display the hourglass for 10 seconds and be done with that. No one > will really notice, except you and me. The world is full with good > software that sometimes has 10-sec delays, to say nothing of bad > software. Not if every Step instruction takes ten seconds! That makes debugging practically infeasible. -- Daniel Jacobowitz Carnegie Mellon University MontaVista Software Debian GNU/Linux Developer ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-14 10:59 ` Daniel Jacobowitz @ 2001-09-14 11:57 ` Eli Zaretskii 0 siblings, 0 replies; 34+ messages in thread From: Eli Zaretskii @ 2001-09-14 11:57 UTC (permalink / raw) To: drow; +Cc: gdb-patches > Date: Fri, 14 Sep 2001 14:00:51 -0400 > From: Daniel Jacobowitz <drow@mvista.com> > > > > Even if the performance hit is significant, I fail to understand how > > can someone say the entire program is broken, or that it cannot be > > released. Can we please get things back into their proportion? > > This much I can mostly agree with, but... > > > Anyway, I don't consider 5-10 seconds such a long time. We still have > > in GDB operations that take much more, and we don't consider it > > ``broken'' because of that. > > Maybe you don't. I'm sure I'm not the only one who would like to start > eliminating them. You are not the only one: I also would like them to be eliminated, as I'm sure is Dan and everyone else on this list. The issue is not whether to eliminate them, but whether their presence makes GDB ``broken'' and unsuitable for a release, and whether it justifies the unfortunate wording seen in this thread that made Dan feel he is unwanted here. I think the answer to the last two issues is a qualified NO. > A lot of operations take frustratingly long that shouldn't. Yes, we have quite a few operations in GDB that are slow, some of them painfully slow. But please do not attack people who wrote the slow code. Please let's assume that we all write the best code we can come up with, and even if someone errs (and we all do), that someone doesn't deserve to be bashed. Dan, in particular, has a long history of eliminating slow code from GDB, and I think he doesn't deserve to be accused in the opposite. I'm sorry to sound as if I were preaching, but it disturbs me deeply to see technical discussions to turn into personal flamewar. > For instance, Jason is probably testing on a machine capable of > displaying a GUI IDE and running MacOS X. That means a PowerPC, > presumably, and at least in the 300MHz-500MHz range. I do much of my > native GDB testing on a 50MHz MIPS R5432 board, which is probably on > the order of thirty or forty times slower. His ten second delays > become my five minute coffee breaks. Every modern program will be slow in a 50MHz machine. For example, Emacs will not be able to keep up with the keyboard auto-repeat rate and scroll the display if you press and hold C-v. Again, I'm not against eliminating delays, if they keep the code correct. > Not if every Step instruction takes ten seconds! That makes debugging > practically infeasible. I disagree. Many of my colleagues on my day-time job use debuggers (not GDB) which does take on the order of 5 seconds to step on a fast machine (a 3-CPU 250MHz SGI box), and none of them thinks it's infeasible. Again, please don't misinterpret me: I'm all for eliminating delays. But taking things out of proportion, and then attacking people based on those out-of-proportion impressions is something that I cannot say I care about. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-14 10:52 ` Eli Zaretskii 2001-09-14 10:59 ` Daniel Jacobowitz @ 2001-09-15 0:54 ` Jason Molenda 2001-09-15 3:43 ` Eli Zaretskii 2001-09-15 7:54 ` Daniel Berlin 1 sibling, 2 replies; 34+ messages in thread From: Jason Molenda @ 2001-09-15 0:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: gdb-patches On Fri, Sep 14, 2001 at 08:49:28PM +0300, Eli Zaretskii wrote: > Even if the performance hit is significant, I fail to understand how > can someone say the entire program is broken, or that it cannot be > released. Can we please get things back into their proportion? I'll stand what I said earlier. > > Why, MacOS X just happens to be one. In some of our > > libraries, there are a lot of these opaque struct pointers in use. > > Looking up one of those variables requires gdb to sit-and-spin for > > 5-10 seconds, before realizing that it has no definition. > > On what machine? Timing data is useless without saying on what kind > of CPU and with what clock speed did you see that. I re-ran the timings to verify on a 733 MHz G4 PPC Mac OS X machine. My test program is called "SimpleText", it is 80k of C source, it is a little notepad GUI program that is included with the Apple development tools. It links in two of our fundamental shared libraries. As I mentioned earlier, our Carbon library makes heavy use of these "opaque struct pointers" - in the Carbon header files you'll see things like "typedef struct OpaqueWindowPtr *WindowPtr", and people will pass variables of type "WindowPtr" around to various system calls. If you have a variable of this type, every time you examine it, gdb will try to find a definition of struct OpaqueWindowPtr. With the gdb 5.1 symtab.c, typing "p aWindowPtr" in SimpleText will take 3.5 seconds to complete (as reported by 'maint time 1'). With the patches (or with gdb 5.0), typing "p aWindowPtr" in SimpleText takes between 0.01 and 0.00 seconds (as reported by 'maint time 1' -- it's at the limits of maint time's granularity. Hence my earlier characterization as "unmeasurable".) I've already written this more than once, but I'll cover the technical details again. lookup_block_symbol is given a symbol name to find, and a pointer to a block to search--the block has a list of symbol names present there. l_b_s() does a binary search to get to the bottom range of possible matches of the symbol, then uses a simple linear search to step over a few symbols. Once it is beyond the range of plausible matches, it exits out of the linear search and returns a no-match result. Dan's patch dropped the "exits out of linear search" -- now it binary searches to the beginning of plausible ranges, and steps through to the end of the block. It's now an O(1/2*N) algorithm for worst-case, i.e. non-matches. More concretely. With the current gdb 5.1 sources, in SimpleText, running "p aWindowPtr" will cause gdb to loop 6,785,175 times in that linear-search after the binary search. With the patch I submitted, or the gdb 5.0 sources, running "p aWindowPtr" will run this loop 4,390 times. > Anyway, I don't consider 5-10 seconds such a long time. We still have > in GDB operations that take much more, and we don't consider it > ``broken'' because of that. It used to run in a hundredth of a second. Now it's taking over three seconds, on one of our fastest machines. When you hit the 'next' button in a GUI and it seizes up for four - ten seconds, and it didn't used to do that. How else can it be described? This behavior is not an act of god, an inevitable consequence of making gdb better, or an unavoidable tradeoff. It's a mistake. The argument against this this change is that there _might_ be a language which has space chars at the beginning of a demangled string. lookup_block_symbol() is already very, very broken if that's the case. The initial binary search uses a combination of first-letter comparisons and strcmp() -- the latter of which is clearly out of the question for making unmangled comparisons. > > If you're using GDB in under an IDE and you have a Locals window > > open, and one of those locals is an opaque structure, whenever you > > step into our out of that frame, you'll have this 5-10 second delay. > > So display the hourglass for 10 seconds and be done with that. No one > will really notice, except you and me. The world is full with good > software that sometimes has 10-sec delays, to say nothing of bad > software. I strongly disagree. A word processor that delays for two seconds every time you add a new character is not acceptable. A debugger that can take ten seconds to complete a "next" is not acceptable. If we're paging in a big shared library, there's nothing we can do about it. If we're doing a linear search instead of a binary seach, there _is_ something we can do about it. I'm not just making things up here. Our IDE at Apple currently has "next" performance of around 0.5 - 0.7 seconds -- and we hear no end of complaints about how it is too slow. There is a very tangible speed at which users want a GUI debugger to respond to a simple Next, and we're not meeting that yet. If 0.7 seconds is too slow, you can sure bet that 5 seconds will gain their ire. Jason ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-15 0:54 ` Jason Molenda @ 2001-09-15 3:43 ` Eli Zaretskii 2001-09-15 8:01 ` Daniel Berlin 2001-09-15 12:52 ` Jason Molenda 2001-09-15 7:54 ` Daniel Berlin 1 sibling, 2 replies; 34+ messages in thread From: Eli Zaretskii @ 2001-09-15 3:43 UTC (permalink / raw) To: jason-swarelist; +Cc: gdb-patches > Date: Sat, 15 Sep 2001 00:52:56 -0700 > From: Jason Molenda <jason-swarelist@molenda.com> > > With the gdb 5.1 symtab.c, typing "p aWindowPtr" in SimpleText will > take 3.5 seconds to complete (as reported by 'maint time 1'). > > With the patches (or with gdb 5.0), typing "p aWindowPtr" in > SimpleText takes between 0.01 and 0.00 seconds (as reported by > 'maint time 1' -- it's at the limits of maint time's granularity. > Hence my earlier characterization as "unmeasurable".) [...] > Dan's patch dropped the "exits out of linear search" -- now it > binary searches to the beginning of plausible ranges, and steps > through to the end of the block. It's now an O(1/2*N) algorithm > for worst-case, i.e. non-matches. Are you saying that Dan's change was a gratuitous one, i.e. there was no reason whatsoever to make that change? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-15 3:43 ` Eli Zaretskii @ 2001-09-15 8:01 ` Daniel Berlin 2001-09-15 9:09 ` Eli Zaretskii 2001-09-15 12:36 ` Daniel Jacobowitz 2001-09-15 12:52 ` Jason Molenda 1 sibling, 2 replies; 34+ messages in thread From: Daniel Berlin @ 2001-09-15 8:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: jason-swarelist, gdb-patches "Eli Zaretskii" <eliz@is.elta.co.il> writes: >> Date: Sat, 15 Sep 2001 00:52:56 -0700 >> From: Jason Molenda <jason-swarelist@molenda.com> >> >> With the gdb 5.1 symtab.c, typing "p aWindowPtr" in SimpleText will >> take 3.5 seconds to complete (as reported by 'maint time 1'). >> >> With the patches (or with gdb 5.0), typing "p aWindowPtr" in >> SimpleText takes between 0.01 and 0.00 seconds (as reported by >> 'maint time 1' -- it's at the limits of maint time's granularity. >> Hence my earlier characterization as "unmeasurable".) > [...] > > Dan's patch dropped the "exits out of linear search" -- now it >> binary searches to the beginning of plausible ranges, and steps >> through to the end of the block. It's now an O(1/2*N) algorithm >> for worst-case, i.e. non-matches. > > Are you saying that Dan's change was a gratuitous one, i.e. there was > no reason whatsoever to make that change? He would be correct, but only if he can prove that non of the languages GDB supports, could ever end up calling lookup_block_symbol (directly or indirectly), or contain symbols, that start with a character strcmp_iw ignores. His argument is that if this was true, it would mean a bunch of other things are broken too. This is not a good argument, it's quite possible they are broken. Tons of stuff related to support for more than just C in GDB are broken, or not in a good state. It's quite possible I missed removing some other instances of the same assumption in that routine. Which, BTW, would slow him down even more. All i'm asking is that he *prove*, besides using the GDB testsuite (which isn't a good test here, since it's tests for languages more than just C are pretty lacking), that his change is correct. I.E. That such a case would never happen. This is not easy. I couldn't do it, which is why i started concentrating on alternate methods of speeding it up. His time would be better spent just implementing one of the methods that doesn't require dealing with this issue at all, then it would trying to characterize my concern as "vague hand-waving". I'd estimate it would take him about the same time he's spent arguing on this mailing list, to implement them. This is based of course, on how long it took me. I'd also happily send the patches to him, so he could clean them up and submit them, or whatever, which would take an absolutely trivial amount of time. --Dan -- "What do batteries run on? "-Steven Wright ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-15 8:01 ` Daniel Berlin @ 2001-09-15 9:09 ` Eli Zaretskii 2001-09-15 12:36 ` Daniel Jacobowitz 1 sibling, 0 replies; 34+ messages in thread From: Eli Zaretskii @ 2001-09-15 9:09 UTC (permalink / raw) To: dan; +Cc: jason-swarelist, gdb-patches > From: Daniel Berlin <dan@cgsoftware.com> > Date: Sat, 15 Sep 2001 11:01:24 -0400 > > All i'm asking is that he *prove*, besides using the GDB testsuite > (which isn't a good test here, since it's tests for languages more > than just C are pretty lacking), that his change is correct. I tend to agree. If we aren't sure such cases don't exist, we might be introducing a subtle bug. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-15 8:01 ` Daniel Berlin 2001-09-15 9:09 ` Eli Zaretskii @ 2001-09-15 12:36 ` Daniel Jacobowitz 1 sibling, 0 replies; 34+ messages in thread From: Daniel Jacobowitz @ 2001-09-15 12:36 UTC (permalink / raw) To: Daniel Berlin; +Cc: Eli Zaretskii, jason-swarelist, gdb-patches On Sat, Sep 15, 2001 at 11:01:24AM -0400, Daniel Berlin wrote: > I'd estimate it would take him about the same time he's spent arguing > on this mailing list, to implement them. This is based of course, on > how long it took me. > I'd also happily send the patches to him, so he could clean them up > and submit them, or whatever, which would take an absolutely trivial > amount of time. In that case, please do that. Or send them to me, and I'll clean them up as fast as I can wrap my head around them. -- Daniel Jacobowitz Carnegie Mellon University MontaVista Software Debian GNU/Linux Developer ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-15 3:43 ` Eli Zaretskii 2001-09-15 8:01 ` Daniel Berlin @ 2001-09-15 12:52 ` Jason Molenda 1 sibling, 0 replies; 34+ messages in thread From: Jason Molenda @ 2001-09-15 12:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: gdb-patches On Sat, Sep 15, 2001 at 01:38:18PM +0300, Eli Zaretskii wrote: > > Dan's patch dropped the "exits out of linear search" -- now it > > binary searches to the beginning of plausible ranges, and steps > > through to the end of the block. It's now an O(1/2*N) algorithm > > for worst-case, i.e. non-matches. > > Are you saying that Dan's change was a gratuitous one, i.e. there was > no reason whatsoever to make that change? Yes. The lookup_block_symbol() function already depends on the first character being a meaningful, comparable character (the binary search is based on this, and on strcmp() of all things), so ending the search based on this criteria will not further weaken this algorithm. If there is real concern that this method of comparison is invalid for some languages, then the entire patch Dan submitted needs to be reworked or we need to stick with comparing only mangled/native names. It can't be had both ways -- either the whole conversion to umangled comparisons, as it is currently written, is invalid, or my correction of the linear search is correct. The current lookup_block_symbol search code is as incorrect as my patch. The only benefits of the current lookup_block_symbol code is that it linear searches over half of the symbols in the block, so it stands a 50% chance of hitting one of these names that we're talking about. I'd not characterize that as 'correct'. Jason ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-15 0:54 ` Jason Molenda 2001-09-15 3:43 ` Eli Zaretskii @ 2001-09-15 7:54 ` Daniel Berlin 2001-09-15 13:08 ` Jason Molenda 1 sibling, 1 reply; 34+ messages in thread From: Daniel Berlin @ 2001-09-15 7:54 UTC (permalink / raw) To: Jason Molenda; +Cc: Eli Zaretskii, gdb-patches > > > >> Anyway, I don't consider 5-10 seconds such a long time. We still have >> in GDB operations that take much more, and we don't consider it >> ``broken'' because of that. > > It used to run in a hundredth of a second. Now it's taking over > three seconds, on one of our fastest machines. When you hit the > 'next' button in a GUI and it seizes up for four - ten seconds, > and it didn't used to do that. How else can it be described? This > behavior is not an act of god, an inevitable consequence of making > gdb better, or an unavoidable tradeoff. It's a mistake. > > The argument against this this change is that there _might_ be a > language which has space chars at the beginning of a demangled > string. Not just demangled. Non-demangled names as well, since you are using SYMBOL_SOURCE_NAME. Not just space chars, either Anything strcmp_iw ignores, which is more than just space chars. Why don't you just try to find out if Fortan, Java, Objective C, C, and C++, as they are supported in GDB, can produce lookup_symbol's with characters that strcmp_iw, as the first character. In other words, why don't you just prove the patch works, before claiming it does? That's all i'm asking. I couldn't achieve this, which is why i removed that code. The fact that it slowed down C a lot for a specific case of opaque pointers is unfortunate, but it was done in the name of correctness. Don't pretend it's a bug. You seem to continually miss the fact that my only objection to the patch is that you haven't proven it to be correct. You are simply using "vague handwaving" to say it is, such as the next sentence. The GDB testsuite is not complete enough to be able to say something is correct simply because it passes the testsuite. > lookup_block_symbol() is already very, very broken if > that's the case. So? Wouldn't be the first time. > The initial binary search uses a combination of > first-letter comparisons and strcmp() -- the latter of which is > clearly out of the question for making unmangled comparisons. >> > If you're using GDB in under an IDE and you have a Locals window >> > open, and one of those locals is an opaque structure, whenever you >> > step into our out of that frame, you'll have this 5-10 second delay. >> >> So display the hourglass for 10 seconds and be done with that. No one >> will really notice, except you and me. The world is full with good >> software that sometimes has 10-sec delays, to say nothing of bad >> software. > > I strongly disagree. A word processor that delays for two seconds > every time you add a new character is not acceptable. A debugger > that can take ten seconds to complete a "next" is not acceptable. > If we're paging in a big shared library, there's nothing we can do > about it. If we're doing a linear search instead of a binary seach, > there _is_ something we can do about it. You only hit such bad performance in very odd cases, such as a ton of opaque pointers. Note that combining the solutions I later attempted avoids the problem of having to show that all the languages gdb supports can't do symbol lookups on something that starts with a character strcmp_iw ignores, altogether. This would be using a hash table for the block array, which we already have a hash function ideal for (minsym_hash_iw or whatever it's called), and the not actually searching blocks that don't contain the symbol. This would give you O(1) symbol lookup time, effectively (A given symbol only searches a constant number of blocks). This would certainly make your debugger much faster to respond. -- "I wrote a few children's books... Not on purpose. "-Steven Wright In-Reply-To: < 20010915005255.A2369@shell17.ba.best.com > (Jason Molenda's message of "Sat, 15 Sep 2001 00:52:56 -0700") Message-ID: <87sndopksw.fsf@cgsoftware.com> Lines: 92 User-Agent: Gnus/5.090004 (Oort Gnus v0.04) XEmacs/21.5 (artichoke) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-15 7:54 ` Daniel Berlin @ 2001-09-15 13:08 ` Jason Molenda 2001-09-15 13:33 ` Daniel Berlin 0 siblings, 1 reply; 34+ messages in thread From: Jason Molenda @ 2001-09-15 13:08 UTC (permalink / raw) To: gdb-patches On Sat, Sep 15, 2001 at 10:54:06AM -0400, Daniel Berlin wrote: > Note that combining the solutions I later attempted avoids the problem > of having > to show that all the languages gdb supports can't do symbol lookups > on something that starts with a character strcmp_iw ignores, > altogether. I guess you didn't read either of my initial notes in this discusssion. I provided pointers to them in the patch submission. > This would be using a hash table for the block array, which we already > have a hash function ideal for (minsym_hash_iw or whatever it's > called), and the not actually searching blocks that don't contain the > symbol. > > This would give you O(1) symbol lookup time, effectively (A given > symbol only searches a constant number of blocks). > This would certainly make your debugger much faster to respond. What you're proposing is to ignore the inefficiencies of lookup_block_symbol by reducing the number of calls to it. Instead, why don't we fix the inefficiencies in lookup_block_symbol *and* reduce the number of calls to it? Imagine how the gdb users will be dancing in the street, thanking buddha for their great fortune of having such a wonderful debugger. Concretely. GDB has a number of symtabs, call it 'i', and a number of _unique_ global blocks, call it 'j'. Every symtab has a link to a global block, but few of these global blocks are unique. In gdb itself, i is an order of magnitude larger than j. I expect some programs will have i two orders of magnitude larger than j. Each global block has a number of symbols in it, call the average of these 'n'. In gdb 5.0, looking for an undefined symbol will happen on order of O(i * lg (n)). In gdb 5.1, that search will be on the order of O(i * 1/2 * n). Dan's proposal is to change this to O(j * 1/2 * n). I am looking for at _least_ O(i * lg (n)), but I'd really like to see us get both of these eventually for O(j * lg (n)). If we define conservatively as i = j * 10, these become gdb 5.0 O (j * 10 * lg (n)) gdb 5.1 O (j * 10 * 1/2 * n) My patch O (j * 10 * lg (n)) What I'd like to see O (j * lg (n)) Dan's block patch alone O (j * 1/2 * n)) Debugging gdb itself, "i" is 330, and "j" is 23. The average number of symbols was small - I don't have the number here right now, but I think it was around 10 symbols. All of these numbers are a lot larger for a C++ program, obviously. Jason ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-15 13:08 ` Jason Molenda @ 2001-09-15 13:33 ` Daniel Berlin 2001-09-15 13:52 ` Daniel Berlin 0 siblings, 1 reply; 34+ messages in thread From: Daniel Berlin @ 2001-09-15 13:33 UTC (permalink / raw) To: Jason Molenda; +Cc: gdb-patches On Saturday, September 15, 2001, at 04:08 PM, Jason Molenda wrote: > On Sat, Sep 15, 2001 at 10:54:06AM -0400, Daniel Berlin wrote: > >> Note that combining the solutions I later attempted avoids the problem >> of having >> to show that all the languages gdb supports can't do symbol lookups >> on something that starts with a character strcmp_iw ignores, >> altogether. > > I guess you didn't read either of my initial notes in this discusssion. > I provided pointers to them in the patch submission. Actually, I did. You are talking about a hack i had added to avoid repeated lookups, not the later attempts to make lookup_block_symbol faster. > >> This would be using a hash table for the block array, which we already >> have a hash function ideal for (minsym_hash_iw or whatever it's >> called), and the not actually searching blocks that don't contain the >> symbol. >> >> This would give you O(1) symbol lookup time, effectively (A given >> symbol only searches a constant number of blocks). >> This would certainly make your debugger much faster to respond. > > > What you're proposing is to ignore the inefficiencies of > lookup_block_symbol by reducing the number of calls to it. No, I'm not. Please *read* what i am proposing. > Instead, > why don't we fix the inefficiencies in lookup_block_symbol *and* > reduce the number of calls to it? Which is exactly what i propose doing, and in fact, *have done before*. You are simply going by what appears on gdb-patches years ago, as submitted patches. > Imagine how the gdb users will > be dancing in the street, thanking buddha for their great fortune of > having such a wonderful debugger. > > Concretely. GDB has a number of symtabs, call it 'i', and a number of > _unique_ global blocks, call it 'j'. Every symtab has a link to a > global > block, but few of these global blocks are unique. In gdb itself, i is > an order of magnitude larger than j. I expect some programs will have > i two orders of magnitude larger than j. Each global block has a number > of symbols in it, call the average of these 'n'. > > In gdb 5.0, looking for an undefined symbol will happen on order of > O(i * lg (n)). In gdb 5.1, that search will be on the order of > O(i * 1/2 * n). Dan's proposal is to change this to O(j * 1/2 * n). > I am looking for at _least_ O(i * lg (n)), but I'd really like to see > us get both of these eventually for O(j * lg (n)). No, my proposal is not going to make it O(j * 1/2 * n). It's going to make it O(j) with the patch i sent you a few hours ago, and O(log j) if you combined it with an approach to not look in blocks unless the symbol is in there (which can be done in O(1) time. > If we define conservatively as i = j * 10, these become > > gdb 5.0 O (j * 10 * lg (n)) > gdb 5.1 O (j * 10 * 1/2 * n) > My patch O (j * 10 * lg (n)) > > What I'd like to see O (j * lg (n)) > Dan's block patch alone O (j * 1/2 * n)) No, actually, it would be O(j). Each block lookup takes O(1) time, if the block is a hash table. If you had to search every globally unique block, you'd search j blocks, at O(1) time, for a total of O (j). Combining the patch I sent you a few hours ago, with a patch to not even search blocks where the symbol won't be in, makes it O(log j). Please stop assuming the patch i have sent you is something i had submitted already. It is not, AFAIK. It was talked about, if you look at the archives. I have *been down this road before*. I have *sped up gdb symbol lookups to the point they take effectively constant time (O (number of active scopes))*. The patches to do this are *not* on gdb-patches. > > Debugging gdb itself, "i" is 330, and "j" is 23. The average number > of symbols was small - I don't have the number here right now, but > I think it was around 10 symbols. All of these numbers are a lot > larger for a C++ program, obviously. > > Jason ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-15 13:33 ` Daniel Berlin @ 2001-09-15 13:52 ` Daniel Berlin 2001-09-15 14:02 ` Jason Molenda 0 siblings, 1 reply; 34+ messages in thread From: Daniel Berlin @ 2001-09-15 13:52 UTC (permalink / raw) To: Daniel Berlin; +Cc: Jason Molenda, gdb-patches Since you seem to be having trouble either reading, or understanding, what I have proposed, and implemented, let me break it down for you, in simple english, so there is no confusion as to what you *think* i am proposing (based on god knows what, probably some old email with a patch from gdb-patches), and what I have proposed, and implemented: First, you can speed up lookup_block_symbol. I sent you a patch hours ago that does this. It turns the linear array of symbols in a block into buckets of a chained hashtable, adding a "hash_next" member to struct symbol to chain them together. The hash function is msymbol_hash_iw. The hashed key is SYMBOL_SOURCE_NAME. This turns block lookups, on non-function argument lists (function argument lists aren't sorted, they have to be kept in the original order), into O(1). This makes your max symbol lookup time, O (j), where j is the number of globally unique blocks. Second, you can take lookup_symbol, and modify it to only call lookup_block_symbol once. This is done by providing a mapping from names to blocks they exist in, using a hash table with SYMBOL_SOURCE_NAME keys, and sparse bitmaps as the data, with a bit set per block it exists in. Now, for undefined symbols, the cost is O(1), total. It won't exist in the mapping, and thus, we know the symbol exists nowhere in the blocks. For defined symbols, the cost is O(number of scopes to search), where number of scopes to search <= all the blocks in the program, worst case (in reality, it's usually < 20). We simply test one bit per scope (since each scope is represented by a single block), and as soon as we get a hit, we call lookup_block_symbol. This will be O (number of scopes to search). It will only call lookup_block_symbol *one* time. Theoretically, number of scopes to search can be J, but this would require every scope in your program be visible at once, which, unless i'm mistaken, is pretty much impossible. number of scopes to search is usually between 7 and 30, and is only dependent on the nesting structure of your program, not it's size. Now will you stop mischaracterizing what I have proposed? It's pretty hard to do better than the above, but possible. It would require a significant rework of the symbol table structure, while both of the things above required a few hours each (In fact, I redid the first one this morning, since I didn't have the source tree around anymore). ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-15 13:52 ` Daniel Berlin @ 2001-09-15 14:02 ` Jason Molenda 2001-09-15 14:21 ` Daniel Berlin 2001-09-16 0:15 ` Eli Zaretskii 0 siblings, 2 replies; 34+ messages in thread From: Jason Molenda @ 2001-09-15 14:02 UTC (permalink / raw) To: Daniel Berlin; +Cc: gdb-patches On Sat, Sep 15, 2001 at 04:51:35PM -0400, Daniel Berlin wrote: > > This turns block lookups, on non-function argument lists (function > argument lists aren't sorted, they have to be kept in the original > order), into O(1). > This makes your max symbol lookup time, O (j), where j is the number of > globally unique blocks. Pretty cool. I look forward to seeing this submitted, approved, and integrated with the gdb sources. I'd like to stay focused on the topic on hand for now. This discussion is about gdb currently experiencing a serious performance regression wrt the last release of gdb, and I'm submitting a patch to fix that. I'd like to see this problem addressed before 5.1 goes out. Maybe I'm wasting my time, and in three months this work you're doing will make symbol searching vastly faster than it currently is. But in the mean time, I want to fix the problem at hand. I could certainly understand if you're not interested in this particluar problem - the work you're doing could represent a major step beyond any of these existing algorithms. Do you have anything to add regarding this note? http://sources.redhat.com/ml/gdb-patches/2001-09/msg00195.html Jason ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-15 14:02 ` Jason Molenda @ 2001-09-15 14:21 ` Daniel Berlin 2001-09-16 0:15 ` Eli Zaretskii 1 sibling, 0 replies; 34+ messages in thread From: Daniel Berlin @ 2001-09-15 14:21 UTC (permalink / raw) To: Jason Molenda; +Cc: gdb-patches On Saturday, September 15, 2001, at 05:02 PM, Jason Molenda wrote: > On Sat, Sep 15, 2001 at 04:51:35PM -0400, Daniel Berlin wrote: > >> >> This turns block lookups, on non-function argument lists (function >> argument lists aren't sorted, they have to be kept in the original >> order), into O(1). >> This makes your max symbol lookup time, O (j), where j is the number of >> globally unique blocks. > > Pretty cool. I look forward to seeing this submitted, approved, > and integrated with the gdb sources. I sent it to you, actually. I don't have time to submit it, unfortunately, i'm busy with gcc work and law school. It would take an insignificant amount of time to write a changelog entry. It gives no regressions on x86 or powerpc, and has been used for people debugging very large C++ programs with no change other than much faster lookups. This was months ago, however. > > I'd like to stay focused on the topic on hand for now. This > discussion is about gdb currently experiencing a serious performance > regression wrt the last release of gdb, and I'm submitting a patch > to fix that. I'd like to see this problem addressed before 5.1 goes > out. > > Maybe I'm wasting my time, and in three months this work you're > doing will make symbol searching vastly faster than it currently > is. > But in the mean time, I want to fix the problem at hand. > I could certainly understand if you're not interested in this > particluar problem - the work you're doing could represent a major > step beyond any of these existing algorithms. > Do you have anything to add regarding this note? > http://sources.redhat.com/ml/gdb-patches/2001-09/msg00195.html > Yes. The reality is that it's likely the search needs to be rewritten. This is one of the non-performance reasons i hashtable'd the blocks. You can just use msymbol_hash_iw on the key, and thus, don't need to worry about it at all. In this case, however, we could just abort if the first character of a lookup is not strcmp_iw significant,and add your patch as well. That way, if it ever occurs, someone will bitch that their program aborts, and we can fix the problem entirely in the meanwhile. > Jason ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-15 14:02 ` Jason Molenda 2001-09-15 14:21 ` Daniel Berlin @ 2001-09-16 0:15 ` Eli Zaretskii 2001-09-17 22:56 ` Andrew Cagney 1 sibling, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2001-09-16 0:15 UTC (permalink / raw) To: jason-swarelist; +Cc: dan, gdb-patches > Date: Sat, 15 Sep 2001 14:02:35 -0700 > From: Jason Molenda <jason-swarelist@molenda.com> > > I'd like to stay focused on the topic on hand for now. This > discussion is about gdb currently experiencing a serious performance > regression wrt the last release of gdb, and I'm submitting a patch > to fix that. I'd like to see this problem addressed before 5.1 goes > out. Why do you think it needs to be addressed in 5.1? AFAIK, the fact that the release branch was cut means that only relatively safe bugfixes are accepted on the branch. This change doesn't seem safe enough IMHO, since we still continue arguing whether there are or aren't cases where Dan's change matters. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-16 0:15 ` Eli Zaretskii @ 2001-09-17 22:56 ` Andrew Cagney 2001-09-17 23:12 ` Jason Molenda 2001-09-17 23:18 ` Daniel Jacobowitz 0 siblings, 2 replies; 34+ messages in thread From: Andrew Cagney @ 2001-09-17 22:56 UTC (permalink / raw) To: Eli Zaretskii, jason-swarelist; +Cc: dan, gdb-patches >> Date: Sat, 15 Sep 2001 14:02:35 -0700 >> From: Jason Molenda <jason-swarelist@molenda.com> >> >> I'd like to stay focused on the topic on hand for now. This >> discussion is about gdb currently experiencing a serious performance >> regression wrt the last release of gdb, and I'm submitting a patch >> to fix that. I'd like to see this problem addressed before 5.1 goes >> out. > > > Why do you think it needs to be addressed in 5.1? AFAIK, the fact > that the release branch was cut means that only relatively safe > bugfixes are accepted on the branch. This change doesn't seem safe > enough IMHO, since we still continue arguing whether there are or > aren't cases where Dan's change matters. The suggested criteria for committing something to the 5.1 branch are: o does it build (and if not did it build in 5.0/4.18) o does ``break main; run'' work Jason, you've indicate that this seriously hurts Apple's GDB branch. Since (FSF) GDB has never worked on MacOS X, I don't think hurting Apple's GDB branch is really qualifies as a reason for getting something into 5.1 of GDB (The trunk yes definitly, just not the branch). The reason 5.1 has got hung up is because people have found platforms that suffered more basic bit rot :-( Andrew ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-17 22:56 ` Andrew Cagney @ 2001-09-17 23:12 ` Jason Molenda 2001-09-18 6:21 ` Daniel Berlin 2001-09-18 7:32 ` Andrew Cagney 2001-09-17 23:18 ` Daniel Jacobowitz 1 sibling, 2 replies; 34+ messages in thread From: Jason Molenda @ 2001-09-17 23:12 UTC (permalink / raw) To: Andrew Cagney; +Cc: gdb-patches On Tue, Sep 18, 2001 at 01:56:36AM -0400, Andrew Cagney wrote: > The suggested criteria for committing something to the 5.1 branch are: > > o does it build (and if not did it build in 5.0/4.18) > > o does ``break main; run'' work If the bar is that high for patches on the release branch, then this change does not meet it. We'll still need to address this on the trunk. > Jason, you've indicate that this seriously hurts Apple's GDB branch. Fear not, it's already patched in the Apple GDB branch. :-) My concern is for (a) environments where pointers to undefined structures are used, and (b) lookups of undefined symbols. Obviously it's not a great travisty if (b) is slow, but (a) will hit anyone who has libraries that try to hide the implementation from the interface in the same way that the Appple Carbon libraries do. I probably won't have time until next weekend, but I'll be sending modified versions of the profiling and symtab patches to incorporate many of the suggestions on the list. Jason ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-17 23:12 ` Jason Molenda @ 2001-09-18 6:21 ` Daniel Berlin 2001-09-18 7:32 ` Andrew Cagney 1 sibling, 0 replies; 34+ messages in thread From: Daniel Berlin @ 2001-09-18 6:21 UTC (permalink / raw) To: Jason Molenda; +Cc: Andrew Cagney, gdb-patches On Tuesday, September 18, 2001, at 02:11 AM, Jason Molenda wrote: > On Tue, Sep 18, 2001 at 01:56:36AM -0400, Andrew Cagney wrote: > >> The suggested criteria for committing something to the 5.1 branch are: >> >> o does it build (and if not did it build in 5.0/4.18) >> >> o does ``break main; run'' work > > If the bar is that high for patches on the release branch, then > this change does not meet it. We'll still need to address this on > the trunk. Yes, of course. > >> Jason, you've indicate that this seriously hurts Apple's GDB branch. > > Fear not, it's already patched in the Apple GDB branch. :-) > > My concern is for (a) environments where pointers to undefined > structures are used, and (b) lookups of undefined symbols. Obviously > it's not a great travisty if (b) is slow, but (a) will hit anyone > who has libraries that try to hide the implementation from the > interface in the same way that the Appple Carbon libraries do. This doesn't quite happen as often as you think. Undefined symbol lookups have always been the absolute worst case, even ignoring the lookup_block_symbol slowness, since it can force readin of all psymtabs, and require us to search minsyms. > > I probably won't have time until next weekend, but I'll be sending > modified versions of the profiling and symtab patches to incorporate > many of the suggestions on the list. > I hope you take a look at the patch I sent you, as it's a good deal faster than the binary search. > Jason ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-17 23:12 ` Jason Molenda 2001-09-18 6:21 ` Daniel Berlin @ 2001-09-18 7:32 ` Andrew Cagney 1 sibling, 0 replies; 34+ messages in thread From: Andrew Cagney @ 2001-09-18 7:32 UTC (permalink / raw) To: Jason Molenda; +Cc: gdb-patches >> The suggested criteria for committing something to the 5.1 branch are: >> >> o does it build (and if not did it build in 5.0/4.18) >> >> o does ``break main; run'' work > > > If the bar is that high for patches on the release branch, then > this change does not meet it. We'll still need to address this on > the trunk. There is no bar, it is a suggestion :-) See: http://sources.redhat.com/ml/gdb/2001-07/msg00418.html Andrew ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-17 22:56 ` Andrew Cagney 2001-09-17 23:12 ` Jason Molenda @ 2001-09-17 23:18 ` Daniel Jacobowitz 2001-09-18 4:51 ` Eli Zaretskii 1 sibling, 1 reply; 34+ messages in thread From: Daniel Jacobowitz @ 2001-09-17 23:18 UTC (permalink / raw) To: Andrew Cagney; +Cc: Eli Zaretskii, jason-swarelist, dan, gdb-patches On Tue, Sep 18, 2001 at 01:56:36AM -0400, Andrew Cagney wrote: > >> Date: Sat, 15 Sep 2001 14:02:35 -0700 > >> From: Jason Molenda <jason-swarelist@molenda.com> > >> > >> I'd like to stay focused on the topic on hand for now. This > >> discussion is about gdb currently experiencing a serious performance > >> regression wrt the last release of gdb, and I'm submitting a patch > >> to fix that. I'd like to see this problem addressed before 5.1 goes > >> out. > > > > > > Why do you think it needs to be addressed in 5.1? AFAIK, the fact > > that the release branch was cut means that only relatively safe > > bugfixes are accepted on the branch. This change doesn't seem safe > > enough IMHO, since we still continue arguing whether there are or > > aren't cases where Dan's change matters. > > The suggested criteria for committing something to the 5.1 branch are: > > o does it build (and if not did it build in 5.0/4.18) > > o does ``break main; run'' work > > Jason, you've indicate that this seriously hurts Apple's GDB branch. > Since (FSF) GDB has never worked on MacOS X, I don't think hurting > Apple's GDB branch is really qualifies as a reason for getting something > into 5.1 of GDB (The trunk yes definitly, just not the branch). These two paragraphs seem to be in direct opposition. Which do you mean? It helps performance on more than just OSX; it's an issue I've run in to more than a few times before. It doesn't break "break main; run". Did you mean "criteria for committing something to the trunk"? -- Daniel Jacobowitz Carnegie Mellon University MontaVista Software Debian GNU/Linux Developer ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method 2001-09-17 23:18 ` Daniel Jacobowitz @ 2001-09-18 4:51 ` Eli Zaretskii 0 siblings, 0 replies; 34+ messages in thread From: Eli Zaretskii @ 2001-09-18 4:51 UTC (permalink / raw) To: drow; +Cc: ac131313, jason-swarelist, dan, gdb-patches > Date: Tue, 18 Sep 2001 02:15:38 -0400 > From: Daniel Jacobowitz <drow@mvista.com> > > > > The suggested criteria for committing something to the 5.1 branch are: > > > > o does it build (and if not did it build in 5.0/4.18) > > > > o does ``break main; run'' work > > > > Jason, you've indicate that this seriously hurts Apple's GDB branch. > > Since (FSF) GDB has never worked on MacOS X, I don't think hurting > > Apple's GDB branch is really qualifies as a reason for getting something > > into 5.1 of GDB (The trunk yes definitly, just not the branch). > > These two paragraphs seem to be in direct opposition. Which do you > mean? It helps performance on more than just OSX; it's an issue I've > run in to more than a few times before. It doesn't break "break main; > run". I think Andrew meant to say that if a GDB port for some platform is broken to the degree that it either doesn't build or cannot do a ``break main; run'', then it is justified to fix that platform on the release branch. ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2001-09-18 7:32 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-09 7:48 [RFA] bug in symtab.c:lookup_block_symbol()'s search method Jason Molenda
2001-09-10 11:24 ` Michael Snyder
2001-09-10 11:32 ` Jason Molenda
2001-09-10 11:50 ` Daniel Berlin
2001-09-10 11:52 ` Daniel Berlin
[not found] ` <20010910130347.A5628@shell17.ba.best.com>
2001-09-10 14:17 ` Daniel Berlin
2001-09-14 7:53 ` Andrew Cagney
2001-09-14 8:53 ` Daniel Berlin
2001-09-14 9:06 ` Eli Zaretskii
2001-09-14 9:13 ` Jason Molenda
2001-09-14 9:58 ` Daniel Berlin
2001-09-14 10:55 ` Eli Zaretskii
2001-09-14 10:52 ` Eli Zaretskii
2001-09-14 10:59 ` Daniel Jacobowitz
2001-09-14 11:57 ` Eli Zaretskii
2001-09-15 0:54 ` Jason Molenda
2001-09-15 3:43 ` Eli Zaretskii
2001-09-15 8:01 ` Daniel Berlin
2001-09-15 9:09 ` Eli Zaretskii
2001-09-15 12:36 ` Daniel Jacobowitz
2001-09-15 12:52 ` Jason Molenda
2001-09-15 7:54 ` Daniel Berlin
2001-09-15 13:08 ` Jason Molenda
2001-09-15 13:33 ` Daniel Berlin
2001-09-15 13:52 ` Daniel Berlin
2001-09-15 14:02 ` Jason Molenda
2001-09-15 14:21 ` Daniel Berlin
2001-09-16 0:15 ` Eli Zaretskii
2001-09-17 22:56 ` Andrew Cagney
2001-09-17 23:12 ` Jason Molenda
2001-09-18 6:21 ` Daniel Berlin
2001-09-18 7:32 ` Andrew Cagney
2001-09-17 23:18 ` Daniel Jacobowitz
2001-09-18 4:51 ` Eli Zaretskii
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox