From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 25875 invoked by alias); 21 Feb 2003 19:15:00 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 25864 invoked from network); 21 Feb 2003 19:15:00 -0000 Received: from unknown (HELO mx1.redhat.com) (172.16.49.200) by 172.16.49.205 with SMTP; 21 Feb 2003 19:15:00 -0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.11.6/8.11.6) with ESMTP id h1LJF0N06686 for ; Fri, 21 Feb 2003 14:15:00 -0500 Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [172.16.52.156]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id h1LJF0f01789 for ; Fri, 21 Feb 2003 14:15:00 -0500 Received: from localhost.redhat.com (romulus-int.sfbay.redhat.com [172.16.27.46]) by pobox.corp.redhat.com (8.11.6/8.11.6) with ESMTP id h1LJEwt32017; Fri, 21 Feb 2003 14:14:58 -0500 Received: by localhost.redhat.com (Postfix, from userid 469) id 63819FF79; Fri, 21 Feb 2003 14:19:02 -0500 (EST) From: Elena Zannoni MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15958.31653.994225.799759@localhost.redhat.com> Date: Fri, 21 Feb 2003 19:15:00 -0000 To: David Carlton Cc: Daniel Jacobowitz , gdb-patches@sources.redhat.com, Elena Zannoni , Jim Blandy Subject: Re: RFA/symtab: Let search_symbols find exact matches In-Reply-To: References: <20030210160107.GA587@nevyn.them.org> X-SW-Source: 2003-02/txt/msg00530.txt.bz2 David Carlton writes: > On Mon, 10 Feb 2003 11:01:07 -0500, Daniel Jacobowitz said: > > > This patch renames search_symbols to search_symbols_aux, and lets it take an > > argument that specified regex or exact (well, strcmp_iw) matching. Then > > search_symbols becomes a wrapper. > > > The new function is used by make_symbol_overload_list, which shaves 50% time > > off of some tests in the C++ directory if your system libc has debugging > > symbols; it removes the bogusity where all psymtabs were converted to > > symtabs during overload resolution. Whew. This cuts memory for > > namespace.exp from 70MB to 7MB, allowing some of my hardware to actually run > > the test without crashing. > > Here's take two on my comments on this. First, I should say that I > totally agree that the aforementioned bogusity in > make_symbol_overload_list should go; as discussed in > , > though, there are ways of doing that that involve much less drastic > changes. > > So the question is: do search_symbols and make_symbol_overload_list > have enough of a common core such that extracting that common core > into a single function would clean up the code, reducing maintenance > overhead, or would that common code be artificial, increasing > maintenance overhead? > > Here's an analysis of the two functions. It may well be incorrect; I > understand GDB's symbol mechanisms pretty well (better than I'd like, > I sometimes think...), and I understand make_symbol_overload_list > pretty well, but I haven't spent too much time with search_symbols > yet. > BTW, make_symbol_overload_list came in via the HP merge. Roughly at the same time (in parallel) the search_symbols function was introduced. I.e. HP didn't have search_symbols available. Actually the same functionality of search_symbol was provided by another function called list_symbols. I suspect that list_symbols was broken down into several functions, to be used in gdbtk, when search_symbols was introduced. So, if you change search_symbols we should make sure gdbtk still works. I agree it could be a good idea to clean up search_symbols, but keep the history above in mind. elena > > Loosely speaking, this is what search_symbols does: > > 1) If the regexp is non-NULL, it cleans it up a bit and compiles it. > > 2) It looks at every partial symtab; in each one, it looks at every > partial symbol until it finds one that matches, at which point it > reads in that partial symtab. > > 3) It (usually) looks at every single minimal symbol, to see if thre's > a corresponding symbol. > > 4) It then looks at every single global and static symbol, to find the > ones that match. > > 5) If there were minimal symbols without corresponding symbols, then > it looks at every single minimal symbol again. > > Here, on the other hand, is what make_symbol_overload_list does. > > 1) It looks at every partial symtab, and reads them all in (in an > amusingly verbose fashion). (My e-mail referenced above contains a > patch fixing this.) > > 2) It looks at local variables, to find the ones that match. > > 3) It looks at every single global and static symbol, to find the ones > that match. > > > So: what's the same, what's different. > > Things that only search_symbols does: > > * It does some regexp-related stuff at the beginning. > > * It looks at every single minimal symbol at least once, possibly > twice. > > Things that only make_symbol_overload_list does: > > * It looks at local variables. > > Things that both functions do: > > * Determine which psymtabs to read in. > > * Look at every single global and static symbol, to find the ones that > match. > > > So already I'm suspicious: those differences seem potentially > significant to me. Some of them may well be due to accidents in the > way the two functions were written (which would be exactly the sort of > accidents that patches like Daniel's would help prevent!); but I think > that, at the least, we need more analysis to see if that's the case. > When doing this sort of refactoring, I think the proper order is to > first minimize the differences and only then to combine them. > (Shrinking the functions first will help, too: see below.) > > But even the similarities aren't as close as they seem. In both > cases, make_symbol_overload_list can potentially use a much faster > algorithm than search_symbols can, one that avoids looking at the vast > majority of partial or regular symbols. For the psymtab case, it can > just do one call to lookup_partial_symbol per psymtab, rather than > look at (almost) every partial symbol. For the symbols, all the > matching symbols will be inside the same hash bucket, so we could > conceivably just look in one bucket per symtab (well, two, one for the > globals and one for the statics), instead of looking at every symbol. > (We don't currently have an appropriate iterator to do that; I do have > one in my branch, that I plan to try to move to mainline in a few > weeks.) > > > So. I like the basic idea of combining the functions as a longer-term > goal, but I'm pretty sure it's not a good idea right now. I think a > better thing to do would be to get a good handle on the difference > between search_symbol and make_symbol_overload_list. (And > make_symbol_completion_list: it may well be easier to combine > make_symbol_overload_list and make_symbol_completion_list as a first > step.) > > One possible concrete recommendation as a step towards this > refactoring would be to extract the body of search_symbol into 5 > separate functions along the lines of my analysis above, to extract > the body of make_symbol_overload_list into 3 separate functions along > the lines of my analysis above, and similarly with > make_symbol_completion_list. (This is like what I did with > lookup_symbol_aux.) That will enable us to do two things: > > * By looking at the resulting much shorter versions of those > functions, it'll be much clearer what each one does in broad terms. > We can then think about whether the differences between them are > important or not. > > * When an extracted function from, say, make_symbol_overload_list is > playing a similar role to an extracted function from, say, > make_symbol_completion_list, then examine those functions to see if > they can be profitably unified into a single function. > > That may sound like a lot of work, but having gone through it a few > times, it's really not so bad. And it's a great way to understand and > clean up functions. It doesn't interact well with GDB's approval > process, though; I have some ideas about that, too, but I don't think > this is the place for them. > > David Carlton > carlton@math.stanford.edu