From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-22672-listarch-gdb-patches=sources.redhat.com@sources.redhat.com>
Received: (qmail 25875 invoked by alias); 21 Feb 2003 19:15:00 -0000
Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-patches-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sources.redhat.com>
List-Help: <mailto:gdb-patches-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: gdb-patches-owner@sources.redhat.com
Received: (qmail 25864 invoked from network); 21 Feb 2003 19:15:00 -0000
Received: from unknown (HELO mx1.redhat.com) (172.16.49.200)
  by 172.16.49.205 with SMTP; 21 Feb 2003 19:15:00 -0000
Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254])
	by mx1.redhat.com (8.11.6/8.11.6) with ESMTP id h1LJF0N06686
	for <gdb-patches@sources.redhat.com>; Fri, 21 Feb 2003 14:15:00 -0500
Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [172.16.52.156])
	by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id h1LJF0f01789
	for <gdb-patches@sources.redhat.com>; Fri, 21 Feb 2003 14:15:00 -0500
Received: from localhost.redhat.com (romulus-int.sfbay.redhat.com [172.16.27.46])
	by pobox.corp.redhat.com (8.11.6/8.11.6) with ESMTP id h1LJEwt32017;
	Fri, 21 Feb 2003 14:14:58 -0500
Received: by localhost.redhat.com (Postfix, from userid 469)
	id 63819FF79; Fri, 21 Feb 2003 14:19:02 -0500 (EST)
From: Elena Zannoni <ezannoni@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <15958.31653.994225.799759@localhost.redhat.com>
Date: Fri, 21 Feb 2003 19:15:00 -0000
To: David Carlton <carlton@math.stanford.edu>
Cc: Daniel Jacobowitz <drow@mvista.com>, gdb-patches@sources.redhat.com,
   Elena Zannoni <ezannoni@redhat.com>, Jim Blandy <jimb@redhat.com>
Subject: Re: RFA/symtab: Let search_symbols find exact matches
In-Reply-To: <ro1k7fu4ltd.fsf@jackfruit.Stanford.EDU>
References: <20030210160107.GA587@nevyn.them.org>
	<ro1k7fu4ltd.fsf@jackfruit.Stanford.EDU>
X-SW-Source: 2003-02/txt/msg00530.txt.bz2

David Carlton writes:
 > On Mon, 10 Feb 2003 11:01:07 -0500, Daniel Jacobowitz <drow@mvista.com> said:
 > 
 > > This patch renames search_symbols to search_symbols_aux, and lets it take an
 > > argument that specified regex or exact (well, strcmp_iw) matching.  Then
 > > search_symbols becomes a wrapper.
 > 
 > > The new function is used by make_symbol_overload_list, which shaves 50% time
 > > off of some tests in the C++ directory if your system libc has debugging
 > > symbols; it removes the bogusity where all psymtabs were converted to
 > > symtabs during overload resolution.  Whew.  This cuts memory for
 > > namespace.exp from 70MB to 7MB, allowing some of my hardware to actually run
 > > the test without crashing.
 > 
 > Here's take two on my comments on this.  First, I should say that I
 > totally agree that the aforementioned bogusity in
 > make_symbol_overload_list should go; as discussed in
 > <http://sources.redhat.com/ml/gdb-patches/2003-02/msg00261.html>,
 > though, there are ways of doing that that involve much less drastic
 > changes.
 > 
 > So the question is: do search_symbols and make_symbol_overload_list
 > have enough of a common core such that extracting that common core
 > into a single function would clean up the code, reducing maintenance
 > overhead, or would that common code be artificial, increasing
 > maintenance overhead?
 > 
 > Here's an analysis of the two functions.  It may well be incorrect; I
 > understand GDB's symbol mechanisms pretty well (better than I'd like,
 > I sometimes think...), and I understand make_symbol_overload_list
 > pretty well, but I haven't spent too much time with search_symbols
 > yet.
 > 

<archeology>

BTW, make_symbol_overload_list came in via the HP merge.  Roughly at
the same time (in parallel) the search_symbols function was
introduced. I.e. HP didn't have search_symbols available.  Actually
the same functionality of search_symbol was provided by another
function called list_symbols. I suspect that list_symbols was broken
down into several functions, to be used in gdbtk, when search_symbols
was introduced.

</archeology>

So, if you change search_symbols we should make sure gdbtk still
works.

I agree it could be a good idea to clean up search_symbols, but keep
the history above in mind.

elena


 > 
 > Loosely speaking, this is what search_symbols does:
 > 
 > 1) If the regexp is non-NULL, it cleans it up a bit and compiles it.
 > 
 > 2) It looks at every partial symtab; in each one, it looks at every
 >    partial symbol until it finds one that matches, at which point it
 >    reads in that partial symtab.
 > 
 > 3) It (usually) looks at every single minimal symbol, to see if thre's
 >    a corresponding symbol.
 > 
 > 4) It then looks at every single global and static symbol, to find the
 >    ones that match.
 > 
 > 5) If there were minimal symbols without corresponding symbols, then
 >    it looks at every single minimal symbol again.
 > 
 > Here, on the other hand, is what make_symbol_overload_list does.
 > 
 > 1) It looks at every partial symtab, and reads them all in (in an
 >    amusingly verbose fashion).  (My e-mail referenced above contains a
 >    patch fixing this.)
 > 
 > 2) It looks at local variables, to find the ones that match.
 > 
 > 3) It looks at every single global and static symbol, to find the ones
 >    that match.
 > 
 > 
 > So: what's the same, what's different.
 > 
 > Things that only search_symbols does:
 > 
 > * It does some regexp-related stuff at the beginning.
 > 
 > * It looks at every single minimal symbol at least once, possibly
 >   twice.
 > 
 > Things that only make_symbol_overload_list does:
 > 
 > * It looks at local variables.
 > 
 > Things that both functions do:
 > 
 > * Determine which psymtabs to read in.
 > 
 > * Look at every single global and static symbol, to find the ones that
 >   match.
 > 
 > 
 > So already I'm suspicious: those differences seem potentially
 > significant to me.  Some of them may well be due to accidents in the
 > way the two functions were written (which would be exactly the sort of
 > accidents that patches like Daniel's would help prevent!); but I think
 > that, at the least, we need more analysis to see if that's the case.
 > When doing this sort of refactoring, I think the proper order is to
 > first minimize the differences and only then to combine them.
 > (Shrinking the functions first will help, too: see below.)
 > 
 > But even the similarities aren't as close as they seem.  In both
 > cases, make_symbol_overload_list can potentially use a much faster
 > algorithm than search_symbols can, one that avoids looking at the vast
 > majority of partial or regular symbols.  For the psymtab case, it can
 > just do one call to lookup_partial_symbol per psymtab, rather than
 > look at (almost) every partial symbol.  For the symbols, all the
 > matching symbols will be inside the same hash bucket, so we could
 > conceivably just look in one bucket per symtab (well, two, one for the
 > globals and one for the statics), instead of looking at every symbol.
 > (We don't currently have an appropriate iterator to do that; I do have
 > one in my branch, that I plan to try to move to mainline in a few
 > weeks.)
 > 
 > 
 > So.  I like the basic idea of combining the functions as a longer-term
 > goal, but I'm pretty sure it's not a good idea right now.  I think a
 > better thing to do would be to get a good handle on the difference
 > between search_symbol and make_symbol_overload_list.  (And
 > make_symbol_completion_list: it may well be easier to combine
 > make_symbol_overload_list and make_symbol_completion_list as a first
 > step.)
 > 
 > One possible concrete recommendation as a step towards this
 > refactoring would be to extract the body of search_symbol into 5
 > separate functions along the lines of my analysis above, to extract
 > the body of make_symbol_overload_list into 3 separate functions along
 > the lines of my analysis above, and similarly with
 > make_symbol_completion_list.  (This is like what I did with
 > lookup_symbol_aux.)  That will enable us to do two things:
 > 
 > * By looking at the resulting much shorter versions of those
 >   functions, it'll be much clearer what each one does in broad terms.
 >   We can then think about whether the differences between them are
 >   important or not.
 > 
 > * When an extracted function from, say, make_symbol_overload_list is
 >   playing a similar role to an extracted function from, say,
 >   make_symbol_completion_list, then examine those functions to see if
 >   they can be profitably unified into a single function.
 > 
 > That may sound like a lot of work, but having gone through it a few
 > times, it's really not so bad.  And it's a great way to understand and
 > clean up functions.  It doesn't interact well with GDB's approval
 > process, though; I have some ideas about that, too, but I don't think
 > this is the place for them.
 > 
 > David Carlton
 > carlton@math.stanford.edu