From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jason Molenda <jason-swarelist@molenda.com>
To: gdb-patches@sources.redhat.com
Subject: Re: [RFA] bug in symtab.c:lookup_block_symbol()'s search method
Date: Sat, 15 Sep 2001 13:08:00 -0000
Message-id: <20010915130802.A12003@shell17.ba.best.com>
References: <20010910113226.A23487@shell17.ba.best.com> <87zo82swwa.fsf@cgsoftware.com> <20010910130347.A5628@shell17.ba.best.com> <8766aq7nki.fsf@cgsoftware.com> <3BA219EF.3000300@cygnus.com> <9003-Fri14Sep2001190223+0300-eliz@is.elta.co.il> <20010914091241.A28921@shell17.ba.best.com> <1659-Fri14Sep2001204927+0300-eliz@is.elta.co.il> <20010915005255.A2369@shell17.ba.best.com> <87ofocpks1.fsf@cgsoftware.com>
X-SW-Source: 2001-09/msg00196.html

On Sat, Sep 15, 2001 at 10:54:06AM -0400, Daniel Berlin wrote:

> Note that combining the solutions I later attempted avoids the problem
> of having 
> to show that all the languages gdb supports can't do symbol lookups
> on something that starts with a character strcmp_iw ignores,
> altogether.

I guess you didn't read either of my initial notes in this discusssion.
I provided pointers to them in the patch submission.  

> This would be using a hash table for the block array, which we already
> have a hash function ideal for (minsym_hash_iw or whatever it's
> called), and the not actually searching blocks that don't contain the
> symbol.
> 
> This would give you O(1) symbol lookup time, effectively (A given
> symbol only searches a constant number of blocks).
> This would certainly make your debugger much faster to respond.


What you're proposing is to ignore the inefficiencies of
lookup_block_symbol by reducing the number of calls to it.  Instead,
why don't we fix the inefficiencies in lookup_block_symbol *and*
reduce the number of calls to it?  Imagine how the gdb users will
be dancing in the street, thanking buddha for their great fortune of
having such a wonderful debugger.

Concretely.  GDB has a number of symtabs, call it 'i', and a number of
_unique_ global blocks, call it 'j'.  Every symtab has a link to a global
block, but few of these global blocks are unique.  In gdb itself, i is
an order of magnitude larger than j.  I expect some programs will have
i two orders of magnitude larger than j.  Each global block has a number
of symbols in it, call the average of these 'n'.

In gdb 5.0, looking for an undefined symbol will happen on order of 
O(i * lg (n)).  In gdb 5.1, that search will be on the order of 
O(i * 1/2 * n).  Dan's proposal is to change this to O(j * 1/2 * n).  
I am looking for at _least_ O(i * lg (n)), but I'd really like to see 
us get both of these eventually for O(j * lg (n)).

If we define conservatively as i = j * 10, these become

 gdb 5.0                   O (j * 10 * lg (n))
 gdb 5.1                   O (j * 10 * 1/2 * n)
 My patch                  O (j * 10 * lg (n))

 What I'd like to see      O (j * lg (n))
 Dan's block patch alone   O (j * 1/2 * n))

Debugging gdb itself, "i" is 330, and "j" is 23.  The average number
of symbols was small - I don't have the number here right now, but
I think it was around 10 symbols.  All of these numbers are a lot
larger for a C++ program, obviously.

Jason