From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-10501-listarch-gdb=sourceware.cygnus.com@sources.redhat.com>
Received: (qmail 11643 invoked by alias); 23 Sep 2002 02:59:41 -0000
Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/gdb/>
List-Post: <mailto:gdb@sources.redhat.com>
List-Help: <mailto:gdb-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: gdb-owner@sources.redhat.com
Received: (qmail 11621 invoked from network); 23 Sep 2002 02:59:36 -0000
Received: from unknown (HELO zenia.red-bean.com) (66.244.67.22)
  by sources.redhat.com with SMTP; 23 Sep 2002 02:59:36 -0000
Received: (from jimb@localhost)
	by zenia.red-bean.com (8.11.6/8.11.6) id g8N2ieo21741;
	Sun, 22 Sep 2002 21:44:40 -0500
Date: Sun, 22 Sep 2002 19:59:00 -0000
Message-Id: <200209230244.g8N2ieo21741@zenia.red-bean.com>
From: Jim Blandy <jimb@redhat.com>
To: david carlton <carlton@math.stanford.edu>
CC: gdb@sources.redhat.com
Subject: suggestion for dictionary representation
X-SW-Source: 2002-09/txt/msg00344.txt.bz2


It seems to me that the `skip list' data structure simultaneously
meets a lot of the criteria that we're currently meeting by having
multiple representations.  Skip lists:
- provide (probabalistic) log n access time
- are low overhead ("They can easily be configured to require an
  average of 1 1/3 pointers per element (or even less).")
- are easy to build incrementally, and stay "balanced" automatically
- are obstack-friendly, since they don't involve realloc (as hash
  tables do)
- are an ordered structure, which would support completion nicely (and,
  by the way, make the `_Z' test for the C++ V3 ABI faster too)
- have a trivial iterator (walking the finest level of links)
- are pretty easy to understand

http://www.cs.umd.edu/~pugh points to a paper describing and analyzing
them.

Using skip lists, there'd be no need to distinguish `expandable' from
non-expandable blocks.  This one structure would scale to handle both
local blocks and the global environment (depending on how we handle
lazy symbol reading --- I'd like a more generic and descriptive term
than "partial symbol tables").

The only remaining special case would be function blocks, in which
parameter symbols must to appear in the order they appear in the
source.  I think it's pretty ugly to abuse the name array this way; it
introduces special cases in dumb places.  This kludge could be removed
by changing the `function' member of `struct block' to a pointer to
something like this:

   struct function_info {
       struct symbol *sym;
       int num_params;
       struct symbol **params;
   };

This would require extra space only for function blocks; non-function
blocks would remain the same size.  And this info would only be
consulted when we actually wanted to iterate over the parameters.
This would clean up a bunch of loops in GDB that currently have to
iterate over all the symbols in a function's block and do a switch on
each symbol's address class to find the arguments.  (And would this
also allow us to remove the arg/other distinction in enum
address_class?  Dunno.)

But if we were to remove function blocks as a special case, there
would only need to be a single structure for representing levels of
the namespace.

(Daniel Berlin pointed me at skip lists a long time ago --- thanks!)