From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11643 invoked by alias); 23 Sep 2002 02:59:41 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 11621 invoked from network); 23 Sep 2002 02:59:36 -0000 Received: from unknown (HELO zenia.red-bean.com) (66.244.67.22) by sources.redhat.com with SMTP; 23 Sep 2002 02:59:36 -0000 Received: (from jimb@localhost) by zenia.red-bean.com (8.11.6/8.11.6) id g8N2ieo21741; Sun, 22 Sep 2002 21:44:40 -0500 Date: Sun, 22 Sep 2002 19:59:00 -0000 Message-Id: <200209230244.g8N2ieo21741@zenia.red-bean.com> From: Jim Blandy To: david carlton CC: gdb@sources.redhat.com Subject: suggestion for dictionary representation X-SW-Source: 2002-09/txt/msg00344.txt.bz2 It seems to me that the `skip list' data structure simultaneously meets a lot of the criteria that we're currently meeting by having multiple representations. Skip lists: - provide (probabalistic) log n access time - are low overhead ("They can easily be configured to require an average of 1 1/3 pointers per element (or even less).") - are easy to build incrementally, and stay "balanced" automatically - are obstack-friendly, since they don't involve realloc (as hash tables do) - are an ordered structure, which would support completion nicely (and, by the way, make the `_Z' test for the C++ V3 ABI faster too) - have a trivial iterator (walking the finest level of links) - are pretty easy to understand http://www.cs.umd.edu/~pugh points to a paper describing and analyzing them. Using skip lists, there'd be no need to distinguish `expandable' from non-expandable blocks. This one structure would scale to handle both local blocks and the global environment (depending on how we handle lazy symbol reading --- I'd like a more generic and descriptive term than "partial symbol tables"). The only remaining special case would be function blocks, in which parameter symbols must to appear in the order they appear in the source. I think it's pretty ugly to abuse the name array this way; it introduces special cases in dumb places. This kludge could be removed by changing the `function' member of `struct block' to a pointer to something like this: struct function_info { struct symbol *sym; int num_params; struct symbol **params; }; This would require extra space only for function blocks; non-function blocks would remain the same size. And this info would only be consulted when we actually wanted to iterate over the parameters. This would clean up a bunch of loops in GDB that currently have to iterate over all the symbols in a function's block and do a switch on each symbol's address class to find the arguments. (And would this also allow us to remove the arg/other distinction in enum address_class? Dunno.) But if we were to remove function blocks as a special case, there would only need to be a single structure for representing levels of the namespace. (Daniel Berlin pointed me at skip lists a long time ago --- thanks!)