From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4395 invoked by alias); 25 Sep 2002 19:59:32 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 4386 invoked from network); 25 Sep 2002 19:59:31 -0000 Received: from unknown (HELO jackfruit.Stanford.EDU) (171.64.38.136) by sources.redhat.com with SMTP; 25 Sep 2002 19:59:31 -0000 Received: (from carlton@localhost) by jackfruit.Stanford.EDU (8.11.6/8.11.6) id g8PJxU531424; Wed, 25 Sep 2002 12:59:30 -0700 X-Authentication-Warning: jackfruit.Stanford.EDU: carlton set sender to carlton@math.stanford.edu using -f To: gdb Subject: the global symbol table From: David Carlton Date: Wed, 25 Sep 2002 12:59:00 -0000 Message-ID: User-Agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Common Lisp) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2002-09/txt/msg00413.txt.bz2 I'm trying to get a handle on the global symbol table, and on what would be a good way to reorganize its data in order to make lookup_symbol (and related functions, e.g. search_symbol) faster and to provide some sort of data structure that could also be used in namespace searches. I'm not sure that I have any really coherent questions, so I'm going to free-associate for a while, and see what thoughts that jars loose from other people's brains. I've listed the main issues that I'm worried about below. As a reminder, here's what currently happens if you call lookup_symbol and it doesn't find a local match somewhere: * It looks up the global blocks in all the symtabs, and looks for them there. If it finds it, it calls fixup_symbol_section (to get the bfd_section and section set correctly; this happens in the other cases below, too) and returns it. * Then, #ifndef HPUXHPPA, it sometimes looks in the minimal symbol table. If it finds something, it looks for it in the global and static blocks of the corresponding symtab. (But why wouldn't it have already found the global match? Hmm.) Otherwise, it sometimes does some sort of mangled name lookup. * Next, it looks in all the psymtabs. If it finds a match, it converts the psymtab to a symtab, and returns the corresponding symbol there. * Next, it looks in the static symbols for all of the symtabs. (To what extent does this overlap with the lookup_minimal_symbol stuff earlier?) * Then, it looks in the static symbols for all the psymtabs. * And finally, #ifdef HPUXHPPA, it does the minimal symbol stuff that it otherwise did earlier. This raises some issues: * Do we _really_ want to search static symbols? My guess is that the answer is 'yes', but it's not entirely clear to me that that's the case. Certainly whenever we do a search that returns results that aren't strictly in line with what the language says, I get nervous. * Even the simplest case of searching all the global blocks isn't implemented very well: surely we need some sort of fast lookup structure that won't require us to look at every single symtab. Assuming that we have some sort of expandable data structure in which lookups are fast, then exactly how should this work? Should we keep replace the current global blocks by one big global table, should we have a global table that duplicates the information in the global blocks (so struct symbols corresponding to global symbols would be stored in two separate places), or should we have a global table that quickly maps names to symtabs, but have the actual symbols only stored in the global blocks of symtabs? I'm currently leaning towards the second solution, with the possible longer-term goal of migrating towards the first solution, but it's not clear to me exactly what the consequences of these choices are. * Which of the structs symbol, partial_symbol, and minimal_symbol can be unified? I tend to think that partial_symbols should go away, to be replaced by special 'incomplete' sorts of struct symbols combined with code in lookup_symbol (or wherever) that says that, if you run into one of those kinds of incomplete symbols, read in the entire symbol table for that file and flesh out all of its partial symbols. (This could open a door to a general notion of incomplete symbols that could, depending on the debugging format, be completed in a more efficient manner than the current psymtab->symtab mechanisms.) I was hoping that minimal_symbols could also be turned into special kinds of symbols, but now I'm more dubious about that; if they have to remain separate, I should presumably tweak dictionaries to let them index any sort of struct general_symbol_info, and change the minimal symbol table in ways that correspond to the changes to the global symbol table. * So just what's going on with the usage of minimal symbols in lookup_symbol? Are there any situations where looking in the minimal symbol is actually helpful? If so, do those only have to do with ickiness related to mangled names, or is there a more serious issue that I'm missing? * One issue that I didn't comment on above is that lookup_symbol has a 'symtab' argument that stores the symtab in which the symbol is found. If we have to support that, that obviously affects the data structures involved. But it seems to me that we _don't_ have to support that: I did a cursory look through GDB's sources and, as far as I can tell, that argument is only used by linespec.c's decode_line_1. So it seems to me that we should remove that argument from lookup_symbol and provide some sort of alternate functionality to satisfy decode_line_1's needs. David Carlton carlton@math.stanford.edu