From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7809 invoked by alias); 1 Nov 2002 22:53:15 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 7776 invoked from network); 1 Nov 2002 22:53:15 -0000 Received: from unknown (HELO jackfruit.Stanford.EDU) (171.64.38.136) by sources.redhat.com with SMTP; 1 Nov 2002 22:53:15 -0000 Received: (from carlton@localhost) by jackfruit.Stanford.EDU (8.11.6/8.11.6) id gA1MrEq24822; Fri, 1 Nov 2002 14:53:14 -0800 X-Authentication-Warning: jackfruit.Stanford.EDU: carlton set sender to carlton@math.stanford.edu using -f To: gdb Subject: How I learned to stop worrying and love decode_line_1 From: David Carlton Date: Fri, 01 Nov 2002 14:53:00 -0000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2002-11/txt/msg00014.txt.bz2 I've been spending the week getting to know decode_line_1. One of the things that I learned was that decode_line_1 thinks that it's its job to take apart C++ expressions like A::B::C::x, where A and B are namespaces, C is a class or a namespace, and x is a variable or a member or something. This cleared up some issues for me: e.g. why you don't have to put single quotes around expressions involving static class members but do have to put single quotes around expressions involving namespaces. But, frankly, I was surprised that the function claimed to handle namespaces at all (since, basically, it doesn't currently); moreover, there's this lovely comment: Some versions of the HP ANSI C++ compiler (as also possibly other compilers) generate class/function/member names with embedded double-colons if they are inside namespaces. To handle this, we loop a few times, considering larger and larger prefixes of the string as though they were single symbols. So, if the initially supplied string is A::B::C::D::foo, we have to look up "A", then "A::B", then "A::B::C", then "A::B::C::D", and finally "A::B::C::D::foo" as single, monolithic symbols, because A, B, C or D may be namespaces. I'm not sure I understand the context here: did/does HP's compiler generate names for symbols in namespace that, after demangling, looked fundamentally different from, say, the way GCC's do? The above sounds the same as the way GCC's demangled symbol names work; but it hints at possible compilers which might somehow generate a symbol 'A' if there is a namespace 'A'. Are there any such compilers? I sure can't imagine where GDB would currently do anything useful with that information. So any insight about the context of that comment would be greatly appreciated. I'm certainly glad I discovered this bit of decode_line_1, at any rate: it suggests that some code that I was planning to add to lookup_symbol should really go in decode_line_1. (The only thing worse than having one gross hack of a parser is having two gross hacks of a parser that are both trying to do the same thing.) For what it's worth, I've got decode_line_1 tamed a bit. The version of it that I'll check into my branch later today is only 82 lines long. (I'll include it after my signature; of course, it calls lots of other functions that contain various bits of the current version of the function.) It turns out that, underlying all that mess, there's actually a function with perfectly reasonable control flow. In particular, whoever put in those goto's originally should be given a bit of a talking-to: there's absolutely no reason whatsoever for the function to do any jumps like that. Mind you, my version of the code still contains all of the myriad special cases in the current version, but at least it's now pretty obvious when the function might return, whether all the different uses of the variables 'p' and 'copy' are really pointing to the same thing or are only there because somebody decided to reuse variable names, and so forth. David Carlton carlton@math.stanford.edu struct symtabs_and_lines decode_line_1 (char **argptr, int funfirstline, struct symtab *default_symtab, int default_line, char ***canonical) { /* This is NULL if there are no parens in argptr, or a pointer to the closing parenthesis if there are parens. */ char *paren_pointer; /* If a file name is specified, this is its symtab. */ struct symtab *file_symtab = NULL; int is_quoted; char *saved_arg = *argptr; /* Defaults have defaults. */ dl1_initialize_defaults (&default_symtab, &default_line); /* See if arg is *PC */ if (**argptr == '*') return dl1_indirect (argptr); /* Set various flags. * 'paren_pointer' is important for overload checking, where * we allow things like: * (gdb) break c::f(int) */ dl1_set_flags (*argptr, &is_quoted, &paren_pointer); /* Check to see if it's a multipart linespec (with colons or periods). */ { char *p; int is_quote_enclosed; /* Locate the first half of the linespec, ending in a colon, period, or whitespace. (More or less.) */ p = dl1_locate_first_half (argptr, &is_quote_enclosed); /* Does it look like there actually were two parts? */ if ((p[0] == ':' || p[0] == '.') && paren_pointer == NULL) { if (is_quoted) *argptr = *argptr + 1; /* Is it a C++ or Java compound data structure? */ if (p[0] == '.' || p[1] == ':') return dl1_compound (argptr, funfirstline, canonical, saved_arg, p); /* No, the first part is a filename; set file_symtab accordingly. Also, move argptr past the filename. */ file_symtab = dl1_handle_filename (argptr, p, is_quote_enclosed); } } /* Check whether arg is all digits (and sign) */ if (dl1_is_all_digits (*argptr)) return dl1_all_digits (argptr, default_symtab, default_line, canonical, file_symtab); /* Arg token is not digits => try it as a variable name Find the next token (everything up to end or next whitespace). */ /* If it starts with $: may be a legitimate variable or routine name (e.g. HP-UX millicode routines such as $$dyncall), or it may be history value, or it may be a convenience variable */ if (**argptr == '$') return dl1_dollar (argptr, funfirstline, default_symtab, canonical, file_symtab); /* Look up that token as a variable. If file specified, use that file's per-file block to start with. */ return dl1_variable (argptr, funfirstline, canonical, is_quoted, paren_pointer, file_symtab); }