From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jim Blandy To: Fernando Nasser Cc: Elena Zannoni , Daniel Berlin , gdb-patches@sources.redhat.com Subject: Re: [RFA] linespec.c change to stop "malformed template specification" error Date: Thu, 07 Jun 2001 09:09:00 -0000 Message-id: References: <87ofsldrgr.fsf@dynamic-addr-83-177.resnet.rochester.edu> <15134.47162.825017.119342@kwikemart.cygnus.com> <3B1F7A4B.4626F9D9@redhat.com> X-SW-Source: 2001-06/msg00129.html Fernando Nasser writes: > Jim Blandy wrote: > > So how does our poor little decode_line_1 handle that? Basically, we > > need to replace decode_line_1 with a real parser. > > It will be hard. As it accepts a variety of types of input, there are > ambiguities in the allowed syntax that are hard to describe in any > formal language. Don't worry about producing a grammar Bison would like. We can write N parsers, some of them Bison-based, some of them hand-coded, but each one which handles a single case correctly and cleanly. Then, we can invoke them each in turn, and use the result from the first one which doesn't return an error. The C, C++, and other language parsers would just be members of the list. > Maybe we could improve things if GDB commands were parsed under some > language context (e.g. care about C++ stuff or not) and even some > host context (to distinguish filename syntaxes between Unix and > Windows for instance). Well, language is a per-compilation-unit kind of thing. And the user should just be able to say "break foo^..bratwurst" whenever foo^..bratwurst is a well-defined breakpoint location, even if it's not in the current compilation unit. There's even a trick we can use to get our existing parsers to work for this. We don't need to write new grammars. Right now, the start symbol for our C++ grammar is `start', which is either an expression or a type. Suppose we want to parse expressions, types, and function names. We make up three new magic token types: START_EXPRESSION, START_TYPE, and START_FUNCTION_NAME. We change the syntax of our start symbol to be: start : START_EXPRESSION exp1 | START_TYPE type_exp | START_FUNCTION_NAME function_name ; (I don't think function_name exists yet, but that work is necessary no matter how we do this.) Then, we change our yylex function to return START_EXPRESSION, START_TYPE, or START_FUNCTION_NAME as the first token, depending on which one we want to parse. These tokens don't correspond to anything in the text stream at all --- they just serve to get the parser in the right state to recognize what we're giving it. But to be clear, we do *not* need to encode the full glory of breakpoint locations here. We extend the grammar to handle what it can do naturally --- probably just template applications and overloading.