From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jim Blandy <jimb@zwingli.cygnus.com>
To: Fernando Nasser <fnasser@redhat.com>
Cc: Elena Zannoni <ezannoni@cygnus.com>, Daniel Berlin <dan@cgsoftware.com>, gdb-patches@sources.redhat.com
Subject: Re: [RFA] linespec.c change to stop "malformed template specification"   error
Date: Thu, 07 Jun 2001 09:09:00 -0000
Message-id: <nplmn41dam.fsf@zwingli.cygnus.com>
References: <87ofsldrgr.fsf@dynamic-addr-83-177.resnet.rochester.edu> <15134.47162.825017.119342@kwikemart.cygnus.com> <npg0dd2b20.fsf@zwingli.cygnus.com> <3B1F7A4B.4626F9D9@redhat.com>
X-SW-Source: 2001-06/msg00129.html

Fernando Nasser <fnasser@redhat.com> writes:
> Jim Blandy wrote:
> > So how does our poor little decode_line_1 handle that?  Basically, we
> > need to replace decode_line_1 with a real parser.
> 
> It will be hard.  As it accepts a variety of types of input, there are
> ambiguities in the allowed syntax that are hard to describe in any
> formal language.

Don't worry about producing a grammar Bison would like.  We can write
N parsers, some of them Bison-based, some of them hand-coded, but each
one which handles a single case correctly and cleanly.  Then, we can
invoke them each in turn, and use the result from the first one which
doesn't return an error.  The C, C++, and other language parsers would
just be members of the list.

> Maybe we could improve things if GDB commands were parsed under some
> language context (e.g. care about C++ stuff or not) and even some
> host context (to distinguish filename syntaxes between Unix and
> Windows for instance).

Well, language is a per-compilation-unit kind of thing.  And the user
should just be able to say "break foo^..bratwurst" whenever
foo^..bratwurst is a well-defined breakpoint location, even if it's
not in the current compilation unit.

There's even a trick we can use to get our existing parsers to work
for this.  We don't need to write new grammars.

Right now, the start symbol for our C++ grammar is `start', which is
either an expression or a type.  Suppose we want to parse expressions,
types, and function names.  We make up three new magic token types:
START_EXPRESSION, START_TYPE, and START_FUNCTION_NAME.  We change the
syntax of our start symbol to be:

start   :	START_EXPRESSION exp1
	|	START_TYPE type_exp
        |       START_FUNCTION_NAME function_name
	;

(I don't think function_name exists yet, but that work is necessary no
matter how we do this.)

Then, we change our yylex function to return START_EXPRESSION,
START_TYPE, or START_FUNCTION_NAME as the first token, depending on
which one we want to parse.  These tokens don't correspond to
anything in the text stream at all --- they just serve to get the
parser in the right state to recognize what we're giving it.

But to be clear, we do *not* need to encode the full glory of
breakpoint locations here.  We extend the grammar to handle what it
can do naturally --- probably just template applications and
overloading.