From mboxrd@z Thu Jan  1 00:00:00 1970
From: Daniel Berlin <dan@cgsoftware.com>
To: Jim Blandy <jimb@zwingli.cygnus.com>
Cc: Elena Zannoni <ezannoni@cygnus.com>, Daniel Berlin <dan@cgsoftware.com>, gdb-patches@sources.redhat.com
Subject: Re: [RFA] linespec.c change to stop "malformed template specification" error
Date: Wed, 06 Jun 2001 23:36:00 -0000
Message-id: <87bso07q5u.fsf@cgsoftware.com>
References: <87ofsldrgr.fsf@dynamic-addr-83-177.resnet.rochester.edu> <15134.47162.825017.119342@kwikemart.cygnus.com> <npg0dd2b20.fsf@zwingli.cygnus.com>
X-SW-Source: 2001-06/msg00115.html

Jim Blandy <jimb@zwingli.cygnus.com> writes:

> Elena Zannoni <ezannoni@cygnus.com> writes:
>> Daniel Berlin writes:
>>  > This error is cause by find_toplevel_char not knowing that '<' and '>'
>>  > increase and decrease the depth we are at.
>>  > 
>>  > The result is that if you say "break _Rb_tree<int, int>", when it goes
>>  > to look for a comma at the top level, it thinks it found one right
>>  > after the "int", and temporarily truncates the string to '_Rb_tree<int,'
>>  > When we then proceed to go through the string, we see the "<", and
>>  > then go to find the end of the template name, and can't, because we've
>>  > truncated the string in the wrong place, and issue an error.
>>  > 
>>  > Cute, no?
>>  > 
>>  > --Dan
>>  > 
>> 
>> Seems OK to me, but could you update the comment on top of the
>> find_toplevel_char() to reflect that the char is looked for also
>> outside of '<' and '>' pairs?
>> 
>> Any of the other maintainers (Jim, Fernando) has any comments?
> 
> Operators like '<' can appear in template arguments.  For example, you
> could define a template like this:
> 
>         template <int i> struct list { int a[i], b[i]; };
> 
> and then use it like this:
> 
>         struct list <20> l;
> 
> and you get the same thing as if you'd written:
> 
>         struct { int a[20], b[20]; } l;
> 
> At least I think so, anyway.  I don't really know C++.  But the point
> is, those template arguments can be any arbitrary constant expression.
> So I could have a template invocation like this:
> 
>         struct list < (x < y) ? 10 : 20 > l;
> 
> So how does our poor little decode_line_1 handle that?  Basically, we
> need to replace decode_line_1 with a real parser.

Wait, stop, this is a complete red herring.

It's irrelevant.

decode_line_1 is used in breakpoints and listing (break and list). *NOT* in
an expression like "p a<5 < 6>"

An decode_line_1 says the string can be:

"LINENUM
FILENAME:LINENUM
FUNCTION
VARIABLE
FILE:FUNCTION
*EXPR"


Since we don't give a shit what the actual arguments to the template
specification are (you can't very well expect us to *evaluate* them in
this context, we certainly don't now), you end up with a grammar like i'm about
to show, excluding *EXPR, which you just special case, as we do
now. This grammar appears to work fine.
The reason COLONCOLON and COLON are seperate from the rest of the
allowed characters is only because double colon is part of a function
name, colon isn't.

Pretend we did symbol lookup along the way, rather than at the end, as
we do now.

%token IDENT
%token NUM
%token COLONCOLON
%token COLON
%%
input : line_reference 
        | function_name;
line_reference: NUM
        | IDENT COLON NUM;
function_name:
        IDENT COLON function_name /* function name with filename in front */
        | IDENT COLONCOLON function_name  /* scoped function name of some sort*/
        | IDENT; /* just a name of some sort */


Now, two things.

One:

IDENT here is [a-zA-Z][A-Za-z0-9<>, ]*

This means daniel<int>::fred is a function name.
As is daniel<int bob, int fred>::george


Two:
Realize the fact that this grammar accepts "DANIEL<int>::fred:fred" is
irrelevant.  We'll never find a match for it, and alert the user to
it.

Same with the fact that you could put DANIEL<int::5
We'll never find DANIEL<int, and point it out to you when we failed to
find it.

Same if you start making up invalid template arguments (daniel<int int
int int int>::fred. We'll fail the first lookup, and point it out).


No shift-reduce conflicts, BTW, because of the proper placement of
function_name (function_name COLONCOLON IDENT would give you one, for
instance).

The only thing doing symbol evaluation along the way buys us is better
error messages (well, less painful, better, error messages. we could
still do it without symbol evaluation along the way, it's just a pain
in the ass to track the same info, and we'd likely point to the wrong
thing as the real error).
The tricky part of it is deferring an error message on scoped names
(IE till we've failed all the possibility) until the very end, or something like that.

We could just do what we effectively do now, defer decoding a function name till
the parse is done, and deal with it then. 


I've got this all flex and bisonified (as you can tell from the above)
up already, and it works, if you guys think i should submit it.

--Dan




> 
> In the mean time, however, I think it's more important to recognize
> the template argument brackets at all than to handle template
> arguments that contain < and > operators.
> 
> So with this caveat, I think the change is fine.

-- 
"I lost a button hole today.  Where am I gonna find another one?
"-Steven Wright