Mirror of the gdb mailing list
 help / color / mirror / Atom feed
* How I learned to stop worrying and love decode_line_1
@ 2002-11-01 14:53 David Carlton
  2002-11-04 17:28 ` Fernando Nasser
  0 siblings, 1 reply; 3+ messages in thread
From: David Carlton @ 2002-11-01 14:53 UTC (permalink / raw)
  To: gdb

I've been spending the week getting to know decode_line_1.  One of the
things that I learned was that decode_line_1 thinks that it's its job
to take apart C++ expressions like A::B::C::x, where A and B are
namespaces, C is a class or a namespace, and x is a variable or a
member or something.

This cleared up some issues for me: e.g. why you don't have to put
single quotes around expressions involving static class members but do
have to put single quotes around expressions involving namespaces.
But, frankly, I was surprised that the function claimed to handle
namespaces at all (since, basically, it doesn't currently); moreover,
there's this lovely comment:

  Some versions of the HP ANSI C++ compiler (as also possibly
  other compilers) generate class/function/member names with
  embedded double-colons if they are inside namespaces. To
  handle this, we loop a few times, considering larger and
  larger prefixes of the string as though they were single
  symbols.  So, if the initially supplied string is
  A::B::C::D::foo, we have to look up "A", then "A::B",
  then "A::B::C", then "A::B::C::D", and finally
  "A::B::C::D::foo" as single, monolithic symbols, because
  A, B, C or D may be namespaces.

I'm not sure I understand the context here: did/does HP's compiler
generate names for symbols in namespace that, after demangling, looked
fundamentally different from, say, the way GCC's do?  The above sounds
the same as the way GCC's demangled symbol names work; but it hints at
possible compilers which might somehow generate a symbol 'A' if there
is a namespace 'A'.  Are there any such compilers?  I sure can't
imagine where GDB would currently do anything useful with that
information.

So any insight about the context of that comment would be greatly
appreciated.

I'm certainly glad I discovered this bit of decode_line_1, at any
rate: it suggests that some code that I was planning to add to
lookup_symbol should really go in decode_line_1.  (The only thing
worse than having one gross hack of a parser is having two gross hacks
of a parser that are both trying to do the same thing.)

For what it's worth, I've got decode_line_1 tamed a bit.  The version
of it that I'll check into my branch later today is only 82 lines
long.  (I'll include it after my signature; of course, it calls lots
of other functions that contain various bits of the current version of
the function.) It turns out that, underlying all that mess, there's
actually a function with perfectly reasonable control flow.  In
particular, whoever put in those goto's originally should be given a
bit of a talking-to: there's absolutely no reason whatsoever for the
function to do any jumps like that.  Mind you, my version of the code
still contains all of the myriad special cases in the current version,
but at least it's now pretty obvious when the function might return,
whether all the different uses of the variables 'p' and 'copy' are
really pointing to the same thing or are only there because somebody
decided to reuse variable names, and so forth.

David Carlton
carlton@math.stanford.edu

struct symtabs_and_lines
decode_line_1 (char **argptr, int funfirstline, struct symtab *default_symtab,
	       int default_line, char ***canonical)
{
  /* This is NULL if there are no parens in argptr, or a pointer to
     the closing parenthesis if there are parens.  */
  char *paren_pointer;
  /* If a file name is specified, this is its symtab.  */
  struct symtab *file_symtab = NULL;
  int is_quoted;
  char *saved_arg = *argptr;

  /* Defaults have defaults.  */

  dl1_initialize_defaults (&default_symtab, &default_line);
  
  /* See if arg is *PC */

  if (**argptr == '*')
    return dl1_indirect (argptr);

  /* Set various flags.
   * 'paren_pointer' is important for overload checking, where
   * we allow things like: 
   *     (gdb) break c::f(int)
   */

  dl1_set_flags (*argptr, &is_quoted, &paren_pointer);

  /* Check to see if it's a multipart linespec (with colons or periods).  */
  {
    char *p;
    int is_quote_enclosed;
    
    /* Locate the first half of the linespec, ending in a colon, period,
       or whitespace.  (More or less.)  */

    p = dl1_locate_first_half (argptr, &is_quote_enclosed);

    /* Does it look like there actually were two parts?  */

    if ((p[0] == ':' || p[0] == '.') && paren_pointer == NULL)
      {
	if (is_quoted)
	  *argptr = *argptr + 1;
      
	/* Is it a C++ or Java compound data structure?  */
      
	if (p[0] == '.' || p[1] == ':')
	  return dl1_compound (argptr, funfirstline, canonical,
			       saved_arg, p);

	/* No, the first part is a filename; set file_symtab
	   accordingly.  Also, move argptr past the filename.  */
      
	file_symtab = dl1_handle_filename (argptr, p, is_quote_enclosed);
      }
  }

  /* Check whether arg is all digits (and sign) */

  if (dl1_is_all_digits (*argptr))
    return dl1_all_digits (argptr, default_symtab, default_line,
			   canonical, file_symtab);

  /* Arg token is not digits => try it as a variable name
     Find the next token (everything up to end or next whitespace).  */

  /* If it starts with $: may be a legitimate variable or routine name
     (e.g. HP-UX millicode routines such as $$dyncall), or it may
     be history value, or it may be a convenience variable */
  
  if (**argptr == '$')
    return dl1_dollar (argptr, funfirstline, default_symtab, canonical,
		       file_symtab);
  
  /* Look up that token as a variable.
     If file specified, use that file's per-file block to start with.  */

  return dl1_variable (argptr, funfirstline, canonical, is_quoted,
		       paren_pointer, file_symtab);
}


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: How I learned to stop worrying and love decode_line_1
  2002-11-01 14:53 How I learned to stop worrying and love decode_line_1 David Carlton
@ 2002-11-04 17:28 ` Fernando Nasser
  2002-11-04 20:41   ` David Carlton
  0 siblings, 1 reply; 3+ messages in thread
From: Fernando Nasser @ 2002-11-04 17:28 UTC (permalink / raw)
  To: David Carlton; +Cc: gdb, Jim Blandy, Elena Zannoni

David,

Jim Blandy brought your message to my attention.

Would you be willing to submit a patch with your clean-up?

Regards,
Fernando

David Carlton wrote:
> I've been spending the week getting to know decode_line_1.  One of the
> things that I learned was that decode_line_1 thinks that it's its job
> to take apart C++ expressions like A::B::C::x, where A and B are
> namespaces, C is a class or a namespace, and x is a variable or a
> member or something.
> 
> This cleared up some issues for me: e.g. why you don't have to put
> single quotes around expressions involving static class members but do
> have to put single quotes around expressions involving namespaces.
> But, frankly, I was surprised that the function claimed to handle
> namespaces at all (since, basically, it doesn't currently); moreover,
> there's this lovely comment:
> 
>   Some versions of the HP ANSI C++ compiler (as also possibly
>   other compilers) generate class/function/member names with
>   embedded double-colons if they are inside namespaces. To
>   handle this, we loop a few times, considering larger and
>   larger prefixes of the string as though they were single
>   symbols.  So, if the initially supplied string is
>   A::B::C::D::foo, we have to look up "A", then "A::B",
>   then "A::B::C", then "A::B::C::D", and finally
>   "A::B::C::D::foo" as single, monolithic symbols, because
>   A, B, C or D may be namespaces.
> 
> I'm not sure I understand the context here: did/does HP's compiler
> generate names for symbols in namespace that, after demangling, looked
> fundamentally different from, say, the way GCC's do?  The above sounds
> the same as the way GCC's demangled symbol names work; but it hints at
> possible compilers which might somehow generate a symbol 'A' if there
> is a namespace 'A'.  Are there any such compilers?  I sure can't
> imagine where GDB would currently do anything useful with that
> information.
> 
> So any insight about the context of that comment would be greatly
> appreciated.
> 
> I'm certainly glad I discovered this bit of decode_line_1, at any
> rate: it suggests that some code that I was planning to add to
> lookup_symbol should really go in decode_line_1.  (The only thing
> worse than having one gross hack of a parser is having two gross hacks
> of a parser that are both trying to do the same thing.)
> 
> For what it's worth, I've got decode_line_1 tamed a bit.  The version
> of it that I'll check into my branch later today is only 82 lines
> long.  (I'll include it after my signature; of course, it calls lots
> of other functions that contain various bits of the current version of
> the function.) It turns out that, underlying all that mess, there's
> actually a function with perfectly reasonable control flow.  In
> particular, whoever put in those goto's originally should be given a
> bit of a talking-to: there's absolutely no reason whatsoever for the
> function to do any jumps like that.  Mind you, my version of the code
> still contains all of the myriad special cases in the current version,
> but at least it's now pretty obvious when the function might return,
> whether all the different uses of the variables 'p' and 'copy' are
> really pointing to the same thing or are only there because somebody
> decided to reuse variable names, and so forth.
> 
> David Carlton
> carlton@math.stanford.edu
> 
> struct symtabs_and_lines
> decode_line_1 (char **argptr, int funfirstline, struct symtab *default_symtab,
> 	       int default_line, char ***canonical)
> {
>   /* This is NULL if there are no parens in argptr, or a pointer to
>      the closing parenthesis if there are parens.  */
>   char *paren_pointer;
>   /* If a file name is specified, this is its symtab.  */
>   struct symtab *file_symtab = NULL;
>   int is_quoted;
>   char *saved_arg = *argptr;
> 
>   /* Defaults have defaults.  */
> 
>   dl1_initialize_defaults (&default_symtab, &default_line);
>   
>   /* See if arg is *PC */
> 
>   if (**argptr == '*')
>     return dl1_indirect (argptr);
> 
>   /* Set various flags.
>    * 'paren_pointer' is important for overload checking, where
>    * we allow things like: 
>    *     (gdb) break c::f(int)
>    */
> 
>   dl1_set_flags (*argptr, &is_quoted, &paren_pointer);
> 
>   /* Check to see if it's a multipart linespec (with colons or periods).  */
>   {
>     char *p;
>     int is_quote_enclosed;
>     
>     /* Locate the first half of the linespec, ending in a colon, period,
>        or whitespace.  (More or less.)  */
> 
>     p = dl1_locate_first_half (argptr, &is_quote_enclosed);
> 
>     /* Does it look like there actually were two parts?  */
> 
>     if ((p[0] == ':' || p[0] == '.') && paren_pointer == NULL)
>       {
> 	if (is_quoted)
> 	  *argptr = *argptr + 1;
>       
> 	/* Is it a C++ or Java compound data structure?  */
>       
> 	if (p[0] == '.' || p[1] == ':')
> 	  return dl1_compound (argptr, funfirstline, canonical,
> 			       saved_arg, p);
> 
> 	/* No, the first part is a filename; set file_symtab
> 	   accordingly.  Also, move argptr past the filename.  */
>       
> 	file_symtab = dl1_handle_filename (argptr, p, is_quote_enclosed);
>       }
>   }
> 
>   /* Check whether arg is all digits (and sign) */
> 
>   if (dl1_is_all_digits (*argptr))
>     return dl1_all_digits (argptr, default_symtab, default_line,
> 			   canonical, file_symtab);
> 
>   /* Arg token is not digits => try it as a variable name
>      Find the next token (everything up to end or next whitespace).  */
> 
>   /* If it starts with $: may be a legitimate variable or routine name
>      (e.g. HP-UX millicode routines such as $$dyncall), or it may
>      be history value, or it may be a convenience variable */
>   
>   if (**argptr == '$')
>     return dl1_dollar (argptr, funfirstline, default_symtab, canonical,
> 		       file_symtab);
>   
>   /* Look up that token as a variable.
>      If file specified, use that file's per-file block to start with.  */
> 
>   return dl1_variable (argptr, funfirstline, canonical, is_quoted,
> 		       paren_pointer, file_symtab);
> }
> 


-- 
Fernando Nasser
Red Hat Canada Ltd.                     E-Mail:  fnasser@redhat.com
2323 Yonge Street, Suite #300
Toronto, Ontario   M4P 2C9


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: How I learned to stop worrying and love decode_line_1
  2002-11-04 17:28 ` Fernando Nasser
@ 2002-11-04 20:41   ` David Carlton
  0 siblings, 0 replies; 3+ messages in thread
From: David Carlton @ 2002-11-04 20:41 UTC (permalink / raw)
  To: Fernando Nasser; +Cc: gdb, Jim Blandy, Elena Zannoni

On Mon, 04 Nov 2002 20:31:39 -0500, Fernando Nasser <fnasser@redhat.com> said:

> Jim Blandy brought your message to my attention.

> Would you be willing to submit a patch with your clean-up?

Absolutely.  I want to do a little more polishing, but then I'll
submit an RFC, probably around the end of this week or the beginning
of next week.

David Carlton
carlton@math.stanford.edu


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2002-11-05  4:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-01 14:53 How I learned to stop worrying and love decode_line_1 David Carlton
2002-11-04 17:28 ` Fernando Nasser
2002-11-04 20:41   ` David Carlton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox