Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Jim Blandy <jimb@redhat.com>
To: Neil Booth <neil@daikokuya.demon.co.uk>
Cc: gdb-patches@sources.redhat.com, gcc@gcc.gnu.org
Subject: Re: RFC: C/C++ preprocessor macro support for GDB
Date: Sun, 17 Mar 2002 20:35:00 -0000	[thread overview]
Message-ID: <npbsdmxzla.fsf@zwingli.cygnus.com> (raw)
In-Reply-To: <20020317101938.GA2636@daikokuya.demon.co.uk>


Neil Booth <neil@daikokuya.demon.co.uk> writes:
> What are the issues with using libcpp?  It would be a good test of its
> viability as an independent library to have it used somewhere else.

I think there are two issues.  Both might simply be my
misunderstanding of the libcpp header files and code I read; I'd love
to be set straight.

- GDB has commands like this:

        (gdb) break *ADDRESS if CONDITION

  This sets a conditional breakpoint at the address computed by
  evaluating the expression ADDRESS, whose condition is CONDITION.
  ADDRESS needs to be evaluated in the current scope --- the currently
  selected frame and its PC --- but CONDITION needs to be evaluated in
  the scope in force at the *breakpoint's* address.  So you can't just
  take the whole command and smoosh it through an expander all at
  once: ADDRESS and CONDITION might have totally different contexts,
  as far as the preprocessor is concerned.

  This means you've got to decide if there's an `if' in the command
  before you can macro-expand things.  Obviously, an `if' in a string,
  or as part of a larger identifier, doesn't count --- you really need
  to work in terms of tokens.

  (There's a similar situation involving commas: sometimes the parser
  is supposed to stop when it finds its first comma outside of any
  parens.)

  So my macro expander has the following function in its public
  interface:

    /* If the null-terminated string pointed to by *LEXPTR begins with a
       macro invocation, return the result of expanding that invocation as
       a null-terminated string, and set *LEXPTR to the next character
       after the invocation.  The result is completely expanded; it
       contains no further macro invocations.

       Otherwise, if *LEXPTR does not start with a macro invocation,
       return zero, and leave *LEXPTR unchanged.

       Use LOOKUP_FUNC and LOOKUP_BATON to find macro definitions.

       If this function returns a string, the caller is responsible for
       freeing it, using xfree.

       We need this expand-one-token-at-a-time interface in order to
       accomodate GDB's C expression parser, which may not consume the
       entire string.  When the user enters a command like

          (gdb) break *func+20 if x == 5

       the parser is expected to consume `func+20', and then stop when it
       sees the "if".  But of course, "if" appearing in a character string
       or as part of a larger identifier doesn't count.  So you pretty
       much have to do tokenization to find the end of the string that
       needs to be macro-expanded.  Our C/C++ tokenizer isn't really
       designed to be called by anything but the yacc parser engine.  */
    char *macro_expand_next (char **lexptr,
                             macro_lookup_ftype *lookup_func,
                             void *lookup_baton);

  I changed GDB's lexer to call macro_expand_next before carving out
  each token.  This means we don't have to worry about commas or `if's
  in macro invocations being confused with terminating commas: the
  expander consumes them before we ever see them.

  As far as I can tell, libcpp doesn't provide an analogous
  token-by-token entry point.

  Another way to deal with this would be to lex the command string
  twice: once to find the `if' or comma, and then again to do the real
  parsing, after macro-expanding each of the various expressions
  properly.  The only difficulty here is that GDB's lexer expects to
  be called by a yacc-style parsing engine; it deposits tokens'
  semantic values in yylval, etc.  To work around this, we'd need to
  make the lexer independent of yacc --- give it some other way to
  return semantic values, mostly --- and hook that into both yacc and
  the code looking for `if's and commas.  But that approach wouldn't
  require any change to libcpp's interface.

  There's nothing too hard there.  But I wanted to put together a
  patch which actually worked, while disturbing the existing GDB code
  as little as possible.  And I think there's something unsatisfying
  about the two-pass approach; parsers ought to be able to leave input
  unconsumed if they want.  It's a common enough idiom.  Shouldn't
  libcpp support it?

- GDB's macro data structures record all the macros that were ever
  #defined in a compilation unit, and the line numbers at which they
  were in force.  Given a name and an #inclusion and a line number (or
  in libcpp's terminology, a logical line number?), it can find the
  #definition in scope at that point.

  This is a bit different from libcpp's data structures, which only
  record the macros currently in force as libcpp makes a pass through
  the file's text.  (At least, that's the impression I got.)

  My macro expander is completely ignorant of the lookup table's
  structure; you pass it a function and a data pointer that it uses
  blindly for lookups.  Here's the relevant typedef, and one of the
  prototypes, from the expander's public interface:

    /* A function for looking up preprocessor macro definitions.  Return
       the preprocessor definition of NAME in scope according to BATON, or
       zero if NAME is not defined as a preprocessor macro.

       The caller must not free or modify the definition returned.  It is
       probably unwise for the caller to hold pointers to it for very
       long; it probably lives in some objfile's obstacks.  */
    typedef struct macro_definition *(macro_lookup_ftype) (const char *name,
                                                           void *baton);

    /* Expand any preprocessor macros in SOURCE, and return the expanded
       text.  Use LOOKUP_FUNC and LOOKUP_FUNC_BATON to find identifiers'
       preprocessor definitions.  SOURCE is a null-terminated string.  The
       result is a null-terminated string, allocated using xmalloc; it is
       the caller's responsibility to free it.  */
    char *macro_expand (const char *source,
                        macro_lookup_ftype *lookup_func,
                        void *lookup_func_baton);

  When expanding an expression, GDB packages up the #inclusion and
  line number in the baton argument, and provides a lookup_func that
  takes those together with the macro name to search the macro table.


  reply	other threads:[~2002-03-18  4:35 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-03-16 22:23 Jim Blandy
2002-03-17  0:11 ` Zack Weinberg
2002-03-17 19:33   ` Jim Blandy
2002-03-17  4:46 ` Neil Booth
2002-03-17 20:35   ` Jim Blandy [this message]
2002-03-17 23:29     ` Neil Booth
2002-03-18  0:06       ` Daniel Berlin
2002-03-18  0:36         ` Zack Weinberg
2002-03-18  5:00           ` Daniel Berlin
2002-03-18  5:32             ` Daniel Berlin
2002-03-18 11:18           ` Jim Blandy
2002-03-18 12:09             ` Neil Booth
2002-03-18 10:45         ` Neil Booth
2002-03-18 11:45           ` Stan Shebs
2002-03-18 12:05             ` Neil Booth
2002-03-18 12:19               ` Stan Shebs
2002-03-18 15:45             ` Jim Blandy
2002-03-18  7:16     ` Andrew Cagney
2002-03-17  9:07 ` Daniel Berlin
2002-03-17 16:53   ` Daniel Berlin
2002-03-18  7:35 ` Batons? Was: " Andrew Cagney
2002-03-18 12:08   ` Jim Blandy
2002-03-18 12:55     ` Andrew Cagney
2002-03-18 15:49       ` Jim Blandy
2002-03-18  7:39 ` Andrew Cagney
2002-03-19 13:16   ` Jim Blandy
2002-03-18 10:34 ` Neil Booth
2002-03-18 11:11   ` Neil Booth
2002-03-18 16:03   ` Jim Blandy
2002-03-18 17:42     ` Stan Shebs
2002-03-18 19:51   ` Jim Blandy
2002-03-18 23:23     ` Neil Booth
2002-03-18 20:33   ` Jim Blandy
2002-03-23 12:14 ` Andrew Cagney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=npbsdmxzla.fsf@zwingli.cygnus.com \
    --to=jimb@redhat.com \
    --cc=gcc@gcc.gnu.org \
    --cc=gdb-patches@sources.redhat.com \
    --cc=neil@daikokuya.demon.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox