* [PATCH] gdb: add a numeric check after the exponent (PR cli/24124) @ 2022-09-04 8:36 Enze Li via Gdb-patches 2022-09-04 8:42 ` Andreas Schwab 2022-09-05 13:57 ` [PATCH v2] " Enze Li via Gdb-patches 0 siblings, 2 replies; 8+ messages in thread From: Enze Li via Gdb-patches @ 2022-09-04 8:36 UTC (permalink / raw) To: gdb-patches; +Cc: enze.li PR cli/24124 points out that `b *804874d` or `b *804874f` gives output `Invalid number "804874d".` or `Invalid number "804874f".` And the output of `b *804874e` is `Breakpoint 1 at 0xc480a`. That is to say, when "e" or "E" appears after a decimal value, it will be incorrectly parsed as a floating point number. Importantly, this parsing is not consistent with the C language. The initial idea was to perform a "0x" or "0X" check of the address. But Tom pointed out that the text after the "*" is an arbitrary expression, not just an integer. Therefore, I realized that this idea was going in the wrong direction. After digging a bit deeper, I found that the root cause of this problem is that the lex_one_token function doesn't check for the case where the exponent has no digits. If we check this, GDB will not continue parsing the invalid numbers. Before this patch applied, things like (gdb) b *804874d Invalid number "804874d". (gdb) b *804874e Breakpoint 1 at 0xc480a (gdb) print 80d Invalid number "80d". (gdb) ptype 80d Invalid number "80d". (gdb) print 80e $1 = 80 (gdb) ptype 80e type = double The new behavior is (gdb) b *804874d Invalid number "804874d". (gdb) b *804874e Invalid number "804874e". (gdb) print 80d Invalid number "80d". (gdb) ptype 80d Invalid number "80d". (gdb) print 80e Invalid number "80e". (gdb) ptype 80e Invalid number "80e". Tested on x86_64-linux. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24124 --- gdb/c-exp.y | 3 ++- gdb/testsuite/gdb.base/commands.exp | 7 +++++++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/gdb/c-exp.y b/gdb/c-exp.y index 61a61fcba09c..b13de967b1b6 100644 --- a/gdb/c-exp.y +++ b/gdb/c-exp.y @@ -2769,7 +2769,8 @@ lex_one_token (struct parser_state *par_state, bool *is_quoted_name) /* This test includes !hex because 'e' is a valid hex digit and thus does not indicate a floating point number when the radix is hex. */ - if (!hex && !got_e && !got_p && (*p == 'e' || *p == 'E')) + if (!hex && !got_e && !got_p && (*p == 'e' || *p == 'E') + && p[1] >= '0' && p[1] <= '9') got_dot = got_e = 1; else if (!got_e && !got_p && (*p == 'p' || *p == 'P')) got_dot = got_p = 1; diff --git a/gdb/testsuite/gdb.base/commands.exp b/gdb/testsuite/gdb.base/commands.exp index 3eb4463cd1a5..ca30b757e38a 100644 --- a/gdb/testsuite/gdb.base/commands.exp +++ b/gdb/testsuite/gdb.base/commands.exp @@ -278,6 +278,13 @@ proc_with_prefix breakpoint_command_test {} { gdb_test "continue" \ "Breakpoint \[0-9\]*, factorial.*Now the value is 5" gdb_test "print value" " = 5" + + gdb_test "break *804874d" "Invalid number.*" "804874d is an invalid number" + gdb_test "break *804874e" "Invalid number.*" "804874e is an invalid number" + gdb_test "print 80d" "Invalid number.*" "(print) 80d is an invalid number" + gdb_test "ptype 80d" "Invalid number.*" "(ptype) 80d is an invalid number" + gdb_test "print 80e" "Invalid number.*" "(print) 80e is an invalid number" + gdb_test "ptype 80e" "Invalid number.*" "(ptype) 80e is an invalid number" } # Test clearing the commands of several breakpoints with one single "end". -- 2.37.2 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gdb: add a numeric check after the exponent (PR cli/24124) 2022-09-04 8:36 [PATCH] gdb: add a numeric check after the exponent (PR cli/24124) Enze Li via Gdb-patches @ 2022-09-04 8:42 ` Andreas Schwab 2022-09-04 10:01 ` Enze Li via Gdb-patches 2022-09-05 13:57 ` [PATCH v2] " Enze Li via Gdb-patches 1 sibling, 1 reply; 8+ messages in thread From: Andreas Schwab @ 2022-09-04 8:42 UTC (permalink / raw) To: Enze Li via Gdb-patches; +Cc: enze.li On Sep 04 2022, Enze Li via Gdb-patches wrote: > diff --git a/gdb/c-exp.y b/gdb/c-exp.y > index 61a61fcba09c..b13de967b1b6 100644 > --- a/gdb/c-exp.y > +++ b/gdb/c-exp.y > @@ -2769,7 +2769,8 @@ lex_one_token (struct parser_state *par_state, bool *is_quoted_name) > /* This test includes !hex because 'e' is a valid hex digit > and thus does not indicate a floating point number when > the radix is hex. */ > - if (!hex && !got_e && !got_p && (*p == 'e' || *p == 'E')) > + if (!hex && !got_e && !got_p && (*p == 'e' || *p == 'E') > + && p[1] >= '0' && p[1] <= '9') 'e' can be followed by a signed number. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gdb: add a numeric check after the exponent (PR cli/24124) 2022-09-04 8:42 ` Andreas Schwab @ 2022-09-04 10:01 ` Enze Li via Gdb-patches 0 siblings, 0 replies; 8+ messages in thread From: Enze Li via Gdb-patches @ 2022-09-04 10:01 UTC (permalink / raw) To: Andreas Schwab; +Cc: Enze Li via Gdb-patches, enze.li On Sun, Sep 04, 2022 at 10:42:43 AM +0200, Andreas Schwab wrote: > On Sep 04 2022, Enze Li via Gdb-patches wrote: > >> diff --git a/gdb/c-exp.y b/gdb/c-exp.y >> index 61a61fcba09c..b13de967b1b6 100644 >> --- a/gdb/c-exp.y >> +++ b/gdb/c-exp.y >> @@ -2769,7 +2769,8 @@ lex_one_token (struct parser_state *par_state, bool *is_quoted_name) >> /* This test includes !hex because 'e' is a valid hex digit >> and thus does not indicate a floating point number when >> the radix is hex. */ >> - if (!hex && !got_e && !got_p && (*p == 'e' || *p == 'E')) >> + if (!hex && !got_e && !got_p && (*p == 'e' || *p == 'E') >> + && p[1] >= '0' && p[1] <= '9') > > 'e' can be followed by a signed number. Oops, I didn't metion that. Thank you for reminding me. I'll soon update this patch in current thread. Thanks, Enze ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2] gdb: add a numeric check after the exponent (PR cli/24124) 2022-09-04 8:36 [PATCH] gdb: add a numeric check after the exponent (PR cli/24124) Enze Li via Gdb-patches 2022-09-04 8:42 ` Andreas Schwab @ 2022-09-05 13:57 ` Enze Li via Gdb-patches 2022-09-21 18:12 ` Tom Tromey 1 sibling, 1 reply; 8+ messages in thread From: Enze Li via Gdb-patches @ 2022-09-05 13:57 UTC (permalink / raw) To: gdb-patches; +Cc: schwab, enze.li PR cli/24124 points out that `b *804874d` or `b *804874f` gives output `Invalid number "804874d".` or `Invalid number "804874f".` And the output of `b *804874e` is `Breakpoint 1 at 0xc480a`. That is to say, when "e" or "E" appears after a decimal value, it will be incorrectly parsed as a floating point number. Importantly, this parsing is not consistent with the C language. The initial idea was to perform a "0x" or "0X" check of the address. But Tom pointed out that the text after the "*" is an arbitrary expression, not just an integer. Therefore, I realized that this idea was going in the wrong direction. After digging a bit deeper, I found that the root cause of this problem is that the lex_one_token function doesn't check for the case where the exponent has no digits. If we check this, GDB will not continue parsing the invalid numbers. Before this patch applied, things like (gdb) b *804874d Invalid number "804874d". (gdb) b *804874e Breakpoint 1 at 0xc480a (gdb) print 80d Invalid number "80d". (gdb) ptype 80d Invalid number "80d". (gdb) print 80e $1 = 80 (gdb) ptype 80e type = double The new behavior is (gdb) b *804874d Invalid number "804874d". (gdb) b *804874e Invalid number "804874e". (gdb) print 80d Invalid number "80d". (gdb) ptype 80d Invalid number "80d". (gdb) print 80e Invalid number "80e". (gdb) ptype 80e Invalid number "80e". New in v2: - Address Andreas's comments, avoid breaking the resolution of the sign of the exponent. - Add test cases for exponent with sign. Tested on x86_64-linux. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24124 --- gdb/c-exp.y | 10 ++++++++++ gdb/testsuite/gdb.base/commands.exp | 11 +++++++++++ 2 files changed, 21 insertions(+) diff --git a/gdb/c-exp.y b/gdb/c-exp.y index 61a61fcba09c..9827de9ad549 100644 --- a/gdb/c-exp.y +++ b/gdb/c-exp.y @@ -2752,6 +2752,7 @@ lex_one_token (struct parser_state *par_state, bool *is_quoted_name) int got_dot = 0, got_e = 0, got_p = 0, toktype; const char *p = tokstart; int hex = input_radix > 10; + int exp_num = 0; if (c == '0' && (p[1] == 'x' || p[1] == 'X')) { @@ -2783,6 +2784,9 @@ lex_one_token (struct parser_state *par_state, bool *is_quoted_name) /* This is the sign of the exponent, not the end of the number. */ continue; + /* This is the digit of the exponent. */ + else if (got_e && *p >= '0' && *p <= '9') + exp_num++; /* We will take any letters or digits. parse_number will complain if past the radix, or if L or U are not final. */ else if ((*p < '0' || *p > '9') @@ -2790,6 +2794,12 @@ lex_one_token (struct parser_state *par_state, bool *is_quoted_name) && (*p < 'A' || *p > 'Z'))) break; } + + /* If the exponent has no digits, it must be invalid. There is + no need to continue parsing. */ + if (got_e && exp_num == 0) + error (_("Invalid number \"%s\"."), tokstart); + toktype = parse_number (par_state, tokstart, p - tokstart, got_dot | got_e | got_p, &yylval); if (toktype == ERROR) diff --git a/gdb/testsuite/gdb.base/commands.exp b/gdb/testsuite/gdb.base/commands.exp index 3eb4463cd1a5..10c4c2722901 100644 --- a/gdb/testsuite/gdb.base/commands.exp +++ b/gdb/testsuite/gdb.base/commands.exp @@ -278,6 +278,17 @@ proc_with_prefix breakpoint_command_test {} { gdb_test "continue" \ "Breakpoint \[0-9\]*, factorial.*Now the value is 5" gdb_test "print value" " = 5" + + gdb_test "break *804874d" "Invalid number.*" "804874d is an invalid number" + gdb_test "break *804874e" "Invalid number.*" "804874e is an invalid number" + gdb_test "print 80d" "Invalid number.*" "(print) 80d is an invalid number" + gdb_test "ptype 80d" "Invalid number.*" "(ptype) 80d is an invalid number" + gdb_test "print 80e" "Invalid number.*" "(print) 80e is an invalid number" + gdb_test "ptype 80e" "Invalid number.*" "(ptype) 80e is an invalid number" + gdb_test "print 80e0" " = 80" "(print) 80e0" + gdb_test "print 80e1" " = 800" "(print) 80e1" + gdb_test "print 80e+2" " = 8000" "(print) 80e+2" + gdb_test "ptype 80e-1" " = double" "(ptype) 80e-1" } # Test clearing the commands of several breakpoints with one single "end". -- 2.37.2 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] gdb: add a numeric check after the exponent (PR cli/24124) 2022-09-05 13:57 ` [PATCH v2] " Enze Li via Gdb-patches @ 2022-09-21 18:12 ` Tom Tromey 2022-09-22 14:09 ` Enze Li via Gdb-patches 0 siblings, 1 reply; 8+ messages in thread From: Tom Tromey @ 2022-09-21 18:12 UTC (permalink / raw) To: Enze Li via Gdb-patches; +Cc: schwab, enze.li >>>>> Enze Li via Gdb-patches <gdb-patches@sourceware.org> writes: > PR cli/24124 points out that `b *804874d` or `b *804874f` gives output > `Invalid number "804874d".` or `Invalid number "804874f".` And the > output of `b *804874e` is `Breakpoint 1 at 0xc480a`. Thank you for the patch. > That is to say, when "e" or "E" appears after a decimal value, it will > be incorrectly parsed as a floating point number. Importantly, this > parsing is not consistent with the C language. I suspect the change has to be in c-exp.y:parse_number, in order to support the case where the input radix is 16. In this situation, something like "80e" is a valid number. The input radix seems like a misfeature, IMNSHO, but I think people do actually use it. Tom ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] gdb: add a numeric check after the exponent (PR cli/24124) 2022-09-21 18:12 ` Tom Tromey @ 2022-09-22 14:09 ` Enze Li via Gdb-patches 2022-09-23 13:47 ` Tom Tromey 0 siblings, 1 reply; 8+ messages in thread From: Enze Li via Gdb-patches @ 2022-09-22 14:09 UTC (permalink / raw) To: Tom Tromey; +Cc: schwab, Enze Li via Gdb-patches, enze.li Hi Tom, Thanks for your review. A few questions, please see below. On Wed, Sep 21 2022 at 12:12:45 PM -0600, Tom Tromey wrote: >>>>>> Enze Li via Gdb-patches <gdb-patches@sourceware.org> writes: > >> PR cli/24124 points out that `b *804874d` or `b *804874f` gives output >> `Invalid number "804874d".` or `Invalid number "804874f".` And the >> output of `b *804874e` is `Breakpoint 1 at 0xc480a`. > > Thank you for the patch. > >> That is to say, when "e" or "E" appears after a decimal value, it will >> be incorrectly parsed as a floating point number. Importantly, this >> parsing is not consistent with the C language. > > I suspect the change has to be in c-exp.y:parse_number, in order to > support the case where the input radix is 16. In this situation, ^^ Sorry, I didn't get that. Shouldn't this radix be 10? > something like "80e" is a valid number. If the radix is 16, is that mean that we should treat the "80" as a hex number, and "80e" equals "128e0"(radix is 10)? > > The input radix seems like a misfeature, IMNSHO, but I think people do > actually use it. In addition, do these test cases below meet expectations? ...... + gdb_test "break *804874d" "Invalid number.*" "804874d is an invalid number" + gdb_test "break *804874e" "Invalid number.*" "804874e is an invalid number" + gdb_test "print 80d" "Invalid number.*" "(print) 80d is an invalid number" + gdb_test "ptype 80d" "Invalid number.*" "(ptype) 80d is an invalid number" + gdb_test "print 80e" "Invalid number.*" "(print) 80e is an invalid number" + gdb_test "ptype 80e" "Invalid number.*" "(ptype) 80e is an invalid number" + gdb_test "print 80e0" " = 80" "(print) 80e0" + gdb_test "print 80e1" " = 800" "(print) 80e1" + gdb_test "print 80e+2" " = 8000" "(print) 80e+2" + gdb_test "ptype 80e-1" " = double" "(ptype) 80e-1" ...... Thanks, Enze ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] gdb: add a numeric check after the exponent (PR cli/24124) 2022-09-22 14:09 ` Enze Li via Gdb-patches @ 2022-09-23 13:47 ` Tom Tromey 2022-10-02 12:15 ` Enze Li via Gdb-patches 0 siblings, 1 reply; 8+ messages in thread From: Tom Tromey @ 2022-09-23 13:47 UTC (permalink / raw) To: Enze Li via Gdb-patches; +Cc: Tom Tromey, schwab, enze.li >>>>> "Enze" == Enze Li via Gdb-patches <gdb-patches@sourceware.org> writes: >> I suspect the change has to be in c-exp.y:parse_number, in order to >> support the case where the input radix is 16. In this situation, Enze> ^^ Enze> Sorry, I didn't get that. Shouldn't this radix be 10? No, what I mean is: (gdb) set input-radix 16 Input radix now set to decimal 16, hex 10, octal 20. (gdb) print 80e $1 = 2062 I think with your patch this will issue an error. Enze> In addition, do these test cases below meet expectations? They seem fine to me. Tom ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] gdb: add a numeric check after the exponent (PR cli/24124) 2022-09-23 13:47 ` Tom Tromey @ 2022-10-02 12:15 ` Enze Li via Gdb-patches 0 siblings, 0 replies; 8+ messages in thread From: Enze Li via Gdb-patches @ 2022-10-02 12:15 UTC (permalink / raw) To: Tom Tromey; +Cc: schwab, Enze Li via Gdb-patches, enze.li Hi Tom, Sorry for the delayed response. On Fri, Sep 23 2022 at 07:47:51 AM -0600, Tom Tromey wrote: >>>>>> "Enze" == Enze Li via Gdb-patches <gdb-patches@sourceware.org> writes: > >>> I suspect the change has to be in c-exp.y:parse_number, in order to >>> support the case where the input radix is 16. In this situation, > Enze> ^^ > Enze> Sorry, I didn't get that. Shouldn't this radix be 10? > > No, what I mean is: > > (gdb) set input-radix 16 > Input radix now set to decimal 16, hex 10, octal 20. > (gdb) print 80e > $1 = 2062 > > I think with your patch this will issue an error. I got the same output as you metioned above with v2 of the patch applied. If I set input-radix to 16, the "input_radix"(line 2754) variable is 16, and "hex" will be 1. In this case, GDB assumes that the "80e" is a hex number. None of the following if and else-if conditions are satisfied. When GDB runs to line 2767, "hex" is still 1, this is vital. ...... 2751 /* It's a number. */ 2752 int got_dot = 0, got_e = 0, got_p = 0, toktype; 2753 const char *p = tokstart; 2754 int hex = input_radix > 10; 2755 int exp_num = 0; 2756 2757 if (c == '0' && (p[1] == 'x' || p[1] == 'X')) 2758 { 2759 p += 2; 2760 hex = 1; 2761 } 2762 else if (c == '0' && (p[1]=='t' || p[1]=='T' || p[1]=='d' || p[1]=='D')) 2763 { 2764 p += 2; 2765 hex = 0; 2766 } 2767 ...... Now, GDB runs to line 2773, the condition cannot be satisfied because "hex" is 1, which means that "got_e" cannot be set to 1. Then GDB runs to line 2788, the judgment condition contains "got_e", and "got_e" is now 0, which does not satisfy this condition, so it is not affected for "80e". ...... 2768 for (;; ++p) 2769 { 2770 /* This test includes !hex because 'e' is a valid hex digit 2771 and thus does not indicate a floating point number when 2772 the radix is hex. */ 2773 if (!hex && !got_e && !got_p && (*p == 'e' || *p == 'E')) 2774 got_dot = got_e = 1; 2775 else if (!got_e && !got_p && (*p == 'p' || *p == 'P')) 2776 got_dot = got_p = 1; 2777 /* This test does not include !hex, because a '.' always indicates 2778 a decimal floating point number regardless of the radix. */ 2779 else if (!got_dot && *p == '.') 2780 got_dot = 1; 2781 else if (((got_e && (p[-1] == 'e' || p[-1] == 'E')) 2782 || (got_p && (p[-1] == 'p' || p[-1] == 'P'))) 2783 && (*p == '-' || *p == '+')) 2784 /* This is the sign of the exponent, not the end of the 2785 number. */ 2786 continue; 2787 /* This is the digit of the exponent. */ 2788 else if (got_e && *p >= '0' && *p <= '9') 2789 exp_num++; ...... This is my understanding. Did I miss something? Best Regards, Enze > > Enze> In addition, do these test cases below meet expectations? > > They seem fine to me. > > Tom ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-10-02 12:15 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-09-04 8:36 [PATCH] gdb: add a numeric check after the exponent (PR cli/24124) Enze Li via Gdb-patches 2022-09-04 8:42 ` Andreas Schwab 2022-09-04 10:01 ` Enze Li via Gdb-patches 2022-09-05 13:57 ` [PATCH v2] " Enze Li via Gdb-patches 2022-09-21 18:12 ` Tom Tromey 2022-09-22 14:09 ` Enze Li via Gdb-patches 2022-09-23 13:47 ` Tom Tromey 2022-10-02 12:15 ` Enze Li via Gdb-patches
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox