[PATCH 00/18] Refactor character printing

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* [PATCH 00/18] Refactor character printing
@ 2022-02-16 13:55 Tom Tromey
  2022-02-16 13:55 ` [PATCH 01/18] Fix latent quote char bug in generic_printstr Tom Tromey
                   ` (17 more replies)
  0 siblings, 18 replies; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches

Right now, each language has to implement two character output methods
-- printchar and emitchar.  Furthermore, the generic string printing
function does not use either of these methods, so language
implementors must choose between copying the string-printing code
(which is pretty big) to change the escaping style, or just always
using C style, no matter what the language in question actually does.

This patch simplifies this situation.  emitchar is removed, and the
generic string- and character-printing code is augmented to let
languages override some of the printing choices.  Then, all languages
are switched to the new scheme.

Regression tested on x86-64 Fedora 34.
Let me know what you think.

Tom



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 01/18] Fix latent quote char bug in generic_printstr
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 15:38   ` Andrew Burgess via Gdb-patches
  2022-02-16 13:55 ` [PATCH 02/18] Boolify need_escape in generic_emit_char Tom Tromey
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

generic_printstr prints an empty string like:

      fputs_filtered ("\"\"", stream);

However, this seems wrong to me if the quote character is something
other than double quote.  This patch fixes this latent bug, though I'm
not sure how to test it -- I don't know Fortran and in my experiment I
was unable to make a zero-length Fortran string.
---
 gdb/valprint.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gdb/valprint.c b/gdb/valprint.c
index d6ec64845f4..e4d68381189 100644
--- a/gdb/valprint.c
+++ b/gdb/valprint.c
@@ -2632,7 +2632,7 @@ generic_printstr (struct ui_file *stream, struct type *type,
 
   if (length == 0)
     {
-      fputs_filtered ("\"\"", stream);
+      fprintf_filtered (stream, "%c%c", quote_char, quote_char);
       return;
     }
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 02/18] Boolify need_escape in generic_emit_char
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
  2022-02-16 13:55 ` [PATCH 01/18] Fix latent quote char bug in generic_printstr Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 15:39   ` Andrew Burgess via Gdb-patches
  2022-02-16 13:55 ` [PATCH 03/18] Remove c_emit_char Tom Tromey
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

This changes 'need_escape' in generic_emit_char to be of type bool,
rather than int.
---
 gdb/valprint.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gdb/valprint.c b/gdb/valprint.c
index e4d68381189..e758c1d1066 100644
--- a/gdb/valprint.c
+++ b/gdb/valprint.c
@@ -2191,11 +2191,11 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
 	     int orig_len, int width,
 	     enum bfd_endian byte_order,
 	     struct obstack *output,
-	     int quoter, int *need_escapep)
+	     int quoter, bool *need_escapep)
 {
-  int need_escape = *need_escapep;
+  bool need_escape = *need_escapep;
 
-  *need_escapep = 0;
+  *need_escapep = false;
 
   /* iswprint implementation on Windows returns 1 for tab character.
      In order to avoid different printout on this host, we explicitly
@@ -2265,7 +2265,7 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
 		  ++i;
 		}
 
-	      *need_escapep = 1;
+	      *need_escapep = true;
 	    }
 	  break;
 	}
@@ -2283,7 +2283,7 @@ generic_emit_char (int c, struct type *type, struct ui_file *stream,
   enum bfd_endian byte_order
     = type_byte_order (type);
   gdb_byte *c_buf;
-  int need_escape = 0;
+  bool need_escape = false;
 
   c_buf = (gdb_byte *) alloca (TYPE_LENGTH (type));
   pack_long (c_buf, type, c);
@@ -2448,7 +2448,7 @@ print_converted_chars_to_obstack (struct obstack *obstack,
   const converted_character *elem;
   enum {START, SINGLE, REPEAT, INCOMPLETE, FINISH} state, last;
   gdb_wchar_t wide_quote_char = gdb_btowc (quote_char);
-  int need_escape = 0;
+  bool need_escape = false;
 
   /* Set the start state.  */
   idx = 0;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 03/18] Remove c_emit_char
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
  2022-02-16 13:55 ` [PATCH 01/18] Fix latent quote char bug in generic_printstr Tom Tromey
  2022-02-16 13:55 ` [PATCH 02/18] Boolify need_escape in generic_emit_char Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 15:40   ` Andrew Burgess via Gdb-patches
  2022-02-16 13:55 ` [PATCH 04/18] Remove c_printstr Tom Tromey
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

This renames c_emit_char, removing a layer of indirection.
---
 gdb/c-lang.c   | 4 ++--
 gdb/c-lang.h   | 3 ---
 gdb/language.c | 9 ---------
 3 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/gdb/c-lang.c b/gdb/c-lang.c
index 1f7cac7bef1..342109c94ef 100644
--- a/gdb/c-lang.c
+++ b/gdb/c-lang.c
@@ -144,8 +144,8 @@ classify_type (struct type *elttype, struct gdbarch *gdbarch,
    for printing characters and strings is language specific.  */
 
 void
-c_emit_char (int c, struct type *type,
-	     struct ui_file *stream, int quoter)
+language_defn::emitchar (int c, struct type *type,
+			 struct ui_file *stream, int quoter) const
 {
   const char *encoding;
 
diff --git a/gdb/c-lang.h b/gdb/c-lang.h
index 46e562df055..16c9b116393 100644
--- a/gdb/c-lang.h
+++ b/gdb/c-lang.h
@@ -103,9 +103,6 @@ extern void c_printstr (struct ui_file * stream,
 extern void c_language_arch_info (struct gdbarch *gdbarch,
 				  struct language_arch_info *lai);
 
-extern void c_emit_char (int c, struct type *type,
-			 struct ui_file *stream, int quoter);
-
 /* These are in c-typeprint.c: */
 
 extern void c_type_print_base (struct type *, struct ui_file *,
diff --git a/gdb/language.c b/gdb/language.c
index 69c73b0318e..931abcd3076 100644
--- a/gdb/language.c
+++ b/gdb/language.c
@@ -635,15 +635,6 @@ language_defn::value_print_inner
 
 /* See language.h.  */
 
-void
-language_defn::emitchar (int ch, struct type *chtype,
-			 struct ui_file * stream, int quoter) const
-{
-  c_emit_char (ch, chtype, stream, quoter);
-}
-
-/* See language.h.  */
-
 void
 language_defn::printstr (struct ui_file *stream, struct type *elttype,
 			 const gdb_byte *string, unsigned int length,
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 04/18] Remove c_printstr
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (2 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 03/18] Remove c_emit_char Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 15:46   ` Andrew Burgess via Gdb-patches
  2022-02-16 13:55 ` [PATCH 05/18] Don't use wchar_printable in print_wchar Tom Tromey
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

This renames c_printstr, removing a layer of indirection.
---
 gdb/c-lang.c    |  8 ++++----
 gdb/c-lang.h    |  8 --------
 gdb/language.c  | 12 ------------
 gdb/rust-lang.c |  5 +++--
 4 files changed, 7 insertions(+), 26 deletions(-)

diff --git a/gdb/c-lang.c b/gdb/c-lang.c
index 342109c94ef..fbbecb696b9 100644
--- a/gdb/c-lang.c
+++ b/gdb/c-lang.c
@@ -190,10 +190,10 @@ language_defn::printchar (int c, struct type *type,
    characters, or if FORCE_ELLIPSES.  */
 
 void
-c_printstr (struct ui_file *stream, struct type *type, 
-	    const gdb_byte *string, unsigned int length, 
-	    const char *user_encoding, int force_ellipses,
-	    const struct value_print_options *options)
+language_defn::printstr (struct ui_file *stream, struct type *type, 
+			 const gdb_byte *string, unsigned int length, 
+			 const char *user_encoding, int force_ellipses,
+			 const struct value_print_options *options) const
 {
   c_string_type str_type;
   const char *type_encoding;
diff --git a/gdb/c-lang.h b/gdb/c-lang.h
index 16c9b116393..5441bfe10c7 100644
--- a/gdb/c-lang.h
+++ b/gdb/c-lang.h
@@ -92,14 +92,6 @@ extern void c_value_print (struct value *, struct ui_file *,
 
 extern void c_printchar (int, struct type *, struct ui_file *);
 
-extern void c_printstr (struct ui_file * stream,
-			struct type *elttype,
-			const gdb_byte *string,
-			unsigned int length,
-			const char *user_encoding,
-			int force_ellipses,
-			const struct value_print_options *options);
-
 extern void c_language_arch_info (struct gdbarch *gdbarch,
 				  struct language_arch_info *lai);
 
diff --git a/gdb/language.c b/gdb/language.c
index 931abcd3076..20b6d8ccf9b 100644
--- a/gdb/language.c
+++ b/gdb/language.c
@@ -635,18 +635,6 @@ language_defn::value_print_inner
 
 /* See language.h.  */
 
-void
-language_defn::printstr (struct ui_file *stream, struct type *elttype,
-			 const gdb_byte *string, unsigned int length,
-			 const char *encoding, int force_ellipses,
-			 const struct value_print_options *options) const
-{
-  c_printstr (stream, elttype, string, length, encoding, force_ellipses,
-	      options);
-}
-
-/* See language.h.  */
-
 void
 language_defn::print_typedef (struct type *type, struct symbol *new_symbol,
 			      struct ui_file *stream) const
diff --git a/gdb/rust-lang.c b/gdb/rust-lang.c
index 7584d2572fa..3668d2d0c6d 100644
--- a/gdb/rust-lang.c
+++ b/gdb/rust-lang.c
@@ -283,8 +283,9 @@ rust_language::printstr (struct ui_file *stream, struct type *type,
 	{
 	  /* This is probably some C string, so let's let C deal with
 	     it.  */
-	  c_printstr (stream, type, string, length, user_encoding,
-		      force_ellipses, options);
+	  this->language_defn::printstr (stream, type, string, length,
+					 user_encoding, force_ellipses,
+					 options);
 	  return;
 	}
     }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 05/18] Don't use wchar_printable in print_wchar
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (3 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 04/18] Remove c_printstr Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 15:52   ` Andrew Burgess via Gdb-patches
  2022-02-16 13:55 ` [PATCH 06/18] Fix a latent bug " Tom Tromey
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

print_wchar uses wchar_printable, but this isn't needed -- all the
relevant cases are already handled by the 'switch'.  This changes the
code to use gdb_iswprint, and removes a somewhat confusing comment
related to this code.
---
 gdb/valprint.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/gdb/valprint.c b/gdb/valprint.c
index e758c1d1066..17ad46c87b5 100644
--- a/gdb/valprint.c
+++ b/gdb/valprint.c
@@ -2197,9 +2197,6 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
 
   *need_escapep = false;
 
-  /* iswprint implementation on Windows returns 1 for tab character.
-     In order to avoid different printout on this host, we explicitly
-     use wchar_printable function.  */
   switch (w)
     {
       case LCST ('\a'):
@@ -2225,9 +2222,9 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
 	break;
       default:
 	{
-	  if (wchar_printable (w) && (!need_escape || (!gdb_iswdigit (w)
-						       && w != LCST ('8')
-						       && w != LCST ('9'))))
+	  if (gdb_iswprint (w) && (!need_escape || (!gdb_iswdigit (w)
+						    && w != LCST ('8')
+						    && w != LCST ('9'))))
 	    {
 	      gdb_wchar_t wchar = w;
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 06/18] Fix a latent bug in print_wchar
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (4 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 05/18] Don't use wchar_printable in print_wchar Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 16:02   ` Andrew Burgess via Gdb-patches
  2022-02-16 13:55 ` [PATCH 07/18] Remove language_defn::emitchar Tom Tromey
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

print_wchar keeps track of when escape sequences are emitted, to force
an escape sequence if needed by a subsequent character.  For example
for the string concatenation "\0" "1", gdb will print "\000\061" --
because printing "\0001" might be confusing.

However, this code has a logic error.  It's intended to reject '8' and
'9', as they are not octal digits, but it fails to do so.  This patch
fixes the bug, and slightly rewrites the expression to be a bit more
clear.

I suspect that, actually, this code can be removed entirely, because
octal sequences are limited to 3 digits.  However, I'm not certain, so
I've left it in place.
---
 gdb/testsuite/gdb.base/charset.exp | 4 ++++
 gdb/valprint.c                     | 7 ++++---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/gdb/testsuite/gdb.base/charset.exp b/gdb/testsuite/gdb.base/charset.exp
index 5df2ec1a8de..97d49b9c461 100644
--- a/gdb/testsuite/gdb.base/charset.exp
+++ b/gdb/testsuite/gdb.base/charset.exp
@@ -503,6 +503,10 @@ gdb_test "print '\\9'" " = \[0-9\]+ '9'"
 # An octal escape can only be 3 digits.
 gdb_test "print \"\\1011\"" " = \"A1\""
 
+# Neither 8 nor 9 need escaping in this situation.
+gdb_test "print \"\\0\" \"8\"" " = \"\\\\0008\""
+gdb_test "print \"\\0\" \"9\"" " = \"\\\\0009\""
+
 # Tests for wide- or unicode- strings.  L is the prefix letter to use,
 # either "L" (for wide strings), "u" (for UTF-16), or "U" (for UTF-32).
 # NAME is used in the test names and should be related to the prefix
diff --git a/gdb/valprint.c b/gdb/valprint.c
index 17ad46c87b5..7e67cb953f4 100644
--- a/gdb/valprint.c
+++ b/gdb/valprint.c
@@ -2222,9 +2222,10 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
 	break;
       default:
 	{
-	  if (gdb_iswprint (w) && (!need_escape || (!gdb_iswdigit (w)
-						    && w != LCST ('8')
-						    && w != LCST ('9'))))
+	  if (gdb_iswprint (w)
+	      && !(need_escape && (gdb_iswdigit (w)
+				   && w != LCST ('8')
+				   && w != LCST ('9'))))
 	    {
 	      gdb_wchar_t wchar = w;
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 07/18] Remove language_defn::emitchar
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (5 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 06/18] Fix a latent bug " Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 16:12   ` Andrew Burgess via Gdb-patches
  2022-02-16 13:55 ` [PATCH 08/18] Add gdb_iswcntrl Tom Tromey
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

Nothing outside of the specific language implementations ever calls
language_defn::emitchar.  This patch removes this method and updates
the rest of the code.  In some spots, the method is entirely removed;
in others, just the 'override' is removed.
---
 gdb/ada-lang.c  |  8 --------
 gdb/c-lang.c    | 19 +++----------------
 gdb/f-lang.h    | 12 ++----------
 gdb/language.c  |  9 ---------
 gdb/language.h  |  6 ------
 gdb/m2-lang.h   |  2 +-
 gdb/p-lang.h    |  2 +-
 gdb/rust-lang.c | 10 ++++++----
 gdb/rust-lang.h | 12 +-----------
 9 files changed, 14 insertions(+), 66 deletions(-)

diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
index d2f620cbb04..e3bd1fa7c17 100644
--- a/gdb/ada-lang.c
+++ b/gdb/ada-lang.c
@@ -13317,14 +13317,6 @@ class ada_language : public language_defn
 
   /* See language.h.  */
 
-  void emitchar (int ch, struct type *chtype,
-		 struct ui_file *stream, int quoter) const override
-  {
-    ada_emit_char (ch, chtype, stream, quoter, 1);
-  }
-
-  /* See language.h.  */
-
   void printchar (int ch, struct type *chtype,
 		  struct ui_file *stream) const override
   {
diff --git a/gdb/c-lang.c b/gdb/c-lang.c
index fbbecb696b9..1a6e6969a84 100644
--- a/gdb/c-lang.c
+++ b/gdb/c-lang.c
@@ -139,20 +139,6 @@ classify_type (struct type *elttype, struct gdbarch *gdbarch,
   return result;
 }
 
-/* Print the character C on STREAM as part of the contents of a
-   literal string whose delimiter is QUOTER.  Note that that format
-   for printing characters and strings is language specific.  */
-
-void
-language_defn::emitchar (int c, struct type *type,
-			 struct ui_file *stream, int quoter) const
-{
-  const char *encoding;
-
-  classify_type (type, type->arch (), &encoding);
-  generic_emit_char (c, type, stream, quoter, encoding);
-}
-
 /* See language.h.  */
 
 void
@@ -160,8 +146,9 @@ language_defn::printchar (int c, struct type *type,
 			  struct ui_file * stream) const
 {
   c_string_type str_type;
+  const char *encoding;
 
-  str_type = classify_type (type, type->arch (), NULL);
+  str_type = classify_type (type, type->arch (), &encoding);
   switch (str_type)
     {
     case C_CHAR:
@@ -178,7 +165,7 @@ language_defn::printchar (int c, struct type *type,
     }
 
   fputc_filtered ('\'', stream);
-  emitchar (c, type, stream, '\'');
+  generic_emit_char (c, type, stream, '\'', encoding);
   fputc_filtered ('\'', stream);
 }
 
diff --git a/gdb/f-lang.h b/gdb/f-lang.h
index 11debd5569f..388c832dcdb 100644
--- a/gdb/f-lang.h
+++ b/gdb/f-lang.h
@@ -149,20 +149,12 @@ class f_language : public language_defn
 
   /* See language.h.  */
 
-  void emitchar (int ch, struct type *chtype,
-		 struct ui_file *stream, int quoter) const override
-  {
-    const char *encoding = get_encoding (chtype);
-    generic_emit_char (ch, chtype, stream, quoter, encoding);
-  }
-
-  /* See language.h.  */
-
   void printchar (int ch, struct type *chtype,
 		  struct ui_file *stream) const override
   {
     fputs_filtered ("'", stream);
-    emitchar (ch, chtype, stream, '\'');
+    const char *encoding = get_encoding (chtype);
+    generic_emit_char (ch, chtype, stream, '\'', encoding);
     fputs_filtered ("'", stream);
   }
 
diff --git a/gdb/language.c b/gdb/language.c
index 20b6d8ccf9b..dff9f8bdaf9 100644
--- a/gdb/language.c
+++ b/gdb/language.c
@@ -786,15 +786,6 @@ class auto_or_unknown_language : public language_defn
 
   /* See language.h.  */
 
-  void emitchar (int ch, struct type *chtype,
-		 struct ui_file *stream, int quoter) const override
-  {
-    error (_("emit character not implemented for language \"%s\""),
-	   natural_name ());
-  }
-
-  /* See language.h.  */
-
   void printchar (int ch, struct type *chtype,
 		  struct ui_file *stream) const override
   {
diff --git a/gdb/language.h b/gdb/language.h
index f2885000259..2820eca0050 100644
--- a/gdb/language.h
+++ b/gdb/language.h
@@ -524,12 +524,6 @@ struct language_defn
 
   virtual int parser (struct parser_state *ps) const;
 
-  /* Print the character CH (of type CHTYPE) on STREAM as part of the
-     contents of a literal string whose delimiter is QUOTER.  */
-
-  virtual void emitchar (int ch, struct type *chtype,
-			 struct ui_file *stream, int quoter) const;
-
   virtual void printchar (int ch, struct type *chtype,
 			  struct ui_file * stream) const;
 
diff --git a/gdb/m2-lang.h b/gdb/m2-lang.h
index 86a093e5f1b..ee2fc6fb8bc 100644
--- a/gdb/m2-lang.h
+++ b/gdb/m2-lang.h
@@ -93,7 +93,7 @@ class m2_language : public language_defn
   /* See language.h.  */
 
   void emitchar (int ch, struct type *chtype,
-		 struct ui_file *stream, int quoter) const override;
+		 struct ui_file *stream, int quoter) const;
 
   /* See language.h.  */
 
diff --git a/gdb/p-lang.h b/gdb/p-lang.h
index 429ef23aea4..b831bde0af6 100644
--- a/gdb/p-lang.h
+++ b/gdb/p-lang.h
@@ -110,7 +110,7 @@ class pascal_language : public language_defn
   /* See language.h.  */
 
   void emitchar (int ch, struct type *chtype,
-		 struct ui_file *stream, int quoter) const override
+		 struct ui_file *stream, int quoter) const
   {
     int in_quotes = 0;
 
diff --git a/gdb/rust-lang.c b/gdb/rust-lang.c
index 3668d2d0c6d..41e973797d2 100644
--- a/gdb/rust-lang.c
+++ b/gdb/rust-lang.c
@@ -1592,13 +1592,14 @@ rust_language::print_type (struct type *type, const char *varstring,
 /* See language.h.  */
 
 void
-rust_language::emitchar (int ch, struct type *chtype,
-			 struct ui_file *stream, int quoter) const
+rust_language::printchar (int ch, struct type *chtype,
+			  struct ui_file *stream) const
 {
+  fputs_filtered ("'", stream);
   if (!rust_chartype_p (chtype))
-    generic_emit_char (ch, chtype, stream, quoter,
+    generic_emit_char (ch, chtype, stream, '\'',
 		       target_charset (chtype->arch ()));
-  else if (ch == '\\' || ch == quoter)
+  else if (ch == '\\')
     fprintf_filtered (stream, "\\%c", ch);
   else if (ch == '\n')
     fputs_filtered ("\\n", stream);
@@ -1614,6 +1615,7 @@ rust_language::emitchar (int ch, struct type *chtype,
     fprintf_filtered (stream, "\\x%02x", ch);
   else
     fprintf_filtered (stream, "\\u{%06x}", ch);
+  fputs_filtered ("'", stream);
 }
 
 /* See language.h.  */
diff --git a/gdb/rust-lang.h b/gdb/rust-lang.h
index 60a1967ff45..1db83fe75ae 100644
--- a/gdb/rust-lang.h
+++ b/gdb/rust-lang.h
@@ -170,18 +170,8 @@ class rust_language : public language_defn
 
   /* See language.h.  */
 
-  void emitchar (int ch, struct type *chtype,
-		 struct ui_file *stream, int quoter) const override;
-
-  /* See language.h.  */
-
   void printchar (int ch, struct type *chtype,
-		  struct ui_file *stream) const override
-  {
-    fputs_filtered ("'", stream);
-    emitchar (ch, chtype, stream, '\'');
-    fputs_filtered ("'", stream);
-  }
+		  struct ui_file *stream) const override;
 
   /* See language.h.  */
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 08/18] Add gdb_iswcntrl
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (6 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 07/18] Remove language_defn::emitchar Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 16:13   ` Andrew Burgess via Gdb-patches
  2022-02-16 13:55 ` [PATCH 09/18] Include \0 in printable wide characters Tom Tromey
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

A future patch will need 'iswcntrl', so introduce the appropriate
wrappers in gdb_wchar.h.
---
 gdb/gdb_wchar.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gdb/gdb_wchar.h b/gdb/gdb_wchar.h
index ba5baf3a2a0..78e880c8f18 100644
--- a/gdb/gdb_wchar.h
+++ b/gdb/gdb_wchar.h
@@ -67,6 +67,7 @@ typedef wint_t gdb_wint_t;
 #define gdb_wcslen wcslen
 #define gdb_iswprint iswprint
 #define gdb_iswdigit iswdigit
+#define gdb_iswcntrl iswcntrl
 #define gdb_btowc btowc
 #define gdb_WEOF WEOF
 
@@ -104,6 +105,7 @@ typedef int gdb_wint_t;
 #define gdb_wcslen strlen
 #define gdb_iswprint isprint
 #define gdb_iswdigit isdigit
+#define gdb_iswcntrl iscntrl
 #define gdb_btowc /* empty */
 #define gdb_WEOF EOF
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 09/18] Include \0 in printable wide characters
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (7 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 08/18] Add gdb_iswcntrl Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 17:19   ` Andrew Burgess via Gdb-patches
  2022-02-16 13:55 ` [PATCH 10/18] Use a ui_file in print_wchar Tom Tromey
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

print_wchar can always display \0, so include it in the list of
"printable" characters in wchar_printable.  This is useful in a coming
patch to change Rust to use the generic character-printing code.
---
 gdb/valprint.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gdb/valprint.c b/gdb/valprint.c
index 7e67cb953f4..dec1dce9d3c 100644
--- a/gdb/valprint.c
+++ b/gdb/valprint.c
@@ -2160,7 +2160,7 @@ wchar_printable (gdb_wchar_t w)
 	  || w == LCST ('\a') || w == LCST ('\b')
 	  || w == LCST ('\f') || w == LCST ('\n')
 	  || w == LCST ('\r') || w == LCST ('\t')
-	  || w == LCST ('\v'));
+	  || w == LCST ('\v') || w == LCST ('\0'));
 }
 
 /* A helper function that converts the contents of STRING to wide
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 10/18] Use a ui_file in print_wchar
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (8 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 09/18] Include \0 in printable wide characters Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 17:25   ` Andrew Burgess via Gdb-patches
  2022-02-16 13:55 ` [PATCH 11/18] Add an emitter callback to generic_printstr and generic_emit_char Tom Tromey
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

This introduces a new ui_file subclass, that turns a host string into
a wide string and appends it to an obstack.  print_wchar is rewritten
to use this new ui_file.  This will be more useful in a later patch.
---
 gdb/valprint.c | 49 +++++++++++++++++++++++++++----------------------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/gdb/valprint.c b/gdb/valprint.c
index dec1dce9d3c..00c0cd2c72a 100644
--- a/gdb/valprint.c
+++ b/gdb/valprint.c
@@ -2163,19 +2163,29 @@ wchar_printable (gdb_wchar_t w)
 	  || w == LCST ('\v') || w == LCST ('\0'));
 }
 
-/* A helper function that converts the contents of STRING to wide
-   characters and then appends them to OUTPUT.  */
-
-static void
-append_string_as_wide (const char *string,
-		       struct obstack *output)
-{
-  for (; *string; ++string)
-    {
-      gdb_wchar_t w = gdb_btowc (*string);
-      obstack_grow (output, &w, sizeof (gdb_wchar_t));
-    }
-}
+/* A ui_file that writes wide characters to an obstack.  */
+class obstack_wide_file : public ui_file
+{
+public:
+  explicit obstack_wide_file (struct obstack *output)
+    : m_output (output)
+  {
+  }
+
+  ~obstack_wide_file () = default;
+
+  void write (const char *buf, long length_buf) override
+  {
+    for (long i = 0; i < length_buf; ++i)
+      {
+	gdb_wchar_t w = gdb_btowc (buf[i]);
+	obstack_grow (m_output, &w, sizeof (gdb_wchar_t));
+      }
+  }
+
+private:
+  struct obstack *m_output;
+};
 
 /* Print a wide character W to OUTPUT.  ORIG is a pointer to the
    original (target) bytes representing the character, ORIG_LEN is the
@@ -2236,10 +2246,10 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
 	  else
 	    {
 	      int i;
+	      obstack_wide_file file (output);
 
 	      for (i = 0; i + width <= orig_len; i += width)
 		{
-		  char octal[30];
 		  ULONGEST value;
 
 		  value = extract_unsigned_integer (&orig[i], width,
@@ -2247,19 +2257,14 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
 		  /* If the value fits in 3 octal digits, print it that
 		     way.  Otherwise, print it as a hex escape.  */
 		  if (value <= 0777)
-		    xsnprintf (octal, sizeof (octal), "\\%.3o",
-			       (int) (value & 0777));
+		    fprintf_filtered (&file, "\\%.3o", (int) (value & 0777));
 		  else
-		    xsnprintf (octal, sizeof (octal), "\\x%lx", (long) value);
-		  append_string_as_wide (octal, output);
+		    fprintf_filtered (&file, "\\x%lx", (long) value);
 		}
 	      /* If we somehow have extra bytes, print them now.  */
 	      while (i < orig_len)
 		{
-		  char octal[5];
-
-		  xsnprintf (octal, sizeof (octal), "\\%.3o", orig[i] & 0xff);
-		  append_string_as_wide (octal, output);
+		  fprintf_filtered (&file, "\\%.3o", orig[i] & 0xff);
 		  ++i;
 		}
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 11/18] Add an emitter callback to generic_printstr and generic_emit_char
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (9 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 10/18] Use a ui_file in print_wchar Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 17:47   ` Andrew Burgess via Gdb-patches
  2022-02-16 13:55 ` [PATCH 12/18] Add a default encoding to generic_emit_char and generic_printstr Tom Tromey
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

This adds an emitter callback to generic_printstr and
generic_emit_char, passing it through to print_wchar.  print_wchar is
modified to call it, if possible.  This will be used to let languages
override the way that escape sequences are emitted.  Nothing uses this
yet, that comes later in the series.
---
 gdb/valprint.c | 41 ++++++++++++++++++++++++++++-------------
 gdb/valprint.h | 30 ++++++++++++++++++++++++++++--
 2 files changed, 56 insertions(+), 15 deletions(-)

diff --git a/gdb/valprint.c b/gdb/valprint.c
index 00c0cd2c72a..c0e5e678005 100644
--- a/gdb/valprint.c
+++ b/gdb/valprint.c
@@ -2201,12 +2201,19 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
 	     int orig_len, int width,
 	     enum bfd_endian byte_order,
 	     struct obstack *output,
-	     int quoter, bool *need_escapep)
+	     int quoter, bool *need_escapep,
+	     emit_char_ftype emitter)
 {
   bool need_escape = *need_escapep;
 
   *need_escapep = false;
 
+  obstack_wide_file file (output);
+  if (emitter != nullptr
+      && emitter (&file, w, gdb::make_array_view (orig, orig_len),
+		  width, byte_order, quoter))
+    return;
+
   switch (w)
     {
       case LCST ('\a'):
@@ -2246,7 +2253,6 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
 	  else
 	    {
 	      int i;
-	      obstack_wide_file file (output);
 
 	      for (i = 0; i + width <= orig_len; i += width)
 		{
@@ -2281,7 +2287,8 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
 
 void
 generic_emit_char (int c, struct type *type, struct ui_file *stream,
-		   int quoter, const char *encoding)
+		   int quoter, const char *encoding,
+		   emit_char_ftype emitter)
 {
   enum bfd_endian byte_order
     = type_byte_order (type);
@@ -2330,14 +2337,16 @@ generic_emit_char (int c, struct type *type, struct ui_file *stream,
 	      for (i = 0; i < num_chars; ++i)
 		print_wchar (chars[i], buf, buflen,
 			     TYPE_LENGTH (type), byte_order,
-			     &wchar_buf, quoter, &need_escape);
+			     &wchar_buf, quoter, &need_escape,
+			     emitter);
 	    }
 	}
 
       /* This handles the NUM_CHARS == 0 case as well.  */
       if (print_escape)
 	print_wchar (gdb_WEOF, buf, buflen, TYPE_LENGTH (type),
-		     byte_order, &wchar_buf, quoter, &need_escape);
+		     byte_order, &wchar_buf, quoter, &need_escape,
+		     emitter);
     }
 
   /* The output in the host encoding.  */
@@ -2445,7 +2454,8 @@ print_converted_chars_to_obstack (struct obstack *obstack,
 				  const std::vector<converted_character> &chars,
 				  int quote_char, int width,
 				  enum bfd_endian byte_order,
-				  const struct value_print_options *options)
+				  const struct value_print_options *options,
+				  emit_char_ftype emitter)
 {
   unsigned int idx;
   const converted_character *elem;
@@ -2486,10 +2496,12 @@ print_converted_chars_to_obstack (struct obstack *obstack,
 	      {
 		if (elem->result == wchar_iterate_ok)
 		  print_wchar (elem->chars[0], elem->buf, elem->buflen, width,
-			       byte_order, obstack, quote_char, &need_escape);
+			       byte_order, obstack, quote_char, &need_escape,
+			       emitter);
 		else
 		  print_wchar (gdb_WEOF, elem->buf, elem->buflen, width,
-			       byte_order, obstack, quote_char, &need_escape);
+			       byte_order, obstack, quote_char, &need_escape,
+			       emitter);
 	      }
 	  }
 	  break;
@@ -2514,10 +2526,12 @@ print_converted_chars_to_obstack (struct obstack *obstack,
 	    obstack_grow_wstr (obstack, LCST ("'"));
 	    if (elem->result == wchar_iterate_ok)
 	      print_wchar (elem->chars[0], elem->buf, elem->buflen, width,
-			   byte_order, obstack, quote_char, &need_escape);
+			   byte_order, obstack, quote_char, &need_escape,
+			   emitter);
 	    else
 	      print_wchar (gdb_WEOF, elem->buf, elem->buflen, width,
-			   byte_order, obstack, quote_char, &need_escape);
+			   byte_order, obstack, quote_char, &need_escape,
+			   emitter);
 	    obstack_grow_wstr (obstack, LCST ("'"));
 	    std::string s = string_printf (_(" <repeats %u times>"),
 					   elem->repeat_count);
@@ -2543,7 +2557,7 @@ print_converted_chars_to_obstack (struct obstack *obstack,
 	  /* Output the incomplete sequence string.  */
 	  obstack_grow_wstr (obstack, LCST ("<incomplete sequence "));
 	  print_wchar (gdb_WEOF, elem->buf, elem->buflen, width, byte_order,
-		       obstack, 0, &need_escape);
+		       obstack, 0, &need_escape, emitter);
 	  obstack_grow_wstr (obstack, LCST (">"));
 
 	  /* We do not attempt to output anything after this.  */
@@ -2602,7 +2616,8 @@ generic_printstr (struct ui_file *stream, struct type *type,
 		  const gdb_byte *string, unsigned int length, 
 		  const char *encoding, int force_ellipses,
 		  int quote_char, int c_style_terminator,
-		  const struct value_print_options *options)
+		  const struct value_print_options *options,
+		  emit_char_ftype emitter)
 {
   enum bfd_endian byte_order = type_byte_order (type);
   unsigned int i;
@@ -2678,7 +2693,7 @@ generic_printstr (struct ui_file *stream, struct type *type,
 
   /* Print the output string to the obstack.  */
   print_converted_chars_to_obstack (&wchar_buf, converted_chars, quote_char,
-				    width, byte_order, options);
+				    width, byte_order, options, emitter);
 
   if (force_ellipses || !finished)
     obstack_grow_wstr (&wchar_buf, LCST ("..."));
diff --git a/gdb/valprint.h b/gdb/valprint.h
index 0586836f9e6..64fea1ccb4a 100644
--- a/gdb/valprint.h
+++ b/gdb/valprint.h
@@ -233,14 +233,40 @@ extern void generic_value_print (struct value *val, struct ui_file *stream,
 				 const struct value_print_options *options,
 				 const struct generic_val_print_decorations *d);
 
+/* A callback that can be used to print a representation of a wide
+   character to a stream.
+   
+   If the character can be represented by this callback, it will
+   return true.  A false return indicates that the default behavior
+   should be taken -- for printable characters, the default is to emit
+   it verbatim; for non-printable characters, C-style escapes are
+   used.  Normally a callback should always return false for
+   printable, non-control characters.
+
+   STREAM is the stream to write to.
+   W is the character.  It might be gdb_WEOF, meaning an unconvertible
+   sequence.
+   ORIG is the original (target) bytes corresponding to W.
+   WIDTH is the width of a base character in the encoding.
+   BYTE_ORDER is the character type's byte order.
+   QUOTER is the quote character used -- this is a host character.  */
+typedef gdb::function_view<bool (ui_file *stream,
+				 gdb_wint_t w,
+				 gdb::array_view<const gdb_byte> orig,
+				 int width,
+				 enum bfd_endian byte_order,
+				 int quoter)> emit_char_ftype;
+
 extern void generic_emit_char (int c, struct type *type, struct ui_file *stream,
-			       int quoter, const char *encoding);
+			       int quoter, const char *encoding,
+			       emit_char_ftype emitter = nullptr);
 
 extern void generic_printstr (struct ui_file *stream, struct type *type, 
 			      const gdb_byte *string, unsigned int length, 
 			      const char *encoding, int force_ellipses,
 			      int quote_char, int c_style_terminator,
-			      const struct value_print_options *options);
+			      const struct value_print_options *options,
+			      emit_char_ftype emitter = nullptr);
 
 /* Run the "output" command.  ARGS and FROM_TTY are the usual
    arguments passed to all command implementations, except ARGS is
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 12/18] Add a default encoding to generic_emit_char and generic_printstr
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (10 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 11/18] Add an emitter callback to generic_printstr and generic_emit_char Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 13:55 ` [PATCH 13/18] Change generic_emit_char to print the quotes Tom Tromey
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

This adds a default encoding to generic_emit_char and
generic_printstr.  The default is pretty basic: use the target charset
for single-byte characters, use the wide charset for wchar_t, and
assume UTF-16/32 for the appropriately-sized other characters.
Languages for which these do not hold can be modified to do something
else if need be.
---
 gdb/valprint.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/gdb/valprint.c b/gdb/valprint.c
index c0e5e678005..554c0e0564f 100644
--- a/gdb/valprint.c
+++ b/gdb/valprint.c
@@ -2281,6 +2281,37 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
     }
 }
 
+/* Helper function to get the default encoding, given a type.  */
+static const char *
+get_default_encoding (struct type *chtype)
+{
+  const char *encoding;
+  if (TYPE_LENGTH (chtype) == 1)
+    encoding = target_charset (chtype->arch ());
+  else if (streq (chtype->name (), "wchar_t"))
+    encoding = target_wide_charset (chtype->arch ());
+  else if (TYPE_LENGTH (chtype) == 2)
+    {
+      if (type_byte_order (chtype) == BFD_ENDIAN_BIG)
+	encoding = "UTF-16BE";
+      else
+	encoding = "UTF-16LE";
+    }
+  else if (TYPE_LENGTH (chtype) == 4)
+    {
+      if (type_byte_order (chtype) == BFD_ENDIAN_BIG)
+	encoding = "UTF-32BE";
+      else
+	encoding = "UTF-32LE";
+    }
+  else
+    {
+      /* No idea.  */
+      encoding = target_charset (chtype->arch ());
+    }
+  return encoding;
+}
+
 /* Print the character C on STREAM as part of the contents of a
    literal string whose delimiter is QUOTER.  ENCODING names the
    encoding of C.  */
@@ -2290,6 +2321,8 @@ generic_emit_char (int c, struct type *type, struct ui_file *stream,
 		   int quoter, const char *encoding,
 		   emit_char_ftype emitter)
 {
+  if (encoding == nullptr)
+    encoding = get_default_encoding (type);
   enum bfd_endian byte_order
     = type_byte_order (type);
   gdb_byte *c_buf;
@@ -2619,6 +2652,8 @@ generic_printstr (struct ui_file *stream, struct type *type,
 		  const struct value_print_options *options,
 		  emit_char_ftype emitter)
 {
+  if (encoding == nullptr)
+    encoding = get_default_encoding (type);
   enum bfd_endian byte_order = type_byte_order (type);
   unsigned int i;
   int width = TYPE_LENGTH (type);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 13/18] Change generic_emit_char to print the quotes
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (11 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 12/18] Add a default encoding to generic_emit_char and generic_printstr Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 13:55 ` [PATCH 14/18] Use generic_emit_char in Rust Tom Tromey
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

All callers of generic_emit_char print the quotes around the
character, then pass the quote character to the function.  It seemed
better to just have generic_emit_char print the quotes itself.
---
 gdb/c-lang.c   | 2 --
 gdb/f-lang.h   | 2 --
 gdb/valprint.c | 2 ++
 3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/gdb/c-lang.c b/gdb/c-lang.c
index 1a6e6969a84..86604c7da49 100644
--- a/gdb/c-lang.c
+++ b/gdb/c-lang.c
@@ -164,9 +164,7 @@ language_defn::printchar (int c, struct type *type,
       break;
     }
 
-  fputc_filtered ('\'', stream);
   generic_emit_char (c, type, stream, '\'', encoding);
-  fputc_filtered ('\'', stream);
 }
 
 /* Print the character string STRING, printing at most LENGTH
diff --git a/gdb/f-lang.h b/gdb/f-lang.h
index 388c832dcdb..fc2cdc1786b 100644
--- a/gdb/f-lang.h
+++ b/gdb/f-lang.h
@@ -152,10 +152,8 @@ class f_language : public language_defn
   void printchar (int ch, struct type *chtype,
 		  struct ui_file *stream) const override
   {
-    fputs_filtered ("'", stream);
     const char *encoding = get_encoding (chtype);
     generic_emit_char (ch, chtype, stream, '\'', encoding);
-    fputs_filtered ("'", stream);
   }
 
   /* See language.h.  */
diff --git a/gdb/valprint.c b/gdb/valprint.c
index 554c0e0564f..a38b46dfc3f 100644
--- a/gdb/valprint.c
+++ b/gdb/valprint.c
@@ -2331,6 +2331,7 @@ generic_emit_char (int c, struct type *type, struct ui_file *stream,
   c_buf = (gdb_byte *) alloca (TYPE_LENGTH (type));
   pack_long (c_buf, type, c);
 
+  fputc_filtered (quoter, stream);
   wchar_iterator iter (c_buf, TYPE_LENGTH (type), encoding, TYPE_LENGTH (type));
 
   /* This holds the printable form of the wchar_t data.  */
@@ -2392,6 +2393,7 @@ generic_emit_char (int c, struct type *type, struct ui_file *stream,
   obstack_1grow (&output, '\0');
 
   fputs_filtered ((const char *) obstack_base (&output), stream);
+  fputc_filtered (quoter, stream);
 }
 
 /* Return the repeat count of the next character/byte in ITER,
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 14/18] Use generic_emit_char in Rust
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (12 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 13/18] Change generic_emit_char to print the quotes Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 13:55 ` [PATCH 15/18] Use generic_emit_char in Ada Tom Tromey
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

This changes the Rust code to use generic_emit_char, passing in a
function to handle Rust escape sequences correctly.  This is
PR rust/20164.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=20164
---
 gdb/rust-lang.c                    | 86 ++++++++++++++++++------------
 gdb/testsuite/gdb.rust/expr.exp    |  6 +--
 gdb/testsuite/gdb.rust/unicode.exp |  3 +-
 3 files changed, 56 insertions(+), 39 deletions(-)

diff --git a/gdb/rust-lang.c b/gdb/rust-lang.c
index 41e973797d2..28438827c1b 100644
--- a/gdb/rust-lang.c
+++ b/gdb/rust-lang.c
@@ -218,16 +218,6 @@ rust_u8_type_p (struct type *type)
 	  && TYPE_LENGTH (type) == 1);
 }
 
-/* Return true if TYPE is a Rust character type.  */
-
-static bool
-rust_chartype_p (struct type *type)
-{
-  return (type->code () == TYPE_CODE_CHAR
-	  && TYPE_LENGTH (type) == 4
-	  && type->is_unsigned ());
-}
-
 /* If VALUE represents a trait object pointer, return the underlying
    pointer with the correct (i.e., runtime) type.  Otherwise, return
    NULL.  */
@@ -263,6 +253,57 @@ rust_get_trait_object_pointer (struct value *value)
 
 \f
 
+/* A callback function for generic_emit_char and generic_printstr that
+   escapes characters Rust-style.  */
+static bool
+rust_emit_char (ui_file *stream,
+		gdb_wint_t w,
+		gdb::array_view<const gdb_byte> orig,
+		int width,
+		enum bfd_endian byte_order,
+		int quoter)
+{
+  if (w == LCST ('\\'))
+    fputs_filtered ("\\\\", stream);
+  else if (w == gdb_btowc (quoter))
+    fprintf_filtered (stream, "\\%c", quoter);
+  else if (gdb_iswprint (w) && !gdb_iswcntrl (w))
+    {
+      /* Let the caller deal with all other printable characters.  */
+      return false;
+    }
+  else if (w == LCST ('\n'))
+    fputs_filtered ("\\n", stream);
+  else if (w == LCST ('\r'))
+    fputs_filtered ("\\r", stream);
+  else if (w == LCST ('\t'))
+    fputs_filtered ("\\t", stream);
+  else if (w == LCST ('\0'))
+    fputs_filtered ("\\0", stream);
+  else
+    {
+      int i;
+
+      for (i = 0; i + width <= orig.size (); i += width)
+	{
+	  ULONGEST value = extract_unsigned_integer (&orig[i], width,
+						     byte_order);
+	  if (value <= 255)
+	    fprintf_filtered (stream, "\\x%02x", (int) value);
+	  else
+	    fprintf_filtered (stream, "\\u{%06lx}", (unsigned long) value);
+	}
+
+      /* If we somehow have extra bytes, print them now.  */
+      while (i < orig.size ())
+	{
+	  fprintf_filtered (stream, "\\x%02x", orig[i] & 0xff);
+	  ++i;
+	}
+    }
+  return true;
+}
+
 /* See language.h.  */
 
 void
@@ -290,9 +331,8 @@ rust_language::printstr (struct ui_file *stream, struct type *type,
 	}
     }
 
-  /* This is not ideal as it doesn't use our character printer.  */
   generic_printstr (stream, type, string, length, encoding, force_ellipses,
-		    '"', 0, options);
+		    '"', 0, options, rust_emit_char);
 }
 
 \f
@@ -1595,27 +1635,7 @@ void
 rust_language::printchar (int ch, struct type *chtype,
 			  struct ui_file *stream) const
 {
-  fputs_filtered ("'", stream);
-  if (!rust_chartype_p (chtype))
-    generic_emit_char (ch, chtype, stream, '\'',
-		       target_charset (chtype->arch ()));
-  else if (ch == '\\')
-    fprintf_filtered (stream, "\\%c", ch);
-  else if (ch == '\n')
-    fputs_filtered ("\\n", stream);
-  else if (ch == '\r')
-    fputs_filtered ("\\r", stream);
-  else if (ch == '\t')
-    fputs_filtered ("\\t", stream);
-  else if (ch == '\0')
-    fputs_filtered ("\\0", stream);
-  else if (ch >= 32 && ch <= 127 && isprint (ch))
-    fputc_filtered (ch, stream);
-  else if (ch <= 255)
-    fprintf_filtered (stream, "\\x%02x", ch);
-  else
-    fprintf_filtered (stream, "\\u{%06x}", ch);
-  fputs_filtered ("'", stream);
+  generic_emit_char (ch, chtype, stream, '\'', nullptr, rust_emit_char);
 }
 
 /* See language.h.  */
diff --git a/gdb/testsuite/gdb.rust/expr.exp b/gdb/testsuite/gdb.rust/expr.exp
index 0c445897338..34eba4bf997 100644
--- a/gdb/testsuite/gdb.rust/expr.exp
+++ b/gdb/testsuite/gdb.rust/expr.exp
@@ -115,10 +115,8 @@ gdb_test "print \[1,2 3" "',' or ']' expected"
 gdb_test "print \[1 2" "',', ';', or ']' expected"
 
 gdb_test "print b\"hi rust\"" " = b\"hi rust\""
-# This isn't rusty syntax yet, but that's another bug -- this is just
-# testing that byte escapes work properly.
-gdb_test "print b\"\\xddhi bob\"" " = b\"\\\\335hi bob\""
-gdb_test "print b\"has\\0nul\"" " = b\"has\\\\000nul\""
+gdb_test "print b\"\\xddhi bob\"" " = b\"\\\\xddhi bob\""
+gdb_test "print b\"has\\0nul\"" " = b\"has\\\\0nul\""
 
 gdb_test "print br##\"hi\"##" " = b\"hi\""
 gdb_test "print br##\"hi" "Unexpected EOF in string"
diff --git a/gdb/testsuite/gdb.rust/unicode.exp b/gdb/testsuite/gdb.rust/unicode.exp
index 9de0a0e724f..8378195bf5d 100644
--- a/gdb/testsuite/gdb.rust/unicode.exp
+++ b/gdb/testsuite/gdb.rust/unicode.exp
@@ -43,8 +43,7 @@ if {![runto ${srcfile}:$line]} {
 
 gdb_test "print 𝕯" " = 98" "print D"
 gdb_test "print \"𝕯\"" " = \"𝕯\"" "print D in string"
-# This output is maybe not ideal, but it also isn't incorrect.
-gdb_test "print '𝕯'" " = 120175 '\\\\u\\\{01d56f\\\}'" \
+gdb_test "print '𝕯'" " = 120175 '𝕯'" \
     "print D as char"
 gdb_test "print cç" " = 97" "print cc"
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 15/18] Use generic_emit_char in Ada
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (13 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 14/18] Use generic_emit_char in Rust Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 13:55 ` [PATCH 16/18] Use generic_emit_char in Modula-2 Tom Tromey
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

This changes the Ada code to use generic_emit_char and
generic_printstr.  This simplifies gdb somewhat.
---
 gdb/ada-lang.c     |   7 +-
 gdb/ada-lang.h     |  12 ++--
 gdb/ada-valprint.c | 163 ++++++++++-----------------------------------
 3 files changed, 45 insertions(+), 137 deletions(-)

diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
index e3bd1fa7c17..35474909f20 100644
--- a/gdb/ada-lang.c
+++ b/gdb/ada-lang.c
@@ -59,6 +59,7 @@
 #include "gdbsupport/byte-vector.h"
 #include <algorithm>
 #include "ada-exp.h"
+#include "charset.h"
 
 /* Define whether or not the C operator '/' truncates towards zero for
    differently signed operands (truncation direction is undefined in C).
@@ -13320,7 +13321,7 @@ class ada_language : public language_defn
   void printchar (int ch, struct type *chtype,
 		  struct ui_file *stream) const override
   {
-    ada_printchar (ch, chtype, stream);
+    generic_emit_char (ch, chtype, stream, '\'', nullptr, ada_emit_char);
   }
 
   /* See language.h.  */
@@ -13330,8 +13331,8 @@ class ada_language : public language_defn
 		 const char *encoding, int force_ellipses,
 		 const struct value_print_options *options) const override
   {
-    ada_printstr (stream, elttype, string, length, encoding,
-		  force_ellipses, options);
+    generic_printstr (stream, elttype, string, length, encoding,
+		      force_ellipses, '"', 0, options, ada_emit_char);
   }
 
   /* See language.h.  */
diff --git a/gdb/ada-lang.h b/gdb/ada-lang.h
index a6caf35b826..2ec9e48f875 100644
--- a/gdb/ada-lang.h
+++ b/gdb/ada-lang.h
@@ -172,13 +172,11 @@ extern void ada_value_print (struct value *, struct ui_file *,
 
 				/* Defined in ada-lang.c */
 
-extern void ada_emit_char (int, struct type *, struct ui_file *, int, int);
-
-extern void ada_printchar (int, struct type *, struct ui_file *);
-
-extern void ada_printstr (struct ui_file *, struct type *, const gdb_byte *,
-			  unsigned int, const char *, int,
-			  const struct value_print_options *);
+extern bool ada_emit_char (ui_file *stream, gdb_wint_t w,
+			   gdb::array_view<const gdb_byte> orig,
+			   int width,
+			   enum bfd_endian byte_order,
+			   int quoter);
 
 struct value *ada_convert_actual (struct value *actual,
 				  struct type *formal_type0);
diff --git a/gdb/ada-valprint.c b/gdb/ada-valprint.c
index a59c392bef4..edd5ed8be18 100644
--- a/gdb/ada-valprint.c
+++ b/gdb/ada-valprint.c
@@ -255,29 +255,43 @@ val_print_packed_array_elements (struct type *type, const gdb_byte *valaddr,
   value_free_to_mark (mark);
 }
 
-/* Print the character C on STREAM as part of the contents of a literal
-   string whose delimiter is QUOTER.  TYPE_LEN is the length in bytes
-   of the character.  */
-
-void
-ada_emit_char (int c, struct type *type, struct ui_file *stream,
-	       int quoter, int type_len)
+/* A callback for generic_emit_char and generic_printstr that escapes
+   characters Ada-style.  */
+
+bool
+ada_emit_char (ui_file *stream, gdb_wint_t w,
+	       gdb::array_view<const gdb_byte> orig,
+	       int width,
+	       enum bfd_endian byte_order,
+	       int quoter)
 {
-  /* If this character fits in the normal ASCII range, and is
-     a printable character, then print the character as if it was
-     an ASCII character, even if this is a wide character.
-     The UCHAR_MAX check is necessary because the isascii function
-     requires that its argument have a value of an unsigned char,
-     or EOF (EOF is obviously not printable).  */
-  if (c <= UCHAR_MAX && isascii (c) && isprint (c))
+  if (quoter == '"' && w == LCST ('"'))
+    fprintf_filtered (stream, "\"\"");
+  else if (gdb_iswprint (w) && !gdb_iswcntrl (w))
     {
-      if (c == quoter && c == '"')
-	fprintf_filtered (stream, "\"\"");
-      else
-	fprintf_filtered (stream, "%c", c);
+      /* Let the caller handle printable characters.  */
+      return false;
     }
   else
-    fprintf_filtered (stream, "[\"%0*x\"]", type_len * 2, c);
+    {
+      int i;
+
+      for (i = 0; i + width <= orig.size (); i += width)
+	{
+	  ULONGEST value = extract_unsigned_integer (&orig[i], width,
+						     byte_order);
+	  fprintf_filtered (stream, "[\"%0*lx\"]", width * 2,
+			    (unsigned long) value);
+	}
+
+      /* If we somehow have extra bytes, print them now.  */
+      while (i < orig.size ())
+	{
+	  fprintf_filtered (stream, "[\"%02x\"]", orig[i] & 0xff);
+	  ++i;
+	}
+    }
+  return true;
 }
 
 /* Character #I of STRING, given that TYPE_LEN is the size in bytes
@@ -348,14 +362,6 @@ ada_print_floating (const gdb_byte *valaddr, struct type *type,
     fprintf_filtered (stream, "%s", &s[skip_count]);
 }
 
-void
-ada_printchar (int c, struct type *type, struct ui_file *stream)
-{
-  fputs_filtered ("'", stream);
-  ada_emit_char (c, type, stream, '\'', TYPE_LENGTH (type));
-  fputs_filtered ("'", stream);
-}
-
 /* [From print_type_scalar in typeprint.c].   Print VAL on STREAM in a
    form appropriate for TYPE, if non-NULL.  If TYPE is NULL, print VAL
    like a default signed integer.  */
@@ -436,103 +442,6 @@ ada_print_scalar (struct type *type, LONGEST val, struct ui_file *stream)
     }
 }
 
-/* Print the character string STRING, printing at most LENGTH characters.
-   Printing stops early if the number hits print_max; repeat counts
-   are printed as appropriate.  Print ellipses at the end if we
-   had to stop before printing LENGTH characters, or if FORCE_ELLIPSES.
-   TYPE_LEN is the length (1 or 2) of the character type.  */
-
-static void
-printstr (struct ui_file *stream, struct type *elttype, const gdb_byte *string,
-	  unsigned int length, int force_ellipses, int type_len,
-	  const struct value_print_options *options)
-{
-  enum bfd_endian byte_order = type_byte_order (elttype);
-  unsigned int i;
-  unsigned int things_printed = 0;
-  int in_quotes = 0;
-  int need_comma = 0;
-
-  if (length == 0)
-    {
-      fputs_filtered ("\"\"", stream);
-      return;
-    }
-
-  for (i = 0; i < length && things_printed < options->print_max; i += 1)
-    {
-      /* Position of the character we are examining
-	 to see whether it is repeated.  */
-      unsigned int rep1;
-      /* Number of repetitions we have detected so far.  */
-      unsigned int reps;
-
-      QUIT;
-
-      if (need_comma)
-	{
-	  fputs_filtered (", ", stream);
-	  need_comma = 0;
-	}
-
-      rep1 = i + 1;
-      reps = 1;
-      while (rep1 < length
-	     && char_at (string, rep1, type_len, byte_order)
-		== char_at (string, i, type_len, byte_order))
-	{
-	  rep1 += 1;
-	  reps += 1;
-	}
-
-      if (reps > options->repeat_count_threshold)
-	{
-	  if (in_quotes)
-	    {
-	      fputs_filtered ("\", ", stream);
-	      in_quotes = 0;
-	    }
-	  fputs_filtered ("'", stream);
-	  ada_emit_char (char_at (string, i, type_len, byte_order),
-			 elttype, stream, '\'', type_len);
-	  fputs_filtered ("'", stream);
-	  fprintf_filtered (stream, _(" %p[<repeats %u times>%p]"),
-			    metadata_style.style ().ptr (), reps, nullptr);
-	  i = rep1 - 1;
-	  things_printed += options->repeat_count_threshold;
-	  need_comma = 1;
-	}
-      else
-	{
-	  if (!in_quotes)
-	    {
-	      fputs_filtered ("\"", stream);
-	      in_quotes = 1;
-	    }
-	  ada_emit_char (char_at (string, i, type_len, byte_order),
-			 elttype, stream, '"', type_len);
-	  things_printed += 1;
-	}
-    }
-
-  /* Terminate the quotes if necessary.  */
-  if (in_quotes)
-    fputs_filtered ("\"", stream);
-
-  if (force_ellipses || i < length)
-    fputs_filtered ("...", stream);
-}
-
-void
-ada_printstr (struct ui_file *stream, struct type *type,
-	      const gdb_byte *string, unsigned int length,
-	      const char *encoding, int force_ellipses,
-	      const struct value_print_options *options)
-{
-  printstr (stream, type, string, length, force_ellipses, TYPE_LENGTH (type),
-	    options);
-}
-
 static int
 print_variant_part (struct value *value, int field_num,
 		    struct value *outer_value,
@@ -707,8 +616,8 @@ ada_val_print_string (struct type *type, const gdb_byte *valaddr,
       len = temp_len;
     }
 
-  printstr (stream, elttype, valaddr + offset_aligned, len, 0,
-	    eltlen, options);
+  current_language->printstr (stream, elttype, valaddr + offset_aligned,
+			      len, nullptr, 0, options);
 }
 
 /* Implement Ada value_print'ing for the case where TYPE is a
@@ -802,7 +711,7 @@ ada_value_print_num (struct value *val, struct ui_file *stream, int recurse,
 
 	      fputs_filtered (" ", stream);
 	      c = unpack_long (type, valaddr);
-	      ada_printchar (c, type, stream);
+	      current_language->printchar (c, type, stream);
 	    }
 	}
       return;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 16/18] Use generic_emit_char in Modula-2
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (14 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 15/18] Use generic_emit_char in Ada Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 13:55 ` [PATCH 17/18] Use generic_emit_char in Pascal Tom Tromey
  2022-02-16 13:55 ` [PATCH 18/18] Simplify Fortran string printing Tom Tromey
  17 siblings, 0 replies; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

This changes the Modula-2 code to use generic_emit_char and
generic_printstr.  I have no way to test this.  However, the Modula-2
code seems to be pretty much a copy of the old C code from before the
charset work.
---
 gdb/m2-lang.c | 145 ++++++++++----------------------------------------
 gdb/m2-lang.h |   5 --
 2 files changed, 27 insertions(+), 123 deletions(-)

diff --git a/gdb/m2-lang.c b/gdb/m2-lang.c
index 7673426b7a8..281c8733ff5 100644
--- a/gdb/m2-lang.c
+++ b/gdb/m2-lang.c
@@ -139,137 +139,46 @@ m2_language::language_arch_info (struct gdbarch *gdbarch,
   lai->set_bool_type (builtin->builtin_bool, "BOOLEAN");
 }
 
+/* A callback function for generic_emit_char and generic_printstr that
+   escapes characters Modula 2-style.  */
+
+static bool
+m2_emit_char (ui_file *stream,
+	      gdb_wint_t w,
+	      gdb::array_view<const gdb_byte> orig,
+	      int width,
+	      enum bfd_endian byte_order,
+	      int quoter)
+{
+  /* Historically the Modula-2 code in gdb handled \e as well.  */
+  if (w == LCST ('\033'))
+    {
+      fputs_filtered ("\\e", stream);
+      return true;
+    }
+  /* Let the caller handle everything else.  */
+  return false;
+}
+
 /* See languge.h.  */
 
 void
 m2_language::printchar (int c, struct type *type,
 			struct ui_file *stream) const
 {
-  fputs_filtered ("'", stream);
-  emitchar (c, type, stream, '\'');
-  fputs_filtered ("'", stream);
+  generic_emit_char (c, type, stream, '\'', nullptr, m2_emit_char);
 }
 
 /* See language.h.  */
 
 void
 m2_language::printstr (struct ui_file *stream, struct type *elttype,
-			const gdb_byte *string, unsigned int length,
-			const char *encoding, int force_ellipses,
-			const struct value_print_options *options) const
+		       const gdb_byte *string, unsigned int length,
+		       const char *encoding, int force_ellipses,
+		       const struct value_print_options *options) const
 {
-  unsigned int i;
-  unsigned int things_printed = 0;
-  int in_quotes = 0;
-  int need_comma = 0;
-
-  if (length == 0)
-    {
-      puts_filtered ("\"\"");
-      return;
-    }
-
-  for (i = 0; i < length && things_printed < options->print_max; ++i)
-    {
-      /* Position of the character we are examining
-	 to see whether it is repeated.  */
-      unsigned int rep1;
-      /* Number of repetitions we have detected so far.  */
-      unsigned int reps;
-
-      QUIT;
-
-      if (need_comma)
-	{
-	  fputs_filtered (", ", stream);
-	  need_comma = 0;
-	}
-
-      rep1 = i + 1;
-      reps = 1;
-      while (rep1 < length && string[rep1] == string[i])
-	{
-	  ++rep1;
-	  ++reps;
-	}
-
-      if (reps > options->repeat_count_threshold)
-	{
-	  if (in_quotes)
-	    {
-	      fputs_filtered ("\", ", stream);
-	      in_quotes = 0;
-	    }
-	  printchar (string[i], elttype, stream);
-	  fprintf_filtered (stream, " <repeats %u times>", reps);
-	  i = rep1 - 1;
-	  things_printed += options->repeat_count_threshold;
-	  need_comma = 1;
-	}
-      else
-	{
-	  if (!in_quotes)
-	    {
-	      fputs_filtered ("\"", stream);
-	      in_quotes = 1;
-	    }
-	  emitchar (string[i], elttype, stream, '"');
-	  ++things_printed;
-	}
-    }
-
-  /* Terminate the quotes if necessary.  */
-  if (in_quotes)
-    fputs_filtered ("\"", stream);
-
-  if (force_ellipses || i < length)
-    fputs_filtered ("...", stream);
-}
-
-/* See language.h.  */
-
-void
-m2_language::emitchar (int ch, struct type *chtype,
-		       struct ui_file *stream, int quoter) const
-{
-  ch &= 0xFF;			/* Avoid sign bit follies.  */
-
-  if (PRINT_LITERAL_FORM (ch))
-    {
-      if (ch == '\\' || ch == quoter)
-	fputs_filtered ("\\", stream);
-      fprintf_filtered (stream, "%c", ch);
-    }
-  else
-    {
-      switch (ch)
-	{
-	case '\n':
-	  fputs_filtered ("\\n", stream);
-	  break;
-	case '\b':
-	  fputs_filtered ("\\b", stream);
-	  break;
-	case '\t':
-	  fputs_filtered ("\\t", stream);
-	  break;
-	case '\f':
-	  fputs_filtered ("\\f", stream);
-	  break;
-	case '\r':
-	  fputs_filtered ("\\r", stream);
-	  break;
-	case '\033':
-	  fputs_filtered ("\\e", stream);
-	  break;
-	case '\007':
-	  fputs_filtered ("\\a", stream);
-	  break;
-	default:
-	  fprintf_filtered (stream, "\\%.3o", (unsigned int) ch);
-	  break;
-	}
-    }
+  generic_printstr (stream, elttype, string, length, encoding, force_ellipses,
+		    '"', 0, options, m2_emit_char);
 }
 
 /* Called during architecture gdbarch initialisation to create language
diff --git a/gdb/m2-lang.h b/gdb/m2-lang.h
index ee2fc6fb8bc..2f516bc6427 100644
--- a/gdb/m2-lang.h
+++ b/gdb/m2-lang.h
@@ -92,11 +92,6 @@ class m2_language : public language_defn
 
   /* See language.h.  */
 
-  void emitchar (int ch, struct type *chtype,
-		 struct ui_file *stream, int quoter) const;
-
-  /* See language.h.  */
-
   void printchar (int ch, struct type *chtype,
 		  struct ui_file *stream) const override;
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 17/18] Use generic_emit_char in Pascal
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (15 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 16/18] Use generic_emit_char in Modula-2 Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  2022-02-16 13:55 ` [PATCH 18/18] Simplify Fortran string printing Tom Tromey
  17 siblings, 0 replies; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

This changes the Pascal to use generic_emit_char and generic_printstr.
The output isn't identical, though it does pass the Pascal tests
(during which I learned that FPC does not work with gold...).

I think this is an improvement, nevertheless, because the Pascal
expression parser accepts the C-style output that is now generated.
---
 gdb/p-lang.c | 132 +++++++--------------------------------------------
 gdb/p-lang.h |  12 -----
 2 files changed, 17 insertions(+), 127 deletions(-)

diff --git a/gdb/p-lang.c b/gdb/p-lang.c
index 70ff404bac3..64e2fa2d9b6 100644
--- a/gdb/p-lang.c
+++ b/gdb/p-lang.c
@@ -143,31 +143,23 @@ pascal_is_string_type (struct type *type,int *length_pos, int *length_size,
   return 0;
 }
 
-/* See p-lang.h.  */
-
-void
-pascal_language::print_one_char (int c, struct ui_file *stream,
-				 int *in_quotes) const
+/* A callback function for generic_emit_char and generic_printstr that
+   escapes characters Modula 2-style.  */
+
+static bool
+pascal_emit_char (ui_file *stream,
+		  gdb_wint_t w,
+		  gdb::array_view<const gdb_byte> orig,
+		  int width,
+		  enum bfd_endian byte_order,
+		  int quoter)
 {
-  if (c == '\'' || ((unsigned int) c <= 0xff && (PRINT_LITERAL_FORM (c))))
+  if (w == LCST ('\''))
     {
-      if (!(*in_quotes))
-	fputs_filtered ("'", stream);
-      *in_quotes = 1;
-      if (c == '\'')
-	{
-	  fputs_filtered ("''", stream);
-	}
-      else
-	fprintf_filtered (stream, "%c", c);
-    }
-  else
-    {
-      if (*in_quotes)
-	fputs_filtered ("'", stream);
-      *in_quotes = 0;
-      fprintf_filtered (stream, "#%d", (unsigned int) c);
+      fputs_filtered ("''", stream);
+      return true;
     }
+  return false;
 }
 
 /* See language.h.  */
@@ -176,11 +168,7 @@ void
 pascal_language::printchar (int c, struct type *type,
 			    struct ui_file *stream) const
 {
-  int in_quotes = 0;
-
-  print_one_char (c, stream, &in_quotes);
-  if (in_quotes)
-    fputs_filtered ("'", stream);
+  generic_emit_char (c, type, stream, '\'', nullptr, pascal_emit_char);
 }
 
 \f
@@ -228,94 +216,8 @@ pascal_language::printstr (struct ui_file *stream, struct type *elttype,
 			   const char *encoding, int force_ellipses,
 			   const struct value_print_options *options) const
 {
-  enum bfd_endian byte_order = type_byte_order (elttype);
-  unsigned int i;
-  unsigned int things_printed = 0;
-  int in_quotes = 0;
-  int need_comma = 0;
-  int width;
-
-  /* Preserve ELTTYPE's original type, just set its LENGTH.  */
-  check_typedef (elttype);
-  width = TYPE_LENGTH (elttype);
-
-  /* If the string was not truncated due to `set print elements', and
-     the last byte of it is a null, we don't print that, in traditional C
-     style.  */
-  if ((!force_ellipses) && length > 0
-      && extract_unsigned_integer (string + (length - 1) * width, width,
-				   byte_order) == 0)
-    length--;
-
-  if (length == 0)
-    {
-      fputs_filtered ("''", stream);
-      return;
-    }
-
-  for (i = 0; i < length && things_printed < options->print_max; ++i)
-    {
-      /* Position of the character we are examining
-	 to see whether it is repeated.  */
-      unsigned int rep1;
-      /* Number of repetitions we have detected so far.  */
-      unsigned int reps;
-      unsigned long int current_char;
-
-      QUIT;
-
-      if (need_comma)
-	{
-	  fputs_filtered (", ", stream);
-	  need_comma = 0;
-	}
-
-      current_char = extract_unsigned_integer (string + i * width, width,
-					       byte_order);
-
-      rep1 = i + 1;
-      reps = 1;
-      while (rep1 < length
-	     && extract_unsigned_integer (string + rep1 * width, width,
-					  byte_order) == current_char)
-	{
-	  ++rep1;
-	  ++reps;
-	}
-
-      if (reps > options->repeat_count_threshold)
-	{
-	  if (in_quotes)
-	    {
-	      fputs_filtered ("', ", stream);
-	      in_quotes = 0;
-	    }
-	  printchar (current_char, elttype, stream);
-	  fprintf_filtered (stream, " %p[<repeats %u times>%p]",
-			    metadata_style.style ().ptr (),
-			    reps, nullptr);
-	  i = rep1 - 1;
-	  things_printed += options->repeat_count_threshold;
-	  need_comma = 1;
-	}
-      else
-	{
-	  if ((!in_quotes) && (PRINT_LITERAL_FORM (current_char)))
-	    {
-	      fputs_filtered ("'", stream);
-	      in_quotes = 1;
-	    }
-	  print_one_char (current_char, stream, &in_quotes);
-	  ++things_printed;
-	}
-    }
-
-  /* Terminate the quotes if necessary.  */
-  if (in_quotes)
-    fputs_filtered ("'", stream);
-
-  if (force_ellipses || i < length)
-    fputs_filtered ("...", stream);
+  generic_printstr (stream, elttype, string, length, encoding, force_ellipses,
+		    '\'', 0, options, pascal_emit_char);
 }
 
 /* Single instance of the Pascal language class.  */
diff --git a/gdb/p-lang.h b/gdb/p-lang.h
index b831bde0af6..79f8fd7c28d 100644
--- a/gdb/p-lang.h
+++ b/gdb/p-lang.h
@@ -109,18 +109,6 @@ class pascal_language : public language_defn
 
   /* See language.h.  */
 
-  void emitchar (int ch, struct type *chtype,
-		 struct ui_file *stream, int quoter) const
-  {
-    int in_quotes = 0;
-
-    print_one_char (ch, stream, &in_quotes);
-    if (in_quotes)
-      fputs_filtered ("'", stream);
-  }
-
-  /* See language.h.  */
-
   void printchar (int ch, struct type *chtype,
 		  struct ui_file *stream) const override;
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 18/18] Simplify Fortran string printing
  2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
                   ` (16 preceding siblings ...)
  2022-02-16 13:55 ` [PATCH 17/18] Use generic_emit_char in Pascal Tom Tromey
@ 2022-02-16 13:55 ` Tom Tromey
  17 siblings, 0 replies; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 13:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

Now that generic_emit_char has a default encoding, Fortran can remove
its get_encoding method and just rely on the default.
---
 gdb/f-lang.c | 27 ---------------------------
 gdb/f-lang.h | 13 +------------
 2 files changed, 1 insertion(+), 39 deletions(-)

diff --git a/gdb/f-lang.c b/gdb/f-lang.c
index eaeda884aef..7fdc767ccdb 100644
--- a/gdb/f-lang.c
+++ b/gdb/f-lang.c
@@ -74,33 +74,6 @@ static value *fortran_prepare_argument (struct expression *exp,
 					int arg_num, bool is_internal_call_p,
 					struct type *func_type, enum noside noside);
 
-/* Return the encoding that should be used for the character type
-   TYPE.  */
-
-const char *
-f_language::get_encoding (struct type *type)
-{
-  const char *encoding;
-
-  switch (TYPE_LENGTH (type))
-    {
-    case 1:
-      encoding = target_charset (type->arch ());
-      break;
-    case 4:
-      if (type_byte_order (type) == BFD_ENDIAN_BIG)
-	encoding = "UTF-32BE";
-      else
-	encoding = "UTF-32LE";
-      break;
-
-    default:
-      error (_("unrecognized character type"));
-    }
-
-  return encoding;
-}
-
 \f
 
 /* A helper function for the "bound" intrinsics that checks that TYPE
diff --git a/gdb/f-lang.h b/gdb/f-lang.h
index fc2cdc1786b..c3c76193dc4 100644
--- a/gdb/f-lang.h
+++ b/gdb/f-lang.h
@@ -152,8 +152,7 @@ class f_language : public language_defn
   void printchar (int ch, struct type *chtype,
 		  struct ui_file *stream) const override
   {
-    const char *encoding = get_encoding (chtype);
-    generic_emit_char (ch, chtype, stream, '\'', encoding);
+    generic_emit_char (ch, chtype, stream, '\'', nullptr);
   }
 
   /* See language.h.  */
@@ -163,14 +162,9 @@ class f_language : public language_defn
 		 const char *encoding, int force_ellipses,
 		 const struct value_print_options *options) const override
   {
-    const char *type_encoding = get_encoding (elttype);
-
     if (TYPE_LENGTH (elttype) == 4)
       fputs_filtered ("4_", stream);
 
-    if (!encoding || !*encoding)
-      encoding = type_encoding;
-
     generic_printstr (stream, elttype, string, length, encoding,
 		      force_ellipses, '\'', 0, options);
   }
@@ -223,11 +217,6 @@ class f_language : public language_defn
 	(const lookup_name_info &lookup_name) const override;
 
 private:
-  /* Return the encoding that should be used for the character type
-     TYPE.  */
-
-  static const char *get_encoding (struct type *type);
-
   /* Print any asterisks or open-parentheses needed before the variable
      name (to describe its type).
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 01/18] Fix latent quote char bug in generic_printstr
  2022-02-16 13:55 ` [PATCH 01/18] Fix latent quote char bug in generic_printstr Tom Tromey
@ 2022-02-16 15:38   ` Andrew Burgess via Gdb-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-02-16 15:38 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

* Tom Tromey <tom@tromey.com> [2022-02-16 06:55:01 -0700]:

> generic_printstr prints an empty string like:
> 
>       fputs_filtered ("\"\"", stream);
> 
> However, this seems wrong to me if the quote character is something
> other than double quote.  This patch fixes this latent bug, though I'm
> not sure how to test it -- I don't know Fortran and in my experiment I
> was unable to make a zero-length Fortran string.

This change looks good to me.   Below is a test case which you are
welcome to include with this patch.

Thanks,
Andrew

---

commit 3ede663e51ebeaef24280891ea8dced86571e9e7
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Wed Feb 16 15:37:15 2022 +0000

    A test

diff --git a/gdb/testsuite/gdb.fortran/empty-string.exp b/gdb/testsuite/gdb.fortran/empty-string.exp
new file mode 100644
index 00000000000..892f189784a
--- /dev/null
+++ b/gdb/testsuite/gdb.fortran/empty-string.exp
@@ -0,0 +1,33 @@
+# Copyright 2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/> .
+
+# Test printing of an empty Fortran string.
+
+if {[skip_fortran_tests]} { return -1 }
+
+standard_testfile ".f90"
+load_lib fortran.exp
+
+if {[prepare_for_testing "failed to prepare" $testfile $srcfile {debug f90}]} {
+    return -1
+}
+
+if ![fortran_runto_main] {
+    return -1
+}
+
+gdb_breakpoint [gdb_get_line_number "Break here"]
+gdb_continue_to_breakpoint "run to test location"
+gdb_test "print string" " = ''"
diff --git a/gdb/testsuite/gdb.fortran/empty-string.f90 b/gdb/testsuite/gdb.fortran/empty-string.f90
new file mode 100644
index 00000000000..574ed7fb7aa
--- /dev/null
+++ b/gdb/testsuite/gdb.fortran/empty-string.f90
@@ -0,0 +1,30 @@
+! Copyright 2022 Free Software Foundation, Inc.
+!
+! This program is free software; you can redistribute it and/or modify
+! it under the terms of the GNU General Public License as published by
+! the Free Software Foundation; either version 3 of the License, or
+! (at your option) any later version.
+!
+! This program is distributed in the hope that it will be useful,
+! but WITHOUT ANY WARRANTY; without even the implied warranty of
+! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+! GNU General Public License for more details.
+!
+! You should have received a copy of the GNU General Public License
+! along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+program empty_string
+  implicit none
+
+  integer :: ret
+
+  ret = string_length('')
+
+contains
+
+  integer(kind=4) function string_length(string)
+    character*(*) :: string
+    string_length = len(string)		! Break here.
+  end function string_length
+
+end program empty_string


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 02/18] Boolify need_escape in generic_emit_char
  2022-02-16 13:55 ` [PATCH 02/18] Boolify need_escape in generic_emit_char Tom Tromey
@ 2022-02-16 15:39   ` Andrew Burgess via Gdb-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-02-16 15:39 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

* Tom Tromey <tom@tromey.com> [2022-02-16 06:55:02 -0700]:

> This changes 'need_escape' in generic_emit_char to be of type bool,
> rather than int.

LGTM.

Thanks,
Andrew

> ---
>  gdb/valprint.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/gdb/valprint.c b/gdb/valprint.c
> index e4d68381189..e758c1d1066 100644
> --- a/gdb/valprint.c
> +++ b/gdb/valprint.c
> @@ -2191,11 +2191,11 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
>  	     int orig_len, int width,
>  	     enum bfd_endian byte_order,
>  	     struct obstack *output,
> -	     int quoter, int *need_escapep)
> +	     int quoter, bool *need_escapep)
>  {
> -  int need_escape = *need_escapep;
> +  bool need_escape = *need_escapep;
>  
> -  *need_escapep = 0;
> +  *need_escapep = false;
>  
>    /* iswprint implementation on Windows returns 1 for tab character.
>       In order to avoid different printout on this host, we explicitly
> @@ -2265,7 +2265,7 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
>  		  ++i;
>  		}
>  
> -	      *need_escapep = 1;
> +	      *need_escapep = true;
>  	    }
>  	  break;
>  	}
> @@ -2283,7 +2283,7 @@ generic_emit_char (int c, struct type *type, struct ui_file *stream,
>    enum bfd_endian byte_order
>      = type_byte_order (type);
>    gdb_byte *c_buf;
> -  int need_escape = 0;
> +  bool need_escape = false;
>  
>    c_buf = (gdb_byte *) alloca (TYPE_LENGTH (type));
>    pack_long (c_buf, type, c);
> @@ -2448,7 +2448,7 @@ print_converted_chars_to_obstack (struct obstack *obstack,
>    const converted_character *elem;
>    enum {START, SINGLE, REPEAT, INCOMPLETE, FINISH} state, last;
>    gdb_wchar_t wide_quote_char = gdb_btowc (quote_char);
> -  int need_escape = 0;
> +  bool need_escape = false;
>  
>    /* Set the start state.  */
>    idx = 0;
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 03/18] Remove c_emit_char
  2022-02-16 13:55 ` [PATCH 03/18] Remove c_emit_char Tom Tromey
@ 2022-02-16 15:40   ` Andrew Burgess via Gdb-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-02-16 15:40 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

* Tom Tromey <tom@tromey.com> [2022-02-16 06:55:03 -0700]:

> This renames c_emit_char, removing a layer of indirection.

LGTM.

Thanks,
Andrew


> ---
>  gdb/c-lang.c   | 4 ++--
>  gdb/c-lang.h   | 3 ---
>  gdb/language.c | 9 ---------
>  3 files changed, 2 insertions(+), 14 deletions(-)
> 
> diff --git a/gdb/c-lang.c b/gdb/c-lang.c
> index 1f7cac7bef1..342109c94ef 100644
> --- a/gdb/c-lang.c
> +++ b/gdb/c-lang.c
> @@ -144,8 +144,8 @@ classify_type (struct type *elttype, struct gdbarch *gdbarch,
>     for printing characters and strings is language specific.  */
>  
>  void
> -c_emit_char (int c, struct type *type,
> -	     struct ui_file *stream, int quoter)
> +language_defn::emitchar (int c, struct type *type,
> +			 struct ui_file *stream, int quoter) const
>  {
>    const char *encoding;
>  
> diff --git a/gdb/c-lang.h b/gdb/c-lang.h
> index 46e562df055..16c9b116393 100644
> --- a/gdb/c-lang.h
> +++ b/gdb/c-lang.h
> @@ -103,9 +103,6 @@ extern void c_printstr (struct ui_file * stream,
>  extern void c_language_arch_info (struct gdbarch *gdbarch,
>  				  struct language_arch_info *lai);
>  
> -extern void c_emit_char (int c, struct type *type,
> -			 struct ui_file *stream, int quoter);
> -
>  /* These are in c-typeprint.c: */
>  
>  extern void c_type_print_base (struct type *, struct ui_file *,
> diff --git a/gdb/language.c b/gdb/language.c
> index 69c73b0318e..931abcd3076 100644
> --- a/gdb/language.c
> +++ b/gdb/language.c
> @@ -635,15 +635,6 @@ language_defn::value_print_inner
>  
>  /* See language.h.  */
>  
> -void
> -language_defn::emitchar (int ch, struct type *chtype,
> -			 struct ui_file * stream, int quoter) const
> -{
> -  c_emit_char (ch, chtype, stream, quoter);
> -}
> -
> -/* See language.h.  */
> -
>  void
>  language_defn::printstr (struct ui_file *stream, struct type *elttype,
>  			 const gdb_byte *string, unsigned int length,
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 04/18] Remove c_printstr
  2022-02-16 13:55 ` [PATCH 04/18] Remove c_printstr Tom Tromey
@ 2022-02-16 15:46   ` Andrew Burgess via Gdb-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-02-16 15:46 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

* Tom Tromey <tom@tromey.com> [2022-02-16 06:55:04 -0700]:

> This renames c_printstr, removing a layer of indirection.
> ---
>  gdb/c-lang.c    |  8 ++++----
>  gdb/c-lang.h    |  8 --------
>  gdb/language.c  | 12 ------------
>  gdb/rust-lang.c |  5 +++--
>  4 files changed, 7 insertions(+), 26 deletions(-)
> 
> diff --git a/gdb/c-lang.c b/gdb/c-lang.c
> index 342109c94ef..fbbecb696b9 100644
> --- a/gdb/c-lang.c
> +++ b/gdb/c-lang.c
> @@ -190,10 +190,10 @@ language_defn::printchar (int c, struct type *type,
>     characters, or if FORCE_ELLIPSES.  */
>  
>  void
> -c_printstr (struct ui_file *stream, struct type *type, 
> -	    const gdb_byte *string, unsigned int length, 
> -	    const char *user_encoding, int force_ellipses,
> -	    const struct value_print_options *options)
> +language_defn::printstr (struct ui_file *stream, struct type *type, 
> +			 const gdb_byte *string, unsigned int length, 

I know you didn't add them, but there's a trailing whitespace on both
the preceding lines, would you mind fixing these please?

> +			 const char *user_encoding, int force_ellipses,
> +			 const struct value_print_options *options) const
>  {
>    c_string_type str_type;
>    const char *type_encoding;
> diff --git a/gdb/c-lang.h b/gdb/c-lang.h
> index 16c9b116393..5441bfe10c7 100644
> --- a/gdb/c-lang.h
> +++ b/gdb/c-lang.h
> @@ -92,14 +92,6 @@ extern void c_value_print (struct value *, struct ui_file *,
>  
>  extern void c_printchar (int, struct type *, struct ui_file *);
>  
> -extern void c_printstr (struct ui_file * stream,
> -			struct type *elttype,
> -			const gdb_byte *string,
> -			unsigned int length,
> -			const char *user_encoding,
> -			int force_ellipses,
> -			const struct value_print_options *options);
> -
>  extern void c_language_arch_info (struct gdbarch *gdbarch,
>  				  struct language_arch_info *lai);
>  
> diff --git a/gdb/language.c b/gdb/language.c
> index 931abcd3076..20b6d8ccf9b 100644
> --- a/gdb/language.c
> +++ b/gdb/language.c
> @@ -635,18 +635,6 @@ language_defn::value_print_inner
>  
>  /* See language.h.  */
>  
> -void
> -language_defn::printstr (struct ui_file *stream, struct type *elttype,
> -			 const gdb_byte *string, unsigned int length,
> -			 const char *encoding, int force_ellipses,
> -			 const struct value_print_options *options) const
> -{
> -  c_printstr (stream, elttype, string, length, encoding, force_ellipses,
> -	      options);
> -}
> -
> -/* See language.h.  */
> -
>  void
>  language_defn::print_typedef (struct type *type, struct symbol *new_symbol,
>  			      struct ui_file *stream) const
> diff --git a/gdb/rust-lang.c b/gdb/rust-lang.c
> index 7584d2572fa..3668d2d0c6d 100644
> --- a/gdb/rust-lang.c
> +++ b/gdb/rust-lang.c
> @@ -283,8 +283,9 @@ rust_language::printstr (struct ui_file *stream, struct type *type,
>  	{
>  	  /* This is probably some C string, so let's let C deal with
>  	     it.  */
> -	  c_printstr (stream, type, string, length, user_encoding,
> -		      force_ellipses, options);
> +	  this->language_defn::printstr (stream, type, string, length,
> +					 user_encoding, force_ellipses,
> +					 options);

Personally, not a fan of the 'this->' when we're already using the
parent class name.  In ada_language::read_var_value we don't use
'this->'.  But I think it's up to you really, I'm not that offended by
it.

Thanks,
Andrew


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 05/18] Don't use wchar_printable in print_wchar
  2022-02-16 13:55 ` [PATCH 05/18] Don't use wchar_printable in print_wchar Tom Tromey
@ 2022-02-16 15:52   ` Andrew Burgess via Gdb-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-02-16 15:52 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

* Tom Tromey <tom@tromey.com> [2022-02-16 06:55:05 -0700]:

> print_wchar uses wchar_printable, but this isn't needed -- all the
> relevant cases are already handled by the 'switch'.  This changes the
> code to use gdb_iswprint, and removes a somewhat confusing comment
> related to this code.

LGTM.

Thanks,
Andrew


> ---
>  gdb/valprint.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/gdb/valprint.c b/gdb/valprint.c
> index e758c1d1066..17ad46c87b5 100644
> --- a/gdb/valprint.c
> +++ b/gdb/valprint.c
> @@ -2197,9 +2197,6 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
>  
>    *need_escapep = false;
>  
> -  /* iswprint implementation on Windows returns 1 for tab character.
> -     In order to avoid different printout on this host, we explicitly
> -     use wchar_printable function.  */
>    switch (w)
>      {
>        case LCST ('\a'):
> @@ -2225,9 +2222,9 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
>  	break;
>        default:
>  	{
> -	  if (wchar_printable (w) && (!need_escape || (!gdb_iswdigit (w)
> -						       && w != LCST ('8')
> -						       && w != LCST ('9'))))
> +	  if (gdb_iswprint (w) && (!need_escape || (!gdb_iswdigit (w)
> +						    && w != LCST ('8')
> +						    && w != LCST ('9'))))
>  	    {
>  	      gdb_wchar_t wchar = w;
>  
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 06/18] Fix a latent bug in print_wchar
  2022-02-16 13:55 ` [PATCH 06/18] Fix a latent bug " Tom Tromey
@ 2022-02-16 16:02   ` Andrew Burgess via Gdb-patches
  2022-02-17 22:02     ` Tom Tromey
  0 siblings, 1 reply; 34+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-02-16 16:02 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

* Tom Tromey <tom@tromey.com> [2022-02-16 06:55:06 -0700]:

> print_wchar keeps track of when escape sequences are emitted, to force
> an escape sequence if needed by a subsequent character.  For example
> for the string concatenation "\0" "1", gdb will print "\000\061" --
> because printing "\0001" might be confusing.
> 
> However, this code has a logic error.  It's intended to reject '8' and
> '9', as they are not octal digits, but it fails to do so.  This patch
> fixes the bug, and slightly rewrites the expression to be a bit more
> clear.
> 
> I suspect that, actually, this code can be removed entirely, because
> octal sequences are limited to 3 digits.  However, I'm not certain, so
> I've left it in place.

The change looks good, I have just one question...

> ---
>  gdb/testsuite/gdb.base/charset.exp | 4 ++++
>  gdb/valprint.c                     | 7 ++++---
>  2 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/gdb/testsuite/gdb.base/charset.exp b/gdb/testsuite/gdb.base/charset.exp
> index 5df2ec1a8de..97d49b9c461 100644
> --- a/gdb/testsuite/gdb.base/charset.exp
> +++ b/gdb/testsuite/gdb.base/charset.exp
> @@ -503,6 +503,10 @@ gdb_test "print '\\9'" " = \[0-9\]+ '9'"
>  # An octal escape can only be 3 digits.
>  gdb_test "print \"\\1011\"" " = \"A1\""
>  
> +# Neither 8 nor 9 need escaping in this situation.
> +gdb_test "print \"\\0\" \"8\"" " = \"\\\\0008\""
> +gdb_test "print \"\\0\" \"9\"" " = \"\\\\0009\""

Do we actually test the 0 -> 7 cases anywhere?  I assume if we did
you've have added the new tests next to that.  So, would it not be a
good idea to add tests to cover the 0 -> 7 cases too?

Thanks,
Andrew


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 07/18] Remove language_defn::emitchar
  2022-02-16 13:55 ` [PATCH 07/18] Remove language_defn::emitchar Tom Tromey
@ 2022-02-16 16:12   ` Andrew Burgess via Gdb-patches
  2022-02-17 22:02     ` Tom Tromey
  0 siblings, 1 reply; 34+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-02-16 16:12 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

* Tom Tromey <tom@tromey.com> [2022-02-16 06:55:07 -0700]:

> Nothing outside of the specific language implementations ever calls
> language_defn::emitchar.  This patch removes this method and updates
> the rest of the code.  In some spots, the method is entirely removed;
> in others, just the 'override' is removed.

This looks good.  My only feedback would be that for pascal_language
and m2_language, it would be nice if the emitchar function was made
private.  Maybe that would be better done in a separate commit though,
so that the diff in this one is smaller, and focuses just on the core
change.

Also, I've not looked ahead yet, so maybe this problem is already
solved by a later commit.

Thanks,
Andrew


> ---
>  gdb/ada-lang.c  |  8 --------
>  gdb/c-lang.c    | 19 +++----------------
>  gdb/f-lang.h    | 12 ++----------
>  gdb/language.c  |  9 ---------
>  gdb/language.h  |  6 ------
>  gdb/m2-lang.h   |  2 +-
>  gdb/p-lang.h    |  2 +-
>  gdb/rust-lang.c | 10 ++++++----
>  gdb/rust-lang.h | 12 +-----------
>  9 files changed, 14 insertions(+), 66 deletions(-)
> 
> diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
> index d2f620cbb04..e3bd1fa7c17 100644
> --- a/gdb/ada-lang.c
> +++ b/gdb/ada-lang.c
> @@ -13317,14 +13317,6 @@ class ada_language : public language_defn
>  
>    /* See language.h.  */
>  
> -  void emitchar (int ch, struct type *chtype,
> -		 struct ui_file *stream, int quoter) const override
> -  {
> -    ada_emit_char (ch, chtype, stream, quoter, 1);
> -  }
> -
> -  /* See language.h.  */
> -
>    void printchar (int ch, struct type *chtype,
>  		  struct ui_file *stream) const override
>    {
> diff --git a/gdb/c-lang.c b/gdb/c-lang.c
> index fbbecb696b9..1a6e6969a84 100644
> --- a/gdb/c-lang.c
> +++ b/gdb/c-lang.c
> @@ -139,20 +139,6 @@ classify_type (struct type *elttype, struct gdbarch *gdbarch,
>    return result;
>  }
>  
> -/* Print the character C on STREAM as part of the contents of a
> -   literal string whose delimiter is QUOTER.  Note that that format
> -   for printing characters and strings is language specific.  */
> -
> -void
> -language_defn::emitchar (int c, struct type *type,
> -			 struct ui_file *stream, int quoter) const
> -{
> -  const char *encoding;
> -
> -  classify_type (type, type->arch (), &encoding);
> -  generic_emit_char (c, type, stream, quoter, encoding);
> -}
> -
>  /* See language.h.  */
>  
>  void
> @@ -160,8 +146,9 @@ language_defn::printchar (int c, struct type *type,
>  			  struct ui_file * stream) const
>  {
>    c_string_type str_type;
> +  const char *encoding;
>  
> -  str_type = classify_type (type, type->arch (), NULL);
> +  str_type = classify_type (type, type->arch (), &encoding);
>    switch (str_type)
>      {
>      case C_CHAR:
> @@ -178,7 +165,7 @@ language_defn::printchar (int c, struct type *type,
>      }
>  
>    fputc_filtered ('\'', stream);
> -  emitchar (c, type, stream, '\'');
> +  generic_emit_char (c, type, stream, '\'', encoding);
>    fputc_filtered ('\'', stream);
>  }
>  
> diff --git a/gdb/f-lang.h b/gdb/f-lang.h
> index 11debd5569f..388c832dcdb 100644
> --- a/gdb/f-lang.h
> +++ b/gdb/f-lang.h
> @@ -149,20 +149,12 @@ class f_language : public language_defn
>  
>    /* See language.h.  */
>  
> -  void emitchar (int ch, struct type *chtype,
> -		 struct ui_file *stream, int quoter) const override
> -  {
> -    const char *encoding = get_encoding (chtype);
> -    generic_emit_char (ch, chtype, stream, quoter, encoding);
> -  }
> -
> -  /* See language.h.  */
> -
>    void printchar (int ch, struct type *chtype,
>  		  struct ui_file *stream) const override
>    {
>      fputs_filtered ("'", stream);
> -    emitchar (ch, chtype, stream, '\'');
> +    const char *encoding = get_encoding (chtype);
> +    generic_emit_char (ch, chtype, stream, '\'', encoding);
>      fputs_filtered ("'", stream);
>    }
>  
> diff --git a/gdb/language.c b/gdb/language.c
> index 20b6d8ccf9b..dff9f8bdaf9 100644
> --- a/gdb/language.c
> +++ b/gdb/language.c
> @@ -786,15 +786,6 @@ class auto_or_unknown_language : public language_defn
>  
>    /* See language.h.  */
>  
> -  void emitchar (int ch, struct type *chtype,
> -		 struct ui_file *stream, int quoter) const override
> -  {
> -    error (_("emit character not implemented for language \"%s\""),
> -	   natural_name ());
> -  }
> -
> -  /* See language.h.  */
> -
>    void printchar (int ch, struct type *chtype,
>  		  struct ui_file *stream) const override
>    {
> diff --git a/gdb/language.h b/gdb/language.h
> index f2885000259..2820eca0050 100644
> --- a/gdb/language.h
> +++ b/gdb/language.h
> @@ -524,12 +524,6 @@ struct language_defn
>  
>    virtual int parser (struct parser_state *ps) const;
>  
> -  /* Print the character CH (of type CHTYPE) on STREAM as part of the
> -     contents of a literal string whose delimiter is QUOTER.  */
> -
> -  virtual void emitchar (int ch, struct type *chtype,
> -			 struct ui_file *stream, int quoter) const;
> -
>    virtual void printchar (int ch, struct type *chtype,
>  			  struct ui_file * stream) const;
>  
> diff --git a/gdb/m2-lang.h b/gdb/m2-lang.h
> index 86a093e5f1b..ee2fc6fb8bc 100644
> --- a/gdb/m2-lang.h
> +++ b/gdb/m2-lang.h
> @@ -93,7 +93,7 @@ class m2_language : public language_defn
>    /* See language.h.  */
>  
>    void emitchar (int ch, struct type *chtype,
> -		 struct ui_file *stream, int quoter) const override;
> +		 struct ui_file *stream, int quoter) const;
>  
>    /* See language.h.  */
>  
> diff --git a/gdb/p-lang.h b/gdb/p-lang.h
> index 429ef23aea4..b831bde0af6 100644
> --- a/gdb/p-lang.h
> +++ b/gdb/p-lang.h
> @@ -110,7 +110,7 @@ class pascal_language : public language_defn
>    /* See language.h.  */
>  
>    void emitchar (int ch, struct type *chtype,
> -		 struct ui_file *stream, int quoter) const override
> +		 struct ui_file *stream, int quoter) const
>    {
>      int in_quotes = 0;
>  
> diff --git a/gdb/rust-lang.c b/gdb/rust-lang.c
> index 3668d2d0c6d..41e973797d2 100644
> --- a/gdb/rust-lang.c
> +++ b/gdb/rust-lang.c
> @@ -1592,13 +1592,14 @@ rust_language::print_type (struct type *type, const char *varstring,
>  /* See language.h.  */
>  
>  void
> -rust_language::emitchar (int ch, struct type *chtype,
> -			 struct ui_file *stream, int quoter) const
> +rust_language::printchar (int ch, struct type *chtype,
> +			  struct ui_file *stream) const
>  {
> +  fputs_filtered ("'", stream);
>    if (!rust_chartype_p (chtype))
> -    generic_emit_char (ch, chtype, stream, quoter,
> +    generic_emit_char (ch, chtype, stream, '\'',
>  		       target_charset (chtype->arch ()));
> -  else if (ch == '\\' || ch == quoter)
> +  else if (ch == '\\')
>      fprintf_filtered (stream, "\\%c", ch);
>    else if (ch == '\n')
>      fputs_filtered ("\\n", stream);
> @@ -1614,6 +1615,7 @@ rust_language::emitchar (int ch, struct type *chtype,
>      fprintf_filtered (stream, "\\x%02x", ch);
>    else
>      fprintf_filtered (stream, "\\u{%06x}", ch);
> +  fputs_filtered ("'", stream);
>  }
>  
>  /* See language.h.  */
> diff --git a/gdb/rust-lang.h b/gdb/rust-lang.h
> index 60a1967ff45..1db83fe75ae 100644
> --- a/gdb/rust-lang.h
> +++ b/gdb/rust-lang.h
> @@ -170,18 +170,8 @@ class rust_language : public language_defn
>  
>    /* See language.h.  */
>  
> -  void emitchar (int ch, struct type *chtype,
> -		 struct ui_file *stream, int quoter) const override;
> -
> -  /* See language.h.  */
> -
>    void printchar (int ch, struct type *chtype,
> -		  struct ui_file *stream) const override
> -  {
> -    fputs_filtered ("'", stream);
> -    emitchar (ch, chtype, stream, '\'');
> -    fputs_filtered ("'", stream);
> -  }
> +		  struct ui_file *stream) const override;
>  
>    /* See language.h.  */
>  
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 08/18] Add gdb_iswcntrl
  2022-02-16 13:55 ` [PATCH 08/18] Add gdb_iswcntrl Tom Tromey
@ 2022-02-16 16:13   ` Andrew Burgess via Gdb-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-02-16 16:13 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

* Tom Tromey <tom@tromey.com> [2022-02-16 06:55:08 -0700]:

> A future patch will need 'iswcntrl', so introduce the appropriate
> wrappers in gdb_wchar.h.

LGTM.

Thanks,
Andrew

> ---
>  gdb/gdb_wchar.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/gdb/gdb_wchar.h b/gdb/gdb_wchar.h
> index ba5baf3a2a0..78e880c8f18 100644
> --- a/gdb/gdb_wchar.h
> +++ b/gdb/gdb_wchar.h
> @@ -67,6 +67,7 @@ typedef wint_t gdb_wint_t;
>  #define gdb_wcslen wcslen
>  #define gdb_iswprint iswprint
>  #define gdb_iswdigit iswdigit
> +#define gdb_iswcntrl iswcntrl
>  #define gdb_btowc btowc
>  #define gdb_WEOF WEOF
>  
> @@ -104,6 +105,7 @@ typedef int gdb_wint_t;
>  #define gdb_wcslen strlen
>  #define gdb_iswprint isprint
>  #define gdb_iswdigit isdigit
> +#define gdb_iswcntrl iscntrl
>  #define gdb_btowc /* empty */
>  #define gdb_WEOF EOF
>  
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 09/18] Include \0 in printable wide characters
  2022-02-16 13:55 ` [PATCH 09/18] Include \0 in printable wide characters Tom Tromey
@ 2022-02-16 17:19   ` Andrew Burgess via Gdb-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-02-16 17:19 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

* Tom Tromey <tom@tromey.com> [2022-02-16 06:55:09 -0700]:

> print_wchar can always display \0, so include it in the list of

maybe s/always/already/ ?

> "printable" characters in wchar_printable.  This is useful in a coming
> patch to change Rust to use the generic character-printing code.

Otherwise, LGTM.

Thanks,
Andrew


> ---
>  gdb/valprint.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gdb/valprint.c b/gdb/valprint.c
> index 7e67cb953f4..dec1dce9d3c 100644
> --- a/gdb/valprint.c
> +++ b/gdb/valprint.c
> @@ -2160,7 +2160,7 @@ wchar_printable (gdb_wchar_t w)
>  	  || w == LCST ('\a') || w == LCST ('\b')
>  	  || w == LCST ('\f') || w == LCST ('\n')
>  	  || w == LCST ('\r') || w == LCST ('\t')
> -	  || w == LCST ('\v'));
> +	  || w == LCST ('\v') || w == LCST ('\0'));
>  }
>  
>  /* A helper function that converts the contents of STRING to wide
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 10/18] Use a ui_file in print_wchar
  2022-02-16 13:55 ` [PATCH 10/18] Use a ui_file in print_wchar Tom Tromey
@ 2022-02-16 17:25   ` Andrew Burgess via Gdb-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-02-16 17:25 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

* Tom Tromey <tom@tromey.com> [2022-02-16 06:55:10 -0700]:

> This introduces a new ui_file subclass, that turns a host string into
> a wide string and appends it to an obstack.  print_wchar is rewritten
> to use this new ui_file.  This will be more useful in a later patch.

LGTM.

Thanks,
Andrew

> ---
>  gdb/valprint.c | 49 +++++++++++++++++++++++++++----------------------
>  1 file changed, 27 insertions(+), 22 deletions(-)
> 
> diff --git a/gdb/valprint.c b/gdb/valprint.c
> index dec1dce9d3c..00c0cd2c72a 100644
> --- a/gdb/valprint.c
> +++ b/gdb/valprint.c
> @@ -2163,19 +2163,29 @@ wchar_printable (gdb_wchar_t w)
>  	  || w == LCST ('\v') || w == LCST ('\0'));
>  }
>  
> -/* A helper function that converts the contents of STRING to wide
> -   characters and then appends them to OUTPUT.  */
> -
> -static void
> -append_string_as_wide (const char *string,
> -		       struct obstack *output)
> -{
> -  for (; *string; ++string)
> -    {
> -      gdb_wchar_t w = gdb_btowc (*string);
> -      obstack_grow (output, &w, sizeof (gdb_wchar_t));
> -    }
> -}
> +/* A ui_file that writes wide characters to an obstack.  */
> +class obstack_wide_file : public ui_file
> +{
> +public:
> +  explicit obstack_wide_file (struct obstack *output)
> +    : m_output (output)
> +  {
> +  }
> +
> +  ~obstack_wide_file () = default;
> +
> +  void write (const char *buf, long length_buf) override
> +  {
> +    for (long i = 0; i < length_buf; ++i)
> +      {
> +	gdb_wchar_t w = gdb_btowc (buf[i]);
> +	obstack_grow (m_output, &w, sizeof (gdb_wchar_t));
> +      }
> +  }
> +
> +private:
> +  struct obstack *m_output;
> +};
>  
>  /* Print a wide character W to OUTPUT.  ORIG is a pointer to the
>     original (target) bytes representing the character, ORIG_LEN is the
> @@ -2236,10 +2246,10 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
>  	  else
>  	    {
>  	      int i;
> +	      obstack_wide_file file (output);
>  
>  	      for (i = 0; i + width <= orig_len; i += width)
>  		{
> -		  char octal[30];
>  		  ULONGEST value;
>  
>  		  value = extract_unsigned_integer (&orig[i], width,
> @@ -2247,19 +2257,14 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
>  		  /* If the value fits in 3 octal digits, print it that
>  		     way.  Otherwise, print it as a hex escape.  */
>  		  if (value <= 0777)
> -		    xsnprintf (octal, sizeof (octal), "\\%.3o",
> -			       (int) (value & 0777));
> +		    fprintf_filtered (&file, "\\%.3o", (int) (value & 0777));
>  		  else
> -		    xsnprintf (octal, sizeof (octal), "\\x%lx", (long) value);
> -		  append_string_as_wide (octal, output);
> +		    fprintf_filtered (&file, "\\x%lx", (long) value);
>  		}
>  	      /* If we somehow have extra bytes, print them now.  */
>  	      while (i < orig_len)
>  		{
> -		  char octal[5];
> -
> -		  xsnprintf (octal, sizeof (octal), "\\%.3o", orig[i] & 0xff);
> -		  append_string_as_wide (octal, output);
> +		  fprintf_filtered (&file, "\\%.3o", orig[i] & 0xff);
>  		  ++i;
>  		}
>  
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 11/18] Add an emitter callback to generic_printstr and generic_emit_char
  2022-02-16 13:55 ` [PATCH 11/18] Add an emitter callback to generic_printstr and generic_emit_char Tom Tromey
@ 2022-02-16 17:47   ` Andrew Burgess via Gdb-patches
  2022-02-16 20:40     ` Tom Tromey
  0 siblings, 1 reply; 34+ messages in thread
From: Andrew Burgess via Gdb-patches @ 2022-02-16 17:47 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

* Tom Tromey <tom@tromey.com> [2022-02-16 06:55:11 -0700]:

> This adds an emitter callback to generic_printstr and
> generic_emit_char, passing it through to print_wchar.  print_wchar is
> modified to call it, if possible.  This will be used to let languages
> override the way that escape sequences are emitted.  Nothing uses this
> yet, that comes later in the series.
> ---
>  gdb/valprint.c | 41 ++++++++++++++++++++++++++++-------------
>  gdb/valprint.h | 30 ++++++++++++++++++++++++++++--
>  2 files changed, 56 insertions(+), 15 deletions(-)
> 
> diff --git a/gdb/valprint.c b/gdb/valprint.c
> index 00c0cd2c72a..c0e5e678005 100644
> --- a/gdb/valprint.c
> +++ b/gdb/valprint.c
> @@ -2201,12 +2201,19 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
>  	     int orig_len, int width,
>  	     enum bfd_endian byte_order,
>  	     struct obstack *output,
> -	     int quoter, bool *need_escapep)
> +	     int quoter, bool *need_escapep,
> +	     emit_char_ftype emitter)
>  {
>    bool need_escape = *need_escapep;
>  
>    *need_escapep = false;
>  
> +  obstack_wide_file file (output);
> +  if (emitter != nullptr
> +      && emitter (&file, w, gdb::make_array_view (orig, orig_len),
> +		  width, byte_order, quoter))

Rather than having the emitter called sometimes.  Did you consider
moving everything below this point into a separate function, say,
default_emit_char, then change the declarations to something like:

  extern void generic_emit_char (int c, struct type *type, struct ui_file *stream,
  			       int quoter, const char *encoding,
  			       emit_char_ftype emitter = default_emit_char);

Your emitter signature would need updating to not return a bool.

Language specific implementation would then do their own thing and/or
call default_emit_char as they saw fit.

I started prototyping this, to see what it might look like, and the
problem I ran into is that your emitter doesn't pass through the
need_escapep argument, which I think is a mistake.

As I understand it, the need_escapep allows us to handle the case you
fixed in patch #6 - printing things like "\0" "1" as "\000\061" - but
if this emitter is need to modify how escape sequences are printed,
then surely we're going to need to know this information, right?  I
haven't looked ahead yet, but, surely, there's at least the
possibility that a language might need to track if the last character
was an escape or not so it can avoid the same sort of issues?  Or what
if the language prints something as an escape, and then the default
code, for the next character, prints something that might merge with
the escape?

Thanks,
Andrew

> +    return;
> +
>    switch (w)
>      {
>        case LCST ('\a'):
> @@ -2246,7 +2253,6 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
>  	  else
>  	    {
>  	      int i;
> -	      obstack_wide_file file (output);
>  
>  	      for (i = 0; i + width <= orig_len; i += width)
>  		{
> @@ -2281,7 +2287,8 @@ print_wchar (gdb_wint_t w, const gdb_byte *orig,
>  
>  void
>  generic_emit_char (int c, struct type *type, struct ui_file *stream,
> -		   int quoter, const char *encoding)
> +		   int quoter, const char *encoding,
> +		   emit_char_ftype emitter)
>  {
>    enum bfd_endian byte_order
>      = type_byte_order (type);
> @@ -2330,14 +2337,16 @@ generic_emit_char (int c, struct type *type, struct ui_file *stream,
>  	      for (i = 0; i < num_chars; ++i)
>  		print_wchar (chars[i], buf, buflen,
>  			     TYPE_LENGTH (type), byte_order,
> -			     &wchar_buf, quoter, &need_escape);
> +			     &wchar_buf, quoter, &need_escape,
> +			     emitter);
>  	    }
>  	}
>  
>        /* This handles the NUM_CHARS == 0 case as well.  */
>        if (print_escape)
>  	print_wchar (gdb_WEOF, buf, buflen, TYPE_LENGTH (type),
> -		     byte_order, &wchar_buf, quoter, &need_escape);
> +		     byte_order, &wchar_buf, quoter, &need_escape,
> +		     emitter);
>      }
>  
>    /* The output in the host encoding.  */
> @@ -2445,7 +2454,8 @@ print_converted_chars_to_obstack (struct obstack *obstack,
>  				  const std::vector<converted_character> &chars,
>  				  int quote_char, int width,
>  				  enum bfd_endian byte_order,
> -				  const struct value_print_options *options)
> +				  const struct value_print_options *options,
> +				  emit_char_ftype emitter)
>  {
>    unsigned int idx;
>    const converted_character *elem;
> @@ -2486,10 +2496,12 @@ print_converted_chars_to_obstack (struct obstack *obstack,
>  	      {
>  		if (elem->result == wchar_iterate_ok)
>  		  print_wchar (elem->chars[0], elem->buf, elem->buflen, width,
> -			       byte_order, obstack, quote_char, &need_escape);
> +			       byte_order, obstack, quote_char, &need_escape,
> +			       emitter);
>  		else
>  		  print_wchar (gdb_WEOF, elem->buf, elem->buflen, width,
> -			       byte_order, obstack, quote_char, &need_escape);
> +			       byte_order, obstack, quote_char, &need_escape,
> +			       emitter);
>  	      }
>  	  }
>  	  break;
> @@ -2514,10 +2526,12 @@ print_converted_chars_to_obstack (struct obstack *obstack,
>  	    obstack_grow_wstr (obstack, LCST ("'"));
>  	    if (elem->result == wchar_iterate_ok)
>  	      print_wchar (elem->chars[0], elem->buf, elem->buflen, width,
> -			   byte_order, obstack, quote_char, &need_escape);
> +			   byte_order, obstack, quote_char, &need_escape,
> +			   emitter);
>  	    else
>  	      print_wchar (gdb_WEOF, elem->buf, elem->buflen, width,
> -			   byte_order, obstack, quote_char, &need_escape);
> +			   byte_order, obstack, quote_char, &need_escape,
> +			   emitter);
>  	    obstack_grow_wstr (obstack, LCST ("'"));
>  	    std::string s = string_printf (_(" <repeats %u times>"),
>  					   elem->repeat_count);
> @@ -2543,7 +2557,7 @@ print_converted_chars_to_obstack (struct obstack *obstack,
>  	  /* Output the incomplete sequence string.  */
>  	  obstack_grow_wstr (obstack, LCST ("<incomplete sequence "));
>  	  print_wchar (gdb_WEOF, elem->buf, elem->buflen, width, byte_order,
> -		       obstack, 0, &need_escape);
> +		       obstack, 0, &need_escape, emitter);
>  	  obstack_grow_wstr (obstack, LCST (">"));
>  
>  	  /* We do not attempt to output anything after this.  */
> @@ -2602,7 +2616,8 @@ generic_printstr (struct ui_file *stream, struct type *type,
>  		  const gdb_byte *string, unsigned int length, 
>  		  const char *encoding, int force_ellipses,
>  		  int quote_char, int c_style_terminator,
> -		  const struct value_print_options *options)
> +		  const struct value_print_options *options,
> +		  emit_char_ftype emitter)
>  {
>    enum bfd_endian byte_order = type_byte_order (type);
>    unsigned int i;
> @@ -2678,7 +2693,7 @@ generic_printstr (struct ui_file *stream, struct type *type,
>  
>    /* Print the output string to the obstack.  */
>    print_converted_chars_to_obstack (&wchar_buf, converted_chars, quote_char,
> -				    width, byte_order, options);
> +				    width, byte_order, options, emitter);
>  
>    if (force_ellipses || !finished)
>      obstack_grow_wstr (&wchar_buf, LCST ("..."));
> diff --git a/gdb/valprint.h b/gdb/valprint.h
> index 0586836f9e6..64fea1ccb4a 100644
> --- a/gdb/valprint.h
> +++ b/gdb/valprint.h
> @@ -233,14 +233,40 @@ extern void generic_value_print (struct value *val, struct ui_file *stream,
>  				 const struct value_print_options *options,
>  				 const struct generic_val_print_decorations *d);
>  
> +/* A callback that can be used to print a representation of a wide
> +   character to a stream.
> +   
> +   If the character can be represented by this callback, it will
> +   return true.  A false return indicates that the default behavior
> +   should be taken -- for printable characters, the default is to emit
> +   it verbatim; for non-printable characters, C-style escapes are
> +   used.  Normally a callback should always return false for
> +   printable, non-control characters.
> +
> +   STREAM is the stream to write to.
> +   W is the character.  It might be gdb_WEOF, meaning an unconvertible
> +   sequence.
> +   ORIG is the original (target) bytes corresponding to W.
> +   WIDTH is the width of a base character in the encoding.
> +   BYTE_ORDER is the character type's byte order.
> +   QUOTER is the quote character used -- this is a host character.  */
> +typedef gdb::function_view<bool (ui_file *stream,
> +				 gdb_wint_t w,
> +				 gdb::array_view<const gdb_byte> orig,
> +				 int width,
> +				 enum bfd_endian byte_order,
> +				 int quoter)> emit_char_ftype;
> +
>  extern void generic_emit_char (int c, struct type *type, struct ui_file *stream,
> -			       int quoter, const char *encoding);
> +			       int quoter, const char *encoding,
> +			       emit_char_ftype emitter = nullptr);
>  
>  extern void generic_printstr (struct ui_file *stream, struct type *type, 
>  			      const gdb_byte *string, unsigned int length, 
>  			      const char *encoding, int force_ellipses,
>  			      int quote_char, int c_style_terminator,
> -			      const struct value_print_options *options);
> +			      const struct value_print_options *options,
> +			      emit_char_ftype emitter = nullptr);
>  
>  /* Run the "output" command.  ARGS and FROM_TTY are the usual
>     arguments passed to all command implementations, except ARGS is
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 11/18] Add an emitter callback to generic_printstr and generic_emit_char
  2022-02-16 17:47   ` Andrew Burgess via Gdb-patches
@ 2022-02-16 20:40     ` Tom Tromey
  2022-02-16 21:00       ` Tom Tromey
  0 siblings, 1 reply; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 20:40 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: Tom Tromey, gdb-patches

Andrew> Rather than having the emitter called sometimes.  Did you consider
Andrew> moving everything below this point into a separate function, say,
Andrew> default_emit_char

Nope.

Andrew> I started prototyping this, to see what it might look like, and the
Andrew> problem I ran into is that your emitter doesn't pass through the
Andrew> need_escapep argument, which I think is a mistake.

Actually, I think we probably can just remove this code.
I don't think it's needed, because gdb always emits 3 octal digits, and
the C parser at least limits octal escapes to 3 digits as well.

Andrew> I
Andrew> haven't looked ahead yet, but, surely, there's at least the
Andrew> possibility that a language might need to track if the last character
Andrew> was an escape or not so it can avoid the same sort of issues?  Or what
Andrew> if the language prints something as an escape, and then the default
Andrew> code, for the next character, prints something that might merge with
Andrew> the escape?

If a language has something like this, it's easy to handle by having the
callback be a closure.

I'll remove the need_escape stuff in v2 and change the character emitter
as you suggest.

Tom

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 11/18] Add an emitter callback to generic_printstr and generic_emit_char
  2022-02-16 20:40     ` Tom Tromey
@ 2022-02-16 21:00       ` Tom Tromey
  0 siblings, 0 replies; 34+ messages in thread
From: Tom Tromey @ 2022-02-16 21:00 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Andrew Burgess, gdb-patches

Tom> Actually, I think we probably can just remove this code.
Tom> I don't think it's needed, because gdb always emits 3 octal digits, and
Tom> the C parser at least limits octal escapes to 3 digits as well.

So the problem is not with octal at all, but actually with hex, because
with hex escapes there isn't a digit limit.  This means the current
behavior is actually wrong:

(gdb) print L"\xfffe" "f"
$1 = L"\xfffef"

I think this should probably print L"\xfffe\x66", though maybe string
concatenation would be prettier.

Anyway, while I did find a spot with a bug, I "found" the wrong bug.

Tom

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 06/18] Fix a latent bug in print_wchar
  2022-02-16 16:02   ` Andrew Burgess via Gdb-patches
@ 2022-02-17 22:02     ` Tom Tromey
  0 siblings, 0 replies; 34+ messages in thread
From: Tom Tromey @ 2022-02-17 22:02 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: Tom Tromey, gdb-patches

>> +# Neither 8 nor 9 need escaping in this situation.
>> +gdb_test "print \"\\0\" \"8\"" " = \"\\\\0008\""
>> +gdb_test "print \"\\0\" \"9\"" " = \"\\\\0009\""

Andrew> Do we actually test the 0 -> 7 cases anywhere?  I assume if we did
Andrew> you've have added the new tests next to that.  So, would it not be a
Andrew> good idea to add tests to cover the 0 -> 7 cases too?

v2 checks all the hex digits here.

Tom

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 07/18] Remove language_defn::emitchar
  2022-02-16 16:12   ` Andrew Burgess via Gdb-patches
@ 2022-02-17 22:02     ` Tom Tromey
  0 siblings, 0 replies; 34+ messages in thread
From: Tom Tromey @ 2022-02-17 22:02 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: Tom Tromey, gdb-patches

>>>>> "Andrew" == Andrew Burgess <aburgess@redhat.com> writes:

>> Nothing outside of the specific language implementations ever calls
>> language_defn::emitchar.  This patch removes this method and updates
>> the rest of the code.  In some spots, the method is entirely removed;
>> in others, just the 'override' is removed.

Andrew> This looks good.  My only feedback would be that for pascal_language
Andrew> and m2_language, it would be nice if the emitchar function was made
Andrew> private.  Maybe that would be better done in a separate commit though,
Andrew> so that the diff in this one is smaller, and focuses just on the core
Andrew> change.

Andrew> Also, I've not looked ahead yet, so maybe this problem is already
Andrew> solved by a later commit.

Yeah, this code is all deleted by the end of the series.

Tom

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2022-02-17 22:02 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-16 13:55 [PATCH 00/18] Refactor character printing Tom Tromey
2022-02-16 13:55 ` [PATCH 01/18] Fix latent quote char bug in generic_printstr Tom Tromey
2022-02-16 15:38   ` Andrew Burgess via Gdb-patches
2022-02-16 13:55 ` [PATCH 02/18] Boolify need_escape in generic_emit_char Tom Tromey
2022-02-16 15:39   ` Andrew Burgess via Gdb-patches
2022-02-16 13:55 ` [PATCH 03/18] Remove c_emit_char Tom Tromey
2022-02-16 15:40   ` Andrew Burgess via Gdb-patches
2022-02-16 13:55 ` [PATCH 04/18] Remove c_printstr Tom Tromey
2022-02-16 15:46   ` Andrew Burgess via Gdb-patches
2022-02-16 13:55 ` [PATCH 05/18] Don't use wchar_printable in print_wchar Tom Tromey
2022-02-16 15:52   ` Andrew Burgess via Gdb-patches
2022-02-16 13:55 ` [PATCH 06/18] Fix a latent bug " Tom Tromey
2022-02-16 16:02   ` Andrew Burgess via Gdb-patches
2022-02-17 22:02     ` Tom Tromey
2022-02-16 13:55 ` [PATCH 07/18] Remove language_defn::emitchar Tom Tromey
2022-02-16 16:12   ` Andrew Burgess via Gdb-patches
2022-02-17 22:02     ` Tom Tromey
2022-02-16 13:55 ` [PATCH 08/18] Add gdb_iswcntrl Tom Tromey
2022-02-16 16:13   ` Andrew Burgess via Gdb-patches
2022-02-16 13:55 ` [PATCH 09/18] Include \0 in printable wide characters Tom Tromey
2022-02-16 17:19   ` Andrew Burgess via Gdb-patches
2022-02-16 13:55 ` [PATCH 10/18] Use a ui_file in print_wchar Tom Tromey
2022-02-16 17:25   ` Andrew Burgess via Gdb-patches
2022-02-16 13:55 ` [PATCH 11/18] Add an emitter callback to generic_printstr and generic_emit_char Tom Tromey
2022-02-16 17:47   ` Andrew Burgess via Gdb-patches
2022-02-16 20:40     ` Tom Tromey
2022-02-16 21:00       ` Tom Tromey
2022-02-16 13:55 ` [PATCH 12/18] Add a default encoding to generic_emit_char and generic_printstr Tom Tromey
2022-02-16 13:55 ` [PATCH 13/18] Change generic_emit_char to print the quotes Tom Tromey
2022-02-16 13:55 ` [PATCH 14/18] Use generic_emit_char in Rust Tom Tromey
2022-02-16 13:55 ` [PATCH 15/18] Use generic_emit_char in Ada Tom Tromey
2022-02-16 13:55 ` [PATCH 16/18] Use generic_emit_char in Modula-2 Tom Tromey
2022-02-16 13:55 ` [PATCH 17/18] Use generic_emit_char in Pascal Tom Tromey
2022-02-16 13:55 ` [PATCH 18/18] Simplify Fortran string printing Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox