From: Tom de Vries <tdevries@suse.de>
To: gdb-patches@sourceware.org
Subject: Re: [PATCH v3] [gdb/tui] Handle unicode chars in prompt
Date: Sun, 6 Jul 2025 09:47:54 +0200 [thread overview]
Message-ID: <b89ba668-b3f2-46b2-833a-87fb2fcbfe5f@suse.de> (raw)
In-Reply-To: <20250706074003.6476-1-tdevries@suse.de>
On 7/6/25 09:40, Tom de Vries wrote:
> Let's try to set the prompt using a unicode character, say '❯', aka U+276F
> (heavy right-pointing angle quotation mark ornament).
>
> This works fine on an xterm with CLI (with X marking the position of the
> blinking cursor):
> ...
> $ gdb -q -ex "set prompt GDB❯ "
> GDB❯ X
> ...
> but with TUI:
> ...
> $ gdb -q -tui -ex "set prompt GDB❯ "
> ...
> we get instead:
> ...
> GDB GDB X
> ...
>
> We can use the test-case gdb.tui/unicode-prompt.exp to get more details, using
> tuiterm.
>
> With Term::dump_screen we have:
> ...
> 16 (gdb) set prompt GDB❯
> 17 GDB❯ GDB❯ GDB❯ set prompt (gdb)
> 18 (gdb)
> ...
> and with Term::dump_screen_with_attrs (summarizing using attribute sets <attrs1>
> and <attrs2>):
> ...
> 16 (gdb) set prompt GDB❯
> 17 GDB<attrs1>❯<attrs2> GDB<attrs1>❯<attrs2> GDB<attrs1>❯<attrs2> set prompt (gdb)
> 18 (gdb)
> ...
> where:
> ...
> <attrs1> == <reverse:1><invisible:1><blinking:1><intensity:bold>
> <attrs2> == <reverse:0><invisible:0><blinking:0><intensity:normal>
> ...
>
> This explains why we didn't see the unicode char on xterm: it's hidden
> because the invisible attribute is set.
>
> So, there seem to be two problems:
> - the attributes are incorrect, and
> - the prompt is repeated a couple of times.
>
> In TUI, the prompt is written out by tui_puts_internal, which outputs one byte
> at a time using waddch, which apparently breaks multi-byte char support.
>
> Fix this by detecting multi-byte chars in tui_puts_internal, and printing them using
> waddnstr.
>
This v3 is roughly the same as what I've posted here (
https://sourceware.org/pipermail/gdb-patches/2023-June/200296.html ).
Changes compared to that post:
- rebased on current trunk
- updated test-case for freebsd
- updated test-case copyright year to 2025
- fixed formatting issue in tui-io.c
Thanks,
- Tom
> Tested on x86_64-linux and x86_64-freebsd.
>
> Reported-By: wuzy01@qq.com
>
> PR tui/28800
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28800
> ---
> gdb/charset.c | 20 +++++
> gdb/charset.h | 7 ++
> gdb/testsuite/gdb.tui/unicode-prompt.exp | 71 ++++++++++++++++
> gdb/tui/tui-io.c | 100 +++++++++++++++++++----
> 4 files changed, 183 insertions(+), 15 deletions(-)
> create mode 100644 gdb/testsuite/gdb.tui/unicode-prompt.exp
>
> diff --git a/gdb/charset.c b/gdb/charset.c
> index 259362563b2..9c0df83c0d3 100644
> --- a/gdb/charset.c
> +++ b/gdb/charset.c
> @@ -690,6 +690,26 @@ wchar_iterator::iterate (enum wchar_iterate_result *out_result,
> return -1;
> }
>
> +/* See charset.h. */
> +
> +void
> +wchar_iterator::skip (size_t len)
> +{
> + m_input += len;
> +
> + gdb_assert (len <= m_bytes);
> + m_bytes -= len;
> +}
> +
> +/* See charset.h. */
> +
> +void
> +wchar_iterator::reset (const gdb_byte *input, size_t bytes)
> +{
> + m_input = input;
> + m_bytes = bytes;
> +}
> +
> struct charset_vector
> {
> ~charset_vector ()
> diff --git a/gdb/charset.h b/gdb/charset.h
> index a0f109da5ee..4d68e61f35a 100644
> --- a/gdb/charset.h
> +++ b/gdb/charset.h
> @@ -126,6 +126,13 @@ class wchar_iterator
> int iterate (enum wchar_iterate_result *out_result, gdb_wchar_t **out_chars,
> const gdb_byte **ptr, size_t *len);
>
> + /* Increase the input buffer pointer by LEN bytes. */
> + void skip (size_t len);
> +
> + /* Reset the input buffer pointer to INPUT and the number of bytes in the
> + input buffer to BYTES. */
> + void reset (const gdb_byte *input, size_t bytes);
> +
> private:
>
> /* The underlying iconv descriptor. */
> diff --git a/gdb/testsuite/gdb.tui/unicode-prompt.exp b/gdb/testsuite/gdb.tui/unicode-prompt.exp
> new file mode 100644
> index 00000000000..ac2b6202c04
> --- /dev/null
> +++ b/gdb/testsuite/gdb.tui/unicode-prompt.exp
> @@ -0,0 +1,71 @@
> +# Copyright 2025 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +
> +require allow_tui_tests
> +
> +tuiterm_env
> +
> +save_vars { env(LC_ALL) } {
> + # Override "C" settings from default_gdb_init.
> + setenv LC_ALL "C.UTF-8"
> +
> + Term::clean_restart 24 80
> +}
> +
> +if {![Term::enter_tui]} {
> + unsupported "TUI not supported"
> + return
> +}
> +
> +set unicode_char "\u276F"
> +set unicode_char_unsupported "❯"
> +
> +set color_on [string cat {\033} "\[" "31m"]
> +set color_off [string cat {\033} "\[" "0m"]
> +
> +set prompt "GDB$color_on$unicode_char$color_off "
> +set prompt_no_color "GDB$unicode_char "
> +set prompt_no_color_re [string_to_regexp $prompt_no_color]
> +
> +if { [ishost *-*-*bsd*] } {
> + set issue_eol "\r\n"
> +} else {
> + set issue_eol "\n"
> +}
> +
> +# Set new prompt.
> +send_gdb "set prompt $prompt$issue_eol"
> +# Set old prompt back.
> +send_gdb "set prompt (gdb) $issue_eol"
> +
> +gdb_assert \
> + { [Term::wait_for "^${prompt_no_color_re}set prompt $gdb_prompt "] } \
> + "prompt with unicode char"
> +
> +set line [Term::get_line_with_attrs [expr $Term::_cur_row - 1]]
> +verbose -log "line with attrs: '$line'"
> +
> +set prompt_with_attrs_re "GDB<fg:red>$unicode_char<fg:default> "
> +set prompt_unsupported_re "GDB$unicode_char_unsupported "
> +
> +set test "colored unicode char"
> +if { [regexp "^${prompt_with_attrs_re}set prompt .*$" $line] } {
> + pass $test
> +} elseif { [regexp "^${prompt_unsupported_re}set prompt .*$" $line] } {
> + # I get this on freebsd: no color, and unicode char not recognized.
> + unsupported $test
> +} else {
> + fail $test
> +}
> diff --git a/gdb/tui/tui-io.c b/gdb/tui/tui-io.c
> index 1b4cc82cce8..5c39e42362a 100644
> --- a/gdb/tui/tui-io.c
> +++ b/gdb/tui/tui-io.c
> @@ -45,6 +45,7 @@
> #include "gdbsupport/unordered_map.h"
> #include "pager.h"
> #include "gdbsupport/gdb-checked-static-cast.h"
> +#include "charset.h"
>
> /* This redefines CTRL if it is not already defined, so it must come
> after terminal state related include files like <term.h> and
> @@ -539,30 +540,99 @@ tui_puts_internal (WINDOW *w, const char *string, int *height)
> char c;
> int prev_col = 0;
> bool saw_nl = false;
> + size_t skip = 0;
> + wchar_iterator it ((gdb_byte *)string, strlen (string), host_charset (), 1);
>
> - while ((c = *string++) != 0)
> + while (true)
> {
> - if (c == '\1' || c == '\2')
> - {
> - /* Ignore these, they are readline escape-marking
> - sequences. */
> - continue;
> - }
> + bool handled = false;
> +
> + /* Get iterator in sync with string. */
> + it.skip (skip);
> + skip = 0;
> +
> + /* Detect and handle multibyte chars. */
> + {
> + enum wchar_iterate_result res2;
> + gdb_wchar_t *dummy1;
> + const gdb_byte *dummy2;
> + size_t len;
> + int res = it.iterate (&res2, &dummy1, &dummy2, &len);
> + if (res < 0)
> + {
> + /* End of string. */
> + gdb_assert (res2 == wchar_iterate_eof);
> + break;
> + }
> +
> + if (res == 0)
> + {
> + if (res2 == wchar_iterate_invalid)
> + {
> + /* Let single-byte char code handle it. */
> + gdb_assert (len == 1);
> + }
> + else if (res2 == wchar_iterate_incomplete)
> + {
> + /* Iterator has been setup to return end-of-string on next
> + call to iterate. Make that an advance-by-one instead, and
> + let single-byte char code handle it. */
> + it.reset ((gdb_byte *)(string + 1), strlen (string + 1));
> + }
> + else
> + gdb_assert_not_reached ("");
> + }
> + else
> + {
> + /* res > 0. */
> + gdb_assert (res2 == wchar_iterate_ok);
> + if (len > 1)
> + {
> + /* Multi-byte char. Handle it. */
> + waddnstr (w, string, len);
> + string += len;
> + handled = true;
> + }
> + else
> + {
> + /* Single-byte char. Let single-byte char code handle it. */
> + gdb_assert (len == 1);
> + }
> + }
> + }
>
> - if (c == '\033')
> + if (!handled)
> {
> - size_t bytes_read = apply_ansi_escape (w, string - 1);
> - if (bytes_read > 0)
> + c = *string++;
> + if (c == '\0')
> + {
> + /* End of string. */
> + break;
> + }
> +
> + if (c == '\1' || c == '\2')
> {
> - string = string + bytes_read - 1;
> + /* Ignore these, they are readline escape-marking
> + sequences. */
> continue;
> }
> - }
>
> - if (c == '\n')
> - saw_nl = true;
> + if (c == '\033')
> + {
> + size_t bytes_read = apply_ansi_escape (w, string - 1);
> + if (bytes_read > 0)
> + {
> + skip = bytes_read - 1;
> + string += skip;
> + continue;
> + }
> + }
> +
> + if (c == '\n')
> + saw_nl = true;
>
> - do_tui_putc (w, c);
> + do_tui_putc (w, c);
> + }
>
> if (height != nullptr)
> {
>
> base-commit: 87f5e2edca1412326ae40489e2780821093481cb
next prev parent reply other threads:[~2025-07-06 7:48 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-06 7:40 Tom de Vries
2025-07-06 7:47 ` Tom de Vries [this message]
2025-07-18 12:35 ` Tom de Vries
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b89ba668-b3f2-46b2-833a-87fb2fcbfe5f@suse.de \
--to=tdevries@suse.de \
--cc=gdb-patches@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox