Re: printing wchar_t*

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

From: "Jim Blandy" <jimb@red-bean.com>
To: "Eli Zaretskii" <eliz@gnu.org>
Cc: "Vladimir Prus" <ghost@cs.msu.su>, gdb@sources.redhat.com
Subject: Re: printing wchar_t*
Date: Fri, 14 Apr 2006 18:03:00 -0000	[thread overview]
Message-ID: <8f2776cb0604141053v73e512e3o2d1c9086312316bd@mail.gmail.com> (raw)
In-Reply-To: <uirpc19u8.fsf@gnu.org>

I think folks are seeing difficult problems where there aren't any. 
Even if the host character set (that is, the character set GDB is
using to communicate with its user, or in its MI communications) is
plain, old ASCII, GDB can, without any loss of information, convey the
contents of a wide string using an arbitrary target character set via
MI to a GUI, using code the GUI must already have.

Suppose we have a wide string where wchar_t values are Unicode code
points.  Suppose our host character set is plain ASCII.  Suppose the
user's program has a string containing the digits '123', followed by
some funky Tibetan characters U+0F04 U+0FCC, followed by the letters
'xyz'.  When asked to print that string, GDB should print the
following twenty-one ASCII characters:

L"123\x0f04\x0fccxyz"

Since this is a valid way to write that string in a source program, a
user at the GDB command line should understand it.  Since consumers of
MI information must contain parsers for C values already, they can
reliably find the contents of the string.

Note that this gets a GUI the contents of the string in the *target*
character set.  The GUI itself should be responsible for converting
target characters to whatever character set it wants to use to present
data to its user.  Here, GDB's 'host' character set is just the
character set used to carry information from GDB to the GUI; it should
probably be set to ASCII, just to avoid needless variation.  But
either way, it's just acting as a medium for values in C source code
syntax, and has no bearing on either the character set the target
program is using, or the character set the GUI will use to present
data to its user.

Unicode technical report #17 lays out the terminology the Unicode
folks use for all this stuff, with good explanations:
http://www.unicode.org/reports/tr17/

According to the ISO C standard, the coding character set used by
wchar_t must be a superset of that used by char for members of the
basic character set.  See ISO/IEC 9899:1999 (E) section 7.17,
paragraph 2.  So I think it's sufficient for the user to specify the
coding character set used by wide characters; that fixes the ccs used
for char values.

next prev parent reply	other threads:[~2006-04-14 17:53 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-04-13 17:07 Vladimir Prus
2006-04-13 17:25 ` Eli Zaretskii
2006-04-14  7:29   ` Vladimir Prus
2006-04-14  8:47     ` Eli Zaretskii
2006-04-14 12:47       ` Vladimir Prus
2006-04-14 13:05         ` Eli Zaretskii
2006-04-14 13:06           ` Vladimir Prus
2006-04-14 13:15             ` Robert Dewar
2006-04-14 13:17           ` Daniel Jacobowitz
2006-04-14 13:59             ` Robert Dewar
2006-04-14 14:37             ` Eli Zaretskii
2006-04-14 14:08       ` Paul Koning
2006-04-14 14:47         ` Eli Zaretskii
2006-04-14 15:00           ` Vladimir Prus
2006-04-14 17:53             ` Eli Zaretskii
2006-04-17  7:05               ` Vladimir Prus
2006-04-17  8:35                 ` Eli Zaretskii
2006-04-13 18:06 ` Jim Blandy
2006-04-13 21:18   ` Eli Zaretskii
2006-04-14  6:02     ` Jim Blandy
2006-04-14  8:43       ` Eli Zaretskii
2006-04-14  7:58   ` Vladimir Prus
2006-04-14  8:07     ` Jim Blandy
2006-04-14  8:30       ` Vladimir Prus
2006-04-14  8:57     ` Eli Zaretskii
2006-04-14 12:52       ` Vladimir Prus
2006-04-14 13:07         ` Daniel Jacobowitz
2006-04-14 14:23           ` Eli Zaretskii
2006-04-14 14:29             ` Daniel Jacobowitz
2006-04-14 14:53               ` Eli Zaretskii
2006-04-14 17:10                 ` Daniel Jacobowitz
2006-04-14 17:55               ` Jim Blandy
2006-04-14 18:27                 ` Eli Zaretskii
2006-04-14 18:30                   ` Jim Blandy
2006-04-14 19:19                     ` Eli Zaretskii
2006-04-14 14:16         ` Eli Zaretskii
2006-04-14 14:50           ` Vladimir Prus
2006-04-14 17:18             ` Eli Zaretskii
2006-04-14 18:03               ` Jim Blandy [this message]
2006-04-14 19:16                 ` Eli Zaretskii
2006-04-14 19:22                   ` Jim Blandy
2006-04-14 22:18                     ` Daniel Jacobowitz
2006-04-16 11:39                       ` Jim Blandy
2006-04-16 15:07                         ` Eli Zaretskii
2006-04-15  7:14                     ` Eli Zaretskii
2006-04-17  7:16                       ` Vladimir Prus
2006-04-17  8:58                         ` Eli Zaretskii
2006-04-17 10:35                           ` Vladimir Prus
2006-04-17 12:26                             ` Eli Zaretskii
2006-04-17 13:56                               ` Vladimir Prus
2006-04-18  5:31                                 ` Eli Zaretskii
2006-04-14 19:53                 ` Mark Kettenis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8f2776cb0604141053v73e512e3o2d1c9086312316bd@mail.gmail.com \
    --to=jimb@red-bean.com \
    --cc=eliz@gnu.org \
    --cc=gdb@sources.redhat.com \
    --cc=ghost@cs.msu.su \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox