From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-61206-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 23226 invoked by alias); 16 Jan 2009 09:36:55 -0000
Received: (qmail 23218 invoked by uid 22791); 16 Jan 2009 09:36:54 -0000
X-SWARE-Spam-Status: No, hits=-0.7 required=5.0 	tests=AWL,BAYES_00,RCVD_IN_SORBS_WEB,SPF_SOFTFAIL
X-Spam-Check-By: sourceware.org
Received: from mtaout2.012.net.il (HELO mtaout2.012.net.il) (84.95.2.4)     by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 16 Jan 2009 09:36:13 +0000
Received: from conversion-daemon.i_mtaout2.012.net.il by i_mtaout2.012.net.il (HyperSendmail v2004.12) id <0KDK0060052M6H00@i_mtaout2.012.net.il> for gdb-patches@sourceware.org; Fri, 16 Jan 2009 11:36:21 +0200 (IST)
Received: from HOME-C4E4A596F7 ([77.127.202.36]) by i_mtaout2.012.net.il (HyperSendmail v2004.12) with ESMTPA id <0KDK00H3Y5CK0ZS1@i_mtaout2.012.net.il>; Fri, 16 Jan 2009 11:36:21 +0200 (IST)
Date: Fri, 16 Jan 2009 09:36:00 -0000
From: Eli Zaretskii <eliz@gnu.org>
Subject: Re: [PATCH/WIP] C/C++ wchar_t/Unicode printing support
In-reply-to: <20090115202411.5f154657@rex.config>
To: Julian Brown <julian@codesourcery.com>
Cc: gdb-patches@sourceware.org, tromey@redhat.com
Reply-to: Eli Zaretskii <eliz@gnu.org>
Message-id: <u3afjwq2c.fsf@gnu.org>
References: <20090115202411.5f154657@rex.config>
X-IsSubscribed: yes
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2009-01/txt/msg00376.txt.bz2

> Date: Thu, 15 Jan 2009 20:24:11 +0000
> From: Julian Brown <julian@codesourcery.com>
> Cc: tromey@redhat.com
> 
> This patch contains (at least the start of) support for printing
> wchar_t strings from a debugged program within GDB. This is the subject
> for GDB bugs 9103 (and its duplicates 9369, 9268) and maybe 7821.

Thank you!

> OK to apply

Not without documentation, sorry.  Such an important feature should
not go in undocumented.

> or any comments?

A few:

> (gdb) show host-charset
> The host character set is "UTF-8" (auto).

Elsewhere in GDB, we show such settings in a slightly different form:

    (gdb) show language
    The current source language is "auto; currently c".

I like this latter form better: it first says that the setting is
"auto", then what is the detected state.

> + #ifndef GDB_DEFAULT_TARGET_WIDE_CHARSET
> + #define GDB_DEFAULT_TARGET_WIDE_CHARSET "UTF-32"
> + #endif
> + 
> + #ifndef GDB_INTERNAL_CODESET
> + #define GDB_INTERNAL_CODESET "UCS-4LE"
> + #endif

Why are these the defaults? because of what GNU/Linux (i.e. glibc)
does, or for some other reason?  If the former, shouldn't this be
autoconfigured?

> + static const char *target_wide_charset_enum[] =
> + {
> +   "UCS-2",
> +   "UCS-2LE",
> +   "UCS-2BE",
> +   "UCS-4",
> +   "UCS-4LE",
> +   "UCS-4BE",
> +   "UTF-16",
> +   "UTF-16LE",
> +   "UTF-16BE",
> +   "UTF-32",
> +   "UTF-32LE",
> +   "UTF-32BE",
> +   0
> + };

Why do we need the UCS-2 charsets?  That's just confusing; are there
important platforms that support UCS-2 instead of UTF-16?  I'd also
suggest to consider removing UTF-32 and its endian variants, since
they are exactly identical to UCS-4.  (Unless someone wants to support
the Emacs 23 internal representation, but that one should be called by
its own name anyway.)