From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-24885-listarch-gdb=sources.redhat.com@sourceware.org>
Received: (qmail 8826 invoked by alias); 14 Apr 2006 08:43:24 -0000
Received: (qmail 8818 invoked by uid 22791); 14 Apr 2006 08:43:23 -0000
X-Spam-Check-By: sourceware.org
Received: from nitzan.inter.net.il (HELO nitzan.inter.net.il) (192.114.186.20)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 14 Apr 2006 08:43:22 +0000
Received: from HOME-C4E4A596F7 (IGLD-80-230-89-169.inter.net.il [80.230.89.169]) 	by nitzan.inter.net.il (MOS 3.7.3-GA) 	with ESMTP id DDF28732 (AUTH halo1); 	Fri, 14 Apr 2006 11:43:18 +0300 (IDT)
Date: Fri, 14 Apr 2006 08:57:00 -0000
Message-Id: <u3bgg35u9.fsf@gnu.org>
From: Eli Zaretskii <eliz@gnu.org>
To: Vladimir Prus <ghost@cs.msu.su>
CC: gdb@sources.redhat.com
In-reply-to: <e1necb$gen$1@sea.gmane.org> (message from Vladimir Prus on Fri, 	14 Apr 2006 10:10:19 +0400)
Subject: Re: printing wchar_t*
Reply-to: Eli Zaretskii <eliz@gnu.org>
References: <e1lsqg$aml$1@sea.gmane.org> <8f2776cb0604131031g370d6fa9p9361421bd21d178@mail.gmail.com> <e1necb$gen$1@sea.gmane.org>
X-IsSubscribed: yes
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
X-SW-Source: 2006-04/txt/msg00173.txt.bz2

> From:  Vladimir Prus <ghost@cs.msu.su>
> Date:  Fri, 14 Apr 2006 10:10:19 +0400
> 
> The problem is that I don't see any way how gdb can print wchar_t in a way
> that does not require post-processing. It can print it as UTF8, but then
> for printing char* gdb should use local 8 bit encoding, which is likely to
> be *not* UTF8.

You are talking about a GUI front-end, aren't you?  In that case, you
will need to code a routine that accepts a wchar_t string, and then
_displays_ it using the appropriate font.  It is wrong to talk about
``printing'' it and about ``local 8-bit encoding'', because you don't
want to encode it, you want to display it using the appropriate font.

In particular, if the original wchar_t uses Unicode codepoints, then
presumably there should be some GUI API call, specific to your
windowing system, that would accept such a wchar_t string and display
it using a Unicode font.

So if you are going to do this in the front-end, I think all you need
is ask GDB to supply the wchar_t string using the array notation; the
rest will have to be done inside the front-end.  Am I missing
something?

> Gdb can probably use some extra markers for values: like:
> 
>    "foo"  for string in local 8-bit encoding
>    L"foo" for string in UTF8 encoding.
> 
> It's also possible to use "\u" escapes.

Why do you need any of these?  16-bit Unicode characters are just
integers, so ask GDB to send them as integers.  That should be all you
need, since displaying them is something your FE will need to do
itself, no?

> But then there's a problem:
> 
>    - Do we assume that wchar_t is always UTF-16 or UTF-32?

You don't need to assume, you can ask the application.  Wouldn't
"sizeof(wchar_t)" do the trick?

>      - how user-specified encoding will be handled

wchar_t is not an encoding, it's the characters' codes themselves.
Encoded characters are (in general) multibyte character strings, not
wchar_t.  See, for example, the description of library functions
mbsinit, mbrlen, mbrtowc, etc., for more about this distinction.