From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8826 invoked by alias); 14 Apr 2006 08:43:24 -0000 Received: (qmail 8818 invoked by uid 22791); 14 Apr 2006 08:43:23 -0000 X-Spam-Check-By: sourceware.org Received: from nitzan.inter.net.il (HELO nitzan.inter.net.il) (192.114.186.20) by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 14 Apr 2006 08:43:22 +0000 Received: from HOME-C4E4A596F7 (IGLD-80-230-89-169.inter.net.il [80.230.89.169]) by nitzan.inter.net.il (MOS 3.7.3-GA) with ESMTP id DDF28732 (AUTH halo1); Fri, 14 Apr 2006 11:43:18 +0300 (IDT) Date: Fri, 14 Apr 2006 08:57:00 -0000 Message-Id: From: Eli Zaretskii To: Vladimir Prus CC: gdb@sources.redhat.com In-reply-to: (message from Vladimir Prus on Fri, 14 Apr 2006 10:10:19 +0400) Subject: Re: printing wchar_t* Reply-to: Eli Zaretskii References: <8f2776cb0604131031g370d6fa9p9361421bd21d178@mail.gmail.com> X-IsSubscribed: yes Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2006-04/txt/msg00173.txt.bz2 > From: Vladimir Prus > Date: Fri, 14 Apr 2006 10:10:19 +0400 > > The problem is that I don't see any way how gdb can print wchar_t in a way > that does not require post-processing. It can print it as UTF8, but then > for printing char* gdb should use local 8 bit encoding, which is likely to > be *not* UTF8. You are talking about a GUI front-end, aren't you? In that case, you will need to code a routine that accepts a wchar_t string, and then _displays_ it using the appropriate font. It is wrong to talk about ``printing'' it and about ``local 8-bit encoding'', because you don't want to encode it, you want to display it using the appropriate font. In particular, if the original wchar_t uses Unicode codepoints, then presumably there should be some GUI API call, specific to your windowing system, that would accept such a wchar_t string and display it using a Unicode font. So if you are going to do this in the front-end, I think all you need is ask GDB to supply the wchar_t string using the array notation; the rest will have to be done inside the front-end. Am I missing something? > Gdb can probably use some extra markers for values: like: > > "foo" for string in local 8-bit encoding > L"foo" for string in UTF8 encoding. > > It's also possible to use "\u" escapes. Why do you need any of these? 16-bit Unicode characters are just integers, so ask GDB to send them as integers. That should be all you need, since displaying them is something your FE will need to do itself, no? > But then there's a problem: > > - Do we assume that wchar_t is always UTF-16 or UTF-32? You don't need to assume, you can ask the application. Wouldn't "sizeof(wchar_t)" do the trick? > - how user-specified encoding will be handled wchar_t is not an encoding, it's the characters' codes themselves. Encoded characters are (in general) multibyte character strings, not wchar_t. See, for example, the description of library functions mbsinit, mbrlen, mbrtowc, etc., for more about this distinction.