From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6934 invoked by alias); 14 Apr 2006 08:30:59 -0000 Received: (qmail 6925 invoked by uid 22791); 14 Apr 2006 08:30:58 -0000 X-Spam-Check-By: sourceware.org Received: from romy.inter.net.il (HELO romy.inter.net.il) (192.114.186.66) by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 14 Apr 2006 08:30:57 +0000 Received: from HOME-C4E4A596F7 (IGLD-80-230-89-169.inter.net.il [80.230.89.169]) by romy.inter.net.il (MOS 3.7.3-GA) with ESMTP id DZC47364 (AUTH halo1); Fri, 14 Apr 2006 11:30:51 +0300 (IDT) Date: Fri, 14 Apr 2006 08:47:00 -0000 Message-Id: From: Eli Zaretskii To: Vladimir Prus CC: gdb@sources.redhat.com In-reply-to: (message from Vladimir Prus on Fri, 14 Apr 2006 10:01:57 +0400) Subject: Re: printing wchar_t* Reply-to: Eli Zaretskii References: X-IsSubscribed: yes Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2006-04/txt/msg00172.txt.bz2 > From: Vladimir Prus > Date: Fri, 14 Apr 2006 10:01:57 +0400 > > > What character set is used by the wide characters in the wchar_t > > arrays? GDB has some support for a few single-byte character sets, > > see the node "Character Sets" in the manual. > > Relatively safe bet would be to assume it's some zero-terminated character > set. I plan to assume it's either UTF-16 or UTF-32 in the GUI (the > conversion code is the same for both encodings), but gdb can just print raw > values. We should get our terminology right: UTF-16 is not a character set, it's an encoding (and a multibyte encoding, btw). As for UTF-32, I don't think such a beast exists at all. I think you meant 16-bit Unicode characters (a.k.a. the BMP) and 32-bit Unicode characters, respectively. > > It's one possibility, the other one being to call a function in the > > debuggee to produce the string. > > And what such a function will return? char* in local 8-bit encoding? In that > case, no all wchar_t* variable can be printed. If you want to display non-ASCII strings, it means you already have some way of displaying such characters. The function I mentioned would not return anything, it would actually _display_ the string. For example, in command-line version of GDB, if the terminal supports UTF-8 encoded characters, that function would output a UTF-8 encoding of the non-ASCII string, and then the terminal will display them with the correct glyphs. > > Yet another possibility is to do the > > conversion in your GUI front end. > > That's what I'm going to do, but first I need to get raw data, preferrably > without issing an MI command for every single character. A wchar_t string is just an array, and GDB already has a feature to produce N elements of an array. In CLI, you say "print *array@20" to print the first 20 elements of the named array.