From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10327 invoked by alias); 14 Apr 2006 08:47:29 -0000 Received: (qmail 10317 invoked by uid 22791); 14 Apr 2006 08:47:28 -0000 X-Spam-Check-By: sourceware.org Received: from zigzag.lvk.cs.msu.su (HELO zigzag.lvk.cs.msu.su) (158.250.17.23) by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 14 Apr 2006 08:47:26 +0000 Received: from Debian-exim by zigzag.lvk.cs.msu.su with spam-scanned (Exim 4.50) id 1FUJxX-0006Wv-C0 for gdb@sources.redhat.com; Fri, 14 Apr 2006 12:47:20 +0400 Received: from zigzag.lvk.cs.msu.su ([158.250.17.23]) by zigzag.lvk.cs.msu.su with esmtp (Exim 4.50) id 1FUJxC-0006WB-K9; Fri, 14 Apr 2006 12:46:58 +0400 From: Vladimir Prus To: Eli Zaretskii Subject: Re: printing wchar_t* Date: Fri, 14 Apr 2006 12:47:00 -0000 User-Agent: KMail/1.7.2 References: In-Reply-To: Cc: gdb@sources.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200604141246.58094.ghost@cs.msu.su> Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2006-04/txt/msg00174.txt.bz2 On Friday 14 April 2006 12:30, Eli Zaretskii wrote: > > Relatively safe bet would be to assume it's some zero-terminated > > character set. I plan to assume it's either UTF-16 or UTF-32 in the GUI > > (the conversion code is the same for both encodings), but gdb can just > > print raw values. > > We should get our terminology right: UTF-16 is not a character set, > it's an encoding (and a multibyte encoding, btw). As for UTF-32, I > don't think such a beast exists at all. > > I think you meant 16-bit Unicode characters (a.k.a. the BMP) and > 32-bit Unicode characters, respectively. No, I meant UTF-16 encoding (the one with surrogate pairs), and UTF-32 encoding (which does exists, in the Unicode standard). > > > It's one possibility, the other one being to call a function in the > > > debuggee to produce the string. > > > > And what such a function will return? char* in local 8-bit encoding? In > > that case, no all wchar_t* variable can be printed. > > If you want to display non-ASCII strings, it means you already have > some way of displaying such characters. The function I mentioned > would not return anything, it would actually _display_ the string. > > For example, in command-line version of GDB, if the terminal supports > UTF-8 encoded characters, that function would output a UTF-8 encoding > of the non-ASCII string, and then the terminal will display them with > the correct glyphs. This is non-starter. I can't have debuggee send data to KDevelop widgets. > > > Yet another possibility is to do the > > > conversion in your GUI front end. > > > > That's what I'm going to do, but first I need to get raw data, > > preferrably without issing an MI command for every single character. > > A wchar_t string is just an array, and GDB already has a feature to > produce N elements of an array. In CLI, you say "print *array@20" to > print the first 20 elements of the named array. I don't know how many elements there are, as wchar_t* is zero terminated, so I'd like gdb to compute the length automatically. - Volodya