From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2996 invoked by alias); 17 Apr 2006 06:18:09 -0000 Received: (qmail 2986 invoked by uid 22791); 17 Apr 2006 06:18:08 -0000 X-Spam-Check-By: sourceware.org Received: from zigzag.lvk.cs.msu.su (HELO zigzag.lvk.cs.msu.su) (158.250.17.23) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 17 Apr 2006 06:18:07 +0000 Received: from Debian-exim by zigzag.lvk.cs.msu.su with spam-scanned (Exim 4.50) id 1FVN3f-0004zF-IW for gdb@sources.redhat.com; Mon, 17 Apr 2006 10:18:04 +0400 Received: from zigzag.lvk.cs.msu.su ([158.250.17.23]) by zigzag.lvk.cs.msu.su with esmtp (Exim 4.50) id 1FVN3O-0004vx-86; Mon, 17 Apr 2006 10:17:42 +0400 From: Vladimir Prus To: Eli Zaretskii Subject: Re: printing wchar_t* Date: Mon, 17 Apr 2006 07:05:00 -0000 User-Agent: KMail/1.7.2 Cc: pkoning@equallogic.com, gdb@sources.redhat.com References: <200604141850.08495.ghost@cs.msu.su> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200604171017.41504.ghost@cs.msu.su> Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2006-04/txt/msg00225.txt.bz2 On Friday 14 April 2006 21:10, Eli Zaretskii wrote: > > > If we want to support wchar_t arrays that store UTF-16, we will need > > > to add a feature to GDB to convert UTF-16 to the full UCS-4 > > > codepoints, and output those. > > > > That's what I mentioned in a reply to Jim -- since the current string > > printing code operated "one wchar_t at a time", it's not suitable for > > outputing UTF-16 encoded wchar_t values to the user. > > I don't understand: if the wchar_t array holds a UTF-16 encoding, then > when you receive the entire string, you have a UTF-16 encoding of what > you want to display, and you yourself said that displaying a UTF-16 > encoded string is easy for you. So where is the problem? is that only > that you cannot know the length of the UTF-16 encoded string? or is > there something else missing? For my frontend -- there's no problem, I can handle UTF-16 myself. However, if gdb is to ever produce output in UTF-8, that should be readable by the console, then it should handle surrogate pairs itself. Taking first and second element of surrogate pair and converting both to UTF-8, individually, won't work, for obvious reasons. - Volodya