From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-24880-listarch-gdb=sources.redhat.com@sourceware.org>
Received: (qmail 17972 invoked by alias); 14 Apr 2006 06:10:30 -0000
Received: (qmail 17964 invoked by uid 22791); 14 Apr 2006 06:10:29 -0000
X-Spam-Check-By: sourceware.org
Received: from main.gmane.org (HELO ciao.gmane.org) (80.91.229.2)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 14 Apr 2006 06:10:28 +0000
Received: from list by ciao.gmane.org with local (Exim 4.43) 	id 1FUHVg-0007hC-7c 	for gdb@sources.redhat.com; Fri, 14 Apr 2006 08:10:24 +0200
Received: from zigzag.lvk.cs.msu.su ([158.250.17.23])         by main.gmane.org with esmtp (Gmexim 0.1 (Debian))         id 1AlnuQ-0007hv-00         for <gdb@sources.redhat.com>; Fri, 14 Apr 2006 08:10:24 +0200
Received: from ghost by zigzag.lvk.cs.msu.su with local (Gmexim 0.1 (Debian))         id 1AlnuQ-0007hv-00         for <gdb@sources.redhat.com>; Fri, 14 Apr 2006 08:10:24 +0200
To: gdb@sources.redhat.com
From:  Vladimir Prus <ghost@cs.msu.su>
Subject:  Re: printing wchar_t*
Date: Fri, 14 Apr 2006 07:58:00 -0000
Message-ID:  <e1necb$gen$1@sea.gmane.org>
References:  <e1lsqg$aml$1@sea.gmane.org> <8f2776cb0604131031g370d6fa9p9361421bd21d178@mail.gmail.com>
Mime-Version:  1.0
Content-Type:  text/plain; charset=us-ascii
Content-Transfer-Encoding:  7Bit
User-Agent: KNode/0.8.2
X-IsSubscribed: yes
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
X-SW-Source: 2006-04/txt/msg00168.txt.bz2

Jim Blandy wrote:

> On 4/13/06, Vladimir Prus <ghost@cs.msu.su> wrote:
>> I have a user-defined command that can produce the output I want, but is
>> defining a custom command the right approach?
> 
> Well, you'd like wide strings to be printed properly when they appear
> in structures, as arguments to functions, and so on, right?  So a
> user-defined command isn't ideal.

I think I'll still need to do some processing for wchar_t* on frontend side.
The problem is that I don't see any way how gdb can print wchar_t in a way
that does not require post-processing. It can print it as UTF8, but then
for printing char* gdb should use local 8 bit encoding, which is likely to
be *not* UTF8. Gdb can probably use some extra markers for values: like:

   "foo"  for string in local 8-bit encoding
   L"foo" for string in UTF8 encoding.

It's also possible to use "\u" escapes.

But then there's a problem:

   - Do we assume that wchar_t is always UTF-16 or UTF-32?
   - If not:
     - how user can select this?
     - how user-specified encoding will be handled
 
> The best approach would be to extend charset.[ch] to handle wide
> character sets as well, and then add code to the language-specific
> printing routines to use the charset functions.  (This is fortunately
> much simpler than adding support for multibyte characters.)

For, for each wchar_t element language-specific code will call
'target_wchar_t_to_host', that will output specific representation of that
wchar_t. Hmm, the interface there seem to assume theres 1<->1 mapping
between target and host characters.  This makes L"UTF8" format and ascii
string with \u escapes format impossible, It seems.

- Volodya