From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-24877-listarch-gdb=sources.redhat.com@sourceware.org>
Received: (qmail 16810 invoked by alias); 13 Apr 2006 18:06:18 -0000
Received: (qmail 16724 invoked by uid 22791); 13 Apr 2006 18:06:17 -0000
X-Spam-Check-By: sourceware.org
Received: from xproxy.gmail.com (HELO xproxy.gmail.com) (66.249.82.203)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Thu, 13 Apr 2006 18:06:15 +0000
Received: by xproxy.gmail.com with SMTP id h29so1181857wxd         for <gdb@sources.redhat.com>; Thu, 13 Apr 2006 11:06:13 -0700 (PDT)
Received: by 10.70.110.3 with SMTP id i3mr1007723wxc;         Thu, 13 Apr 2006 11:06:12 -0700 (PDT)
Received: by 10.70.125.5 with HTTP; Thu, 13 Apr 2006 11:06:12 -0700 (PDT)
Message-ID: <8f2776cb0604131106q7fef0342ic5d0b4990b42b66a@mail.gmail.com>
Date: Fri, 14 Apr 2006 06:02:00 -0000
From: "Jim Blandy" <jimb@red-bean.com>
To: "Eli Zaretskii" <eliz@gnu.org>
Subject: Re: printing wchar_t*
Cc: ghost@cs.msu.su, gdb@sources.redhat.com
In-Reply-To: <ur7412x9q.fsf@gnu.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
References: <e1lsqg$aml$1@sea.gmane.org> 	 <8f2776cb0604131031g370d6fa9p9361421bd21d178@mail.gmail.com> 	 <ur7412x9q.fsf@gnu.org>
X-IsSubscribed: yes
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
X-SW-Source: 2006-04/txt/msg00165.txt.bz2

On 4/13/06, Eli Zaretskii <eliz@gnu.org> wrote:
> > Date: Thu, 13 Apr 2006 10:31:18 -0700
> > From: "Jim Blandy" <jimb@red-bean.com>
> > Cc: gdb@sources.redhat.com
> >
> > The best approach would be to extend charset.[ch] to handle wide
> > character sets as well, and then add code to the language-specific
> > printing routines to use the charset functions.  (This is fortunately
> > much simpler than adding support for multibyte characters.)
>
> Can you tell why you think it's much simpler?

Okay --- just to be clear, this is about multi-byte characters, not
wide characters, which is what Volodya was asking about.

- The code for limiting how much of a string GDB will print, and for
detecting repetitions, seemed like it would be hard to adapt to
multibyte encodings.  Remember that you've got to be completely
agnostic about the encoding; there are stateful encodings out there in
widespread use, etc.

- I don't think GDB should use off-the-shelf conversion stuff like
iconv.  For example, if you're looking at ISO-2022 text with the
character set switching escape codes in there, I'd argue it'd be wrong
for GDB to display those strings without showing the escape codes.=20
It's a debugger, so people are looking at strings and corresponding
indexes into those strings, and they need to be able to see exactly
what's in there.  iconv handles the escape codes silently.

- Most programs can just print an error message and die if they see
ill-formed multi-byte sequences: you gave them junk; fix it.  GDB
needs to do something more useful; its job is to be helpful exactly
when your program is misbehaving and you don't know why.