From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16810 invoked by alias); 13 Apr 2006 18:06:18 -0000 Received: (qmail 16724 invoked by uid 22791); 13 Apr 2006 18:06:17 -0000 X-Spam-Check-By: sourceware.org Received: from xproxy.gmail.com (HELO xproxy.gmail.com) (66.249.82.203) by sourceware.org (qpsmtpd/0.31) with ESMTP; Thu, 13 Apr 2006 18:06:15 +0000 Received: by xproxy.gmail.com with SMTP id h29so1181857wxd for ; Thu, 13 Apr 2006 11:06:13 -0700 (PDT) Received: by 10.70.110.3 with SMTP id i3mr1007723wxc; Thu, 13 Apr 2006 11:06:12 -0700 (PDT) Received: by 10.70.125.5 with HTTP; Thu, 13 Apr 2006 11:06:12 -0700 (PDT) Message-ID: <8f2776cb0604131106q7fef0342ic5d0b4990b42b66a@mail.gmail.com> Date: Fri, 14 Apr 2006 06:02:00 -0000 From: "Jim Blandy" To: "Eli Zaretskii" Subject: Re: printing wchar_t* Cc: ghost@cs.msu.su, gdb@sources.redhat.com In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <8f2776cb0604131031g370d6fa9p9361421bd21d178@mail.gmail.com> X-IsSubscribed: yes Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2006-04/txt/msg00165.txt.bz2 On 4/13/06, Eli Zaretskii wrote: > > Date: Thu, 13 Apr 2006 10:31:18 -0700 > > From: "Jim Blandy" > > Cc: gdb@sources.redhat.com > > > > The best approach would be to extend charset.[ch] to handle wide > > character sets as well, and then add code to the language-specific > > printing routines to use the charset functions. (This is fortunately > > much simpler than adding support for multibyte characters.) > > Can you tell why you think it's much simpler? Okay --- just to be clear, this is about multi-byte characters, not wide characters, which is what Volodya was asking about. - The code for limiting how much of a string GDB will print, and for detecting repetitions, seemed like it would be hard to adapt to multibyte encodings. Remember that you've got to be completely agnostic about the encoding; there are stateful encodings out there in widespread use, etc. - I don't think GDB should use off-the-shelf conversion stuff like iconv. For example, if you're looking at ISO-2022 text with the character set switching escape codes in there, I'd argue it'd be wrong for GDB to display those strings without showing the escape codes.=20 It's a debugger, so people are looking at strings and corresponding indexes into those strings, and they need to be able to see exactly what's in there. iconv handles the escape codes silently. - Most programs can just print an error message and die if they see ill-formed multi-byte sequences: you gave them junk; fix it. GDB needs to do something more useful; its job is to be helpful exactly when your program is misbehaving and you don't know why.