From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28416 invoked by alias); 1 Feb 2009 22:42:44 -0000 Received: (qmail 28408 invoked by uid 22791); 1 Feb 2009 22:42:44 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx2.redhat.com (HELO mx2.redhat.com) (66.187.237.31) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 01 Feb 2009 22:42:38 +0000 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n11MeYUe026059; Sun, 1 Feb 2009 17:40:34 -0500 Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) by int-mx2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n11MeXpk009909; Sun, 1 Feb 2009 17:40:33 -0500 Received: from opsy.redhat.com (vpn-12-165.rdu.redhat.com [10.11.12.165]) by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id n11MeXqC009867; Sun, 1 Feb 2009 17:40:33 -0500 Received: by opsy.redhat.com (Postfix, from userid 500) id 0669D50826F; Sun, 1 Feb 2009 15:40:30 -0700 (MST) To: Joel Brobecker Cc: Julian Brown , gdb-patches@sourceware.org Subject: Re: [PATCH/WIP] C/C++ wchar_t/Unicode printing support References: <20090115202411.5f154657@rex.config> <20090130194343.GA3964@adacore.com> <20090201182344.GD4597@caradoc.them.org> From: Tom Tromey Reply-To: Tom Tromey Date: Sun, 01 Feb 2009 22:42:00 -0000 In-Reply-To: <20090201182344.GD4597@caradoc.them.org> (Daniel Jacobowitz's message of "Sun\, 1 Feb 2009 13\:23\:44 -0500") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2009-02/txt/msg00015.txt.bz2 >>>>> "Daniel" == Daniel Jacobowitz writes: >> I suppose one option would be to have a degraded mode where we require >> that the host charset and the target charset be the same. Then maybe >> we could make it work by redefining iswprint and wchar_t. Daniel> I don't see the connection between the iconv dependency and Daniel> iswprint / wchar_t. Are there portability issues for those Daniel> too? They don't come from libiconv. Yeah, there isn't a direct connection. Basically there are two problems to solve. One is having a way to convert from a target charset of some kind to a host charset of some kind. The other issue is deciding how to print things on the host. We want to use a host wide character of some sort, so that we can print a larger subset of characters on a capable terminal. This also lets us handle "set print repeat", on some platforms anyway, without needing details about a possible host-side variable-length encoding. I chose to solve the first problem by using iconv for all the conversions, and the second by using wchar_t and iswprint for host printability decisions. There may be portability issues for the use of wchar_t and iswprint. I don't know. It would be helpful if someone with access to the more exotic hosts out there could take a look. Daniel> It seems like a dummy version of iconv_open which only succeeds if the Daniel> two character sets are the same, plus a pass-through version of iconv, Daniel> would be enough to remove the iconv dependency. That degraded mode Daniel> covers all local debugging. The wchar_t issue comes into play because we actually do two conversions when printing: one from the target charset to the host wchar_t, and then a second one from the host wchar_t to the host "narrow" charset. This just adds a wrinkle to the implementation, though -- the general plan still applies. We could either pretend that wchar_t == char, or we could make an iconv that uses the mb* functions. I can implement this, but I'd rather do it only if it is truly needed. How are you planning to handle this for Code Sourcery? Really I would like to hear the answer to this from anybody shipping a gdb executable. I suppose my recommendation would be to put GNU libiconv into your local tree, with some configury tweaks to make it build a static library. This does not seem very hard, though I suppose it is only suitable if you are not too concerned about the resulting executable size. Another portability question is whether there is a platform that does not have iconv at all. If every host we care about has some form of iconv, even a bad one, perhaps we don't have to worry much -- users could still have a functional-for-basic-native-debugging gdb. Daniel> There'd need to be a little additional Daniel> logic too, to allow you to set all the charset variables at Daniel> once I think "set charset" already does this. It doesn't handle the target wide charset, but that seems ok in the degraded functionality mode. Tom