From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1309 invoked by alias); 19 Apr 2011 13:56:41 -0000 Received: (qmail 1297 invoked by uid 22791); 19 Apr 2011 13:56:39 -0000 X-SWARE-Spam-Status: No, hits=-1.5 required=5.0 tests=AWL,BAYES_00,MSGID_MULTIPLE_AT X-Spam-Check-By: sourceware.org Received: from mailhost.u-strasbg.fr (HELO mailhost.u-strasbg.fr) (130.79.200.157) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 19 Apr 2011 13:56:23 +0000 Received: from md1.u-strasbg.fr (md1.u-strasbg.fr [IPv6:2001:660:2402::186]) by mailhost.u-strasbg.fr (8.14.3/jtpda-5.5pre1) with ESMTP id p3JDuFUw020608 ; Tue, 19 Apr 2011 15:56:16 +0200 (CEST) (envelope-from pierre.muller@ics-cnrs.unistra.fr) Received: from mailserver.u-strasbg.fr (ms7.u-strasbg.fr [130.79.204.16]) by md1.u-strasbg.fr (8.14.4/jtpda-5.5pre1) with ESMTP id p3JDuFHk064971 ; Tue, 19 Apr 2011 15:56:15 +0200 (CEST) (envelope-from pierre.muller@ics-cnrs.unistra.fr) Received: from E6510Muller (gw-ics.u-strasbg.fr [130.79.210.225]) (user=mullerp mech=LOGIN) by mailserver.u-strasbg.fr (8.14.4/jtpda-5.5pre1) with ESMTP id p3JDuE1g003592 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) ; Tue, 19 Apr 2011 15:56:15 +0200 (CEST) (envelope-from pierre.muller@ics-cnrs.unistra.fr) From: "Pierre Muller" To: "'Tom Tromey'" Cc: References: <5928.31498147479$1302882967@news.gmane.org> <005101cbfc50$193136b0$4b93a410$%muller@ics-cnrs.unistra.fr> <20110416162455.GA5599@host1.jankratochvil.net> <000001cbfc7d$3f67f440$be37dcc0$%muller@ics-cnrs.unistra.fr> <83zknpoacd.fsf@gnu.org> <21014.6501930014$1303139687@news.gmane.org> <34716.7311156683$1303204711@news.gmane.org> In-Reply-To: Subject: [RFC-v5] Handle cygwin wchar_t specifics Date: Tue, 19 Apr 2011 13:56:00 -0000 Message-ID: <000c01cbfe99$8c9d1fb0$a5d75f10$@muller@ics-cnrs.unistra.fr> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2011-04/txt/msg00321.txt.bz2 > -----Message d'origine----- > De=A0: gdb-patches-owner@sourceware.org [mailto:gdb-patches- > owner@sourceware.org] De la part de Tom Tromey > Envoy=E9=A0: mardi 19 avril 2011 15:19 > =C0=A0: Pierre Muller > Cc=A0: gdb-patches@sourceware.org > Objet=A0: Re: [RFC-v4] Handle cygwin wchar_t specifics >=20 > >>>>> "Pierre" =3D=3D Pierre Muller writes: >=20 > Pierre> 1) we should really use "gdb_wchar_t" type, not "wchar_t" >=20 > Yeah. >=20 > Pierre> 2) If sizeof(gdb_wchar_t) =3D=3D 1 > Pierre> I don't think that UTF-8LE and UTF-8BE exist, do they? > Pierre> At least they are not in the iconv -l list for current cygwin. >=20 > A platform where this is true should not define __STDC_ISO_10646__. > You might as well just assert that the size is 2 or 4. >=20 > Pierre> 3) WORD_BIGENDIAN is not defined at all on Cygwin, > Pierre> so that your code would probably not compile. >=20 > Yeah, I forgot, you need #if. See config.in. >=20 > Pierre> A further question is whether UTF-32 is always supported... >=20 > If someone can find a platform where wchar_t is 4 bytes, where > __STDC_ISO_10646__ is defined, and where UTF-32 is not understood, then > we can complain bitterly and change the code again. >=20 > Pierre> Below is yet another proposal: > Pierre> it transforms INTERMEDIATE_ENCODING macro into a call to > Pierre> intermediate_encoding function. >=20 > I'd prefer it if the new code is only used in the __STDC_ISO_10646__ > case. Done below.=20 > Pierre> +#ifdef WORDS_BIGENDIAN >=20 > #if OK, corrected below. > Pierre> +const char * > Pierre> +intermediate_encoding (void) >=20 > New functions require an introductory comment. I wrote a minimal description, feel free to improve it. =20 > Pierre> #ifdef PHONY_ICONV > Pierre> -#define INTERMEDIATE_ENCODING "wchar_t" > Pierre> +#define DEFAULT_INTERMEDIATE_ENCODING "wchar_t" >=20 > I don't think DEFAULT_INTERMEDIATE_ENCODING is needed. I assumed you ment: not necessary if PHONY_ICONV is defined, and this is what I changed below. (I would personally have favored to completely remove INTERMEDIATE_ENCODING macro and call the function directly.) =20 > Tom Thanks for your comments, I tried to take all into account in the new version below. Checked on cygwin (where __STDC_ISO_10646__ is defined),=20 mingw32 (not defined) and mingw64 (no iconv at all, and consequently no intermediate_encoding function). All three allow at least printing out of version correctly. More comments? Pierre 2011-04-19 Pierre Muller * gdb_wchar.h (DEFAULT_INTERMEDIATE_ENCODING): New macro. (INTERMEDIATE_ENCODING): Change value to intermediate_encoding function call. (intermediate_encoding): New prototype. * charset.c (ENDIAN_SUFFIX): New macro. (intermediate_encoding): New function. =09 Index: charset.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /cvs/src/src/gdb/charset.c,v retrieving revision 1.43 diff -u -p -r1.43 charset.c --- charset.c 11 Jan 2011 15:10:01 -0000 1.43 +++ charset.c 19 Apr 2011 13:42:54 -0000 @@ -922,6 +922,59 @@ default_auto_wide_charset (void) return GDB_DEFAULT_TARGET_WIDE_CHARSET; } =20 + +#ifndef PHONY_ICONV +/* Macro used for UTF or UCS endianness suffix. */ +#if WORDS_BIGENDIAN +#define ENDIAN_SUFFIX "BE" +#else +#define ENDIAN_SUFFIX "LE" +#endif + +/* intermediate_encoding returns the charset unsed internally by + GDB to convert between target and host encodings. */ + +const char * +intermediate_encoding (void) +{ +#ifdef __STDC_ISO_10646__ + if (sizeof (gdb_wchar_t) =3D=3D 2) + { + static const char *stored_result =3D NULL; + const char *result; + int i; + + if (stored_result) + return stored_result; + result =3D "UTF-16" ENDIAN_SUFFIX; + /* Check that the name is in the list of handled charsets. */ + for (i =3D 0; charset_enum[i]; i++) + { + if (strcmp (result, charset_enum[i]) =3D=3D 0) + { + stored_result =3D result; + return result; + } + } + /* Second try, with UCS-2 type. */ + result =3D "UCS-2" ENDIAN_SUFFIX; + /* Check that the name is in the list of handled charsets. */ + for (i =3D 0; charset_enum[i]; i++) + { + if (strcmp (result, charset_enum[i]) =3D=3D 0) + { + stored_result =3D result; + return result; + } + } + } +#endif /* __STDC_ISO_10646__ */ + /* if gdb_wchar_t is not of size 2, or if "UTF-16XE" and "UCS-2XE" are + not known, use DEFAULT_INTERMEDIATE_ENCODING macro. */ + return DEFAULT_INTERMEDIATE_ENCODING; +} +#endif /* not PHONY_ICONV */ + void _initialize_charset (void) { Index: gdb_wchar.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /cvs/src/src/gdb/gdb_wchar.h,v retrieving revision 1.6 diff -u -p -r1.6 gdb_wchar.h --- gdb_wchar.h 1 Jan 2011 15:33:05 -0000 1.6 +++ gdb_wchar.h 19 Apr 2011 13:42:54 -0000 @@ -79,18 +79,20 @@ typedef wint_t gdb_wint_t; hosts that emit a BOM when the unadorned name is used. */ #if defined (__STDC_ISO_10646__) #if WORDS_BIGENDIAN -#define INTERMEDIATE_ENCODING "UCS-4BE" +#define DEFAULT_INTERMEDIATE_ENCODING "UCS-4BE" #else -#define INTERMEDIATE_ENCODING "UCS-4LE" +#define DEFAULT_INTERMEDIATE_ENCODING "UCS-4LE" #endif #elif defined (_LIBICONV_VERSION) && _LIBICONV_VERSION >=3D 0x108 -#define INTERMEDIATE_ENCODING "wchar_t" +#define DEFAULT_INTERMEDIATE_ENCODING "wchar_t" #else /* This shouldn't happen, because the earlier #if should have filtered out this case. */ #error "Neither __STDC_ISO_10646__ nor _LIBICONV_VERSION defined" #endif =20 +#define INTERMEDIATE_ENCODING intermediate_encoding () + #else =20 /* If we got here and have wchar_t support, we might be on a system @@ -117,9 +119,13 @@ typedef int gdb_wint_t; #ifdef PHONY_ICONV #define INTERMEDIATE_ENCODING "wchar_t" #else -#define INTERMEDIATE_ENCODING host_charset () +#define DEFAULT_INTERMEDIATE_ENCODING host_charset () +#endif + #endif =20 +#ifndef PHONY_ICONV +const char *intermediate_encoding (void); #endif =20 #endif /* GDB_WCHAR_H */