From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22310 invoked by alias); 18 Apr 2011 10:57:46 -0000 Received: (qmail 22301 invoked by uid 22791); 18 Apr 2011 10:57:46 -0000 X-SWARE-Spam-Status: No, hits=-0.9 required=5.0 tests=AWL,BAYES_00,SPF_SOFTFAIL X-Spam-Check-By: sourceware.org Received: from mtaout20.012.net.il (HELO mtaout20.012.net.il) (80.179.55.166) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 18 Apr 2011 10:57:08 +0000 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0LJU00200GLYCG00@a-mtaout20.012.net.il> for gdb-patches@sourceware.org; Mon, 18 Apr 2011 13:56:40 +0300 (IDT) Received: from HOME-C4E4A596F7 ([77.124.96.168]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0LJU001EFH2DH5G0@a-mtaout20.012.net.il>; Mon, 18 Apr 2011 13:56:40 +0300 (IDT) Date: Mon, 18 Apr 2011 10:57:00 -0000 From: Eli Zaretskii Subject: Re: [RFA-v2] Handle cygwin wchar_t specifics In-reply-to: <00a801cbfdb4$551214a0$ff363de0$%muller@ics-cnrs.unistra.fr> To: Pierre Muller Cc: jan.kratochvil@redhat.com, tromey@redhat.com, gdb-patches@sourceware.org Reply-to: Eli Zaretskii Message-id: <83aafnsu8w.fsf@gnu.org> References: <5928.31498147479$1302882967@news.gmane.org> <005101cbfc50$193136b0$4b93a410$%muller@ics-cnrs.unistra.fr> <20110416162455.GA5599@host1.jankratochvil.net> <000001cbfc7d$3f67f440$be37dcc0$%muller@ics-cnrs.unistra.fr> <83zknpoacd.fsf@gnu.org> <00a801cbfdb4$551214a0$ff363de0$%muller@ics-cnrs.unistra.fr> X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2011-04/txt/msg00274.txt.bz2 > From: "Pierre Muller" > Cc: , , > Date: Mon, 18 Apr 2011 12:35:26 +0200 > > > > -/* If __STDC_ISO_10646__ is defined, then the host wchar_t is UCS-4. > > > +/* If __STDC_ISO_10646__ is defined, then the host wchar_t is UCS-4 or > > UCS-2. > > > > Please use UTF-16, not UCS-2. What Windows uses is the former. The > > latter is the old name from the days when Unicode covered only the > > BMP; it was superseded by UTF-16 that covers more than that. > > Are you sure this is correct? > I tried what you said, but "UTF-16" seems to mean "UTF-16BE" > while UTF-16LE" seems to do a better job. UTF-16 means both LE and BE varieties. I meant to use UTF-16 in the comment, instead of UCS-2. In the code, you need to use the variety that suits the endianness of the host platform. > But if UTF-16 is better than UCS-2, > shouldn't we also favor UTF-32 over UCS-4? IMO, there's no need, since Unicode still didn't exceed 32 bits.