From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29000 invoked by alias); 15 Jun 2011 19:42:42 -0000 Received: (qmail 28363 invoked by uid 22791); 15 Jun 2011 19:42:19 -0000 X-Spam-Check-By: sourceware.org Received: from aquarius.hirmke.de (HELO calimero.vinschen.de) (217.91.18.234) by sourceware.org (qpsmtpd/0.83/v0.83-20-g38e4449) with ESMTP; Wed, 15 Jun 2011 19:41:56 +0000 Received: by calimero.vinschen.de (Postfix, from userid 500) id 345E22CAF78; Wed, 15 Jun 2011 21:41:53 +0200 (CEST) Date: Wed, 15 Jun 2011 19:42:00 -0000 From: Corinna Vinschen To: gcc-patches@gcc.gnu.org, gdb-patches@sourceware.org Subject: Re: [RFA/libiberty] Darwin has case-insensitive filesystems Message-ID: <20110615194152.GC18126@calimero.vinschen.de> Reply-To: gcc-patches@gcc.gnu.org, gdb-patches@sourceware.org Mail-Followup-To: gcc-patches@gcc.gnu.org, gdb-patches@sourceware.org References: <1308087182-26577-1-git-send-email-brobecker@adacore.com> <201106142201.p5EM1vOd006127@greed.delorie.com> <20110615082236.GP12140@calimero.vinschen.de> <8362o79f9j.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <8362o79f9j.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2011-06/txt/msg00224.txt.bz2 On Jun 15 20:27, Eli Zaretskii wrote: > > Date: Wed, 15 Jun 2011 10:22:36 +0200 > > From: Corinna Vinschen <...> > > Talking about case-insensitive comparison, the filename_cmp and > > filename_ncmp functions don't work for multibyte codesets, only for > > singlebyte codesets. Given that UTF-8 is standard nowadays, shouldn't > > these functions be replaced with multibyte-aware versions? > > I agree, but if we go that way, shouldn't we support UTF-16, which is > used by the native Windows APIs? Windows does not use UTF-8 for file > names. I don't think so. UTF-16 is Windows' wchar_t (or WCHAR) codeset, but the applications calling the libiberty functions are using the char datatype with single- or multibyte codesets. If the filename_cmp function converts the multibyte input strings to wchar_t and compares the wide char strings case insensitive(*), they would use UTF-16 under the hood on Windows anyway. (*) As proposed in http://sourceware.org/ml/gdb-patches/2011-06/msg00210.html, basically like this: #ifdef _WIN32 #define wcscasecmp _wcsicmp #endif mbstowcs (wide_a, a); mbstowcs (wide_b, b); return wcscasecmp (wide_a, wide_b); Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat