From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 92201 invoked by alias); 19 Dec 2019 15:17:48 -0000 Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org Received: (qmail 92194 invoked by uid 89); 19 Dec 2019 15:17:48 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-3.5 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: eggs.gnu.org Received: from eggs.gnu.org (HELO eggs.gnu.org) (209.51.188.92) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 19 Dec 2019 15:17:47 +0000 Received: from fencepost.gnu.org ([2001:470:142:3::e]:34926) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ihxYe-0001jH-VZ for gdb@sourceware.org; Thu, 19 Dec 2019 10:17:45 -0500 Received: from [176.228.60.248] (port=1223 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1ihxYd-0001Dc-L4 for gdb@sourceware.org; Thu, 19 Dec 2019 10:17:44 -0500 Date: Thu, 19 Dec 2019 15:17:00 -0000 Message-Id: <831rt02vlb.fsf@gnu.org> From: Eli Zaretskii To: gdb@sourceware.org Subject: Thread names and non-ASCII characters X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-IsSubscribed: yes X-SW-Source: 2019-12/txt/msg00039.txt.bz2 Can someone tell what GDB assumes to be the character encoding used by thread names we get from the system APIs (such as pthread_getname_np)? It sounds like we assume the host character set, since the functions used to display the thread name don't perform any encoding conversion. Is my understanding correct? I'm asking because Windows 10 introduces a new API for setting and getting a thread's name, but this API wants a UTF-16 encoded string, so if we want to use it, we need to decide from/to what encoding to convert to/from UTF-16. The current code in windows-nat.c that processes the special MSVC exception used on older platforms to set thread names for debugging purposes simply copies the name as an array of 'char', so it, too, implicitly assumes the host encoding (a.k.a. "system codepage" in Windows parlance). Am I missing something?