From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id JdpgJvbfXGj5sh8AWB0awg (envelope-from ) for ; Thu, 26 Jun 2025 01:51:50 -0400 Authentication-Results: simark.ca; dkim=pass (2048-bit key; unprotected) header.d=gnu.org header.i=@gnu.org header.a=rsa-sha256 header.s=fencepost-gnu-org header.b=m00cQtkM; dkim-atps=neutral Received: by simark.ca (Postfix, from userid 112) id 8CCDC1E11E; Thu, 26 Jun 2025 01:51:50 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-10.1 required=5.0 tests=ARC_SIGNED,ARC_VALID, BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_VALIDITY_CERTIFIED, RCVD_IN_VALIDITY_RPBL,RCVD_IN_VALIDITY_SAFE autolearn=ham autolearn_force=no version=4.0.1 Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id CA6061E089 for ; Thu, 26 Jun 2025 01:51:46 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 45130385B517 for ; Thu, 26 Jun 2025 05:51:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 45130385B517 Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=gnu.org header.i=@gnu.org header.a=rsa-sha256 header.s=fencepost-gnu-org header.b=m00cQtkM Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 1EAD8385703B for ; Thu, 26 Jun 2025 05:51:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1EAD8385703B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gnu.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1EAD8385703B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1750917076; cv=none; b=mAMi50zC2LadCDsggLl4bnvIQb4ZjTzPtjCl8/IMw4ltESNOJrdNfiJEpgVYR9uaS9op5gBAva0+rpp66R9PEBERsWowNswQ5pzpQsZ+Z5Q4LyP3EDwvuDGkHnu+YU7AJSZPujEIcjEN8jx5BtYMBixdXj3UsyNs8tDas9wywro= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1750917076; c=relaxed/simple; bh=/zvs7Oc3RUw9UhTLLMnN0zRFR/7QzVfsTYHzl0ukBjA=; h=DKIM-Signature:Date:Message-Id:From:To:Subject; b=ErZcs8Rxw8TsMISfqnUi2MSVZbEKZRCUHL9kcIpig7LQr0dAzFTtwqHwcX8aTK8Qwsg2WbjzVAvor9JGwdUIBFG8jZtKAM9mhLvqzRq06Je/t9XlfCbdNtnbQEtzbd5xaTB/uOUxMsoEGHi7CuDjs0Qh76D3YdaGXkpDwbFghMA= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1EAD8385703B Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUfVx-0000ia-JY; Thu, 26 Jun 2025 01:51:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=3FQ1zCoLSIozets2og4KPnBtmj4mwLetY9fRrxr0O+s=; b=m00cQtkMm7Ay l+gIDKvWoWQP8g6gO6+hSo7k7Y26D2rJpOx4KCxU+uWn/PS/ke7oQ9mhnkoHLTevSZ2cPK3yvEoVY pht2WrJhVSMOPKAB5cDOHtxvkF0lU+T9IoXwtQZYhbDsLL5SnXyzzV3c3+rWyuwmc00QCjD5Ge6UB WUXfA6Zl4hDXZOTCDiRadCJD2vahBPRZzIHrb0iu2HqRQHUjK7y2+83755dsEjtsgdn/04j8EfzbG YXptncRRrGcLJkU95UA3m/DWP68aWkaZI9RAAc4UmVYRn1ZFk+vqofHtLrCbF2b4GROWZ7iFIF+cY C/8ragbxq36KJggl73/OnA==; Date: Thu, 26 Jun 2025 08:51:10 +0300 Message-Id: <86o6ubcapd.fsf@gnu.org> From: Eli Zaretskii To: Pedro Alves Cc: tromey@adacore.com, aburgess@redhat.com, gdb-patches@sourceware.org In-Reply-To: (message from Pedro Alves on Wed, 25 Jun 2025 20:11:07 +0100) Subject: Re: [PATCH 2/2] Allow check-mark to be changed for CLI References: <20250509-emoji-check-mark-v1-0-63b6c52411f3@adacore.com> <20250509-emoji-check-mark-v1-2-63b6c52411f3@adacore.com> <87zfffnqrj.fsf@redhat.com> <87wmag63jg.fsf@tromey.com> <86ikm0y1hi.fsf@gnu.org> X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces~public-inbox=simark.ca@sourceware.org > Date: Wed, 25 Jun 2025 20:11:07 +0100 > Cc: aburgess@redhat.com, gdb-patches@sourceware.org > From: Pedro Alves > > >> I don't think there's any way to figure out what the display width of a > >> string might be. At least, not unless gdb adds a dependency on > >> something like libicu. > > > > I don't think even libicu will be enough, because the terminal > > emulators vary in this respect. > > > > There's some recent protocol to help with this, but I don't think it's > > supported widely enough for GDB to rely on it. > > I'm curious on what protocol is this, and what it addresses. That's "Mode 2027". For the details, see https://github.com/contour-terminal/terminal-unicode-core and https://mitchellh.com/writing/grapheme-clusters-in-terminals > If we assume monospace font in the terminal, what goes wrong? Monospace font is only relevant for single-character strings. Once you try to display a string that has more than one Unicode character, a terminal may or may not combine those characters into one or more font glyphs (it's a many-to-many relation, so you could have M characters displayed as N glyphs, where M could be lass than, equal, or greater than N, depending on the characters in the string and the capabilities of the font used by the terminal). If that happens, wcswidth will usually produce an incorrect result, because it doesn't (and cannot) know about these features of the terminal, and basically just sums the results of wcwidth of each character in the string. > Is there a downside to using the number of columns wcwidth says the character occupies, > and default to 1 if wcwidth returns -1? See above. This discussion started in the context of showing Emoji, where the case of combining several characters into one or more glyphs happens very frequently. For example, the sequence U+2713 followed by U+FE0F VARIATION SELECTOR-16 should display a single glyph on capable terminals (try it in a GUI session in Emacs to see that), although it's two characters. In some cases, wcswidth will do the job, because control characters like U+FE0F have zero width in its database, but that is not universally so. For example, the following string of characters: U+1F468 U+200D U+1F469 U+200D U+1F466 will be displayed by capable terminals as a single glyph (again, try in Emacs to see), but wcswidth will probably return 6 for it. And even for single-character strings wcwidth will sometimes produce incorrect results, because the fonts used by modern terminal emulators for some exotic characters (like Emoji) are not monospaced.