Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: "Pierre Muller" <pierre.muller@ics-cnrs.unistra.fr>
To: "'Tom Tromey'" <tromey@redhat.com>
Cc: <gdb-patches@sourceware.org>
Subject: [RFC-v4] Handle cygwin wchar_t specifics
Date: Tue, 19 Apr 2011 09:18:00 -0000	[thread overview]
Message-ID: <004f01cbfe72$adddeb40$0999c1c0$@muller@ics-cnrs.unistra.fr> (raw)
In-Reply-To: <m3zknn7a2v.fsf@fleche.redhat.com>



> -----Message d'origine-----
> De : gdb-patches-owner@sourceware.org [mailto:gdb-patches-
> owner@sourceware.org] De la part de Tom Tromey
> Envoyé : lundi 18 avril 2011 19:18
> À : Pierre Muller
> Cc : 'Eli Zaretskii'; jan.kratochvil@redhat.com;
gdb-patches@sourceware.org
> Objet : Re: [RFA-v3] Handle cygwin wchar_t specifics
> 
> >>>>> "Pierre" == Pierre Muller <pierre.muller@ics-cnrs.unistra.fr>
writes:
> 
> Pierre>   This patch also changes the intermediate_encoding for mingw
hosts,
> Pierre>   from "wchar_t" to "UTF-16LE", but this seems to work nicely
> Pierre> for both mingw32 and mingw64 (and only if iconv is found,
> Pierre> otherwise gdb_wchar_t is simply char and phony functions are
used).
> 
> Pierre> -#define INTERMEDIATE_ENCODING host_charset ()
> Pierre> +#define DEFAULT_INTERMEDIATE_ENCODING host_charset ()
> 
> This changes the behavior if the gdb user changes the host encoding.
> This is an unusual situation, admittedly, but it seems to me that it is
> just as easy to only introduce the `intermediate_encoding' global in the
> UTF-{16,32} case.
> 
> Pierre> +  intermediate_encoding = DEFAULT_INTERMEDIATE_ENCODING;
> Pierre> +# if defined (USE_WIN32API) || defined (__CYGWIN__)
> Pierre> +  if (sizeof (gdb_wchar_t) == 2)
> Pierre> +    intermediate_encoding = "UTF-16LE";
> Pierre> +# endif
> 
> Here, instead of a special case for __CYGWIN__, and instead of
> hard-coding the endian-ness, just use the same code for all
> __STDC_ISO_10646__ platforms.  Maybe something like:
> 
> intermediate_encoding = xstrprintf ("UTF-%d%s", 8 * sizeof (wchar_t),
>                                     WORDS_BIGENDIAN ? "BE" : "LE");

  Three problems here:

1) we should really use "gdb_wchar_t" type, not "wchar_t"
2) If sizeof(gdb_wchar_t) == 1
I don't think that UTF-8LE and UTF-8BE exist, do they?
At least they are not in the iconv -l list for current cygwin.
3) WORD_BIGENDIAN is not defined at all on Cygwin,
so that your code would probably not compile.

A further question is whether UTF-32 is always supported...

Below is yet another proposal:
it transforms INTERMEDIATE_ENCODING macro into a call to
intermediate_encoding function.
This functions handles especially the case when gdb_wchar_t is 2 byte long,
by trying UTF-16XE (with X equal L or B), and if this one is not
in the list of supported charsets, tries UCS-2XE.

  As there is apparently no advantage of using UTF-32 over UCS-4 (according
to Eli)
I did not extend the change to the 4 byte case.

  Comments welcome,

Pierre Muller


2011-04-19  Pierre Muller  <muller@ics.u-strasbg.fr>

	* gdb_wchar.h (DEFAULT_INTERMEDIATE_ENCODING): New macro.
	(INTERMEDIATE_ENCODING): Change value to intermediate_encoding
	function call.
	(intermediate_encoding): New prototype.
	* charset.c (intermediate_encoding): New function.

Index: charset.c
===================================================================
RCS file: /cvs/src/src/gdb/charset.c,v
retrieving revision 1.43
diff -u -p -r1.43 charset.c
--- charset.c	11 Jan 2011 15:10:01 -0000	1.43
+++ charset.c	19 Apr 2011 09:05:43 -0000
@@ -922,6 +922,50 @@ default_auto_wide_charset (void)
   return GDB_DEFAULT_TARGET_WIDE_CHARSET;
 }
 
+#ifdef WORDS_BIGENDIAN
+#define ENDIAN_SUFFIX "BE"
+#else
+#define ENDIAN_SUFFIX "LE"
+#endif
+
+const char *
+intermediate_encoding (void)
+{
+  if (sizeof (gdb_wchar_t) == 2)
+    {
+      static const char *stored_result = NULL;
+      const char *result;
+      int i;
+
+      if (stored_result)
+	return stored_result;
+      result = "UTF-16" ENDIAN_SUFFIX;
+      /* Check that the name is in the list of handled charsets.  */
+      for (i = 0; charset_enum[i]; i++)
+	{
+	  if (strcmp (result, charset_enum[i]) == 0)
+	    {
+	      stored_result = result;
+	      return result;
+	    }
+	}
+      /* Second try, with UCS-2 type.  */
+      result = "UCS-2" ENDIAN_SUFFIX;
+      /* Check that the name is in the list of handled charsets.  */
+      for (i = 0; charset_enum[i]; i++)
+	{
+	  if (strcmp (result, charset_enum[i]) == 0)
+	    {
+	      stored_result = result;
+	      return result;
+	    }
+	}
+    }
+  /* if gdb_wchar_t is not of size 2, or if "UTF-16XE" and "UCS-2XE" are
+     not known, use DEFAULT_INTERMEDIATE_ENCODING macro.  */
+  return DEFAULT_INTERMEDIATE_ENCODING;
+}
+
 void
 _initialize_charset (void)
 {
Index: gdb_wchar.h
===================================================================
RCS file: /cvs/src/src/gdb/gdb_wchar.h,v
retrieving revision 1.6
diff -u -p -r1.6 gdb_wchar.h
--- gdb_wchar.h	1 Jan 2011 15:33:05 -0000	1.6
+++ gdb_wchar.h	19 Apr 2011 09:05:43 -0000
@@ -79,12 +79,12 @@ typedef wint_t gdb_wint_t;
    hosts that emit a BOM when the unadorned name is used.  */
 #if defined (__STDC_ISO_10646__)
 #if WORDS_BIGENDIAN
-#define INTERMEDIATE_ENCODING "UCS-4BE"
+#define DEFAULT_INTERMEDIATE_ENCODING "UCS-4BE"
 #else
-#define INTERMEDIATE_ENCODING "UCS-4LE"
+#define DEFAULT_INTERMEDIATE_ENCODING "UCS-4LE"
 #endif
 #elif defined (_LIBICONV_VERSION) && _LIBICONV_VERSION >= 0x108
-#define INTERMEDIATE_ENCODING "wchar_t"
+#define DEFAULT_INTERMEDIATE_ENCODING "wchar_t"
 #else
 /* This shouldn't happen, because the earlier #if should have filtered
    out this case.  */
@@ -115,11 +115,14 @@ typedef int gdb_wint_t;
    also providing a phony iconv, we might as well just stick with
    "wchar_t".  */
 #ifdef PHONY_ICONV
-#define INTERMEDIATE_ENCODING "wchar_t"
+#define DEFAULT_INTERMEDIATE_ENCODING "wchar_t"
 #else
-#define INTERMEDIATE_ENCODING host_charset ()
+#define DEFAULT_INTERMEDIATE_ENCODING host_charset ()
 #endif
 
 #endif
 
+#define INTERMEDIATE_ENCODING intermediate_encoding ()
+const char *intermediate_encoding (void);
+
 #endif /* GDB_WCHAR_H */


  reply	other threads:[~2011-04-19  9:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <5928.31498147479$1302882967@news.gmane.org>
2011-04-15 18:16 ` [RFA] " Tom Tromey
2011-04-16 16:05   ` Pierre Muller
2011-04-16 16:25     ` Jan Kratochvil
2011-04-16 21:29       ` [RFA-v2] " Pierre Muller
2011-04-16 22:35         ` Jan Kratochvil
     [not found]       ` <000001cbfc7d$3f67f440$be37dcc0$%muller@ics-cnrs.unistra.fr>
2011-04-17  2:55         ` Eli Zaretskii
2011-04-18 10:36           ` Pierre Muller
     [not found]           ` <00a801cbfdb4$551214a0$ff363de0$%muller@ics-cnrs.unistra.fr>
2011-04-18 10:57             ` Eli Zaretskii
2011-04-18 15:14           ` [RFA-v3] " Pierre Muller
     [not found]           ` <21014.6501930014$1303139687@news.gmane.org>
2011-04-18 17:18             ` Tom Tromey
2011-04-19  9:18               ` Pierre Muller [this message]
     [not found]               ` <004f01cbfe72$adddeb40$0999c1c0$%muller@ics-cnrs.unistra.fr>
2011-04-19  9:34                 ` [RFC-v4] " Eli Zaretskii
     [not found]               ` <34716.7311156683$1303204711@news.gmane.org>
2011-04-19 13:19                 ` Tom Tromey
2011-04-19 13:56                   ` [RFC-v5] " Pierre Muller
     [not found]                   ` <16656.7281041809$1303221408@news.gmane.org>
2011-04-19 17:50                     ` Tom Tromey
2011-04-20  7:59                       ` Pierre Muller
2011-04-20 21:08                         ` Pedro Alves
2011-04-21  6:57                           ` Pierre Muller
2011-04-21  7:17                             ` [RFA-v6] " Pierre Muller
2011-04-21  9:02                               ` Pierre Muller
     [not found]                               ` <24274.3825926029$1303376558@news.gmane.org>
2011-04-21 14:14                                 ` Tom Tromey
2011-04-21 14:27                                   ` Pierre Muller
     [not found]                                   ` <4691.37052209607$1303396084@news.gmane.org>
2011-04-21 15:06                                     ` Tom Tromey
2011-04-21 16:39                                       ` Pierre Muller
     [not found]                                       ` <25400.1310132027$1303403986@news.gmane.org>
2011-04-21 20:25                                         ` Tom Tromey
2011-04-21 21:18                                           ` 7.3 commit " Pierre Muller
     [not found]                           ` <15550.7422438406$1303369059@news.gmane.org>
2011-04-21 14:10                             ` [RFC-v5] " Tom Tromey
     [not found]                       ` <420.768399681215$1303286406@news.gmane.org>
2011-04-20 20:21                         ` Tom Tromey
2011-04-16 21:24     ` [RFA] " Tom Tromey
2011-04-18 20:07   ` Corinna Vinschen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='004f01cbfe72$adddeb40$0999c1c0$@muller@ics-cnrs.unistra.fr' \
    --to=pierre.muller@ics-cnrs.unistra.fr \
    --cc=gdb-patches@sourceware.org \
    --cc=tromey@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox