From: "Pierre Muller" <pierre.muller@ics-cnrs.unistra.fr>
To: "'Tom Tromey'" <tromey@redhat.com>
Cc: <gdb-patches@sourceware.org>
Subject: RE: [RFC-v5] Handle cygwin wchar_t specifics
Date: Wed, 20 Apr 2011 07:59:00 -0000 [thread overview]
Message-ID: <000201cbff30$e2266030$a6732090$@muller@ics-cnrs.unistra.fr> (raw)
In-Reply-To: <m3wriq3zbu.fsf@fleche.redhat.com>
> Tom> I don't think DEFAULT_INTERMEDIATE_ENCODING is needed.
>
> Pierre> I assumed you ment: not necessary if PHONY_ICONV is defined,
> Pierre> and this is what I changed below.
> Pierre> (I would personally have favored to completely remove
> Pierre> INTERMEDIATE_ENCODING macro and call the function directly.)
>
> Sorry, that isn't what I meant.
Hopefully I got it right this time...
> All this new code is needed only in the __STDC_ISO_10646__ case.
> All other cases are already handled ok.
> So, I think it is best to only introduce new code along the
> __STDC_ISO_10646__ branches. Thus far your patches have touched all the
> other branches -- but there is no reason to do that, and I think it just
> makes it more complicated without an associated benefit.
>
> Pierre> +#ifdef __STDC_ISO_10646__
> Pierre> + if (sizeof (gdb_wchar_t) == 2)
>
> You might as well unify the 2 and 4 byte cases like I said earlier, and
> just die for any other value.
Done below.
> You can use a static assert trick to make
> it die during compilation, which I think is better than dying at
> runtime. E.g.:
> extern char your_platform_is_bogus[(sizeof (gdb_wchar_t) == 2
> || sizeof (gdb_wchar_t) == 4)
> ? 1 : -1];
Used below (renamed your_gdb_wchar_t_is_bogus).
> Pierre> + /* Check that the name is in the list of handled charsets.
> */
> Pierre> + for (i = 0; charset_enum[i]; i++)
>
> I don't think this is really needed either.
> Or, if you really want to do the check, do it by calling iconv_open at
> initialization, and then just make gdb die early -- whatever platform
> does this is really messed up.
Also done below.
> Pierre> + /* if gdb_wchar_t is not of size 2, or if "UTF-16XE" and "UCS-
> 2XE" are
> Pierre> + not known, use DEFAULT_INTERMEDIATE_ENCODING macro. */
> Pierre> + return DEFAULT_INTERMEDIATE_ENCODING;
>
> I don't think this will generally do the right thing.
> For example, your patch defines DEFAULT_INTERMEDIATE_ENCODING to
> "UCS-4LE" in the !WORDS_BIGENDIAN case. But we already know that
> gdb_wchar_t has 2 bytes. So I think this will just result in the same
> bug as today.
I hope I now understood what you wanted:
the new code makes less changes to gdb_wchar_t.
It only uses intermediate_encoding function in the case where UCS-4LE/BE
where set before.
To avoid having this code compiled in other cases,
I defined a new macro called USE_INTERMEDIATE_ENCODING_FUNCTION
and charset.c code changes are limited to this conditional.
I used iconv_open to check for working charset names
and added a call to error if none is found.
Comments?
Pierre
2011-04-20 Pierre Muller <muller@ics.u-strasbg.fr>
* gdb_wchar.h (USE_INTERMEDIATE_ENCODING_FUNCTION): New macro.
(INTERMEDIATE_ENCODING): Change value to intermediate_encoding
function call if __STDC_ISO_10646__ macro is defined.
(intermediate_encoding): New prototype.
* charset.c (your_gdb_wchar_t_is_bogus): New test variable
to generate compile time error for unsupported gdb_wchar_t
size.
(ENDIAN_SUFFIX): New macro.
(intermediate_encoding): New function.
Index: charset.c
===================================================================
RCS file: /cvs/src/src/gdb/charset.c,v
retrieving revision 1.43
diff -u -p -r1.43 charset.c
--- charset.c 11 Jan 2011 15:10:01 -0000 1.43
+++ charset.c 20 Apr 2011 07:48:21 -0000
@@ -922,6 +922,70 @@ default_auto_wide_charset (void)
return GDB_DEFAULT_TARGET_WIDE_CHARSET;
}
+
+#ifdef USE_INTERMEDIATE_ENCODING_FUNCTION
+/* Macro used for UTF or UCS endianness suffix. */
+#if WORDS_BIGENDIAN
+#define ENDIAN_SUFFIX "BE"
+#else
+#define ENDIAN_SUFFIX "LE"
+#endif
+
+/* The code below serves to generate a compile time error if
+ gdb_wchar_t type is not of size 2 nor 4, despite the fact that
+ macro __STDC_ISO_10646__ is defined.
+ This is better than a gdb_assert call, because GDB cannot handle
+ strings correctly if this size is different. */
+
+static char your_gdb_wchar_t_is_bogus[(sizeof (gdb_wchar_t) == 2
+ || sizeof (gdb_wchar_t) == 4)
+ ? 1 : -1];
+
+/* intermediate_encoding returns the charset unsed internally by
+ GDB to convert between target and host encodings. As the test above
+ compiled, sizeof (gdb_wchar_t) is either 2 or 4 bytes.
+ UTF-16/32 is tested first, UCS-2/4 is tested as a second option,
+ otherwise an error is generated. */
+
+const char *
+intermediate_encoding (void)
+{
+ iconv_t desc;
+ static const char *stored_result = NULL;
+ const char *result;
+ int i;
+
+ if (stored_result)
+ return stored_result;
+ result = xstrprintf ("UTF-%d%s", sizeof (gdb_wchar_t) * 8,
ENDIAN_SUFFIX);
+ /* Check that the name is supported by iconv_open. */
+ desc = iconv_open (result, host_charset ());
+ if (desc != (iconv_t) -1)
+ {
+ iconv_close (desc);
+ stored_result = result;
+ return result;
+ }
+ /* Not valid, free the allocated memory. */
+ xfree ((void *) result);
+ /* Second try, with UCS-2 type. */
+ result = xstrprintf ("UCS-%d%s", sizeof (gdb_wchar_t), ENDIAN_SUFFIX);
+ /* Check that the name is supported by iconv_open. */
+ desc = iconv_open (result, host_charset ());
+ if (desc != (iconv_t) -1)
+ {
+ iconv_close (desc);
+ stored_result = result;
+ return result;
+ }
+ /* Not valid, free the allocated memory. */
+ xfree ((void *) result);
+ /* No valid charset found, generate error here. */
+ error ("Unable to find a vaild charset for string conversions");
+}
+
+#endif /* USE_INTERMEDIATE_ENCODING_FUNCTION */
+
void
_initialize_charset (void)
{
Index: gdb_wchar.h
===================================================================
RCS file: /cvs/src/src/gdb/gdb_wchar.h,v
retrieving revision 1.6
diff -u -p -r1.6 gdb_wchar.h
--- gdb_wchar.h 1 Jan 2011 15:33:05 -0000 1.6
+++ gdb_wchar.h 20 Apr 2011 07:48:21 -0000
@@ -78,11 +78,10 @@ typedef wint_t gdb_wint_t;
iconv_open. We put the endianness into the encoding name to avoid
hosts that emit a BOM when the unadorned name is used. */
#if defined (__STDC_ISO_10646__)
-#if WORDS_BIGENDIAN
-#define INTERMEDIATE_ENCODING "UCS-4BE"
-#else
-#define INTERMEDIATE_ENCODING "UCS-4LE"
-#endif
+#define USE_INTERMEDIATE_ENCODING_FUNCTION
+#define INTERMEDIATE_ENCODING intermediate_encoding ()
+const char *intermediate_encoding (void);
+
#elif defined (_LIBICONV_VERSION) && _LIBICONV_VERSION >= 0x108
#define INTERMEDIATE_ENCODING "wchar_t"
#else
next prev parent reply other threads:[~2011-04-20 7:59 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <5928.31498147479$1302882967@news.gmane.org>
2011-04-15 18:16 ` [RFA] " Tom Tromey
2011-04-16 16:05 ` Pierre Muller
2011-04-16 16:25 ` Jan Kratochvil
2011-04-16 21:29 ` [RFA-v2] " Pierre Muller
2011-04-16 22:35 ` Jan Kratochvil
[not found] ` <000001cbfc7d$3f67f440$be37dcc0$%muller@ics-cnrs.unistra.fr>
2011-04-17 2:55 ` Eli Zaretskii
2011-04-18 10:36 ` Pierre Muller
[not found] ` <00a801cbfdb4$551214a0$ff363de0$%muller@ics-cnrs.unistra.fr>
2011-04-18 10:57 ` Eli Zaretskii
2011-04-18 15:14 ` [RFA-v3] " Pierre Muller
[not found] ` <21014.6501930014$1303139687@news.gmane.org>
2011-04-18 17:18 ` Tom Tromey
2011-04-19 9:18 ` [RFC-v4] " Pierre Muller
[not found] ` <004f01cbfe72$adddeb40$0999c1c0$%muller@ics-cnrs.unistra.fr>
2011-04-19 9:34 ` Eli Zaretskii
[not found] ` <34716.7311156683$1303204711@news.gmane.org>
2011-04-19 13:19 ` Tom Tromey
2011-04-19 13:56 ` [RFC-v5] " Pierre Muller
[not found] ` <16656.7281041809$1303221408@news.gmane.org>
2011-04-19 17:50 ` Tom Tromey
2011-04-20 7:59 ` Pierre Muller [this message]
2011-04-20 21:08 ` Pedro Alves
2011-04-21 6:57 ` Pierre Muller
2011-04-21 7:17 ` [RFA-v6] " Pierre Muller
2011-04-21 9:02 ` Pierre Muller
[not found] ` <24274.3825926029$1303376558@news.gmane.org>
2011-04-21 14:14 ` Tom Tromey
2011-04-21 14:27 ` Pierre Muller
[not found] ` <4691.37052209607$1303396084@news.gmane.org>
2011-04-21 15:06 ` Tom Tromey
2011-04-21 16:39 ` Pierre Muller
[not found] ` <25400.1310132027$1303403986@news.gmane.org>
2011-04-21 20:25 ` Tom Tromey
2011-04-21 21:18 ` 7.3 commit " Pierre Muller
[not found] ` <15550.7422438406$1303369059@news.gmane.org>
2011-04-21 14:10 ` [RFC-v5] " Tom Tromey
[not found] ` <420.768399681215$1303286406@news.gmane.org>
2011-04-20 20:21 ` Tom Tromey
2011-04-16 21:24 ` [RFA] " Tom Tromey
2011-04-18 20:07 ` Corinna Vinschen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='000201cbff30$e2266030$a6732090$@muller@ics-cnrs.unistra.fr' \
--to=pierre.muller@ics-cnrs.unistra.fr \
--cc=gdb-patches@sourceware.org \
--cc=tromey@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox