* [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds
@ 2010-02-28 13:05 Corinna Vinschen
2010-02-28 14:29 ` Daniel Jacobowitz
2010-02-28 16:56 ` Eli Zaretskii
0 siblings, 2 replies; 64+ messages in thread
From: Corinna Vinschen @ 2010-02-28 13:05 UTC (permalink / raw)
To: gdb-patches
Hi,
today I tested the wide char printing in GDB for the furst time on a
Cygwin build and it didn't work well. The reson was that the default
for GDB_DEFAULT_TARGET_WIDE_CHARSET is UTF-32, which is bad for a system
defining wchar_t to be UTF-16. That's Windows for you.
So I applied the below patch, which makes GDB for Cygwin happy. I also
set the GDB_DEFAULT_TARGET_CHARSET for Cygwin and Win32 to a more sane
default for both systems.
Ok to apply?
Thanks,
Corinna
* defs.h (GDB_DEFAULT_TARGET_WIDE_CHARSET): Define as UTF-16 for
Windows-based systems.
(GDB_DEFAULT_TARGET_CHARSET): Set to UTF-8 for Cygwin and CP1252
for other Windows-based systems.
Index: defs.h
===================================================================
RCS file: /cvs/src/src/gdb/defs.h,v
retrieving revision 1.265
diff -u -p -r1.265 defs.h
--- defs.h 25 Feb 2010 20:30:58 -0000 1.265
+++ defs.h 28 Feb 2010 13:04:26 -0000
@@ -1192,6 +1192,22 @@ extern int use_windows;
#define ISATTY(FP) (isatty (fileno (FP)))
#endif
+/* The default wide character charset is UTF-32, which is wrong for
+ Windows-based systems which use UTF-16. */
+#if defined (__CYGWIN__) || defined (_WIN32)
+#define GDB_DEFAULT_TARGET_WIDE_CHARSET "UTF-16"
+#endif
+
+/* The default output charset for Cygwin is UTF-8. The default output
+ charset for Win32 is the default ANSI codepage of the system, which
+ depends on the localization of the underlying Windows system. However,
+ CP1252 is a good default replacement for ISO-8859-1 at least. */
+#if defined (__CYGWIN__)
+#define GDB_DEFAULT_TARGET_CHARSET "UTF-8"
+#elif defined (_WIN32)
+#define GDB_DEFAULT_TARGET_CHARSET "CP1252"
+#endif
+
/* Ensure that V is aligned to an N byte boundary (B's assumed to be a
power of 2). Round up/down when necessary. Examples of correct
use include:
--
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat
^ permalink raw reply [flat|nested] 64+ messages in thread* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 13:05 [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds Corinna Vinschen @ 2010-02-28 14:29 ` Daniel Jacobowitz 2010-02-28 15:03 ` Corinna Vinschen 2010-02-28 16:56 ` Eli Zaretskii 1 sibling, 1 reply; 64+ messages in thread From: Daniel Jacobowitz @ 2010-02-28 14:29 UTC (permalink / raw) To: gdb-patches On Sun, Feb 28, 2010 at 02:05:00PM +0100, Corinna Vinschen wrote: > Hi, > > today I tested the wide char printing in GDB for the furst time on a > Cygwin build and it didn't work well. The reson was that the default > for GDB_DEFAULT_TARGET_WIDE_CHARSET is UTF-32, which is bad for a system > defining wchar_t to be UTF-16. That's Windows for you. > > So I applied the below patch, which makes GDB for Cygwin happy. I also > set the GDB_DEFAULT_TARGET_CHARSET for Cygwin and Win32 to a more sane > default for both systems. > > > Ok to apply? No, this isn't right. __CYGWIN__ and _WIN32 are host checks, and these charsets are target properties. So this will not have the desired effect. Hmm, does this mean these macros should become gdbarch properties? That could be awkward in practice. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 14:29 ` Daniel Jacobowitz @ 2010-02-28 15:03 ` Corinna Vinschen 2010-02-28 18:48 ` Daniel Jacobowitz 0 siblings, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-02-28 15:03 UTC (permalink / raw) To: gdb-patches On Feb 28 09:29, Daniel Jacobowitz wrote: > On Sun, Feb 28, 2010 at 02:05:00PM +0100, Corinna Vinschen wrote: > > Hi, > > > > today I tested the wide char printing in GDB for the furst time on a > > Cygwin build and it didn't work well. The reson was that the default > > for GDB_DEFAULT_TARGET_WIDE_CHARSET is UTF-32, which is bad for a system > > defining wchar_t to be UTF-16. That's Windows for you. > > > > So I applied the below patch, which makes GDB for Cygwin happy. I also > > set the GDB_DEFAULT_TARGET_CHARSET for Cygwin and Win32 to a more sane > > default for both systems. > > > > > > Ok to apply? > > No, this isn't right. __CYGWIN__ and _WIN32 are host checks, and > these charsets are target properties. So this will not have the > desired effect. > > Hmm, does this mean these macros should become gdbarch properties? > That could be awkward in practice. I don't know because I don't know the affected code good enough, but as it is now, the code doesn't work on Cygwin, and probably on no UTF-16 target. If the codeset is target-specific anyway, then the idea of the GDB_DEFAULT_TARGET_WIDE_CHARSET and GDB_DEFAULT_TARGET_CHARSET variables is either wrong, or it must be possible to define them somewhere on a per-target base. What about windows-tdep.h? Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 15:03 ` Corinna Vinschen @ 2010-02-28 18:48 ` Daniel Jacobowitz 2010-02-28 19:22 ` Corinna Vinschen 0 siblings, 1 reply; 64+ messages in thread From: Daniel Jacobowitz @ 2010-02-28 18:48 UTC (permalink / raw) To: gdb-patches On Sun, Feb 28, 2010 at 04:03:18PM +0100, Corinna Vinschen wrote: > If the codeset is target-specific anyway, then the idea of the > GDB_DEFAULT_TARGET_WIDE_CHARSET and GDB_DEFAULT_TARGET_CHARSET variables > is either wrong, or it must be possible to define them somewhere > on a per-target base. What about windows-tdep.h? If it has to be per-target at all, it needs to go in gdbarch; it can't be a #define any more, because we support multiple compiled-in architectures. This is going to be significantly more complicated, though; there needs to be an "auto (currently cp1252)" style setting. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 18:48 ` Daniel Jacobowitz @ 2010-02-28 19:22 ` Corinna Vinschen 2010-02-28 22:27 ` Daniel Jacobowitz 0 siblings, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-02-28 19:22 UTC (permalink / raw) To: gdb-patches On Feb 28 13:47, Daniel Jacobowitz wrote: > On Sun, Feb 28, 2010 at 04:03:18PM +0100, Corinna Vinschen wrote: > > If the codeset is target-specific anyway, then the idea of the > > GDB_DEFAULT_TARGET_WIDE_CHARSET and GDB_DEFAULT_TARGET_CHARSET variables > > is either wrong, or it must be possible to define them somewhere > > on a per-target base. What about windows-tdep.h? > > If it has to be per-target at all, it needs to go in gdbarch; it can't > be a #define any more, because we support multiple compiled-in > architectures. This is going to be significantly more complicated, > though; there needs to be an "auto (currently cp1252)" style setting. There is already a setting for the target charsets. We're just talking about setting a sane default. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 19:22 ` Corinna Vinschen @ 2010-02-28 22:27 ` Daniel Jacobowitz 2010-03-01 10:31 ` Corinna Vinschen 2010-03-01 17:12 ` Tom Tromey 0 siblings, 2 replies; 64+ messages in thread From: Daniel Jacobowitz @ 2010-02-28 22:27 UTC (permalink / raw) To: gdb-patches On Sun, Feb 28, 2010 at 08:21:59PM +0100, Corinna Vinschen wrote: > On Feb 28 13:47, Daniel Jacobowitz wrote: > > On Sun, Feb 28, 2010 at 04:03:18PM +0100, Corinna Vinschen wrote: > > > If the codeset is target-specific anyway, then the idea of the > > > GDB_DEFAULT_TARGET_WIDE_CHARSET and GDB_DEFAULT_TARGET_CHARSET variables > > > is either wrong, or it must be possible to define them somewhere > > > on a per-target base. What about windows-tdep.h? > > > > If it has to be per-target at all, it needs to go in gdbarch; it can't > > be a #define any more, because we support multiple compiled-in > > architectures. This is going to be significantly more complicated, > > though; there needs to be an "auto (currently cp1252)" style setting. > > There is already a setting for the target charsets. We're just > talking about setting a sane default. We're talking past each other. Compare show_host_charset_name (which has an auto setting) to show_target_charset_name (which does not). If the default becomes dependent on the target, we need to distinguish "user specified iso-8859-1" or "user didn't say anything, but now we're debugging i686-mingw32, and that usually uses cp1252". -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 22:27 ` Daniel Jacobowitz @ 2010-03-01 10:31 ` Corinna Vinschen 2010-03-01 17:11 ` Eli Zaretskii 2010-03-01 17:21 ` Tom Tromey 2010-03-01 17:12 ` Tom Tromey 1 sibling, 2 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-03-01 10:31 UTC (permalink / raw) To: gdb-patches On Feb 28 17:27, Daniel Jacobowitz wrote: > On Sun, Feb 28, 2010 at 08:21:59PM +0100, Corinna Vinschen wrote: > > On Feb 28 13:47, Daniel Jacobowitz wrote: > > > On Sun, Feb 28, 2010 at 04:03:18PM +0100, Corinna Vinschen wrote: > > > > If the codeset is target-specific anyway, then the idea of the > > > > GDB_DEFAULT_TARGET_WIDE_CHARSET and GDB_DEFAULT_TARGET_CHARSET variables > > > > is either wrong, or it must be possible to define them somewhere > > > > on a per-target base. What about windows-tdep.h? > > > > > > If it has to be per-target at all, it needs to go in gdbarch; it can't > > > be a #define any more, because we support multiple compiled-in > > > architectures. This is going to be significantly more complicated, > > > though; there needs to be an "auto (currently cp1252)" style setting. > > > > There is already a setting for the target charsets. We're just > > talking about setting a sane default. > > We're talking past each other. Compare show_host_charset_name > (which has an auto setting) to show_target_charset_name (which does > not). > > If the default becomes dependent on the target, we need to distinguish > "user specified iso-8859-1" or "user didn't say anything, but now > we're debugging i686-mingw32, and that usually uses cp1252". Ok. Would the below be sufficient for a start? Corinna * charset.c (auto_target_charset_name): New variable. (target_charset_name): Set to "auto" by default. (show_target_charset_name): Take "auto" setting into account. (auto_target_wide_charset_name): New variable. (target_wide_charset_name): Set to "auto" by default. (show_target_wide_charset_name): Take "auto" setting into account. (set_be_le_names): Ditto. (validate): Ditto. (set_auto_target_wide_charset): New function to set auto_target_wide_charset_name. (set_auto_target_charset): New function to set auto_target_charset_name. (target_charset): Take "auto" setting into account. (target_wide_charset): Ditto. (_initialize_charset): Initialize auto_target_charset_name rather than target_charset_name with value of auto_host_charset_name. * charset.h (set_auto_target_wide_charset): Declare. (set_auto_target_charset): Declare. * windows-nat.c: Include charset.h. (_initialize_windows_nat): Initialize auto target charsets to sane values for native targets. Index: charset.c =================================================================== RCS file: /cvs/src/src/gdb/charset.c,v retrieving revision 1.28 diff -u -p -r1.28 charset.c --- charset.c 1 Jan 2010 07:31:30 -0000 1.28 +++ charset.c 1 Mar 2010 10:27:57 -0000 @@ -213,22 +213,37 @@ show_host_charset_name (struct ui_file * fprintf_filtered (file, _("The host character set is \"%s\".\n"), value); } -static const char *target_charset_name = GDB_DEFAULT_TARGET_CHARSET; +static const char *auto_target_charset_name = GDB_DEFAULT_TARGET_CHARSET; +static const char *target_charset_name = "auto"; static void show_target_charset_name (struct ui_file *file, int from_tty, struct cmd_list_element *c, const char *value) { - fprintf_filtered (file, _("The target character set is \"%s\".\n"), - value); + if (!strcmp (value, "auto")) + fprintf_filtered (file, + _("The target character set is \"auto; " + "currently %s\".\n"), + auto_target_charset_name); + else + fprintf_filtered (file, _("The target character set is \"%s\".\n"), + value); } -static const char *target_wide_charset_name = GDB_DEFAULT_TARGET_WIDE_CHARSET; +static const char *auto_target_wide_charset_name + = GDB_DEFAULT_TARGET_WIDE_CHARSET; +static const char *target_wide_charset_name = "auto"; static void show_target_wide_charset_name (struct ui_file *file, int from_tty, struct cmd_list_element *c, const char *value) { - fprintf_filtered (file, _("The target wide character set is \"%s\".\n"), - value); + if (!strcmp (value, "auto")) + fprintf_filtered (file, + _("The target wide character set is \"auto; " + "currently %s\".\n"), + auto_target_wide_charset_name); + else + fprintf_filtered (file, _("The target wide character set is \"%s\".\n"), + value); } static const char *default_charset_names[] = @@ -252,14 +267,19 @@ static void set_be_le_names (void) { int i, len; + const char *target_wide; target_wide_charset_le_name = NULL; target_wide_charset_be_name = NULL; - len = strlen (target_wide_charset_name); + target_wide = target_wide_charset_name; + if (!strcmp (target_wide, "auto")) + target_wide = auto_target_wide_charset_name; + + len = strlen (target_wide); for (i = 0; charset_enum[i]; ++i) { - if (strncmp (target_wide_charset_name, charset_enum[i], len)) + if (strncmp (target_wide, charset_enum[i], len)) continue; if ((charset_enum[i][len] == 'B' || charset_enum[i][len] == 'L') @@ -282,17 +302,21 @@ validate (void) { iconv_t desc; const char *host_cset = host_charset (); + const char *target_cset = target_charset (); + const char *target_wide_cset = target_wide_charset_name; + if (!strcmp (target_wide_cset, "auto")) + target_wide_cset = auto_target_wide_charset_name; - desc = iconv_open (target_wide_charset_name, host_cset); + desc = iconv_open (target_wide_cset, host_cset); if (desc == (iconv_t) -1) error ("Cannot convert between character sets `%s' and `%s'", - target_wide_charset_name, host_cset); + target_wide_cset, host_cset); iconv_close (desc); - desc = iconv_open (target_charset_name, host_cset); + desc = iconv_open (target_cset, host_cset); if (desc == (iconv_t) -1) error ("Cannot convert between character sets `%s' and `%s'", - target_charset_name, host_cset); + target_cset, host_cset); iconv_close (desc); set_be_le_names (); @@ -342,6 +366,21 @@ show_charset (struct ui_file *file, int show_target_wide_charset_name (file, from_tty, c, target_wide_charset_name); } +/* These functions allow to set the auto target charsets. This isn't for + the user, rather it's to be used to set the auto target charsets from + target-specific code. */ +void +set_auto_target_wide_charset (char *charset) +{ + auto_target_wide_charset_name = charset; +} + +void +set_auto_target_charset (char *charset) +{ + auto_target_charset_name = charset; +} + \f /* Accessor functions. */ @@ -356,6 +395,8 @@ host_charset (void) const char * target_charset (void) { + if (!strcmp (target_charset_name, "auto")) + return auto_target_charset_name; return target_charset_name; } @@ -373,6 +414,8 @@ target_wide_charset (enum bfd_endian byt return target_wide_charset_le_name; } + if (!strcmp (target_wide_charset_name, "auto")) + return auto_target_wide_charset_name; return target_wide_charset_name; } @@ -874,7 +917,7 @@ _initialize_charset (void) which GNU libiconv doesn't like (infinite loop). */ if (!strcmp (auto_host_charset_name, "646") || !*auto_host_charset_name) auto_host_charset_name = "ASCII"; - target_charset_name = auto_host_charset_name; + auto_target_charset_name = auto_host_charset_name; set_be_le_names (); #endif Index: charset.h =================================================================== RCS file: /cvs/src/src/gdb/charset.h,v retrieving revision 1.10 diff -u -p -r1.10 charset.h --- charset.h 1 Jan 2010 07:31:30 -0000 1.10 +++ charset.h 1 Mar 2010 10:27:57 -0000 @@ -36,6 +36,12 @@ const char *host_charset (void); const char *target_charset (void); const char *target_wide_charset (enum bfd_endian byte_order); +/* These functions allow to set the auto target charsets. This isn't for + the user, rather it's to be used to set the auto target charsets from + target-specific code. */ +void set_auto_target_wide_charset (char *charset); +void set_auto_target_charset (char *charset); + /* These values are used to specify the type of transliteration done by convert_between_encodings. */ enum transliterations Index: windows-nat.c =================================================================== RCS file: /cvs/src/src/gdb/windows-nat.c,v retrieving revision 1.204 diff -u -p -r1.204 windows-nat.c --- windows-nat.c 1 Mar 2010 09:09:24 -0000 1.204 +++ windows-nat.c 1 Mar 2010 10:27:57 -0000 @@ -65,6 +65,7 @@ #include "windows-nat.h" #include "i386-nat.h" #include "complaints.h" +#include "charset.h" #define AdjustTokenPrivileges dyn_AdjustTokenPrivileges #define DebugActiveProcessStop dyn_DebugActiveProcessStop @@ -2296,12 +2297,20 @@ void _initialize_windows_nat (void) { struct cmd_list_element *c; +#ifndef __CYGWIN__ + static char codepage[32]; +#endif init_windows_ops (); #ifdef __CYGWIN__ cygwin_internal (CW_SET_DOS_FILE_WARNING, 0); + set_auto_target_charset ("UTF-8"); +#else + snprintf (codepage, 32, "CP%u", GetACP ()); + set_auto_target_charset (codepage); #endif + set_auto_target_wide_charset ("UTF-16"); c = add_com ("dll-symbols", class_files, dll_symbol_command, _("Load dll library symbols from FILE.")); -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 10:31 ` Corinna Vinschen @ 2010-03-01 17:11 ` Eli Zaretskii 2010-03-01 17:28 ` Corinna Vinschen 2010-03-01 17:21 ` Tom Tromey 1 sibling, 1 reply; 64+ messages in thread From: Eli Zaretskii @ 2010-03-01 17:11 UTC (permalink / raw) To: gdb-patches > Date: Mon, 1 Mar 2010 11:31:25 +0100 > From: Corinna Vinschen <vinschen@redhat.com> > > @@ -2296,12 +2297,20 @@ void > _initialize_windows_nat (void) > { > struct cmd_list_element *c; > +#ifndef __CYGWIN__ > + static char codepage[32]; > +#endif > > init_windows_ops (); > > #ifdef __CYGWIN__ > cygwin_internal (CW_SET_DOS_FILE_WARNING, 0); > + set_auto_target_charset ("UTF-8"); > +#else > + snprintf (codepage, 32, "CP%u", GetACP ()); ^^ Could this be `sizeof(codepage)' instead, please? Thanks. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 17:11 ` Eli Zaretskii @ 2010-03-01 17:28 ` Corinna Vinschen 0 siblings, 0 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-03-01 17:28 UTC (permalink / raw) To: gdb-patches On Mar 1 19:11, Eli Zaretskii wrote: > > Date: Mon, 1 Mar 2010 11:31:25 +0100 > > From: Corinna Vinschen <vinschen@redhat.com> > > > > @@ -2296,12 +2297,20 @@ void > > _initialize_windows_nat (void) > > { > > struct cmd_list_element *c; > > +#ifndef __CYGWIN__ > > + static char codepage[32]; > > +#endif > > > > init_windows_ops (); > > > > #ifdef __CYGWIN__ > > cygwin_internal (CW_SET_DOS_FILE_WARNING, 0); > > + set_auto_target_charset ("UTF-8"); > > +#else > > + snprintf (codepage, 32, "CP%u", GetACP ()); > ^^ > Could this be `sizeof(codepage)' instead, please? Yes, no problem. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 10:31 ` Corinna Vinschen 2010-03-01 17:11 ` Eli Zaretskii @ 2010-03-01 17:21 ` Tom Tromey 2010-03-01 17:23 ` Daniel Jacobowitz 2010-03-01 17:31 ` Corinna Vinschen 1 sibling, 2 replies; 64+ messages in thread From: Tom Tromey @ 2010-03-01 17:21 UTC (permalink / raw) To: gdb-patches >>>>> "Corinna" == Corinna Vinschen <vinschen@redhat.com> writes: Corinna> +void Corinna> +set_auto_target_wide_charset (char *charset) Argument should be const. Corinna> +void Corinna> +set_auto_target_charset (char *charset) Likewise. I think the documentation for these functions (in charset.h) should mention the expected lifetime of the arguments. Corinna> @@ -874,7 +917,7 @@ _initialize_charset (void) [...] Corinna> + auto_target_charset_name = auto_host_charset_name; Corinna> _initialize_windows_nat (void) [...] Corinna> + set_auto_target_charset ("UTF-8"); This seems to introduce an ordering dependency between _initialize functions. My understanding is that we don't allow this. Tom ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 17:21 ` Tom Tromey @ 2010-03-01 17:23 ` Daniel Jacobowitz 2010-03-01 17:27 ` Corinna Vinschen 2010-03-01 17:31 ` Corinna Vinschen 1 sibling, 1 reply; 64+ messages in thread From: Daniel Jacobowitz @ 2010-03-01 17:23 UTC (permalink / raw) To: Tom Tromey; +Cc: gdb-patches On Mon, Mar 01, 2010 at 10:21:27AM -0700, Tom Tromey wrote: > This seems to introduce an ordering dependency between _initialize > functions. My understanding is that we don't allow this. Correct. That's why I suggested gdbarch. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 17:23 ` Daniel Jacobowitz @ 2010-03-01 17:27 ` Corinna Vinschen 0 siblings, 0 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-03-01 17:27 UTC (permalink / raw) To: gdb-patches; +Cc: Tom Tromey On Mar 1 12:23, Daniel Jacobowitz wrote: > On Mon, Mar 01, 2010 at 10:21:27AM -0700, Tom Tromey wrote: > > This seems to introduce an ordering dependency between _initialize > > functions. My understanding is that we don't allow this. > > Correct. That's why I suggested gdbarch. Is there any chance to get a sane default on a native Windows GDB without having to turn GDB upside down? Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 17:21 ` Tom Tromey 2010-03-01 17:23 ` Daniel Jacobowitz @ 2010-03-01 17:31 ` Corinna Vinschen 2010-03-01 19:25 ` Tom Tromey 1 sibling, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-03-01 17:31 UTC (permalink / raw) To: gdb-patches On Mar 1 10:21, Tom Tromey wrote: > >>>>> "Corinna" == Corinna Vinschen <vinschen@redhat.com> writes: > > Corinna> +void > Corinna> +set_auto_target_wide_charset (char *charset) > > Argument should be const. > > Corinna> +void > Corinna> +set_auto_target_charset (char *charset) > > Likewise. > > I think the documentation for these functions (in charset.h) should > mention the expected lifetime of the arguments. I don't understand. What do you mean by "expected lifetime"?!? > Corinna> @@ -874,7 +917,7 @@ _initialize_charset (void) > [...] > Corinna> + auto_target_charset_name = auto_host_charset_name; > > Corinna> _initialize_windows_nat (void) > [...] > Corinna> + set_auto_target_charset ("UTF-8"); > > This seems to introduce an ordering dependency between _initialize > functions. My understanding is that we don't allow this. It's just a default. The default of UTF-32 breaks wide char support for UTF-16 targets. A native Cygwin or MingW GDB should have a sane default for the target charsets. That's all I'm trying to accomplish. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 17:31 ` Corinna Vinschen @ 2010-03-01 19:25 ` Tom Tromey 2010-03-01 19:31 ` Daniel Jacobowitz 0 siblings, 1 reply; 64+ messages in thread From: Tom Tromey @ 2010-03-01 19:25 UTC (permalink / raw) To: gdb-patches >>>>> "Corinna" == Corinna Vinschen <vinschen@redhat.com> writes: Tom> I think the documentation for these functions (in charset.h) should Tom> mention the expected lifetime of the arguments. Corinna> I don't understand. What do you mean by "expected lifetime"?!? The argument string must not be freed by the caller. Tom> This seems to introduce an ordering dependency between _initialize Tom> functions. My understanding is that we don't allow this. Corinna> It's just a default. The default of UTF-32 breaks wide char Corinna> support for UTF-16 targets. A native Cygwin or MingW GDB Corinna> should have a sane default for the target charsets. That's all Corinna> I'm trying to accomplish. Yes, I understand. The problem here is that if the init functions are ordered one way, then your patch installs one default, but if they are ordered another way, then your patch installs a different default. Personally, I think that, given the existing nl_langinfo code (and worse, Solaris hack) in _initialize_charset, it would be fine to add some Cygwin change here. However, I won't approve a patch like that over Daniel's objections. Tom ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 19:25 ` Tom Tromey @ 2010-03-01 19:31 ` Daniel Jacobowitz 2010-03-01 20:06 ` Corinna Vinschen ` (2 more replies) 0 siblings, 3 replies; 64+ messages in thread From: Daniel Jacobowitz @ 2010-03-01 19:31 UTC (permalink / raw) To: Tom Tromey; +Cc: gdb-patches On Mon, Mar 01, 2010 at 12:25:47PM -0700, Tom Tromey wrote: > Personally, I think that, given the existing nl_langinfo code (and > worse, Solaris hack) in _initialize_charset, it would be fine to add > some Cygwin change here. However, I won't approve a patch like that > over Daniel's objections. I don't mind hacks in general. I do mind hacks that are going to break my currently working builds. If you build an i686-mingw32 to arm-linux debugger, and suddenly that defaults to UTF-16 instead of UCS-4, then I've got a problem. What do you suggest? -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 19:31 ` Daniel Jacobowitz @ 2010-03-01 20:06 ` Corinna Vinschen 2010-03-01 20:44 ` Eli Zaretskii 2010-03-02 17:59 ` Tom Tromey 2 siblings, 0 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-03-01 20:06 UTC (permalink / raw) To: gdb-patches; +Cc: Tom Tromey On Mar 1 14:31, Daniel Jacobowitz wrote: > On Mon, Mar 01, 2010 at 12:25:47PM -0700, Tom Tromey wrote: > > Personally, I think that, given the existing nl_langinfo code (and > > worse, Solaris hack) in _initialize_charset, it would be fine to add > > some Cygwin change here. However, I won't approve a patch like that > > over Daniel's objections. > > I don't mind hacks in general. I do mind hacks that are going to > break my currently working builds. That's fine, it's just that the current default breaks my builds. > If you build an i686-mingw32 > to arm-linux debugger, and suddenly that defaults to UTF-16 instead of > UCS-4, then I've got a problem. > > What do you suggest? That's why I added the default now to _initialize_windows_nat. This initialization function is only included into the build if you create a GDB debugger for native Windows debugging. I don't see anything wrong setting the default target charsets to UTF-16 and UTF-8/GetACP() in this special case. After all, it's still just a default setting. It doesn't seriously break anything which can't be changed by a user setting. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 19:31 ` Daniel Jacobowitz 2010-03-01 20:06 ` Corinna Vinschen @ 2010-03-01 20:44 ` Eli Zaretskii 2010-03-01 21:50 ` Daniel Jacobowitz 2010-03-02 17:59 ` Tom Tromey 2 siblings, 1 reply; 64+ messages in thread From: Eli Zaretskii @ 2010-03-01 20:44 UTC (permalink / raw) To: Daniel Jacobowitz; +Cc: tromey, gdb-patches > Date: Mon, 1 Mar 2010 14:31:31 -0500 > From: Daniel Jacobowitz <dan@codesourcery.com> > Cc: gdb-patches@sourceware.org > > If you build an i686-mingw32 to arm-linux debugger, and suddenly > that defaults to UTF-16 instead of UCS-4, then I've got a problem. The same is true in the other direction. It sounds like, for cross debugging, the only fair default is ASCII. That way, each user will have to set the right target charset, and no one will feel they are children of a lesser god. But for native Windows debugging what Corinna suggests is just about right, I think. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 20:44 ` Eli Zaretskii @ 2010-03-01 21:50 ` Daniel Jacobowitz 2010-03-02 10:44 ` Corinna Vinschen 0 siblings, 1 reply; 64+ messages in thread From: Daniel Jacobowitz @ 2010-03-01 21:50 UTC (permalink / raw) To: Eli Zaretskii; +Cc: tromey, gdb-patches On Mon, Mar 01, 2010 at 10:44:43PM +0200, Eli Zaretskii wrote: > It sounds like, for cross debugging, the only fair default is ASCII. > That way, each user will have to set the right target charset, and no > one will feel they are children of a lesser god. This is a target specific setting. Why is there so much resistance to making the target set the property? This is how we handle lots of our other global and target-related state. There's a function, target_wide_charset. It is currently passed a byte order, and consults various globals. It should be passed a gdbarch instead of a byte order, which will require some refactoring but nothing major from my inspection. That's one straightforward patch. target_wide_charset_name can not currently be set to auto. That should be allowed, and the default; if it is "auto", then it should be returned (and "show target-wide-charset" too) as GDB_DEFAULT_TARGET_WIDE_CHARSET. Just like host_charset_name / auto_host_charset name. That's another straightforward patch. Then, GDB_DEFAULT_TARGET_WIDE_CHARSET can become a gdbarch variable. The gdbarch created in i386_cygwin_init_abi can set it to the right thing. The default can be what it is now. This patch is less straightforward because of the existing hacks, like PHONY_ICONV. I'll even offer to help, but I can't promise to help in a timely fashion except with review. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 21:50 ` Daniel Jacobowitz @ 2010-03-02 10:44 ` Corinna Vinschen 2010-03-02 13:54 ` Daniel Jacobowitz 0 siblings, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-03-02 10:44 UTC (permalink / raw) To: gdb-patches On Mar 1 16:50, Daniel Jacobowitz wrote: > On Mon, Mar 01, 2010 at 10:44:43PM +0200, Eli Zaretskii wrote: > > It sounds like, for cross debugging, the only fair default is ASCII. > > That way, each user will have to set the right target charset, and no > > one will feel they are children of a lesser god. > > This is a target specific setting. Why is there so much resistance to > making the target set the property? This is how we handle lots of our > other global and target-related state. > > There's a function, target_wide_charset. It is currently passed a > byte order, and consults various globals. It should be passed a > gdbarch instead of a byte order, which will require some refactoring > but nothing major from my inspection. That's one straightforward > patch. > > target_wide_charset_name can not currently be set to auto. That > should be allowed, and the default; if it is "auto", then it should > be returned (and "show target-wide-charset" too) as > GDB_DEFAULT_TARGET_WIDE_CHARSET. Just like host_charset_name / > auto_host_charset name. That's another straightforward patch. This, as well as the same for target_charset_name was already in my previous patch. But I don't see a reason to keep the auto_target_wide_charset_name and auto_target_charset_name variables since they would be entirely replaced by the gdbarch equivalent... > Then, GDB_DEFAULT_TARGET_WIDE_CHARSET can become a gdbarch variable. > The gdbarch created in i386_cygwin_init_abi can set it to the right > thing. The default can be what it is now. This patch is less > straightforward because of the existing hacks, like PHONY_ICONV. The target specifies charset and wide charset. So we need two target-specific variables. As I mentioned above, both variables would replace the auto_target...charset variables. I know how to change gdbarch.sh to add the variables, and I have all the other stuff almost in place. But I have a problem. How do I access the current gdbarch from show_target_charset_name, show_target_wide_charset_name, and all the other functions in charset.c which refer to the target charsets? Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-02 10:44 ` Corinna Vinschen @ 2010-03-02 13:54 ` Daniel Jacobowitz 2010-03-02 14:13 ` Corinna Vinschen 0 siblings, 1 reply; 64+ messages in thread From: Daniel Jacobowitz @ 2010-03-02 13:54 UTC (permalink / raw) To: gdb-patches On Tue, Mar 02, 2010 at 11:43:58AM +0100, Corinna Vinschen wrote: > The target specifies charset and wide charset. So we need two > target-specific variables. As I mentioned above, both variables would > replace the auto_target...charset variables. That makes sense. > How do I access the current gdbarch from show_target_charset_name, > show_target_wide_charset_name, and all the other functions in charset.c > which refer to the target charsets? From the show functions, which are called by the CLI infrastructure, I think you're supposed to use get_current_arch (in arch-utils.h). From other functions, mostly, you need the caller to pass an architecture. An increasing portion of GDB is parameterized this way. That'll work for e.g. target_wide_charset (). I don't think you can do either from _initialize_charset. That may have to happen lazily... -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-02 13:54 ` Daniel Jacobowitz @ 2010-03-02 14:13 ` Corinna Vinschen 2010-03-02 15:03 ` Daniel Jacobowitz 0 siblings, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-03-02 14:13 UTC (permalink / raw) To: gdb-patches On Mar 2 08:54, Daniel Jacobowitz wrote: > On Tue, Mar 02, 2010 at 11:43:58AM +0100, Corinna Vinschen wrote: > > The target specifies charset and wide charset. So we need two > > target-specific variables. As I mentioned above, both variables would > > replace the auto_target...charset variables. > > That makes sense. > > > How do I access the current gdbarch from show_target_charset_name, > > show_target_wide_charset_name, and all the other functions in charset.c > > which refer to the target charsets? > > >From the show functions, which are called by the CLI infrastructure, I > think you're supposed to use get_current_arch (in arch-utils.h). > > >From other functions, mostly, you need the caller to pass an > architecture. An increasing portion of GDB is parameterized > this way. That'll work for e.g. target_wide_charset (). > > I don't think you can do either from _initialize_charset. That may > have to happen lazily... It looks to me as if a call to get_current_arch() in target_charset() is the way to go. Otherwise, I don't see how this can be set up. The problem here are the various places which call the target_charset() function. The one call in c-lang.c, function charset_for_string_type() looks simple, but the calls in util.c and python/py-utils.c look pretty tricky. Or can I just refer to get_current_arch() in these places? Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-02 14:13 ` Corinna Vinschen @ 2010-03-02 15:03 ` Daniel Jacobowitz 2010-03-02 17:43 ` Corinna Vinschen 0 siblings, 1 reply; 64+ messages in thread From: Daniel Jacobowitz @ 2010-03-02 15:03 UTC (permalink / raw) To: gdb-patches On Tue, Mar 02, 2010 at 03:13:24PM +0100, Corinna Vinschen wrote: > It looks to me as if a call to get_current_arch() in target_charset() > is the way to go. Otherwise, I don't see how this can be set up. > The problem here are the various places which call the target_charset() > function. The one call in c-lang.c, function charset_for_string_type() > looks simple, but the calls in util.c and python/py-utils.c look > pretty tricky. Or can I just refer to get_current_arch() in these > places? We're only supposed to call get_current_arch in the functions that implement UI commands. Everything else has to be passed a relevant architecture. This is a lot of hassle, but it's the way the design worked out. The one in c-lang.c is easy. charset_for_string_type takes a byte order already; wherever its callers get a byte order, they will get it from the gdbarch, so just pass the gdbarch instead. I checked, both eventual callers have one available For utils, both calls come through parse_escape. Since this has more calls, obviously, it's more of a mess :-( * echo_command is a command and can use get_current_arch. * Ditto for do_setshow_command. * Um, I guess the same is true for mi_parse_argv, but at this point I think something's wrong infrastructurally. Why do we care about the target charset when decoding MI commands? C has c_parse_escape, which always uses the host charset, and pushes conversion to the target charset elsewhere... at least, this will make the problem visible at the call site instead of buried. * That leaves the three yylex definitions, for Java, ObjC, and Pascal. For those, use parse_gdbarch. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-02 15:03 ` Daniel Jacobowitz @ 2010-03-02 17:43 ` Corinna Vinschen 0 siblings, 0 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-03-02 17:43 UTC (permalink / raw) To: gdb-patches On Mar 2 10:03, Daniel Jacobowitz wrote: > On Tue, Mar 02, 2010 at 03:13:24PM +0100, Corinna Vinschen wrote: > > It looks to me as if a call to get_current_arch() in target_charset() > > is the way to go. Otherwise, I don't see how this can be set up. > > The problem here are the various places which call the target_charset() > > function. The one call in c-lang.c, function charset_for_string_type() > > looks simple, but the calls in util.c and python/py-utils.c look > > pretty tricky. Or can I just refer to get_current_arch() in these > > places? > > We're only supposed to call get_current_arch in the functions that > implement UI commands. Everything else has to be passed a relevant > architecture. This is a lot of hassle, but it's the way the design > worked out. > > The one in c-lang.c is easy. charset_for_string_type takes a byte > order already; wherever its callers get a byte order, they will get it > from the gdbarch, so just pass the gdbarch instead. I checked, both > eventual callers have one available > > For utils, both calls come through parse_escape. Since this has more > calls, obviously, it's more of a mess :-( > > * echo_command is a command and can use get_current_arch. > > * Ditto for do_setshow_command. > > * Um, I guess the same is true for mi_parse_argv, but at this point I > think something's wrong infrastructurally. Why do we care about the > target charset when decoding MI commands? C has c_parse_escape, which > always uses the host charset, and pushes conversion to the target > charset elsewhere... at least, this will make the problem visible at > the call site instead of buried. > > * That leaves the three yylex definitions, for Java, ObjC, and Pascal. > For those, use parse_gdbarch. Sorry, but I have to step down from patching this. There's too much code involved I never had a look into and this is getting way over my head for the simple seeming task of just setting a sane default for a native Cygwin GDB. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 19:31 ` Daniel Jacobowitz 2010-03-01 20:06 ` Corinna Vinschen 2010-03-01 20:44 ` Eli Zaretskii @ 2010-03-02 17:59 ` Tom Tromey 2010-03-02 18:22 ` Corinna Vinschen 2010-03-02 22:55 ` Tom Tromey 2 siblings, 2 replies; 64+ messages in thread From: Tom Tromey @ 2010-03-02 17:59 UTC (permalink / raw) To: gdb-patches >>>>> "Daniel" == Daniel Jacobowitz <dan@codesourcery.com> writes: >> Personally, I think that, given the existing nl_langinfo code (and >> worse, Solaris hack) in _initialize_charset, it would be fine to add >> some Cygwin change here. However, I won't approve a patch like that >> over Daniel's objections. Daniel> I don't mind hacks in general. I do mind hacks that are going to Daniel> break my currently working builds. If you build an i686-mingw32 Daniel> to arm-linux debugger, and suddenly that defaults to UTF-16 instead of Daniel> UCS-4, then I've got a problem. Yes, I see what you mean. Sorry for being so dense. Additionally, you're currently exposed to this scenario in the situation where Corinna decides to fix the problem by implementing nl_langinfo(CODESET) in Cygwin ;-) Daniel> What do you suggest? I agree, we need a gdbarch method. Perhaps we should also extend the current "-BE/-LE" hack to look at the width of wchar_t in the user's program. That is, "set target-wide-charset auto" (or perhaps "auto-utf") would select between UTF-16/32/-BE-LE as appropriate. I'll look into this today. Tom ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-02 17:59 ` Tom Tromey @ 2010-03-02 18:22 ` Corinna Vinschen 2010-03-02 18:28 ` Corinna Vinschen 2010-03-02 22:55 ` Tom Tromey 1 sibling, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-03-02 18:22 UTC (permalink / raw) To: gdb-patches On Mar 2 10:58, Tom Tromey wrote: > >>>>> "Daniel" == Daniel Jacobowitz <dan@codesourcery.com> writes: > > >> Personally, I think that, given the existing nl_langinfo code (and > >> worse, Solaris hack) in _initialize_charset, it would be fine to add > >> some Cygwin change here. However, I won't approve a patch like that > >> over Daniel's objections. > > Daniel> I don't mind hacks in general. I do mind hacks that are going to > Daniel> break my currently working builds. If you build an i686-mingw32 > Daniel> to arm-linux debugger, and suddenly that defaults to UTF-16 instead of > Daniel> UCS-4, then I've got a problem. > > Yes, I see what you mean. Sorry for being so dense. > > Additionally, you're currently exposed to this scenario in the situation > where Corinna decides to fix the problem by implementing > nl_langinfo(CODESET) in Cygwin ;-) nl_langinfo(CODESET) exists in Cygwin. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-02 18:22 ` Corinna Vinschen @ 2010-03-02 18:28 ` Corinna Vinschen 0 siblings, 0 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-03-02 18:28 UTC (permalink / raw) To: gdb-patches On Mar 2 19:22, Corinna Vinschen wrote: > On Mar 2 10:58, Tom Tromey wrote: > > >>>>> "Daniel" == Daniel Jacobowitz <dan@codesourcery.com> writes: > > > > >> Personally, I think that, given the existing nl_langinfo code (and > > >> worse, Solaris hack) in _initialize_charset, it would be fine to add > > >> some Cygwin change here. However, I won't approve a patch like that > > >> over Daniel's objections. > > > > Daniel> I don't mind hacks in general. I do mind hacks that are going to > > Daniel> break my currently working builds. If you build an i686-mingw32 > > Daniel> to arm-linux debugger, and suddenly that defaults to UTF-16 instead of > > Daniel> UCS-4, then I've got a problem. > > > > Yes, I see what you mean. Sorry for being so dense. > > > > Additionally, you're currently exposed to this scenario in the situation > > where Corinna decides to fix the problem by implementing > > nl_langinfo(CODESET) in Cygwin ;-) > > nl_langinfo(CODESET) exists in Cygwin. ...but that doesn't fix the wide char problem. UTF-32 just doesn't work on Windows/Cygwin. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-02 17:59 ` Tom Tromey 2010-03-02 18:22 ` Corinna Vinschen @ 2010-03-02 22:55 ` Tom Tromey 2010-03-02 22:57 ` Tom Tromey ` (2 more replies) 1 sibling, 3 replies; 64+ messages in thread From: Tom Tromey @ 2010-03-02 22:55 UTC (permalink / raw) To: gdb-patches >>>>> "Tom" == Tom Tromey <tromey@redhat.com> writes: Tom> I agree, we need a gdbarch method. Tom> I'll look into this today. Here's a patch, please let me know what you think. This patch nearly works -- it regresses on a Python test that looks at gdb.parameter('target-charset') directly and then gets confused by "auto". I think the only solution for this is to add a new Python API. (It would be easy enough to just hack the test case somehow -- but the libstdc++ printers use this same idiom, so a real solution is needed.) I wasn't sure where to put the Cygwin override; I took Daniel's suggestion of i386_cygwin_init_abi; but Corinna's earlier patches changed all Windows targets, not just Cygwin, and I didn't see a good place to do that. Also, given that Cygwin implements nl_langinfo(CODESET), I figured we only needed to override the target wide charset default in this case. I do think this patch is largely an improvement. In particular, I agree with Daniel that the current MI use of parse_escape is suspect, so it was good to flush that out. Tom 2010-03-02 Tom Tromey <tromey@redhat.com> * utils.c (host_char_to_target): Add 'gdbarch' argument. (parse_escape): Likewise. * python/py-utils.c (unicode_to_target_string): Update. (unicode_to_target_python_string): Update. (target_string_to_unicode): Update. * printcmd.c (printf_command): Update. * p-exp.y (yylex): Update. * objc-exp.y (yylex): Update. * mi/mi-parse.c: Include charset.h. (mi_parse_escape): New function. (mi_parse_argv): Use it. * jv-exp.y (yylex): Update. * i386-cygwin-tdep.c (i386_cygwin_auto_wide_charset): New function. (i386_cygwin_init_abi): Call set_gdbarch_auto_wide_charset. * gdbarch.sh (auto_charset, auto_wide_charset): New. * gdbarch.c: Rebuild. * gdbarch.h: Rebuild. * defs.h (parse_escape): Update. * cli/cli-setshow.c: Include arch-utils.h. (do_setshow_command): Update. * cli/cli-cmds.c (echo_command): Update. * charset.h (target_charset, target_wide_charset): Update. * charset.c: Include arch-utils.h. (target_charset_name): Default to "auto". (target_wide_charset_name): Likewise. (show_target_charset_name): Handle "auto". (show_target_wide_charset_name): Likewise. (be_le_arch): New global. (set_be_le_names): Add 'gdbarch' argument. (validate): Likewise. Don't call set_be_le_names. (set_charset_sfunc, set_host_charset_sfunc) (set_target_charset_sfunc, set_target_wide_charset_sfunc): Update. (target_charset): Add 'gdbarch' argument. (target_wide_charset): Likewise. Remove 'byte_order' argument. (auto_target_charset_name): New global. (default_auto_charset, default_auto_wide_charset): New functions. (_initialize_charset): Set auto_target_charset_name. Allow "auto" for target charsets. * c-lang.c (charset_for_string_type): Add 'gdbarch' argument, remove 'byte_order' argument. Update. (classify_type): Likewise. (c_emit_char): Update. (c_printchar): Update. (c_printstr): Update. (c_get_string): Update. (evaluate_subexp_c): Update. * arch-utils.h (default_auto_charset, default_auto_wide_charset): Declare. diff --git a/gdb/arch-utils.h b/gdb/arch-utils.h index d5d7f1d..dbeb67f 100644 --- a/gdb/arch-utils.h +++ b/gdb/arch-utils.h @@ -162,4 +162,7 @@ extern int default_fast_tracepoint_valid_at (struct gdbarch *gdbarch, extern void default_remote_breakpoint_from_pc (struct gdbarch *, CORE_ADDR *pcptr, int *kindptr); +extern const char *default_auto_charset (void); +extern const char *default_auto_wide_charset (void); + #endif diff --git a/gdb/c-lang.c b/gdb/c-lang.c index d620881..fefd675 100644 --- a/gdb/c-lang.c +++ b/gdb/c-lang.c @@ -43,23 +43,23 @@ extern void _initialize_c_language (void); static const char * charset_for_string_type (enum c_string_type str_type, - enum bfd_endian byte_order) + struct gdbarch *gdbarch) { switch (str_type & ~C_CHAR) { case C_STRING: - return target_charset (); + return target_charset (gdbarch); case C_WIDE_STRING: - return target_wide_charset (byte_order); + return target_wide_charset (gdbarch); case C_STRING_16: /* FIXME: UTF-16 is not always correct. */ - if (byte_order == BFD_ENDIAN_BIG) + if (gdbarch_byte_order (gdbarch) == BFD_ENDIAN_BIG) return "UTF-16BE"; else return "UTF-16LE"; case C_STRING_32: /* FIXME: UTF-32 is not always correct. */ - if (byte_order == BFD_ENDIAN_BIG) + if (gdbarch_byte_order (gdbarch) == BFD_ENDIAN_BIG) return "UTF-32BE"; else return "UTF-32LE"; @@ -73,7 +73,7 @@ charset_for_string_type (enum c_string_type str_type, characters of this type in target BYTE_ORDER to the host character set. */ static enum c_string_type -classify_type (struct type *elttype, enum bfd_endian byte_order, +classify_type (struct type *elttype, struct gdbarch *gdbarch, const char **encoding) { struct type *saved_type; @@ -134,7 +134,7 @@ classify_type (struct type *elttype, enum bfd_endian byte_order, done: if (encoding) - *encoding = charset_for_string_type (result, byte_order); + *encoding = charset_for_string_type (result, gdbarch); return result; } @@ -264,7 +264,7 @@ c_emit_char (int c, struct type *type, struct ui_file *stream, int quoter) struct wchar_iterator *iter; int need_escape = 0; - classify_type (type, byte_order, &encoding); + classify_type (type, get_type_arch (type), &encoding); buf = alloca (TYPE_LENGTH (type)); pack_long (buf, type, c); @@ -340,7 +340,7 @@ c_printchar (int c, struct type *type, struct ui_file *stream) { enum c_string_type str_type; - str_type = classify_type (type, BFD_ENDIAN_UNKNOWN, NULL); + str_type = classify_type (type, get_type_arch (type), NULL); switch (str_type) { case C_CHAR: @@ -396,7 +396,8 @@ c_printstr (struct ui_file *stream, struct type *type, const gdb_byte *string, width, byte_order) == 0)) length--; - str_type = classify_type (type, byte_order, &type_encoding) & ~C_CHAR; + str_type = (classify_type (type, get_type_arch (type), &type_encoding) + & ~C_CHAR); switch (str_type) { case C_STRING: @@ -659,7 +660,7 @@ c_get_string (struct value *value, gdb_byte **buffer, int *length, if (! c_textual_element_type (element_type, 0)) goto error; kind = classify_type (element_type, - gdbarch_byte_order (get_type_arch (element_type)), + get_type_arch (element_type), charset); width = TYPE_LENGTH (element_type); @@ -942,7 +943,6 @@ evaluate_subexp_c (struct type *expect_type, struct expression *exp, struct value *result; enum c_string_type dest_type; const char *dest_charset; - enum bfd_endian byte_order; obstack_init (&output); cleanup = make_cleanup_obstack_free (&output); @@ -979,8 +979,7 @@ evaluate_subexp_c (struct type *expect_type, struct expression *exp, /* Ensure TYPE_LENGTH is valid for TYPE. */ check_typedef (type); - byte_order = gdbarch_byte_order (exp->gdbarch); - dest_charset = charset_for_string_type (dest_type, byte_order); + dest_charset = charset_for_string_type (dest_type, exp->gdbarch); ++*pos; while (*pos < limit) diff --git a/gdb/charset.c b/gdb/charset.c index 8977f27..f383b00 100644 --- a/gdb/charset.c +++ b/gdb/charset.c @@ -27,6 +27,7 @@ #include "charset-list.h" #include "vec.h" #include "environ.h" +#include "arch-utils.h" #include <stddef.h> #include "gdb_string.h" @@ -213,22 +214,34 @@ show_host_charset_name (struct ui_file *file, int from_tty, fprintf_filtered (file, _("The host character set is \"%s\".\n"), value); } -static const char *target_charset_name = GDB_DEFAULT_TARGET_CHARSET; +static const char *target_charset_name = "auto"; static void show_target_charset_name (struct ui_file *file, int from_tty, struct cmd_list_element *c, const char *value) { - fprintf_filtered (file, _("The target character set is \"%s\".\n"), - value); + if (!strcmp (value, "auto")) + fprintf_filtered (file, + _("The target character set is \"auto; " + "currently %s\".\n"), + gdbarch_auto_charset (get_current_arch ())); + else + fprintf_filtered (file, _("The target character set is \"%s\".\n"), + value); } -static const char *target_wide_charset_name = GDB_DEFAULT_TARGET_WIDE_CHARSET; +static const char *target_wide_charset_name = "auto"; static void show_target_wide_charset_name (struct ui_file *file, int from_tty, struct cmd_list_element *c, const char *value) { - fprintf_filtered (file, _("The target wide character set is \"%s\".\n"), - value); + if (!strcmp (value, "auto")) + fprintf_filtered (file, + _("The target wide character set is \"auto; " + "currently %s\".\n"), + gdbarch_auto_wide_charset (get_current_arch ())); + else + fprintf_filtered (file, _("The target wide character set is \"%s\".\n"), + value); } static const char *default_charset_names[] = @@ -245,21 +258,33 @@ static const char **charset_enum; static const char *target_wide_charset_be_name; static const char *target_wide_charset_le_name; -/* A helper function for validate which sets the target wide big- and - little-endian character set names, if possible. */ +/* The architecture for which the BE- and LE-names are valid. */ +static struct gdbarch *be_le_arch; + +/* A helper function which sets the target wide big- and little-endian + character set names, if possible. */ static void -set_be_le_names (void) +set_be_le_names (struct gdbarch *gdbarch) { int i, len; + const char *target_wide; + + if (be_le_arch == gdbarch) + return; + be_le_arch = gdbarch; target_wide_charset_le_name = NULL; target_wide_charset_be_name = NULL; - len = strlen (target_wide_charset_name); + target_wide = target_wide_charset_name; + if (!strcmp (target_wide, "auto")) + target_wide = gdbarch_auto_wide_charset (gdbarch); + + len = strlen (target_wide); for (i = 0; charset_enum[i]; ++i) { - if (strncmp (target_wide_charset_name, charset_enum[i], len)) + if (strncmp (target_wide, charset_enum[i], len)) continue; if ((charset_enum[i][len] == 'B' || charset_enum[i][len] == 'L') @@ -278,24 +303,29 @@ set_be_le_names (void) target-wide-charset', 'set charset' sfunc's. */ static void -validate (void) +validate (struct gdbarch *gdbarch) { iconv_t desc; const char *host_cset = host_charset (); + const char *target_cset = target_charset (gdbarch); + const char *target_wide_cset = target_wide_charset_name; + if (!strcmp (target_wide_cset, "auto")) + target_wide_cset = gdbarch_auto_wide_charset (gdbarch); - desc = iconv_open (target_wide_charset_name, host_cset); + desc = iconv_open (target_wide_cset, host_cset); if (desc == (iconv_t) -1) error ("Cannot convert between character sets `%s' and `%s'", - target_wide_charset_name, host_cset); + target_wide_cset, host_cset); iconv_close (desc); - desc = iconv_open (target_charset_name, host_cset); + desc = iconv_open (target_cset, host_cset); if (desc == (iconv_t) -1) error ("Cannot convert between character sets `%s' and `%s'", - target_charset_name, host_cset); + target_cset, host_cset); iconv_close (desc); - set_be_le_names (); + /* Clear the cache. */ + be_le_arch = NULL; } /* This is the sfunc for the 'set charset' command. */ @@ -304,7 +334,7 @@ set_charset_sfunc (char *charset, int from_tty, struct cmd_list_element *c) { /* CAREFUL: set the target charset here as well. */ target_charset_name = host_charset_name; - validate (); + validate (get_current_arch ()); } /* 'set host-charset' command sfunc. We need a wrapper here because @@ -313,7 +343,7 @@ static void set_host_charset_sfunc (char *charset, int from_tty, struct cmd_list_element *c) { - validate (); + validate (get_current_arch ()); } /* Wrapper for the 'set target-charset' command. */ @@ -321,7 +351,7 @@ static void set_target_charset_sfunc (char *charset, int from_tty, struct cmd_list_element *c) { - validate (); + validate (get_current_arch ()); } /* Wrapper for the 'set target-wide-charset' command. */ @@ -329,7 +359,7 @@ static void set_target_wide_charset_sfunc (char *charset, int from_tty, struct cmd_list_element *c) { - validate (); + validate (get_current_arch ()); } /* sfunc for the 'show charset' command. */ @@ -354,14 +384,19 @@ host_charset (void) } const char * -target_charset (void) +target_charset (struct gdbarch *gdbarch) { + if (!strcmp (target_charset_name, "auto")) + return gdbarch_auto_charset (gdbarch); return target_charset_name; } const char * -target_wide_charset (enum bfd_endian byte_order) +target_wide_charset (struct gdbarch *gdbarch) { + enum bfd_endian byte_order = gdbarch_byte_order (gdbarch); + + set_be_le_names (gdbarch); if (byte_order == BFD_ENDIAN_BIG) { if (target_wide_charset_be_name) @@ -373,6 +408,9 @@ target_wide_charset (enum bfd_endian byte_order) return target_wide_charset_le_name; } + if (!strcmp (target_wide_charset_name, "auto")) + return gdbarch_auto_wide_charset (gdbarch); + return target_wide_charset_name; } @@ -851,13 +889,27 @@ find_charset_names (void) #endif /* HAVE_ICONVLIST || HAVE_LIBICONVLIST */ #endif /* PHONY_ICONV */ +/* The "auto" target charset used by default_auto_charset. */ +static const char *auto_target_charset_name = GDB_DEFAULT_TARGET_CHARSET; + +const char * +default_auto_charset (void) +{ + return auto_target_charset_name; +} + +const char * +default_auto_wide_charset (void) +{ + return GDB_DEFAULT_TARGET_WIDE_CHARSET; +} + void _initialize_charset (void) { struct cmd_list_element *new_cmd; - /* The first element is always "auto"; then we skip it for the - commands where it is not allowed. */ + /* The first element is always "auto". */ VEC_safe_push (char_ptr, charsets, xstrdup ("auto")); find_charset_names (); @@ -874,14 +926,12 @@ _initialize_charset (void) which GNU libiconv doesn't like (infinite loop). */ if (!strcmp (auto_host_charset_name, "646") || !*auto_host_charset_name) auto_host_charset_name = "ASCII"; - target_charset_name = auto_host_charset_name; - - set_be_le_names (); + auto_target_charset_name = auto_host_charset_name; #endif #endif add_setshow_enum_cmd ("charset", class_support, - &charset_enum[1], &host_charset_name, _("\ + charset_enum, &host_charset_name, _("\ Set the host and target character sets."), _("\ Show the host and target character sets."), _("\ The `host character set' is the one used by the system GDB is running on.\n\ @@ -909,7 +959,7 @@ To see a list of the character sets GDB supports, type `set host-charset <TAB>'. &setlist, &showlist); add_setshow_enum_cmd ("target-charset", class_support, - &charset_enum[1], &target_charset_name, _("\ + charset_enum, &target_charset_name, _("\ Set the target character set."), _("\ Show the target character set."), _("\ The `target character set' is the one used by the program being debugged.\n\ @@ -921,7 +971,7 @@ To see a list of the character sets GDB supports, type `set target-charset'<TAB> &setlist, &showlist); add_setshow_enum_cmd ("target-wide-charset", class_support, - &charset_enum[1], &target_wide_charset_name, + charset_enum, &target_wide_charset_name, _("\ Set the target wide character set."), _("\ Show the target wide character set."), _("\ diff --git a/gdb/charset.h b/gdb/charset.h index 7700168..bd93d01 100644 --- a/gdb/charset.h +++ b/gdb/charset.h @@ -33,8 +33,8 @@ result is owned by the charset module; the caller should not free it. */ const char *host_charset (void); -const char *target_charset (void); -const char *target_wide_charset (enum bfd_endian byte_order); +const char *target_charset (struct gdbarch *gdbarch); +const char *target_wide_charset (struct gdbarch *gdbarch); /* These values are used to specify the type of transliteration done by convert_between_encodings. */ diff --git a/gdb/cli/cli-cmds.c b/gdb/cli/cli-cmds.c index 7400967..bc79258 100644 --- a/gdb/cli/cli-cmds.c +++ b/gdb/cli/cli-cmds.c @@ -618,7 +618,7 @@ echo_command (char *text, int from_tty) if (*p == 0) return; - c = parse_escape (&p); + c = parse_escape (get_current_arch (), &p); if (c >= 0) printf_filtered ("%c", c); } diff --git a/gdb/cli/cli-setshow.c b/gdb/cli/cli-setshow.c index c1dafb4..d8ec100 100644 --- a/gdb/cli/cli-setshow.c +++ b/gdb/cli/cli-setshow.c @@ -21,6 +21,7 @@ #include "value.h" #include <ctype.h> #include "gdb_string.h" +#include "arch-utils.h" #include "ui-out.h" @@ -152,7 +153,7 @@ do_setshow_command (char *arg, int from_tty, struct cmd_list_element *c) right before a newline. */ if (*p == 0) break; - ch = parse_escape (&p); + ch = parse_escape (get_current_arch (), &p); if (ch == 0) break; /* C loses */ else if (ch > 0) diff --git a/gdb/defs.h b/gdb/defs.h index b8973b3..cb15d0e 100644 --- a/gdb/defs.h +++ b/gdb/defs.h @@ -899,7 +899,7 @@ extern char *xstrvprintf (const char *format, va_list ap) extern int xsnprintf (char *str, size_t size, const char *format, ...) ATTR_FORMAT (printf, 3, 4); -extern int parse_escape (char **); +extern int parse_escape (struct gdbarch *, char **); /* Message to be printed before the error message, when an error occurs. */ diff --git a/gdb/gdbarch.c b/gdb/gdbarch.c index cfb042b..f482572 100644 --- a/gdb/gdbarch.c +++ b/gdb/gdbarch.c @@ -253,6 +253,8 @@ struct gdbarch gdbarch_has_shared_address_space_ftype *has_shared_address_space; gdbarch_fast_tracepoint_valid_at_ftype *fast_tracepoint_valid_at; const char * qsupported; + gdbarch_auto_charset_ftype *auto_charset; + gdbarch_auto_wide_charset_ftype *auto_wide_charset; }; @@ -397,6 +399,8 @@ struct gdbarch startup_gdbarch = default_has_shared_address_space, /* has_shared_address_space */ default_fast_tracepoint_valid_at, /* fast_tracepoint_valid_at */ 0, /* qsupported */ + default_auto_charset, /* auto_charset */ + default_auto_wide_charset, /* auto_wide_charset */ /* startup_gdbarch() */ }; @@ -483,6 +487,8 @@ gdbarch_alloc (const struct gdbarch_info *info, gdbarch->target_signal_to_host = default_target_signal_to_host; gdbarch->has_shared_address_space = default_has_shared_address_space; gdbarch->fast_tracepoint_valid_at = default_fast_tracepoint_valid_at; + gdbarch->auto_charset = default_auto_charset; + gdbarch->auto_wide_charset = default_auto_wide_charset; /* gdbarch_alloc() */ return gdbarch; @@ -664,6 +670,8 @@ verify_gdbarch (struct gdbarch *gdbarch) /* Skip verify of has_shared_address_space, invalid_p == 0 */ /* Skip verify of fast_tracepoint_valid_at, invalid_p == 0 */ /* Skip verify of qsupported, invalid_p == 0 */ + /* Skip verify of auto_charset, invalid_p == 0 */ + /* Skip verify of auto_wide_charset, invalid_p == 0 */ buf = ui_file_xstrdup (log, &length); make_cleanup (xfree, buf); if (length > 0) @@ -720,6 +728,12 @@ gdbarch_dump (struct gdbarch *gdbarch, struct ui_file *file) "gdbarch_dump: adjust_breakpoint_address = <%s>\n", host_address_to_string (gdbarch->adjust_breakpoint_address)); fprintf_unfiltered (file, + "gdbarch_dump: auto_charset = <%s>\n", + host_address_to_string (gdbarch->auto_charset)); + fprintf_unfiltered (file, + "gdbarch_dump: auto_wide_charset = <%s>\n", + host_address_to_string (gdbarch->auto_wide_charset)); + fprintf_unfiltered (file, "gdbarch_dump: believe_pcc_promotion = %s\n", plongest (gdbarch->believe_pcc_promotion)); fprintf_unfiltered (file, @@ -3599,6 +3613,40 @@ set_gdbarch_qsupported (struct gdbarch *gdbarch, gdbarch->qsupported = qsupported; } +const char * +gdbarch_auto_charset (struct gdbarch *gdbarch) +{ + gdb_assert (gdbarch != NULL); + gdb_assert (gdbarch->auto_charset != NULL); + if (gdbarch_debug >= 2) + fprintf_unfiltered (gdb_stdlog, "gdbarch_auto_charset called\n"); + return gdbarch->auto_charset (); +} + +void +set_gdbarch_auto_charset (struct gdbarch *gdbarch, + gdbarch_auto_charset_ftype auto_charset) +{ + gdbarch->auto_charset = auto_charset; +} + +const char * +gdbarch_auto_wide_charset (struct gdbarch *gdbarch) +{ + gdb_assert (gdbarch != NULL); + gdb_assert (gdbarch->auto_wide_charset != NULL); + if (gdbarch_debug >= 2) + fprintf_unfiltered (gdb_stdlog, "gdbarch_auto_wide_charset called\n"); + return gdbarch->auto_wide_charset (); +} + +void +set_gdbarch_auto_wide_charset (struct gdbarch *gdbarch, + gdbarch_auto_wide_charset_ftype auto_wide_charset) +{ + gdbarch->auto_wide_charset = auto_wide_charset; +} + /* Keep a registry of per-architecture data-pointers required by GDB modules. */ diff --git a/gdb/gdbarch.h b/gdb/gdbarch.h index 1353ab1..83281f4 100644 --- a/gdb/gdbarch.h +++ b/gdb/gdbarch.h @@ -928,6 +928,18 @@ extern void set_gdbarch_fast_tracepoint_valid_at (struct gdbarch *gdbarch, gdbar extern const char * gdbarch_qsupported (struct gdbarch *gdbarch); extern void set_gdbarch_qsupported (struct gdbarch *gdbarch, const char * qsupported); +/* Return the "auto" target charset. */ + +typedef const char * (gdbarch_auto_charset_ftype) (void); +extern const char * gdbarch_auto_charset (struct gdbarch *gdbarch); +extern void set_gdbarch_auto_charset (struct gdbarch *gdbarch, gdbarch_auto_charset_ftype *auto_charset); + +/* Return the "auto" target wide charset. */ + +typedef const char * (gdbarch_auto_wide_charset_ftype) (void); +extern const char * gdbarch_auto_wide_charset (struct gdbarch *gdbarch); +extern void set_gdbarch_auto_wide_charset (struct gdbarch *gdbarch, gdbarch_auto_wide_charset_ftype *auto_wide_charset); + /* Definition for an unknown syscall, used basically in error-cases. */ #define UNKNOWN_SYSCALL (-1) diff --git a/gdb/gdbarch.sh b/gdb/gdbarch.sh index 0eaa0ef..441c515 100755 --- a/gdb/gdbarch.sh +++ b/gdb/gdbarch.sh @@ -769,6 +769,11 @@ m:int:fast_tracepoint_valid_at:CORE_ADDR addr, int *isize, char **msg:addr, isiz # Not NULL if a target has additonal field for qSupported. v:const char *:qsupported:::0:0::0:gdbarch->qsupported + +# Return the "auto" target charset. +f:const char *:auto_charset:void::default_auto_charset:default_auto_charset::0 +# Return the "auto" target wide charset. +f:const char *:auto_wide_charset:void::default_auto_wide_charset:default_auto_wide_charset::0 EOF } diff --git a/gdb/i386-cygwin-tdep.c b/gdb/i386-cygwin-tdep.c index f0d8d4b..ef24670 100644 --- a/gdb/i386-cygwin-tdep.c +++ b/gdb/i386-cygwin-tdep.c @@ -205,6 +205,12 @@ i386_cygwin_skip_trampoline_code (struct frame_info *frame, CORE_ADDR pc) return i386_pe_skip_trampoline_code (frame, pc, NULL); } +static const char * +i386_cygwin_auto_wide_charset (void) +{ + return "UTF-16"; +} + static void i386_cygwin_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch) { @@ -227,6 +233,8 @@ i386_cygwin_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch) (gdbarch, i386_windows_regset_from_core_section); set_gdbarch_core_xfer_shared_libraries (gdbarch, windows_core_xfer_shared_libraries); + + set_gdbarch_auto_wide_charset (gdbarch, i386_cygwin_auto_wide_charset); } static enum gdb_osabi diff --git a/gdb/jv-exp.y b/gdb/jv-exp.y index f91e5bd..a904c32 100644 --- a/gdb/jv-exp.y +++ b/gdb/jv-exp.y @@ -898,7 +898,7 @@ yylex (void) lexptr++; c = *lexptr++; if (c == '\\') - c = parse_escape (&lexptr); + c = parse_escape (parse_gdbarch, &lexptr); else if (c == '\'') error (_("Empty character constant")); @@ -1061,7 +1061,7 @@ yylex (void) break; case '\\': tokptr++; - c = parse_escape (&tokptr); + c = parse_escape (parse_gdbarch, &tokptr); if (c == -1) { continue; diff --git a/gdb/mi/mi-parse.c b/gdb/mi/mi-parse.c index 8548b67..c3f5eeb 100644 --- a/gdb/mi/mi-parse.c +++ b/gdb/mi/mi-parse.c @@ -23,10 +23,83 @@ #include "defs.h" #include "mi-cmds.h" #include "mi-parse.h" +#include "charset.h" #include <ctype.h> #include "gdb_string.h" +/* Like parse_escape, but leave the results as a host char, not a + target char. */ + +static int +mi_parse_escape (char **string_ptr) +{ + int c = *(*string_ptr)++; + switch (c) + { + case '\n': + return -2; + case 0: + (*string_ptr)--; + return 0; + + case '0': + case '1': + case '2': + case '3': + case '4': + case '5': + case '6': + case '7': + { + int i = host_hex_value (c); + int count = 0; + while (++count < 3) + { + c = (**string_ptr); + if (isdigit (c) && c != '8' && c != '9') + { + (*string_ptr)++; + i *= 8; + i += host_hex_value (c); + } + else + { + break; + } + } + return i; + } + + case 'a': + c = '\a'; + break; + case 'b': + c = '\b'; + break; + case 'f': + c = '\f'; + break; + case 'n': + c = '\n'; + break; + case 'r': + c = '\r'; + break; + case 't': + c = '\t'; + break; + case 'v': + c = '\v'; + break; + + default: + break; + } + + return c; +} + static void mi_parse_argv (char *args, struct mi_parse *parse) { @@ -60,7 +133,7 @@ mi_parse_argv (char *args, struct mi_parse *parse) if (*chp == '\\') { chp++; - if (parse_escape (&chp) <= 0) + if (mi_parse_escape (&chp) <= 0) { /* Do not allow split lines or "\000" */ freeargv (argv); @@ -93,7 +166,7 @@ mi_parse_argv (char *args, struct mi_parse *parse) if (*chp == '\\') { chp++; - arg[len] = parse_escape (&chp); + arg[len] = mi_parse_escape (&chp); } else arg[len] = *chp++; diff --git a/gdb/objc-exp.y b/gdb/objc-exp.y index 9994ca2..888bd20 100644 --- a/gdb/objc-exp.y +++ b/gdb/objc-exp.y @@ -1276,7 +1276,7 @@ yylex () lexptr++; c = *lexptr++; if (c == '\\') - c = parse_escape (&lexptr); + c = parse_escape (parse_gdbarch, &lexptr); else if (c == '\'') error ("Empty character constant."); @@ -1506,7 +1506,7 @@ yylex () break; case '\\': tokptr++; - c = parse_escape (&tokptr); + c = parse_escape (parse_gdbarch, &tokptr); if (c == -1) { continue; diff --git a/gdb/p-exp.y b/gdb/p-exp.y index 481624a..c034fbb 100644 --- a/gdb/p-exp.y +++ b/gdb/p-exp.y @@ -1138,7 +1138,7 @@ yylex () lexptr++; c = *lexptr++; if (c == '\\') - c = parse_escape (&lexptr); + c = parse_escape (parse_gdbarch, &lexptr); else if (c == '\'') error ("Empty character constant."); @@ -1303,7 +1303,7 @@ yylex () break; case '\\': tokptr++; - c = parse_escape (&tokptr); + c = parse_escape (parse_gdbarch, &tokptr); if (c == -1) { continue; diff --git a/gdb/printcmd.c b/gdb/printcmd.c index c8cb35c..1e644d8 100644 --- a/gdb/printcmd.c +++ b/gdb/printcmd.c @@ -2372,7 +2372,7 @@ printf_command (char *arg, int from_tty) obstack_init (&output); inner_cleanup = make_cleanup_obstack_free (&output); - convert_between_encodings (target_wide_charset (byte_order), + convert_between_encodings (target_wide_charset (gdbarch), host_charset (), str, j, wcwidth, &output, translit_char); @@ -2404,7 +2404,7 @@ printf_command (char *arg, int from_tty) obstack_init (&output); inner_cleanup = make_cleanup_obstack_free (&output); - convert_between_encodings (target_wide_charset (byte_order), + convert_between_encodings (target_wide_charset (gdbarch), host_charset (), bytes, TYPE_LENGTH (valtype), TYPE_LENGTH (valtype), diff --git a/gdb/python/py-utils.c b/gdb/python/py-utils.c index 4ccf97f..aed8fff 100644 --- a/gdb/python/py-utils.c +++ b/gdb/python/py-utils.c @@ -129,7 +129,8 @@ unicode_to_encoded_python_string (PyObject *unicode_str, const char *charset) char * unicode_to_target_string (PyObject *unicode_str) { - return unicode_to_encoded_string (unicode_str, target_charset ()); + return unicode_to_encoded_string (unicode_str, + target_charset (python_gdbarch)); } /* Returns a PyObject with the contents of the given unicode string @@ -139,7 +140,8 @@ unicode_to_target_string (PyObject *unicode_str) PyObject * unicode_to_target_python_string (PyObject *unicode_str) { - return unicode_to_encoded_python_string (unicode_str, target_charset ()); + return unicode_to_encoded_python_string (unicode_str, + target_charset (python_gdbarch)); } /* Converts a python string (8-bit or unicode) to a target string in @@ -208,7 +210,7 @@ target_string_to_unicode (const gdb_byte *str, int length) if (length == -1) length = strlen (str); - return PyUnicode_Decode (str, length, target_charset (), NULL); + return PyUnicode_Decode (str, length, target_charset (python_gdbarch), NULL); } /* Return true if OBJ is a Python string or unicode object, false diff --git a/gdb/utils.c b/gdb/utils.c index 52596ca..22eefee 100644 --- a/gdb/utils.c +++ b/gdb/utils.c @@ -1652,7 +1652,7 @@ query (const char *ctlstr, ...) function returns 1. Otherwise, the function returns 0. */ static int -host_char_to_target (int c, int *target_c) +host_char_to_target (struct gdbarch *gdbarch, int c, int *target_c) { struct obstack host_data; char the_char = c; @@ -1662,7 +1662,7 @@ host_char_to_target (int c, int *target_c) obstack_init (&host_data); cleanups = make_cleanup_obstack_free (&host_data); - convert_between_encodings (target_charset (), host_charset (), + convert_between_encodings (target_charset (gdbarch), host_charset (), &the_char, 1, 1, &host_data, translit_none); if (obstack_object_size (&host_data) == 1) @@ -1691,7 +1691,7 @@ host_char_to_target (int c, int *target_c) after the zeros. A value of 0 does not mean end of string. */ int -parse_escape (char **string_ptr) +parse_escape (struct gdbarch *gdbarch, char **string_ptr) { int target_char = -2; /* initialize to avoid GCC warnings */ int c = *(*string_ptr)++; @@ -1757,11 +1757,11 @@ parse_escape (char **string_ptr) break; } - if (!host_char_to_target (c, &target_char)) + if (!host_char_to_target (gdbarch, c, &target_char)) error ("The escape sequence `\%c' is equivalent to plain `%c', which" " has no equivalent\n" "in the `%s' character set.", c, c, - target_charset ()); + target_charset (gdbarch)); return target_char; } \f ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-02 22:55 ` Tom Tromey @ 2010-03-02 22:57 ` Tom Tromey 2010-03-03 10:16 ` Corinna Vinschen 2010-03-03 10:16 ` Corinna Vinschen 2010-03-03 17:08 ` Daniel Jacobowitz 2 siblings, 1 reply; 64+ messages in thread From: Tom Tromey @ 2010-03-02 22:57 UTC (permalink / raw) To: gdb-patches >>>>> "Tom" == Tom Tromey <tromey@redhat.com> writes: Tom> Here's a patch, please let me know what you think. Tom> 2010-03-02 Tom Tromey <tromey@redhat.com> Tom> * utils.c (host_char_to_target): Add 'gdbarch' argument. [...] Oops! I took some code from Corinna's earlier patch but then neglected to put her name on the ChangeLog entry. I've corrected that locally. Tom ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-02 22:57 ` Tom Tromey @ 2010-03-03 10:16 ` Corinna Vinschen 2010-03-03 10:41 ` Corinna Vinschen 0 siblings, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-03-03 10:16 UTC (permalink / raw) To: gdb-patches On Mar 2 15:57, Tom Tromey wrote: > >>>>> "Tom" == Tom Tromey <tromey@redhat.com> writes: > > Tom> Here's a patch, please let me know what you think. > > Tom> 2010-03-02 Tom Tromey <tromey@redhat.com> > Tom> * utils.c (host_char_to_target): Add 'gdbarch' argument. > > [...] > > Oops! I took some code from Corinna's earlier patch but then neglected > to put her name on the ChangeLog entry. I've corrected that locally. No worries. I tested your code on the latest Cygwin release and it works fine. Thanks, Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 10:16 ` Corinna Vinschen @ 2010-03-03 10:41 ` Corinna Vinschen 2010-03-03 10:59 ` Corinna Vinschen 0 siblings, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-03-03 10:41 UTC (permalink / raw) To: gdb-patches On Mar 3 11:16, Corinna Vinschen wrote: > On Mar 2 15:57, Tom Tromey wrote: > > >>>>> "Tom" == Tom Tromey <tromey@redhat.com> writes: > > > > Tom> Here's a patch, please let me know what you think. > > > > Tom> 2010-03-02 Tom Tromey <tromey@redhat.com> > > Tom> * utils.c (host_char_to_target): Add 'gdbarch' argument. > > > > [...] > > > > Oops! I took some code from Corinna's earlier patch but then neglected > > to put her name on the ChangeLog entry. I've corrected that locally. > > No worries. I tested your code on the latest Cygwin release and it > works fine. However, there's a problem in GDB which spoils the effort a bit. When GDB is started, setlocale() is called from captured_main(). Afterwards, _initialize_charset() and, in turn, nl_langinfo() is called, and the pointer returned by nl_langinfo(CODESET) is stored as the name of the codeset. So the codeset name is not copied into a safe storage area here, just the pointer is memorized. Later, while captured_main() is still initializing GDB, the interp_set() function is called, which calls tui_init(), which in turn initializes readline. Readline calls setlocale(). Oops. The old pointer to nl_langinfo(CODESET) might not be valid anymore afterwards. See the Linux manpage for nl_langinfo: RETURN VALUE [...] This pointer may point to static data that may be overwritten on the next call to nl_langinfo() or setlocale(3). So, does anything speak against copying the codeset name into some local storage, rather than just storing the pointer returned by nl_langinfo? Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 10:41 ` Corinna Vinschen @ 2010-03-03 10:59 ` Corinna Vinschen 2010-03-03 16:26 ` Tom Tromey 0 siblings, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-03-03 10:59 UTC (permalink / raw) To: gdb-patches On Mar 3 11:41, Corinna Vinschen wrote: > However, there's a problem in GDB which spoils the effort a bit. > > When GDB is started, setlocale() is called from captured_main(). > Afterwards, _initialize_charset() and, in turn, nl_langinfo() is > called, and the pointer returned by nl_langinfo(CODESET) is > stored as the name of the codeset. > So the codeset name is not copied into a safe storage area here, > just the pointer is memorized. > > Later, while captured_main() is still initializing GDB, the interp_set() > function is called, which calls tui_init(), which in turn initializes > readline. Readline calls setlocale(). Oops. The old pointer to > nl_langinfo(CODESET) might not be valid anymore afterwards. > > See the Linux manpage for nl_langinfo: > > RETURN VALUE > [...] > This pointer may point to static data that may be overwritten on the > next call to nl_langinfo() or setlocale(3). > > So, does anything speak against copying the codeset name into some > local storage, rather than just storing the pointer returned by > nl_langinfo? Accidentally omitted from my former mail: Since nl_langinfo is called only once here anyway, a simple auto_host_charset_name = strdup (nl_langinfo (CODESET)); would do. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 10:59 ` Corinna Vinschen @ 2010-03-03 16:26 ` Tom Tromey 0 siblings, 0 replies; 64+ messages in thread From: Tom Tromey @ 2010-03-03 16:26 UTC (permalink / raw) To: gdb-patches >>>>> "Corinna" == Corinna Vinschen <vinschen@redhat.com> writes: Corinna> Accidentally omitted from my former mail: Corinna> Since nl_langinfo is called only once here anyway, a simple Corinna> auto_host_charset_name = strdup (nl_langinfo (CODESET)); Corinna> would do. Thanks for noticing this. I've added this to my patch locally. Tom ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-02 22:55 ` Tom Tromey 2010-03-02 22:57 ` Tom Tromey @ 2010-03-03 10:16 ` Corinna Vinschen 2010-03-03 16:25 ` Tom Tromey 2010-03-03 21:15 ` Tom Tromey 2010-03-03 17:08 ` Daniel Jacobowitz 2 siblings, 2 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-03-03 10:16 UTC (permalink / raw) To: gdb-patches On Mar 2 15:55, Tom Tromey wrote: > >>>>> "Tom" == Tom Tromey <tromey@redhat.com> writes: > > Tom> I agree, we need a gdbarch method. > > Tom> I'll look into this today. > > Here's a patch, please let me know what you think. > > This patch nearly works -- it regresses on a Python test that looks at > gdb.parameter('target-charset') directly and then gets confused by > "auto". I think the only solution for this is to add a new Python API. > (It would be easy enough to just hack the test case somehow -- but the > libstdc++ printers use this same idiom, so a real solution is needed.) > > I wasn't sure where to put the Cygwin override; I took Daniel's > suggestion of i386_cygwin_init_abi; but Corinna's earlier patches > changed all Windows targets, not just Cygwin, and I didn't see a good > place to do that. Me neither. There seems to be no i386_windows_init_abi which would allow to define this for all Windows targets. Btw., in contrast to Cygwin, native Windows hosts don't have the nl_langinfo function. Of course, this affects the default host charset. ISO-8859-1 is no sensible default, even if it mostly works on western language Windows systems since it's quite similar to CP1252. The most sensible default host charset for a native Windows GDB is the codepage returned by GetACP(), as Eli already pointed out for the target charset. Maybe something along these lines would help (untested!): --- charset.c 2010-03-03 11:09:43.000000000 +0100 +++ charset.c.new 2010-03-03 11:09:36.000000000 +0100 @@ -904,6 +904,10 @@ default_auto_wide_charset (void) return GDB_DEFAULT_TARGET_WIDE_CHARSET; } +#ifdef _WIN32 +static w32_host_default_charset[16]; /* "CP + x<=5 digits + paranoia. */ +#endif + void _initialize_charset (void) { @@ -927,6 +931,11 @@ _initialize_charset (void) if (!strcmp (auto_host_charset_name, "646") || !*auto_host_charset_name) auto_host_charset_name = "ASCII"; auto_target_charset_name = auto_host_charset_name; +#elif defined (_WIN32) + snprintf (w32_host_default_charset, sizeof w32_host_default_charset, + "CP%d", GetACP()); + auto_host_charset_name = w32_host_default_charset; + auto_target_charset_name = auto_host_charset_name; #endif #endif Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 10:16 ` Corinna Vinschen @ 2010-03-03 16:25 ` Tom Tromey 2010-03-03 16:46 ` Corinna Vinschen 2010-03-03 21:15 ` Tom Tromey 1 sibling, 1 reply; 64+ messages in thread From: Tom Tromey @ 2010-03-03 16:25 UTC (permalink / raw) To: gdb-patches >>>>> "Corinna" == Corinna Vinschen <vinschen@redhat.com> writes: Corinna> Btw., in contrast to Cygwin, native Windows hosts don't have Corinna> the nl_langinfo function. Of course, this affects the default Corinna> host charset. ISO-8859-1 is no sensible default, even if it Corinna> mostly works on western language Windows systems since it's Corinna> quite similar to CP1252. The most sensible default host Corinna> charset for a native Windows GDB is the codepage returned by Corinna> GetACP(), as Eli already pointed out for the target charset. Can we just check for GetACP in configure? Otherwise, should we use USE_WIN32API rather than _WIN32? It isn't clear to me which is preferred. Tom ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 16:25 ` Tom Tromey @ 2010-03-03 16:46 ` Corinna Vinschen 2010-03-05 8:55 ` Charles Wilson 0 siblings, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-03-03 16:46 UTC (permalink / raw) To: gdb-patches On Mar 3 09:25, Tom Tromey wrote: > >>>>> "Corinna" == Corinna Vinschen <vinschen@redhat.com> writes: > > Corinna> Btw., in contrast to Cygwin, native Windows hosts don't have > Corinna> the nl_langinfo function. Of course, this affects the default > Corinna> host charset. ISO-8859-1 is no sensible default, even if it > Corinna> mostly works on western language Windows systems since it's > Corinna> quite similar to CP1252. The most sensible default host > Corinna> charset for a native Windows GDB is the codepage returned by > Corinna> GetACP(), as Eli already pointed out for the target charset. > > Can we just check for GetACP in configure? GetACP is (obviously?) also available on Cygwin through automatic linking against kernel32.dll. I would not really like the idea that GetACP gets used on Cygwin as well due to a configure test... I think that a configure test doesn't make sense here. GetACP is a well-defined API on Windows, and a non-existant API on any other OS. It's not at all like, say, a POSIX function which may or may not exist on a platform, which would call for a configure test. > Otherwise, should we use USE_WIN32API rather than _WIN32? It isn't > clear to me which is preferred. I really don't know. I wasn't involved in porting GDB to MingW. I see USE_WIN32API mainly used in gdbserver.c and ser-tcp.c so far. Guessing from the comment in configure.ac I assume that USE_WIN32API is the way to go. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 16:46 ` Corinna Vinschen @ 2010-03-05 8:55 ` Charles Wilson 0 siblings, 0 replies; 64+ messages in thread From: Charles Wilson @ 2010-03-05 8:55 UTC (permalink / raw) To: gdb-patches Corinna Vinschen <vinschen <at> redhat.com> writes: > > Otherwise, should we use USE_WIN32API rather than _WIN32? It isn't > > clear to me which is preferred. > > I really don't know. I wasn't involved in porting GDB to MingW. > I see USE_WIN32API mainly used in gdbserver.c and ser-tcp.c so far. > Guessing from the comment in configure.ac I assume that USE_WIN32API > is the way to go. Remember that _WIN32 can be defined, even on __CYGWIN__, if <windows.h> is #included -- so _WIN32 is not, in general, a reliable indicator of "native win32"-ness. (Conversely, if you HAVEN'T included <windows.h>, then on cygwin _WIN32 is not defined -- which could be a problem if you really DO want to use a native win32 api from cygwin, but the code is using _WIN32 to determine whether to do so. 'Course, if you wanted to use a native win32 api, you really should have included <windows.h> to get the API's declaration, but that's another story). -- Chuck ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 10:16 ` Corinna Vinschen 2010-03-03 16:25 ` Tom Tromey @ 2010-03-03 21:15 ` Tom Tromey 2010-03-04 9:34 ` Corinna Vinschen 1 sibling, 1 reply; 64+ messages in thread From: Tom Tromey @ 2010-03-03 21:15 UTC (permalink / raw) To: gdb-patches >>>>> "Corinna" == Corinna Vinschen <vinschen@redhat.com> writes: Corinna> Maybe something along these lines would help (untested!): I'm adding the appended to my local patch. This is just your patch with the new global put closer to the code, and a different preprocessor check. Let me know if this seems wrong somehow. Tom diff --git a/gdb/charset.c b/gdb/charset.c index 21c4306..9c1e7f4 100644 --- a/gdb/charset.c +++ b/gdb/charset.c @@ -930,6 +930,15 @@ _initialize_charset (void) if (!strcmp (auto_host_charset_name, "646") || !*auto_host_charset_name) auto_host_charset_name = "ASCII"; auto_target_charset_name = auto_host_charset_name; +#elif defined (USE_WIN32API) + { + static w32_host_default_charset[16]; /* "CP" + x<=5 digits + paranoia. */ + + snprintf (w32_host_default_charset, sizeof w32_host_default_charset, + "CP%d", GetACP()); + auto_host_charset_name = w32_host_default_charset; + auto_target_charset_name = auto_host_charset_name; + } #endif #endif ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 21:15 ` Tom Tromey @ 2010-03-04 9:34 ` Corinna Vinschen 2010-03-04 15:24 ` Christopher Faylor 0 siblings, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-03-04 9:34 UTC (permalink / raw) To: gdb-patches On Mar 3 14:15, Tom Tromey wrote: > >>>>> "Corinna" == Corinna Vinschen <vinschen@redhat.com> writes: > > Corinna> Maybe something along these lines would help (untested!): > > I'm adding the appended to my local patch. > This is just your patch with the new global put closer to the code, and > a different preprocessor check. > > Let me know if this seems wrong somehow. > > Tom > > diff --git a/gdb/charset.c b/gdb/charset.c > index 21c4306..9c1e7f4 100644 > --- a/gdb/charset.c > +++ b/gdb/charset.c > @@ -930,6 +930,15 @@ _initialize_charset (void) > if (!strcmp (auto_host_charset_name, "646") || !*auto_host_charset_name) > auto_host_charset_name = "ASCII"; > auto_target_charset_name = auto_host_charset_name; > +#elif defined (USE_WIN32API) > + { > + static w32_host_default_charset[16]; /* "CP" + x<=5 digits + paranoia. */ > + > + snprintf (w32_host_default_charset, sizeof w32_host_default_charset, > + "CP%d", GetACP()); > + auto_host_charset_name = w32_host_default_charset; > + auto_target_charset_name = auto_host_charset_name; > + } > #endif > #endif Looks good to me. Thank you, Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-04 9:34 ` Corinna Vinschen @ 2010-03-04 15:24 ` Christopher Faylor 2010-03-04 15:38 ` Corinna Vinschen 0 siblings, 1 reply; 64+ messages in thread From: Christopher Faylor @ 2010-03-04 15:24 UTC (permalink / raw) To: gdb-patches On Thu, Mar 04, 2010 at 10:34:06AM +0100, Corinna Vinschen wrote: >On Mar 3 14:15, Tom Tromey wrote: >> >>>>> "Corinna" == Corinna Vinschen <vinschen@redhat.com> writes: >> >> Corinna> Maybe something along these lines would help (untested!): >> >> I'm adding the appended to my local patch. >> This is just your patch with the new global put closer to the code, and >> a different preprocessor check. >> >> Let me know if this seems wrong somehow. >> >> Tom >> >> diff --git a/gdb/charset.c b/gdb/charset.c >> index 21c4306..9c1e7f4 100644 >> --- a/gdb/charset.c >> +++ b/gdb/charset.c >> @@ -930,6 +930,15 @@ _initialize_charset (void) >> if (!strcmp (auto_host_charset_name, "646") || !*auto_host_charset_name) >> auto_host_charset_name = "ASCII"; >> auto_target_charset_name = auto_host_charset_name; >> +#elif defined (USE_WIN32API) >> + { >> + static w32_host_default_charset[16]; /* "CP" + x<=5 digits + paranoia. */ >> + >> + snprintf (w32_host_default_charset, sizeof w32_host_default_charset, >> + "CP%d", GetACP()); >> + auto_host_charset_name = w32_host_default_charset; >> + auto_target_charset_name = auto_host_charset_name; >> + } >> #endif >> #endif > >Looks good to me. I don't know if I get to approve this but if it looks ok to Corinna, it's ok with me. I'm not too wide character set literate. cgf ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-04 15:24 ` Christopher Faylor @ 2010-03-04 15:38 ` Corinna Vinschen 0 siblings, 0 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-03-04 15:38 UTC (permalink / raw) To: gdb-patches On Mar 4 10:24, Christopher Faylor wrote: > On Thu, Mar 04, 2010 at 10:34:06AM +0100, Corinna Vinschen wrote: > >On Mar 3 14:15, Tom Tromey wrote: > >> >>>>> "Corinna" == Corinna Vinschen <vinschen@redhat.com> writes: > >> > >> Corinna> Maybe something along these lines would help (untested!): > >> > >> I'm adding the appended to my local patch. > >> This is just your patch with the new global put closer to the code, and > >> a different preprocessor check. > >> > >> Let me know if this seems wrong somehow. > >> > >> Tom > >> > >> diff --git a/gdb/charset.c b/gdb/charset.c > >> index 21c4306..9c1e7f4 100644 > >> --- a/gdb/charset.c > >> +++ b/gdb/charset.c > >> @@ -930,6 +930,15 @@ _initialize_charset (void) > >> if (!strcmp (auto_host_charset_name, "646") || !*auto_host_charset_name) > >> auto_host_charset_name = "ASCII"; > >> auto_target_charset_name = auto_host_charset_name; > >> +#elif defined (USE_WIN32API) > >> + { > >> + static w32_host_default_charset[16]; /* "CP" + x<=5 digits + paranoia. */ > >> + > >> + snprintf (w32_host_default_charset, sizeof w32_host_default_charset, > >> + "CP%d", GetACP()); > >> + auto_host_charset_name = w32_host_default_charset; > >> + auto_target_charset_name = auto_host_charset_name; > >> + } > >> #endif > >> #endif > > > >Looks good to me. > > I don't know if I get to approve this but if it looks ok to Corinna, it's ok > with me. I'm not too wide character set literate. No worries, it only affects GDB for MingW hosts, and only if it gets build with iconv support. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-02 22:55 ` Tom Tromey 2010-03-02 22:57 ` Tom Tromey 2010-03-03 10:16 ` Corinna Vinschen @ 2010-03-03 17:08 ` Daniel Jacobowitz 2010-03-03 17:22 ` Corinna Vinschen 2010-03-03 21:08 ` Tom Tromey 2 siblings, 2 replies; 64+ messages in thread From: Daniel Jacobowitz @ 2010-03-03 17:08 UTC (permalink / raw) To: Tom Tromey; +Cc: gdb-patches On Tue, Mar 02, 2010 at 03:55:08PM -0700, Tom Tromey wrote: > This patch nearly works -- it regresses on a Python test that looks at > gdb.parameter('target-charset') directly and then gets confused by > "auto". I think the only solution for this is to add a new Python API. > (It would be easy enough to just hack the test case somehow -- but the > libstdc++ printers use this same idiom, so a real solution is needed.) I agree. This is a general problem: I had the same issue with gdb.parameter('endian'). It'd be nice to have a standard way to query the effective value of the parameter! > I wasn't sure where to put the Cygwin override; I took Daniel's > suggestion of i386_cygwin_init_abi; but Corinna's earlier patches > changed all Windows targets, not just Cygwin, and I didn't see a good > place to do that. This is an interesting mess. We have both a Windows target issue and a Windows native issue. Handling Windows has become a separate issue; I don't see any need to wait on this patch which fixes Cygwin. What I'd suggest: create an i386-windows-tdep.c. Define a new OSABI for Windows, and sniff for it. Then you have an i386_windows_init_abi and you can put things there. Unfortunately, the Cygwin OSABI sniffer is going to be a problem for this. It isn't actually sniffing for Cygwin binaries, it's sniffing for pei-i386 objects. Is there a practical way to do this right? If we can rely on the cygwin DLL being in the image header, we could have a single sniffer (rename it to i386_windows_osabi_sniffer, move it to i386-windows-tdep.c) that returns GDB_OSABI_WINDOWS if not linked to Cygwin and GDB_OSABI_CYGWIN if linked to Cygwin. I know it's easy to get this information out of a Windows executable. I don't know if there's an easy way to get BFD to tell it to you. It doesn't really look like it is. It's printed from objdump -p (bfd/peXXigen.c:pe_print_idata, I think). Or there may be some better way. Corinna, do you know? Is there some other marker to distinguish a Cygwin executable besides linking to the DLL? Anyway, once you've got sniffers that distinguish Windows from Cygwin binaries, the rest is easy. In the Windows implementation of auto_charset, if GetACP is available, call it. That's not 100% right, in that you could *theoretically* be debugging a Windows binary on a remote system with a different charset, but it's all the work I think we should do for a default. At this point, I think it's correct to call GetACP even for a Cygwin GDB. The GDB might be a Cygwin executable but the program being debugged might not be, and it will use the non-Cygwin Windows settings. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 17:08 ` Daniel Jacobowitz @ 2010-03-03 17:22 ` Corinna Vinschen 2010-03-03 17:27 ` Corinna Vinschen 2010-03-03 17:33 ` Daniel Jacobowitz 2010-03-03 21:08 ` Tom Tromey 1 sibling, 2 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-03-03 17:22 UTC (permalink / raw) To: gdb-patches; +Cc: Tom Tromey On Mar 3 12:08, Daniel Jacobowitz wrote: > On Tue, Mar 02, 2010 at 03:55:08PM -0700, Tom Tromey wrote: > Or there may be some better way. Corinna, do you know? Is there some > other marker to distinguish a Cygwin executable besides linking to the > DLL? Not that I'm aware of, sorry. > Anyway, once you've got sniffers that distinguish Windows from Cygwin > binaries, the rest is easy. In the Windows implementation of > auto_charset, if GetACP is available, call it. That's not 100% right, > in that you could *theoretically* be debugging a Windows binary on a > remote system with a different charset, but it's all the work I think > we should do for a default. > > At this point, I think it's correct to call GetACP even for > a Cygwin GDB. The GDB might be a Cygwin executable but the program > being debugged might not be, and it will use the non-Cygwin > Windows settings. As one of the Cygwin maintainers I veto the notion to handle Cygwin as a Windows target in the first place. It's not valid to assume that Cygwin GDB is used to debug native apps and Cygwin apps are just an afterthought. And for Cygwin binaries the Windows default codepage has no meaning. The default codeset in Cygwin is UTF-8 and otherwise the same locale environment variables are used as other POSIX systems. Only the MingW GDB should default to the ANSI codepage. Cygwin should default to UTF-8. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 17:22 ` Corinna Vinschen @ 2010-03-03 17:27 ` Corinna Vinschen 2010-03-03 17:33 ` Daniel Jacobowitz 1 sibling, 0 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-03-03 17:27 UTC (permalink / raw) To: gdb-patches; +Cc: Tom Tromey On Mar 3 18:22, Corinna Vinschen wrote: > On Mar 3 12:08, Daniel Jacobowitz wrote: > > On Tue, Mar 02, 2010 at 03:55:08PM -0700, Tom Tromey wrote: > > Or there may be some better way. Corinna, do you know? Is there some > > other marker to distinguish a Cygwin executable besides linking to the > > DLL? > > Not that I'm aware of, sorry. > > > Anyway, once you've got sniffers that distinguish Windows from Cygwin > > binaries, the rest is easy. In the Windows implementation of > > auto_charset, if GetACP is available, call it. That's not 100% right, > > in that you could *theoretically* be debugging a Windows binary on a > > remote system with a different charset, but it's all the work I think > > we should do for a default. > > > > At this point, I think it's correct to call GetACP even for > > a Cygwin GDB. The GDB might be a Cygwin executable but the program > > being debugged might not be, and it will use the non-Cygwin > > Windows settings. > > As one of the Cygwin maintainers I veto the notion to handle Cygwin > as a Windows target in the first place. > > It's not valid to assume that Cygwin GDB is used to debug native apps > and Cygwin apps are just an afterthought. And for Cygwin binaries the > Windows default codepage has no meaning. The default codeset in Cygwin > is UTF-8 and otherwise the same locale environment variables are used as > other POSIX systems. > > Only the MingW GDB should default to the ANSI codepage. Cygwin should > default to UTF-8. ... or better, the default should be nl_langinfo (CODESET), just as it is today. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 17:22 ` Corinna Vinschen 2010-03-03 17:27 ` Corinna Vinschen @ 2010-03-03 17:33 ` Daniel Jacobowitz 2010-03-03 17:42 ` Eli Zaretskii 2010-03-03 18:00 ` Corinna Vinschen 1 sibling, 2 replies; 64+ messages in thread From: Daniel Jacobowitz @ 2010-03-03 17:33 UTC (permalink / raw) To: gdb-patches, Tom Tromey On Wed, Mar 03, 2010 at 06:22:17PM +0100, Corinna Vinschen wrote: > As one of the Cygwin maintainers I veto the notion to handle Cygwin > as a Windows target in the first place. > > It's not valid to assume that Cygwin GDB is used to debug native apps > and Cygwin apps are just an afterthought. And for Cygwin binaries the > Windows default codepage has no meaning. The default codeset in Cygwin > is UTF-8 and otherwise the same locale environment variables are used as > other POSIX systems. > > Only the MingW GDB should default to the ANSI codepage. Cygwin should > default to UTF-8. I don't understand why there's so much conflict about this! I'm not trying to treat Cygwin as an afterthought. And for an implementation, it is impossible to completely ignore that Cygwin binaries run on a Windows system. The point is to detect and handle Cygwin binaries correctly. Then we can give them a Cygwin code page, no matter whether GDB was built for mingw or cygwin. Both debuggers will handle both cases automatically, robustly, and correctly. We can not decide this at configure time; that defeats the point of multi-arch, a long term goal of GDB that we've finally achieved. I don't see why we can't do it right. I can dig up some code Pedro wrote that decodes the import table to check linked DLLs. I was hoping that there was an easier way using BFD, but BFD does not want to play nice. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 17:33 ` Daniel Jacobowitz @ 2010-03-03 17:42 ` Eli Zaretskii 2010-03-03 18:00 ` Corinna Vinschen 1 sibling, 0 replies; 64+ messages in thread From: Eli Zaretskii @ 2010-03-03 17:42 UTC (permalink / raw) To: Daniel Jacobowitz; +Cc: gdb-patches, tromey > Date: Wed, 3 Mar 2010 12:33:26 -0500 > From: Daniel Jacobowitz <dan@codesourcery.com> > > I can dig up some code Pedro wrote that decodes the import table to > check linked DLLs. There's code to do that in Emacs, see w32proc.c:w32_executable_type. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 17:33 ` Daniel Jacobowitz 2010-03-03 17:42 ` Eli Zaretskii @ 2010-03-03 18:00 ` Corinna Vinschen 2010-03-03 18:05 ` Daniel Jacobowitz 1 sibling, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-03-03 18:00 UTC (permalink / raw) To: gdb-patches; +Cc: Tom Tromey On Mar 3 12:33, Daniel Jacobowitz wrote: > On Wed, Mar 03, 2010 at 06:22:17PM +0100, Corinna Vinschen wrote: > > As one of the Cygwin maintainers I veto the notion to handle Cygwin > > as a Windows target in the first place. > > > > It's not valid to assume that Cygwin GDB is used to debug native apps > > and Cygwin apps are just an afterthought. And for Cygwin binaries the > > Windows default codepage has no meaning. The default codeset in Cygwin > > is UTF-8 and otherwise the same locale environment variables are used as > > other POSIX systems. > > > > Only the MingW GDB should default to the ANSI codepage. Cygwin should > > default to UTF-8. > > I don't understand why there's so much conflict about this! > [...] > I don't see why we can't do it right. I didn't say anything against doing it right. But this: > At this point, I think it's correct to call GetACP even for > a Cygwin GDB. is *not* right. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 18:00 ` Corinna Vinschen @ 2010-03-03 18:05 ` Daniel Jacobowitz 2010-03-03 18:22 ` Corinna Vinschen 0 siblings, 1 reply; 64+ messages in thread From: Daniel Jacobowitz @ 2010-03-03 18:05 UTC (permalink / raw) To: gdb-patches, Tom Tromey On Wed, Mar 03, 2010 at 07:00:44PM +0100, Corinna Vinschen wrote: > I didn't say anything against doing it right. But this: > > > At this point, I think it's correct to call GetACP even for > > a Cygwin GDB. > > is *not* right. Can you explain it to me? This would only happen when we've established that the application we're debugging is not linked to the Cygwin DLL. Cygwin's nl_langinfo (CODESET) won't have any effect on a non-Cygwin application. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 18:05 ` Daniel Jacobowitz @ 2010-03-03 18:22 ` Corinna Vinschen 2010-03-03 19:39 ` Daniel Jacobowitz 0 siblings, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-03-03 18:22 UTC (permalink / raw) To: gdb-patches; +Cc: Tom Tromey On Mar 3 13:05, Daniel Jacobowitz wrote: > On Wed, Mar 03, 2010 at 07:00:44PM +0100, Corinna Vinschen wrote: > > I didn't say anything against doing it right. But this: > > > > > At this point, I think it's correct to call GetACP even for > > > a Cygwin GDB. > > > > is *not* right. > > Can you explain it to me? This would only happen when we've > established that the application we're debugging is not linked to the > Cygwin DLL. Cygwin's nl_langinfo (CODESET) won't have any effect on a > non-Cygwin application. You're right. I re-read your original mail and it seems I misunderstood what you were trying to say. What I understood was that GetACP() is the correct default value for Cygwin GDB independent of the recognition of the target binary. Apparently you didn't mean that. Sorry for the confusion. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 18:22 ` Corinna Vinschen @ 2010-03-03 19:39 ` Daniel Jacobowitz 2010-03-03 20:16 ` Eli Zaretskii 0 siblings, 1 reply; 64+ messages in thread From: Daniel Jacobowitz @ 2010-03-03 19:39 UTC (permalink / raw) To: gdb-patches On Wed, Mar 03, 2010 at 07:22:10PM +0100, Corinna Vinschen wrote: > You're right. I re-read your original mail and it seems I misunderstood > what you were trying to say. What I understood was that GetACP() is the > correct default value for Cygwin GDB independent of the recognition of > the target binary. Apparently you didn't mean that. Sorry for the > confusion. OK, whew. We're all on the same page then. I think the next thing is for Tom to install his patch, which will strictly improve the situation. The issue of making GDB_OSABI_CYGWIN really mean Cygwin, as opposed to Windows, will be tricky - we've agreed on exactly how to do it, someone just has to convert the image search code to work with BFD somehow. Or fread from the binary if that proves too frustrating :-) -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 19:39 ` Daniel Jacobowitz @ 2010-03-03 20:16 ` Eli Zaretskii 2010-03-03 20:26 ` Daniel Jacobowitz 0 siblings, 1 reply; 64+ messages in thread From: Eli Zaretskii @ 2010-03-03 20:16 UTC (permalink / raw) To: Daniel Jacobowitz; +Cc: gdb-patches > Date: Wed, 3 Mar 2010 14:39:37 -0500 > From: Daniel Jacobowitz <dan@codesourcery.com> > > The issue of making GDB_OSABI_CYGWIN really mean Cygwin, as opposed > to Windows, will be tricky - we've agreed on exactly how to do it, > someone just has to convert the image search code to work with BFD > somehow. Or fread from the binary if that proves too frustrating > :-) It may be easier to have a simple emulation of nl_langinfo for MinGW. Then both Cygwin and MinGW will call nl_langinfo, and each one will get the value it wants. Would that solve the problem? ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 20:16 ` Eli Zaretskii @ 2010-03-03 20:26 ` Daniel Jacobowitz 0 siblings, 0 replies; 64+ messages in thread From: Daniel Jacobowitz @ 2010-03-03 20:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: gdb-patches On Wed, Mar 03, 2010 at 10:16:58PM +0200, Eli Zaretskii wrote: > It may be easier to have a simple emulation of nl_langinfo for MinGW. > Then both Cygwin and MinGW will call nl_langinfo, and each one will > get the value it wants. Would that solve the problem? No, not quite - that has the problem I was just discussing with Corinna. The ideal (and achievable) choice for a Cygwin debugger running a non-Cygwin program is the system codeset. This is a common scenario, at least for me; I do it all the time. I can't promise when I'll get to it, not with a release coming up at work, but if no one else does, I'll try to; this would be useful to me. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-03 17:08 ` Daniel Jacobowitz 2010-03-03 17:22 ` Corinna Vinschen @ 2010-03-03 21:08 ` Tom Tromey 1 sibling, 0 replies; 64+ messages in thread From: Tom Tromey @ 2010-03-03 21:08 UTC (permalink / raw) To: gdb-patches >>>>> "Daniel" == Daniel Jacobowitz <dan@codesourcery.com> writes: >> This patch nearly works -- it regresses on a Python test that looks at >> gdb.parameter('target-charset') directly and then gets confused by >> "auto". I think the only solution for this is to add a new Python API. >> (It would be easy enough to just hack the test case somehow -- but the >> libstdc++ printers use this same idiom, so a real solution is needed.) Daniel> I agree. This is a general problem: I had the same issue with Daniel> gdb.parameter('endian'). It'd be nice to have a standard way to query Daniel> the effective value of the parameter! Yes, I agree. I don't know of a generic way to do that in gdb, so I suppose we would need a new method on set/show commands and then a bunch of new implementations all over. Daniel> This is an interesting mess. We have both a Windows target issue and Daniel> a Windows native issue. Handling Windows has become a separate issue; Daniel> I don't see any need to wait on this patch which fixes Cygwin. Ok, thanks. I will finish the patch soon. Daniel> What I'd suggest: create an i386-windows-tdep.c. Define a new OSABI Daniel> for Windows, and sniff for it. Then you have an i386_windows_init_abi Daniel> and you can put things there. Someone else will have to do these parts. Tom ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 22:27 ` Daniel Jacobowitz 2010-03-01 10:31 ` Corinna Vinschen @ 2010-03-01 17:12 ` Tom Tromey 2010-03-01 17:21 ` Daniel Jacobowitz 1 sibling, 1 reply; 64+ messages in thread From: Tom Tromey @ 2010-03-01 17:12 UTC (permalink / raw) To: gdb-patches >>>>> "Daniel" == Daniel Jacobowitz <dan@codesourcery.com> writes: Daniel> We're talking past each other. Compare show_host_charset_name Daniel> (which has an auto setting) to show_target_charset_name (which does Daniel> not). We actually set the default target charset name based on gdb's own environment. See charset.c:_initialize_charset. This is not an excellent default, but it does the right thing in the common case of "gdb foo; run". Daniel> If the default becomes dependent on the target, we need to distinguish Daniel> "user specified iso-8859-1" or "user didn't say anything, but now Daniel> we're debugging i686-mingw32, and that usually uses cp1252". I think the ideal would be to extract this information from the inferior. Tom ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 17:12 ` Tom Tromey @ 2010-03-01 17:21 ` Daniel Jacobowitz 2010-03-01 17:28 ` Tom Tromey 0 siblings, 1 reply; 64+ messages in thread From: Daniel Jacobowitz @ 2010-03-01 17:21 UTC (permalink / raw) To: Tom Tromey; +Cc: gdb-patches On Mon, Mar 01, 2010 at 10:12:00AM -0700, Tom Tromey wrote: > Daniel> If the default becomes dependent on the target, we need to distinguish > Daniel> "user specified iso-8859-1" or "user didn't say anything, but now > Daniel> we're debugging i686-mingw32, and that usually uses cp1252". > > I think the ideal would be to extract this information from the > inferior. I'm not sure I understand what you're suggesting... extract it how? -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 17:21 ` Daniel Jacobowitz @ 2010-03-01 17:28 ` Tom Tromey 2010-03-01 17:50 ` Corinna Vinschen 0 siblings, 1 reply; 64+ messages in thread From: Tom Tromey @ 2010-03-01 17:28 UTC (permalink / raw) To: gdb-patches >>>>> "Daniel" == Daniel Jacobowitz <dan@codesourcery.com> writes: Daniel> On Mon, Mar 01, 2010 at 10:12:00AM -0700, Tom Tromey wrote: Daniel> If the default becomes dependent on the target, we need to distinguish Daniel> "user specified iso-8859-1" or "user didn't say anything, but now Daniel> we're debugging i686-mingw32, and that usually uses cp1252". >> I think the ideal would be to extract this information from the >> inferior. Daniel> I'm not sure I understand what you're suggesting... extract it how? I don't know :-) The only ways I can think of seem pretty fragile -- e.g., for POSIXy systems, extract information from the inferior environment and reproduce the C library logic. FWIW I think target-charset and target-wide-charset should be per-inferior settings, like the environment and arguments. I haven't looked into how to do that, though. I'm also not sure how that would interact with an "auto" setting. Tom ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-03-01 17:28 ` Tom Tromey @ 2010-03-01 17:50 ` Corinna Vinschen 0 siblings, 0 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-03-01 17:50 UTC (permalink / raw) To: gdb-patches On Mar 1 10:27, Tom Tromey wrote: > >>>>> "Daniel" == Daniel Jacobowitz <dan@codesourcery.com> writes: > > Daniel> On Mon, Mar 01, 2010 at 10:12:00AM -0700, Tom Tromey wrote: > Daniel> If the default becomes dependent on the target, we need to distinguish > Daniel> "user specified iso-8859-1" or "user didn't say anything, but now > Daniel> we're debugging i686-mingw32, and that usually uses cp1252". > > >> I think the ideal would be to extract this information from the > >> inferior. > > Daniel> I'm not sure I understand what you're suggesting... extract it how? > > I don't know :-) > > The only ways I can think of seem pretty fragile -- e.g., for POSIXy > systems, extract information from the inferior environment and reproduce > the C library logic. > > FWIW I think target-charset and target-wide-charset should be > per-inferior settings, like the environment and arguments. I haven't > looked into how to do that, though. I'm also not sure how that would > interact with an "auto" setting. I can't see any way to do that for an arbitrary target. Just Windows alone would be enough to get headaches. The default multibyte charset depends on the target application being a native Win32 application, or a Cygwin application. In the first case, the charset is the default ANSI codepage, in the latter case it's UTF-8, or the charset determined by the LC_ALL/LC_CTYPE/LANG environment variables. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 13:05 [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds Corinna Vinschen 2010-02-28 14:29 ` Daniel Jacobowitz @ 2010-02-28 16:56 ` Eli Zaretskii 2010-02-28 17:05 ` Corinna Vinschen 1 sibling, 1 reply; 64+ messages in thread From: Eli Zaretskii @ 2010-02-28 16:56 UTC (permalink / raw) To: gdb-patches > Date: Sun, 28 Feb 2010 14:05:00 +0100 > From: Corinna Vinschen <vinschen@redhat.com> > > +/* The default output charset for Cygwin is UTF-8. The default output > + charset for Win32 is the default ANSI codepage of the system, which > + depends on the localization of the underlying Windows system. However, > + CP1252 is a good default replacement for ISO-8859-1 at least. */ > +#if defined (__CYGWIN__) > +#define GDB_DEFAULT_TARGET_CHARSET "UTF-8" > +#elif defined (_WIN32) > +#define GDB_DEFAULT_TARGET_CHARSET "CP1252" > +#endif Why cp1252? why not detect the ANSI codepage at run time, and make more non-Latin users happy? ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 16:56 ` Eli Zaretskii @ 2010-02-28 17:05 ` Corinna Vinschen 2010-02-28 17:45 ` Eli Zaretskii 0 siblings, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-02-28 17:05 UTC (permalink / raw) To: gdb-patches On Feb 28 18:56, Eli Zaretskii wrote: > > Date: Sun, 28 Feb 2010 14:05:00 +0100 > > From: Corinna Vinschen <vinschen@redhat.com> > > > > +/* The default output charset for Cygwin is UTF-8. The default output > > + charset for Win32 is the default ANSI codepage of the system, which > > + depends on the localization of the underlying Windows system. However, > > + CP1252 is a good default replacement for ISO-8859-1 at least. */ > > +#if defined (__CYGWIN__) > > +#define GDB_DEFAULT_TARGET_CHARSET "UTF-8" > > +#elif defined (_WIN32) > > +#define GDB_DEFAULT_TARGET_CHARSET "CP1252" > > +#endif > > Why cp1252? why not detect the ANSI codepage at run time, and make > more non-Latin users happy? How? Is there somewhere a function which converts a Windows codepage number into a iconv compatible codeset string? Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 17:05 ` Corinna Vinschen @ 2010-02-28 17:45 ` Eli Zaretskii 2010-02-28 17:52 ` Corinna Vinschen 0 siblings, 1 reply; 64+ messages in thread From: Eli Zaretskii @ 2010-02-28 17:45 UTC (permalink / raw) To: gdb-patches > Date: Sun, 28 Feb 2010 18:05:23 +0100 > From: Corinna Vinschen <vinschen@redhat.com> > > > > +#define GDB_DEFAULT_TARGET_CHARSET "CP1252" > > > +#endif > > > > Why cp1252? why not detect the ANSI codepage at run time, and make > > more non-Latin users happy? > > How? Is there somewhere a function which converts a Windows codepage > number into a iconv compatible codeset string? Sorry, I'm not following: last time I looked, iconv supported cpNNNN codepages out of the box. This is from an Ubuntu GNU/Linux system: $ iconv --version iconv (GNU libiconv 1.11) $ iconv --list | egrep "^CP" CP819 IBM819 ISO-8859-1 ISO-IR-100 ISO8859-1 ISO_8859-1 ISO_8859-1:1987 L1 LATIN1 CSISOLATIN1 CP1250 MS-EE WINDOWS-1250 CP1251 MS-CYRL WINDOWS-1251 CP1252 MS-ANSI WINDOWS-1252 CP1253 MS-GREEK WINDOWS-1253 CP1254 MS-TURK WINDOWS-1254 CP1255 MS-HEBR WINDOWS-1255 CP1256 MS-ARAB WINDOWS-1256 CP1257 WINBALTRIM WINDOWS-1257 CP1258 WINDOWS-1258 CP154 CYRILLIC-ASIAN PT154 PTCP154 CSPTCP154 CP1133 IBM-CP1133 CP874 WINDOWS-874 CP932 CP936 MS936 WINDOWS-936 CP950 CP949 UHC CP1361 JOHAB What am I missing? ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 17:45 ` Eli Zaretskii @ 2010-02-28 17:52 ` Corinna Vinschen 2010-02-28 18:11 ` Corinna Vinschen 2010-02-28 18:59 ` Eli Zaretskii 0 siblings, 2 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-02-28 17:52 UTC (permalink / raw) To: gdb-patches On Feb 28 19:45, Eli Zaretskii wrote: > > Date: Sun, 28 Feb 2010 18:05:23 +0100 > > From: Corinna Vinschen <vinschen@redhat.com> > > > > > > +#define GDB_DEFAULT_TARGET_CHARSET "CP1252" > > > > +#endif > > > > > > Why cp1252? why not detect the ANSI codepage at run time, and make > > > more non-Latin users happy? > > > > How? Is there somewhere a function which converts a Windows codepage > > number into a iconv compatible codeset string? > > Sorry, I'm not following: last time I looked, iconv supported cpNNNN > codepages out of the box. This is from an Ubuntu GNU/Linux system: > > $ iconv --version > iconv (GNU libiconv 1.11) > $ iconv --list | egrep "^CP" > CP819 IBM819 ISO-8859-1 ISO-IR-100 ISO8859-1 ISO_8859-1 ISO_8859-1:1987 L1 LATIN1 CSISOLATIN1 > CP1250 MS-EE WINDOWS-1250 > CP1251 MS-CYRL WINDOWS-1251 > CP1252 MS-ANSI WINDOWS-1252 > CP1253 MS-GREEK WINDOWS-1253 > CP1254 MS-TURK WINDOWS-1254 > CP1255 MS-HEBR WINDOWS-1255 > CP1256 MS-ARAB WINDOWS-1256 > CP1257 WINBALTRIM WINDOWS-1257 > CP1258 WINDOWS-1258 > CP154 CYRILLIC-ASIAN PT154 PTCP154 CSPTCP154 > CP1133 IBM-CP1133 > CP874 WINDOWS-874 > CP932 > CP936 MS936 WINDOWS-936 > CP950 > CP949 UHC > CP1361 JOHAB > > What am I missing? Windows has a function GetACP() which returns the current ANSI codepage. iconv wants a string which determines the source and target codesets. I don't know any function in Windows which converts a codepage number into an iconv-compatible codeset name. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 17:52 ` Corinna Vinschen @ 2010-02-28 18:11 ` Corinna Vinschen 2010-02-28 18:59 ` Eli Zaretskii 2010-02-28 18:59 ` Eli Zaretskii 1 sibling, 1 reply; 64+ messages in thread From: Corinna Vinschen @ 2010-02-28 18:11 UTC (permalink / raw) To: gdb-patches On Feb 28 18:52, Corinna Vinschen wrote: > On Feb 28 19:45, Eli Zaretskii wrote: > > > Date: Sun, 28 Feb 2010 18:05:23 +0100 > > > From: Corinna Vinschen <vinschen@redhat.com> > > > > > > > > +#define GDB_DEFAULT_TARGET_CHARSET "CP1252" > > > > > +#endif > > > > > > > > Why cp1252? why not detect the ANSI codepage at run time, and make > > > > more non-Latin users happy? > > > > > > How? Is there somewhere a function which converts a Windows codepage > > > number into a iconv compatible codeset string? > > > > Sorry, I'm not following: last time I looked, iconv supported cpNNNN > > codepages out of the box. This is from an Ubuntu GNU/Linux system: > > > > $ iconv --version > > iconv (GNU libiconv 1.11) > > $ iconv --list | egrep "^CP" > > CP819 IBM819 ISO-8859-1 ISO-IR-100 ISO8859-1 ISO_8859-1 ISO_8859-1:1987 L1 LATIN1 CSISOLATIN1 > > CP1250 MS-EE WINDOWS-1250 > > CP1251 MS-CYRL WINDOWS-1251 > > CP1252 MS-ANSI WINDOWS-1252 > > CP1253 MS-GREEK WINDOWS-1253 > > CP1254 MS-TURK WINDOWS-1254 > > CP1255 MS-HEBR WINDOWS-1255 > > CP1256 MS-ARAB WINDOWS-1256 > > CP1257 WINBALTRIM WINDOWS-1257 > > CP1258 WINDOWS-1258 > > CP154 CYRILLIC-ASIAN PT154 PTCP154 CSPTCP154 > > CP1133 IBM-CP1133 > > CP874 WINDOWS-874 > > CP932 > > CP936 MS936 WINDOWS-936 > > CP950 > > CP949 UHC > > CP1361 JOHAB > > > > What am I missing? > > Windows has a function GetACP() which returns the current ANSI > codepage. iconv wants a string which determines the source and > target codesets. I don't know any function in Windows which > converts a codepage number into an iconv-compatible codeset name. Erm. Looks I wasn't thinking at all. Since every codepage seems to be simply avaiable as "CPxxx", it can be easily constructed, apparently. So this: #define GDB_DEFAULT_TARGET_CHARSET get_cp() char * get_cp () { char buf[32]; snprintf (buf, 32, "CP%u", GetACP ()); } should do it, right? Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 18:11 ` Corinna Vinschen @ 2010-02-28 18:59 ` Eli Zaretskii 2010-02-28 19:23 ` Corinna Vinschen 0 siblings, 1 reply; 64+ messages in thread From: Eli Zaretskii @ 2010-02-28 18:59 UTC (permalink / raw) To: gdb-patches > Date: Sun, 28 Feb 2010 19:11:05 +0100 > From: Corinna Vinschen <vinschen@redhat.com> > > #define GDB_DEFAULT_TARGET_CHARSET get_cp() > > char * > get_cp () > { > char buf[32]; > > snprintf (buf, 32, "CP%u", GetACP ()); > } > > should do it, right? Yes, I think so. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 18:59 ` Eli Zaretskii @ 2010-02-28 19:23 ` Corinna Vinschen 0 siblings, 0 replies; 64+ messages in thread From: Corinna Vinschen @ 2010-02-28 19:23 UTC (permalink / raw) To: gdb-patches On Feb 28 20:59, Eli Zaretskii wrote: > > Date: Sun, 28 Feb 2010 19:11:05 +0100 > > From: Corinna Vinschen <vinschen@redhat.com> > > > > #define GDB_DEFAULT_TARGET_CHARSET get_cp() > > > > char * > > get_cp () > > { > > char buf[32]; > > > > snprintf (buf, 32, "CP%u", GetACP ()); > > } > > > > should do it, right? > > Yes, I think so. Apparently not, see http://sourceware.org/ml/gdb-patches/2010-02/msg00694.html Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds 2010-02-28 17:52 ` Corinna Vinschen 2010-02-28 18:11 ` Corinna Vinschen @ 2010-02-28 18:59 ` Eli Zaretskii 1 sibling, 0 replies; 64+ messages in thread From: Eli Zaretskii @ 2010-02-28 18:59 UTC (permalink / raw) To: gdb-patches > Date: Sun, 28 Feb 2010 18:52:39 +0100 > From: Corinna Vinschen <vinschen@redhat.com> > > Windows has a function GetACP() which returns the current ANSI > codepage. iconv wants a string which determines the source and > target codesets. I don't know any function in Windows which > converts a codepage number into an iconv-compatible codeset name. How about `sprintf (str, "CP%u", GetACP ())' ? (I'm probably still missing something.) ^ permalink raw reply [flat|nested] 64+ messages in thread
end of thread, other threads:[~2010-03-05 8:55 UTC | newest] Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-02-28 13:05 [RFA] defs.h: Define GDB_DEFAULT_TARGET_[WIDE_]CHARSET for Cygwin and MingW builds Corinna Vinschen 2010-02-28 14:29 ` Daniel Jacobowitz 2010-02-28 15:03 ` Corinna Vinschen 2010-02-28 18:48 ` Daniel Jacobowitz 2010-02-28 19:22 ` Corinna Vinschen 2010-02-28 22:27 ` Daniel Jacobowitz 2010-03-01 10:31 ` Corinna Vinschen 2010-03-01 17:11 ` Eli Zaretskii 2010-03-01 17:28 ` Corinna Vinschen 2010-03-01 17:21 ` Tom Tromey 2010-03-01 17:23 ` Daniel Jacobowitz 2010-03-01 17:27 ` Corinna Vinschen 2010-03-01 17:31 ` Corinna Vinschen 2010-03-01 19:25 ` Tom Tromey 2010-03-01 19:31 ` Daniel Jacobowitz 2010-03-01 20:06 ` Corinna Vinschen 2010-03-01 20:44 ` Eli Zaretskii 2010-03-01 21:50 ` Daniel Jacobowitz 2010-03-02 10:44 ` Corinna Vinschen 2010-03-02 13:54 ` Daniel Jacobowitz 2010-03-02 14:13 ` Corinna Vinschen 2010-03-02 15:03 ` Daniel Jacobowitz 2010-03-02 17:43 ` Corinna Vinschen 2010-03-02 17:59 ` Tom Tromey 2010-03-02 18:22 ` Corinna Vinschen 2010-03-02 18:28 ` Corinna Vinschen 2010-03-02 22:55 ` Tom Tromey 2010-03-02 22:57 ` Tom Tromey 2010-03-03 10:16 ` Corinna Vinschen 2010-03-03 10:41 ` Corinna Vinschen 2010-03-03 10:59 ` Corinna Vinschen 2010-03-03 16:26 ` Tom Tromey 2010-03-03 10:16 ` Corinna Vinschen 2010-03-03 16:25 ` Tom Tromey 2010-03-03 16:46 ` Corinna Vinschen 2010-03-05 8:55 ` Charles Wilson 2010-03-03 21:15 ` Tom Tromey 2010-03-04 9:34 ` Corinna Vinschen 2010-03-04 15:24 ` Christopher Faylor 2010-03-04 15:38 ` Corinna Vinschen 2010-03-03 17:08 ` Daniel Jacobowitz 2010-03-03 17:22 ` Corinna Vinschen 2010-03-03 17:27 ` Corinna Vinschen 2010-03-03 17:33 ` Daniel Jacobowitz 2010-03-03 17:42 ` Eli Zaretskii 2010-03-03 18:00 ` Corinna Vinschen 2010-03-03 18:05 ` Daniel Jacobowitz 2010-03-03 18:22 ` Corinna Vinschen 2010-03-03 19:39 ` Daniel Jacobowitz 2010-03-03 20:16 ` Eli Zaretskii 2010-03-03 20:26 ` Daniel Jacobowitz 2010-03-03 21:08 ` Tom Tromey 2010-03-01 17:12 ` Tom Tromey 2010-03-01 17:21 ` Daniel Jacobowitz 2010-03-01 17:28 ` Tom Tromey 2010-03-01 17:50 ` Corinna Vinschen 2010-02-28 16:56 ` Eli Zaretskii 2010-02-28 17:05 ` Corinna Vinschen 2010-02-28 17:45 ` Eli Zaretskii 2010-02-28 17:52 ` Corinna Vinschen 2010-02-28 18:11 ` Corinna Vinschen 2010-02-28 18:59 ` Eli Zaretskii 2010-02-28 19:23 ` Corinna Vinschen 2010-02-28 18:59 ` Eli Zaretskii
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox