* Iconv / Solaris @ 2009-08-27 2:22 Daniel Jacobowitz 2009-08-27 17:09 ` Tom Tromey 0 siblings, 1 reply; 10+ messages in thread From: Daniel Jacobowitz @ 2009-08-27 2:22 UTC (permalink / raw) To: Tom Tromey, gdb-patches Hi Tom, Just confirming what I said on IRC: __STDC_ISO_10646__ is not defined anywhere, but with __sun__ your patch from here: http://sourceware.org/ml/gdb-patches/2009-07/msg00434.html fixes an otherwise broken GDB on Solaris. I think UCS-4 is right, but the Solaris documentation isn't clear; maybe it would be clear if I understood wide characters better. Actually, poking around the internet, it looks like wchar_t's contents may be locale-dependent... -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Iconv / Solaris 2009-08-27 2:22 Iconv / Solaris Daniel Jacobowitz @ 2009-08-27 17:09 ` Tom Tromey 2009-08-27 17:45 ` Daniel Jacobowitz 0 siblings, 1 reply; 10+ messages in thread From: Tom Tromey @ 2009-08-27 17:09 UTC (permalink / raw) To: gdb-patches >>>>> "Daniel" == Daniel Jacobowitz <drow@false.org> writes: Daniel> Just confirming what I said on IRC: __STDC_ISO_10646__ is not defined Daniel> anywhere, but with __sun__ your patch from here: Daniel> http://sourceware.org/ml/gdb-patches/2009-07/msg00434.html Daniel> fixes an otherwise broken GDB on Solaris. I think UCS-4 is right, but Daniel> the Solaris documentation isn't clear; maybe it would be clear if I Daniel> understood wide characters better. Daniel> Actually, poking around the internet, it looks like wchar_t's contents Daniel> may be locale-dependent... Yeah :-( I did a little digging on various sun.com sites, and all I could find is that wchar_t will be UCS-4 if the current locale is using UTF-8. That is not a very strong guarantee -- it means that GDB might work for some users but not others, if there are locales where wchar_t is not UCS-4. The very safest thing to do is disable use of wchar_t on a platform like this. E.g., the appended hack can be used to try it. This will provide a user experience similar to GDB 6.8. Alternatively, you could try using the __sun__ variant and running gdb in a non-UTF-8 locale. If it works we could go with (a variant of) this approach. Finally, I looked at libiconv-1.13. It also supports converting to the "wchar_t" encoding using mbrtowc, if that is available. I guess it must perform an intermediate conversion. Maybe using libiconv on Solaris would provide a better experience for your users. Tom *** gdb_wchar.h.~1.2.~ 2009-04-15 15:59:05.000000000 -0600 --- gdb_wchar.h 2009-08-27 10:31:01.000000000 -0600 *************** *** 47,53 **** /* We use "btowc" as a sentinel to detect functioning wchar_t support. */ ! #if defined (HAVE_ICONV) && defined (HAVE_WCHAR_H) && defined (HAVE_BTOWC) #include <wchar.h> #include <wctype.h> --- 47,53 ---- /* We use "btowc" as a sentinel to detect functioning wchar_t support. */ ! #if defined (HAVE_ICONV) && defined (HAVE_WCHAR_H) && defined (HAVE_BTOWC) && !defined (__sun__) #include <wchar.h> #include <wctype.h> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Iconv / Solaris 2009-08-27 17:09 ` Tom Tromey @ 2009-08-27 17:45 ` Daniel Jacobowitz 2009-08-27 20:36 ` Tom Tromey 0 siblings, 1 reply; 10+ messages in thread From: Daniel Jacobowitz @ 2009-08-27 17:45 UTC (permalink / raw) To: gdb-patches On Thu, Aug 27, 2009 at 11:00:41AM -0600, Tom Tromey wrote: > Alternatively, you could try using the __sun__ variant and running gdb > in a non-UTF-8 locale. If it works we could go with (a variant of) this > approach. What do we look for? That is, how would I know if it was working or not? I can easily try an ISO-8859-1 locale, but otherwise I'm a bit out of my depth. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Iconv / Solaris 2009-08-27 17:45 ` Daniel Jacobowitz @ 2009-08-27 20:36 ` Tom Tromey 2009-08-27 20:38 ` Daniel Jacobowitz 0 siblings, 1 reply; 10+ messages in thread From: Tom Tromey @ 2009-08-27 20:36 UTC (permalink / raw) To: gdb-patches >>>>> "Daniel" == Daniel Jacobowitz <drow@false.org> writes: Daniel> On Thu, Aug 27, 2009 at 11:00:41AM -0600, Tom Tromey wrote: >> Alternatively, you could try using the __sun__ variant and running gdb >> in a non-UTF-8 locale. If it works we could go with (a variant of) this >> approach. Daniel> What do we look for? That is, how would I know if it was working or Daniel> not? I can easily try an ISO-8859-1 locale, but otherwise I'm a bit Daniel> out of my depth. Hmm, good question. For ISO-8859-1, it is tricky, because that is a subset of UCS-4. I think you could do a test in other ISO-8859 locales: take a narrow character not appearing in ISO-8859-1, convert it to a wchar_t using btowc, and then print the value. If the value is the same as the UCS-4 value, you probably have UCS-4 wchar_t. E.g., in ISO-8859-15, 0xA4 is the euro currency sign. In UCS-4 this is 0x20A0. The cases I was more concerned about were locales using encodings like SJIS or EUC. I'm not sure what wchar_t encoding these might use. So, I dug through the OpenSolaris source a little and I think UCS-4 is not always used. In particular it looks like mbtowc can call: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libbc/libc/gen/common/euc.multibyte.c#_mbtowc_euc ... which looks like it uses an ad hoc flattened EUC encoding. The initial problem here is that iconv will not accept "wchar_t" as an encoding on this platform. I see we only have one AC_TRY_RUN in gdb ... am I right in assuming that these are not ok? If they are ok, we can test this at configure time. If they are not ok, I think we can just add a new setting to configure.host. This is simpler to implement. Tom ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Iconv / Solaris 2009-08-27 20:36 ` Tom Tromey @ 2009-08-27 20:38 ` Daniel Jacobowitz 2009-08-27 20:50 ` Daniel Jacobowitz 2009-08-28 1:24 ` Tom Tromey 0 siblings, 2 replies; 10+ messages in thread From: Daniel Jacobowitz @ 2009-08-27 20:38 UTC (permalink / raw) To: gdb-patches On Thu, Aug 27, 2009 at 02:30:30PM -0600, Tom Tromey wrote: > The initial problem here is that iconv will not accept "wchar_t" as an > encoding on this platform. I see we only have one AC_TRY_RUN in gdb > ... am I right in assuming that these are not ok? They are not OK. Please don't add another if you can avoid it. I think the one that's there is for long long printf? I used to have to override the cache variable... haven't checked lately. > If they are not ok, I think we can just add a new setting to > configure.host. This is simpler to implement. I'm not sure how to do this without hardcoding it by platform. If the user has external libiconv, do we still want a change? -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Iconv / Solaris 2009-08-27 20:38 ` Daniel Jacobowitz @ 2009-08-27 20:50 ` Daniel Jacobowitz 2009-08-27 22:03 ` Daniel Jacobowitz 2009-08-28 1:01 ` Tom Tromey 2009-08-28 1:24 ` Tom Tromey 1 sibling, 2 replies; 10+ messages in thread From: Daniel Jacobowitz @ 2009-08-27 20:50 UTC (permalink / raw) To: gdb-patches On Thu, Aug 27, 2009 at 04:36:09PM -0400, Daniel Jacobowitz wrote: > > If they are not ok, I think we can just add a new setting to > > configure.host. This is simpler to implement. > > I'm not sure how to do this without hardcoding it by platform. If the > user has external libiconv, do we still want a change? Maybe I'm thinking about this wrong... can we determine the encoding of wchar_t somehow that works on Solaris? Something like what we do now with nl_langinfo? Or is it not guaranteed to have any known encoding? I'm lost in the configure maze, but if we don't define PHONY_ICONV, then INTERMEDIATE_CHARSET ought to be host_charset anyway. So the fact that your patch made a difference implies that PHONY_ICONV is defined. So what's failing? Isn't it our *dummy* iconv_open? -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Iconv / Solaris 2009-08-27 20:50 ` Daniel Jacobowitz @ 2009-08-27 22:03 ` Daniel Jacobowitz 2009-08-28 1:01 ` Tom Tromey 1 sibling, 0 replies; 10+ messages in thread From: Daniel Jacobowitz @ 2009-08-27 22:03 UTC (permalink / raw) To: gdb-patches On Thu, Aug 27, 2009 at 04:42:58PM -0400, Daniel Jacobowitz wrote: > Maybe I'm thinking about this wrong... can we determine the encoding > of wchar_t somehow that works on Solaris? Something like what we do > now with nl_langinfo? Or is it not guaranteed to have any known > encoding? > > I'm lost in the configure maze, but if we don't define PHONY_ICONV, > then INTERMEDIATE_CHARSET ought to be host_charset anyway. So the > fact that your patch made a difference implies that PHONY_ICONV is > defined. So what's failing? Isn't it our *dummy* iconv_open? No, HAVE_ICONV is defined. So how does changing the default definition of INTERMEDIATE_ENCODING make a difference? All of HAVE_ICONV, HAVE_WCHAR_H, HAVE_BTOWC are defined. Oh. We use host_charset if gdb_wchar_t is char. I misread the #if's... I don't see how you use iconv and wchar_t together, otherwise. Are you just not supposed to? -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Iconv / Solaris 2009-08-27 20:50 ` Daniel Jacobowitz 2009-08-27 22:03 ` Daniel Jacobowitz @ 2009-08-28 1:01 ` Tom Tromey 1 sibling, 0 replies; 10+ messages in thread From: Tom Tromey @ 2009-08-28 1:01 UTC (permalink / raw) To: gdb-patches >>>>> "Daniel" == Daniel Jacobowitz <drow@false.org> writes: Daniel> Maybe I'm thinking about this wrong... can we determine the Daniel> encoding of wchar_t somehow that works on Solaris? Something Daniel> like what we do now with nl_langinfo? Nope. For the "narrow" charset you can use nl_langinfo(CODESET). There is no equivalent for the wide charset :-( Many systems let you pass "wchar_t" to iconv_open, instead. However, Solaris doesn't. Daniel> Or is it not guaranteed to have any known encoding? If the system defines __STDC_ISO_10646__, then you know it uses Unicode. Otherwise, all bets are off. Daniel> I'm lost in the configure maze Yeah. The way it works: * If you have a fully working iconv + wchar_t suite, then: - All the gdb_* macros are defined as their wide counterparts, e.g. gdb_iswprint == iswprint The intermediate charset is wchar_t. * You might have iconv but no working wchar_t support. In this case we use the narrow forms for everything, e.g., gdb_iswprint == isprint (and gdb_wchar_t == char). However we still use iconv for recoding. The intermediate charset is host_charset. This is the scenario I propose we make Solaris use, preferably by using AC_TRY_RUN to test iconv_open. The impact on the user is that if he tries to print a string with non-host-charset characters, he will get escapes -- basically what GDB 6.8 does. * You might have nothing at all. This is the PHONY_ICONV case. In this scenario we use the narrow forms for everything and basically just fail unless host_charset == target_charset. Tom ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Iconv / Solaris 2009-08-27 20:38 ` Daniel Jacobowitz 2009-08-27 20:50 ` Daniel Jacobowitz @ 2009-08-28 1:24 ` Tom Tromey 2009-08-28 17:00 ` Tom Tromey 1 sibling, 1 reply; 10+ messages in thread From: Tom Tromey @ 2009-08-28 1:24 UTC (permalink / raw) To: gdb-patches >>>>> "Daniel" == Daniel Jacobowitz <drow@false.org> writes: Daniel> On Thu, Aug 27, 2009 at 02:30:30PM -0600, Tom Tromey wrote: >> The initial problem here is that iconv will not accept "wchar_t" as an >> encoding on this platform. I see we only have one AC_TRY_RUN in gdb >> ... am I right in assuming that these are not ok? Daniel> They are not OK. Please don't add another if you can avoid it. I Daniel> think the one that's there is for long long printf? I used to have Daniel> to override the cache variable... haven't checked lately. Oops, I read your notes out-of-order. Ok, no new AC_TRY_RUN. The existing one is some obscure old Linux thing: dnl For Linux/i386, glibc 2.1.3 was released with a bogus dnl prfpregset_t type (it's a typedef for the pointer to a struct dnl instead of the struct itself). We detect this here, and work dnl around it in gdb_proc_service.h. >> If they are not ok, I think we can just add a new setting to >> configure.host. This is simpler to implement. Daniel> I'm not sure how to do this without hardcoding it by platform. If the Daniel> user has external libiconv, do we still want a change? What we want is to skip case #1 in gdb_wchar.h (the full support case) and let configure choose between case #2 (iconv only) and case #3 (nothing) depending on whether iconv was found. I can write a patch tomorrow. Tom ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Iconv / Solaris 2009-08-28 1:24 ` Tom Tromey @ 2009-08-28 17:00 ` Tom Tromey 0 siblings, 0 replies; 10+ messages in thread From: Tom Tromey @ 2009-08-28 17:00 UTC (permalink / raw) To: gdb-patches >>>>> "Tom" == Tom Tromey <tromey@redhat.com> writes: Tom> I can write a patch tomorrow. Please try the appended. You will have to re-run autoheader and autoconf. I took the opportunity to add more text to the comment in gdb_wchar.h, in the hopes that this would make it more clear. If this works for you, I will check it in. Tom 2009-08-28 Tom Tromey <tromey@redhat.com> * gdb_wchar.h: Update comments. Use DISABLE_ICONV. * configure, config.in: Rebuild. * configure.ac (DISABLE_ICONV): Define when needed. * configure.host: Set gdb_host_wchar_iconv. Index: configure.ac =================================================================== RCS file: /cvs/src/src/gdb/configure.ac,v retrieving revision 1.105 diff -u -r1.105 configure.ac --- configure.ac 22 Aug 2009 17:08:09 -0000 1.105 +++ configure.ac 28 Aug 2009 16:15:22 -0000 @@ -788,6 +788,12 @@ ttrace wborder setlocale iconvlist libiconvlist btowc]) AM_LANGINFO_CODESET +# An additional check for whether iconv is ok for us to use. +if test "$gdb_host_wchar_iconv" = no; then + AC_DEFINE([DISABLE_ICONV], 1, + [Define if host iconv is unsuitable for use by gdb]) +fi + # Check the return and argument types of ptrace. No canned test for # this, so roll our own. gdb_ptrace_headers=' Index: configure.host =================================================================== RCS file: /cvs/src/src/gdb/configure.host,v retrieving revision 1.103 diff -u -r1.103 configure.host --- configure.host 11 Jan 2009 13:15:56 -0000 1.103 +++ configure.host 28 Aug 2009 16:15:22 -0000 @@ -8,6 +8,8 @@ # gdb_host_double_format host's double floatformat, or 0 # gdb_host_long_double_format host's long double floatformat, or 0 # gdb_host_obs host-specific .o files to include +# gdb_host_wchar_iconv "no" if iconv_open will not accept +# "wchar_t" as an argument. # Map host cpu into the config cpu subdirectory name. # The default is $host_cpu. @@ -207,3 +209,15 @@ gdb_host_long_double_format=0 ;; esac + + +# Per-host iconv setting. + +# Default to "yes". +gdb_host_wchar_iconv=yes + +case "${host}" in +*-*-solaris*) + gdb_host_wchar_iconv=no + ;; +esac Index: gdb_wchar.h =================================================================== RCS file: /cvs/src/src/gdb/gdb_wchar.h,v retrieving revision 1.2 diff -u -r1.2 gdb_wchar.h --- gdb_wchar.h 15 Apr 2009 22:20:31 -0000 1.2 +++ gdb_wchar.h 28 Aug 2009 16:15:22 -0000 @@ -21,18 +21,33 @@ /* We handle three different modes here. - Capable systems have the full suite: wchar_t support and iconv + 1. Capable systems have the full suite: wchar_t support and iconv (perhaps via GNU libiconv). On these machines, full functionality - is available. + is available. In this case, the intermediate character set is + "wchar_t". - DJGPP is known to have libiconv but not wchar_t support. On + 2. DJGPP is known to have libiconv but not wchar_t support. On systems like this, we use the narrow character functions. The full functionality is available to the user, but many characters (those - outside the narrow range) will be displayed as escapes. + outside the narrow range) will be displayed as escapes. In this + case, the intermediate character set uses the host encoding. - Finally, some systems do not have iconv. Here we provide a phony - iconv which only handles a single character set, and we provide - wrappers for the wchar_t functionality we use. */ + We also end up in this scenario if the host iconv is not fully + suitable. Solaris falls into this category both because iconv_open + does not accept "wchar_t" as an argument, but also because wchar_t + does not have a fixed encoding. + + 3. Finally, some systems do not have iconv. Here we provide a + phony iconv which only handles a single character set, and we + provide wrappers for the wchar_t functionality we use. This is + what the "PHONY_ICONV" define means, below. In this case, the + intermediate character set uses the host encoding, and limited + functionality is available to the user. + + It is possible that you may run into a system that does not support + "wchar_t" as an argument to iconv_open, but where + __STDC_ISO_10646__ is defined. If you have such a system, we can + add a fourth case where we fix the intermediate encoding. */ #define INTERMEDIATE_ENCODING "wchar_t" @@ -40,14 +55,17 @@ #if defined (HAVE_ICONV) #include <iconv.h> #else -/* This define is used elsewhere so we don't need to duplicate the - same checking logic in multiple places. */ +/* Case 3. This define is used elsewhere so we don't need to + duplicate the same checking logic in multiple places. */ #define PHONY_ICONV #endif /* We use "btowc" as a sentinel to detect functioning wchar_t support. */ -#if defined (HAVE_ICONV) && defined (HAVE_WCHAR_H) && defined (HAVE_BTOWC) +#if defined (HAVE_ICONV) && defined (HAVE_WCHAR_H) && defined (HAVE_BTOWC) \ + && !defined (DISABLE_ICONV) + +/* Case 1. */ #include <wchar.h> #include <wctype.h> @@ -65,6 +83,8 @@ #else +/* Case 2 and case 3, depending on whether PHONY_ICONV is defined. */ + typedef char gdb_wchar_t; typedef int gdb_wint_t; ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-08-28 16:17 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-08-27 2:22 Iconv / Solaris Daniel Jacobowitz 2009-08-27 17:09 ` Tom Tromey 2009-08-27 17:45 ` Daniel Jacobowitz 2009-08-27 20:36 ` Tom Tromey 2009-08-27 20:38 ` Daniel Jacobowitz 2009-08-27 20:50 ` Daniel Jacobowitz 2009-08-27 22:03 ` Daniel Jacobowitz 2009-08-28 1:01 ` Tom Tromey 2009-08-28 1:24 ` Tom Tromey 2009-08-28 17:00 ` Tom Tromey
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox