Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* [RFC] Make string printing work on NetBSD (iconv issue)
@ 2010-05-04 19:43 Paul Koning
  2010-05-04 20:01 ` Andreas Schwab
  2010-05-07 16:40 ` Tom Tromey
  0 siblings, 2 replies; 11+ messages in thread
From: Paul Koning @ 2010-05-04 19:43 UTC (permalink / raw)
  To: gdb-patches

On NetBSD (and, from what I've heard, some other systems like Solaris
also) a full version of iconv exists except that it doesn't have the
codeset name "wchar_t".  Currently GDB assumes without checking that
this codeset exists.  The result is that on NetBSD, any attempt to
print a string fails with an uninformative error like "Error reading
variable". 

The attached patch fixes this by having configure pick a suitable
codeset name to use.  "wchar_t" is used if available, otherwise ucs-2
or ucs-4 with the appropriate byte order suffix is used instead.

Tested on i386 NetBSD and i386 Linux.  I don't have write privs but I
do have the paperwork in place.

   paul

2010-05-04  Paul Koning  <paul_koning@dell.com>

	* configure.ac: Define ICONV_INTERMEDIATE_ENCODING as iconv
	codeset name for wchar_t data.
	* gdb_wchar.h: Use ICONV_INTERMEDIATE_ENCODING.
	* configure: Regenerate.
	* config.in: Regenerate.

Index: gdb/config.in
===================================================================
RCS file: /cvs/src/src/gdb/config.in,v
retrieving revision 1.116
diff -u -p -r1.116 config.in
--- gdb/config.in	10 Mar 2010 18:37:22 -0000	1.116
+++ gdb/config.in	4 May 2010 19:18:33 -0000
@@ -599,6 +599,9 @@
 /* Define as const if the declaration of iconv() needs const. */
 #undef ICONV_CONST
 
+/* Codeset name that corresponds to wchar_t encoding. */
+#undef ICONV_INTERMEDIATE_ENCODING
+
 /* Define if you want to use new multi-fd /proc interface (replaces
    HAVE_MULTIPLE_PROC_FDS as well as other macros). */
 #undef NEW_PROC_API
Index: gdb/configure
===================================================================
RCS file: /cvs/src/src/gdb/configure,v
retrieving revision 1.302
diff -u -p -r1.302 configure
--- gdb/configure	23 Apr 2010 18:07:27 -0000	1.302
+++ gdb/configure	4 May 2010 19:18:34 -0000
@@ -13325,6 +13325,40 @@ $as_echo "#define HAVE_PERSONALITY 1" >>
 
 fi
 
+if test "x$am_cv_func_iconv" = "xyes"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking iconv codeset for wchar_t" >&5
+$as_echo_n "checking iconv codeset for wchar_t... " >&6; }
+  if iconv -f ascii -t wchar_t /dev/null; then
+    { $as_echo "$as_me:${as_lineno-$LINENO}: result: wchar_t" >&5
+$as_echo "wchar_t" >&6; }
+
+$as_echo "#define ICONV_INTERMEDIATE_ENCODING \"wchar_t\"" >>confdefs.h
+
+  else
+    if test "$ac_cv_c_bigendian" = "yes"; then
+      if test "$gl_cv_bitsizeof_wchar_t" = "32"; then
+
+$as_echo "#define ICONV_INTERMEDIATE_ENCODING \"ucs-4be\"" >>confdefs.h
+
+      else
+
+$as_echo "#define ICONV_INTERMEDIATE_ENCODING \"ucs-2be\"" >>confdefs.h
+
+      fi
+    else
+      if test "$gl_cv_bitsizeof_wchar_t" = "32"; then
+
+$as_echo "#define ICONV_INTERMEDIATE_ENCODING \"ucs-4le\"" >>confdefs.h
+
+      else
+
+$as_echo "#define ICONV_INTERMEDIATE_ENCODING \"ucs-2le\"" >>confdefs.h
+
+      fi
+    fi
+  fi
+fi
+
 
 target_sysroot_reloc=0
 
Index: gdb/configure.ac
===================================================================
RCS file: /cvs/src/src/gdb/configure.ac,v
retrieving revision 1.117
diff -u -p -r1.117 configure.ac
--- gdb/configure.ac	23 Apr 2010 18:07:26 -0000	1.117
+++ gdb/configure.ac	4 May 2010 19:18:34 -0000
@@ -1480,6 +1480,34 @@ then
 	      [Define if you support the personality syscall.])
 fi
 
+dnl Check if iconv handles wchar_t, and if not, what to use instead.
+if test "x$am_cv_func_iconv" = "xyes"; then
+  AC_MSG_CHECKING(iconv codeset for wchar_t)
+  if iconv -f ascii -t wchar_t /dev/null; then
+    AC_MSG_RESULT(wchar_t)
+    AC_DEFINE([ICONV_INTERMEDIATE_ENCODING],["wchar_t"],
+              [Codeset name that corresponds to wchar_t encoding.])
+  else
+    if test "$ac_cv_c_bigendian" = "yes"; then
+      if test "$gl_cv_bitsizeof_wchar_t" = "32"; then
+        AC_DEFINE([ICONV_INTERMEDIATE_ENCODING],["ucs-4be"],
+                  [Codeset name that corresponds to wchar_t encoding.])
+      else
+        AC_DEFINE([ICONV_INTERMEDIATE_ENCODING],["ucs-2be"],
+                  [Codeset name that corresponds to wchar_t encoding.])
+      fi
+    else
+      if test "$gl_cv_bitsizeof_wchar_t" = "32"; then
+        AC_DEFINE([ICONV_INTERMEDIATE_ENCODING],["ucs-4le"],
+                  [Codeset name that corresponds to wchar_t encoding.])
+      else
+        AC_DEFINE([ICONV_INTERMEDIATE_ENCODING],["ucs-2le"],
+                  [Codeset name that corresponds to wchar_t encoding.])
+      fi
+    fi
+  fi
+fi
+
 dnl Handle optional features that can be enabled.
 
 target_sysroot_reloc=0
Index: gdb/gdb_wchar.h
===================================================================
RCS file: /cvs/src/src/gdb/gdb_wchar.h,v
retrieving revision 1.3
diff -u -p -r1.3 gdb_wchar.h
--- gdb/gdb_wchar.h	1 Jan 2010 07:31:32 -0000	1.3
+++ gdb/gdb_wchar.h	4 May 2010 19:18:34 -0000
@@ -35,7 +35,7 @@
    wrappers for the wchar_t functionality we use.  */
 
 
-#define INTERMEDIATE_ENCODING "wchar_t"
+#define INTERMEDIATE_ENCODING ICONV_INTERMEDIATE_ENCODING 
 
 #if defined (HAVE_ICONV)
 #include <iconv.h>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Make string printing work on NetBSD (iconv issue)
  2010-05-04 19:43 [RFC] Make string printing work on NetBSD (iconv issue) Paul Koning
@ 2010-05-04 20:01 ` Andreas Schwab
  2010-05-04 20:06   ` Paul Koning
  2010-05-07 16:40 ` Tom Tromey
  1 sibling, 1 reply; 11+ messages in thread
From: Andreas Schwab @ 2010-05-04 20:01 UTC (permalink / raw)
  To: Paul Koning; +Cc: gdb-patches

Paul Koning <Paul_Koning@dell.com> writes:

> +dnl Check if iconv handles wchar_t, and if not, what to use instead.
> +if test "x$am_cv_func_iconv" = "xyes"; then
> +  AC_MSG_CHECKING(iconv codeset for wchar_t)
> +  if iconv -f ascii -t wchar_t /dev/null; then
> +    AC_MSG_RESULT(wchar_t)
> +    AC_DEFINE([ICONV_INTERMEDIATE_ENCODING],["wchar_t"],
> +              [Codeset name that corresponds to wchar_t encoding.])

That tests iconv on the build system, but gdb uses iconv on the host
system.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [RFC] Make string printing work on NetBSD (iconv issue)
  2010-05-04 20:01 ` Andreas Schwab
@ 2010-05-04 20:06   ` Paul Koning
  0 siblings, 0 replies; 11+ messages in thread
From: Paul Koning @ 2010-05-04 20:06 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: gdb-patches

> Paul Koning <Paul_Koning@dell.com> writes:
> 
> > +dnl Check if iconv handles wchar_t, and if not, what to use
instead.
> > +if test "x$am_cv_func_iconv" = "xyes"; then
> > +  AC_MSG_CHECKING(iconv codeset for wchar_t)
> > +  if iconv -f ascii -t wchar_t /dev/null; then
> > +    AC_MSG_RESULT(wchar_t)
> > +    AC_DEFINE([ICONV_INTERMEDIATE_ENCODING],["wchar_t"],
> > +              [Codeset name that corresponds to wchar_t encoding.])
> 
> That tests iconv on the build system, but gdb uses iconv on the host
> system.

True.  Is there a way to test the host iconv?  I can make the check
conditional on "not a cross build".  Then what about cross builds?  If
there isn't a test that gives the answer, I could do something based on
host OS type.  Or that plus libiconv present/absent?

	paul


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Make string printing work on NetBSD (iconv issue)
  2010-05-04 19:43 [RFC] Make string printing work on NetBSD (iconv issue) Paul Koning
  2010-05-04 20:01 ` Andreas Schwab
@ 2010-05-07 16:40 ` Tom Tromey
  2010-05-07 17:26   ` Paul Koning
  1 sibling, 1 reply; 11+ messages in thread
From: Tom Tromey @ 2010-05-07 16:40 UTC (permalink / raw)
  To: Paul Koning; +Cc: gdb-patches

>>>>> "Paul" == Paul Koning <Paul_Koning@dell.com> writes:

Paul> The attached patch fixes this by having configure pick a suitable
Paul> codeset name to use.  "wchar_t" is used if available, otherwise ucs-2
Paul> or ucs-4 with the appropriate byte order suffix is used instead.

This will yield incorrect results unless the chosen intermediate charset
is actually the one used for wchar_t.

Note that if this is the case for UCS-4, then your platform headers
ought to define __STDC_ISO_10646__.  So, you could test that in
gdb_wchar.h rather than do any configury.

Alternatively, it is always safe to fall back to the code that uses
narrow intermediate characters and host_charset for the intermediate
encoding.

Perhaps this "wchar_t" thing is not the best way for us to go.  Maybe
better would be to test __STDC_ISO_10646__ and fall back to narrow chars
in all other cases.

Other approaches are available too, but they are generally more work
than simply using GNU libiconv.

Tom


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [RFC] Make string printing work on NetBSD (iconv issue)
  2010-05-07 16:40 ` Tom Tromey
@ 2010-05-07 17:26   ` Paul Koning
  2010-05-07 22:46     ` Tom Tromey
  2010-05-19 15:11     ` Tom Tromey
  0 siblings, 2 replies; 11+ messages in thread
From: Paul Koning @ 2010-05-07 17:26 UTC (permalink / raw)
  To: tromey; +Cc: gdb-patches

> >>>>> "Paul" == Paul Koning <Paul_Koning@dell.com> writes:
> 
> Paul> The attached patch fixes this by having configure pick a
suitable
> Paul> codeset name to use.  "wchar_t" is used if available, otherwise
> ucs-2
> Paul> or ucs-4 with the appropriate byte order suffix is used instead.
> 
> This will yield incorrect results unless the chosen intermediate
> charset
> is actually the one used for wchar_t.

Tom, thanks for your feedback.

Yes, it clearly depends on picking the correct codeset.  If there were a
foolproof way to determine what that codeset is, that would be the best
answer.  I could not find one.  My reasoning is that UCS-n for n byte
wchar_t is a likely answer, so while it may be wrong for some platforms
(at least in theory) it will also be right for some, hopefully for most.
It clearly can't make matters worse, because any platform that doesn't
have the codeset name "wchar_t" currently doesn't work at all. 
 
> Note that if this is the case for UCS-4, then your platform headers
> ought to define __STDC_ISO_10646__.  So, you could test that in
> gdb_wchar.h rather than do any configury.

NetBSD clearly is using UCS-4 for wchar_t, but it does not define that
symbol.
 
> Alternatively, it is always safe to fall back to the code that uses
> narrow intermediate characters and host_charset for the intermediate
> encoding.

Yes, but doesn't that mean you end up not being able to accurately print
a wide string if one occurs in your program -- because it gets mapped to
the intermediate encoding first and with narrow chars for intermediate
coding you have a lossy translation?
 
> Perhaps this "wchar_t" thing is not the best way for us to go.  Maybe
> better would be to test __STDC_ISO_10646__ and fall back to narrow
> chars
> in all other cases. 

That sounds attractive.  But given that __STDC_ISO_10646__ isn't defined
in NetBSD even though it clearly supports wide chars and knows about
ucs-4, it doesn't seem to be workable.

> Other approaches are available too, but they are generally more work
> than simply using GNU libiconv.

Right, if you use libiconv then the issue goes away, and the patch I
wrote should handle that case cleanly.  I wanted to offer a solution to
people who don't want to install libiconv because they have a functional
iconv in libc, as is the case for NetBSD.

	paul


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Make string printing work on NetBSD (iconv issue)
  2010-05-07 17:26   ` Paul Koning
@ 2010-05-07 22:46     ` Tom Tromey
  2010-05-10 21:25       ` Paul Koning
  2010-05-19 15:11     ` Tom Tromey
  1 sibling, 1 reply; 11+ messages in thread
From: Tom Tromey @ 2010-05-07 22:46 UTC (permalink / raw)
  To: Paul Koning; +Cc: gdb-patches

>>>>> "Paul" == Paul Koning <Paul_Koning@Dell.com> writes:

Paul> It clearly can't make matters worse, because any platform that doesn't
Paul> have the codeset name "wchar_t" currently doesn't work at all. 

I think it is worse in one sense, which is that the results can be
incorrect with no warning.  In particular I believe this could happen on
Solaris in some locales.

I do see your point though.  Completely failing is terrible.

Paul> NetBSD clearly is using UCS-4 for wchar_t, but it does not define that
Paul> symbol.

Bummer.

Yet another idea would be host-specific overrides for INTERMEDIATE_ENCODING.

Tom> Alternatively, it is always safe to fall back to the code that uses
Tom> narrow intermediate characters and host_charset for the intermediate
Tom> encoding.

Paul> Yes, but doesn't that mean you end up not being able to accurately print
Paul> a wide string if one occurs in your program -- because it gets mapped to
Paul> the intermediate encoding first and with narrow chars for intermediate
Paul> coding you have a lossy translation?

Nope, there is code in gdb to handle this.  You will see escape
sequences for any unprintable or untranslatable characters.  The only
problem is just that more characters are considered to be unprintable in
this scenario; roughly speaking this gets you back to what gdb 6.8 would
do.

Tom> Other approaches are available too, but they are generally more work
Tom> than simply using GNU libiconv.

Paul> Right, if you use libiconv then the issue goes away, and the patch I
Paul> wrote should handle that case cleanly.  I wanted to offer a solution to
Paul> people who don't want to install libiconv because they have a functional
Paul> iconv in libc, as is the case for NetBSD.

Another possible approach is to choose a known intermediate
representation, say UTF-32.  Then, provide gdb-specific replacements for
the various wide functions that are needed: wcslen, iswprint, iswdigit,
btowc.

Tom


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [RFC] Make string printing work on NetBSD (iconv issue)
  2010-05-07 22:46     ` Tom Tromey
@ 2010-05-10 21:25       ` Paul Koning
  0 siblings, 0 replies; 11+ messages in thread
From: Paul Koning @ 2010-05-10 21:25 UTC (permalink / raw)
  To: tromey; +Cc: gdb-patches

> Paul> It clearly can't make matters worse, because any platform that
> doesn't
> Paul> have the codeset name "wchar_t" currently doesn't work at all.
> 
> I think it is worse in one sense, which is that the results can be
> incorrect with no warning.  In particular I believe this could happen
> on
> Solaris in some locales.
> 
> I do see your point though.  Completely failing is terrible.
> 
> Paul> NetBSD clearly is using UCS-4 for wchar_t, but it does not
define
> that
> Paul> symbol.
> 
> Bummer.
> 
> Yet another idea would be host-specific overrides for
> INTERMEDIATE_ENCODING.

That would also have the advantage of working right in cross builds.  I
think I'll rewrite the patch to do that, and make it specifically
override for NetBSD where the right answer is known.  That way people
familiar with other hosts can supply the corresponding override for
those systems.

	paul


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Make string printing work on NetBSD (iconv issue)
  2010-05-07 17:26   ` Paul Koning
  2010-05-07 22:46     ` Tom Tromey
@ 2010-05-19 15:11     ` Tom Tromey
  2010-05-19 15:25       ` Paul Koning
  1 sibling, 1 reply; 11+ messages in thread
From: Tom Tromey @ 2010-05-19 15:11 UTC (permalink / raw)
  To: Paul Koning; +Cc: gdb-patches

>>>>> "Paul" == Paul Koning <Paul_Koning@Dell.com> writes:

Paul> NetBSD clearly is using UCS-4 for wchar_t, but it does not define that
Paul> symbol.
 
I have been wondering about this again recently.

I dug through the NetBSD libc a little, looking for this, but I couldn't
find it.  Then I ran across this page today:

    http://www.gnu.org/software/libunistring/manual/libunistring.html#The-wchar_005ft-mess

This claims that, like Solaris, NetBSD uses a locale-dependent wchar_t
encoding.  If this is the case then disabling the wide character code is
probably the simplest approach that will do something sensible for you.

BTW, libunistring seems like a nice approach.  It is close to what I
wish wchar_t support actually looked like.  I am tempted to say we
should go that route, but of course that would mean adding another
dependency :-(

Tom


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [RFC] Make string printing work on NetBSD (iconv issue)
  2010-05-19 15:11     ` Tom Tromey
@ 2010-05-19 15:25       ` Paul Koning
  2010-05-19 15:30         ` Tom Tromey
  0 siblings, 1 reply; 11+ messages in thread
From: Paul Koning @ 2010-05-19 15:25 UTC (permalink / raw)
  To: tromey; +Cc: gdb-patches

That gnu.org webpage talks about FreeBSD which is a different OS, and
what it says may or may not apply to NetBSD.  I guess I will have to ask
about this, since looking at the newlib sources (which is what libc is
in NetBSD) only got me lost.

	paul

> -----Original Message-----
> From: Tom Tromey [mailto:tromey@redhat.com]
> Sent: Wednesday, May 19, 2010 11:07 AM
> To: Paul Koning
> Cc: gdb-patches@sourceware.org
> Subject: Re: [RFC] Make string printing work on NetBSD (iconv issue)
> 
> >>>>> "Paul" == Paul Koning <Paul_Koning@Dell.com> writes:
> 
> Paul> NetBSD clearly is using UCS-4 for wchar_t, but it does not
define
> that
> Paul> symbol.
> 
> I have been wondering about this again recently.
> 
> I dug through the NetBSD libc a little, looking for this, but I
> couldn't
> find it.  Then I ran across this page today:
> 
> 
> http://www.gnu.org/software/libunistring/manual/libunistring.html#The-
> wchar_005ft-mess
> 
> This claims that, like Solaris, NetBSD uses a locale-dependent wchar_t
> encoding.  If this is the case then disabling the wide character code
> is
> probably the simplest approach that will do something sensible for
you.
> 
> BTW, libunistring seems like a nice approach.  It is close to what I
> wish wchar_t support actually looked like.  I am tempted to say we
> should go that route, but of course that would mean adding another
> dependency :-(
> 
> Tom


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Make string printing work on NetBSD (iconv issue)
  2010-05-19 15:25       ` Paul Koning
@ 2010-05-19 15:30         ` Tom Tromey
  2010-05-19 18:23           ` Paul Koning
  0 siblings, 1 reply; 11+ messages in thread
From: Tom Tromey @ 2010-05-19 15:30 UTC (permalink / raw)
  To: Paul Koning; +Cc: gdb-patches

>>>>> "Paul" == Paul Koning <Paul_Koning@Dell.com> writes:

Paul> That gnu.org webpage talks about FreeBSD which is a different OS, and
Paul> what it says may or may not apply to NetBSD.

Oops, sorry.

Paul> I guess I will have to ask
Paul> about this, since looking at the newlib sources (which is what libc is
Paul> in NetBSD) only got me lost.

I looked through the NetBSD libc for quite some time online, but got
lost as well.  At one point I concluded that the answer wasn't in the
code but in some kind of resource file I couldn't find.

Tom


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [RFC] Make string printing work on NetBSD (iconv issue)
  2010-05-19 15:30         ` Tom Tromey
@ 2010-05-19 18:23           ` Paul Koning
  0 siblings, 0 replies; 11+ messages in thread
From: Paul Koning @ 2010-05-19 18:23 UTC (permalink / raw)
  To: tromey; +Cc: gdb-patches

I sent off a query... we'll see what answer I get.

	paul

> -----Original Message-----
> From: Tom Tromey [mailto:tromey@redhat.com]
> Sent: Wednesday, May 19, 2010 11:25 AM
> To: Paul Koning
> Cc: gdb-patches@sourceware.org
> Subject: Re: [RFC] Make string printing work on NetBSD (iconv issue)
> 
> >>>>> "Paul" == Paul Koning <Paul_Koning@Dell.com> writes:
> 
> Paul> That gnu.org webpage talks about FreeBSD which is a different
OS,
> and
> Paul> what it says may or may not apply to NetBSD.
> 
> Oops, sorry.
> 
> Paul> I guess I will have to ask
> Paul> about this, since looking at the newlib sources (which is what
> libc is
> Paul> in NetBSD) only got me lost.
> 
> I looked through the NetBSD libc for quite some time online, but got
> lost as well.  At one point I concluded that the answer wasn't in the
> code but in some kind of resource file I couldn't find.
> 
> Tom


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-05-19 15:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-04 19:43 [RFC] Make string printing work on NetBSD (iconv issue) Paul Koning
2010-05-04 20:01 ` Andreas Schwab
2010-05-04 20:06   ` Paul Koning
2010-05-07 16:40 ` Tom Tromey
2010-05-07 17:26   ` Paul Koning
2010-05-07 22:46     ` Tom Tromey
2010-05-10 21:25       ` Paul Koning
2010-05-19 15:11     ` Tom Tromey
2010-05-19 15:25       ` Paul Koning
2010-05-19 15:30         ` Tom Tromey
2010-05-19 18:23           ` Paul Koning

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox