* iconv returning byte order marks for Solaris 2.9
@ 2009-07-15 18:28 Andrew
2009-07-15 18:57 ` Tom Tromey
0 siblings, 1 reply; 7+ messages in thread
From: Andrew @ 2009-07-15 18:28 UTC (permalink / raw)
To: gdb-patches
[-- Attachment #1: Type: text/plain, Size: 355 bytes --]
Hi
I found a problem printing strings for gdb 6.8 weekly snapshot
(2009 07 07) on Solaris 2.9.
I eventually found that changing INTERMEDIATE_ENCODING
in gdb_wchar.h to "UCS-4" and applying the following
patch worked. Any comments?
I'm not sure how to handle the INTERMEDIATE_ENCODING
change, since it's probably system dependent.
Andrew
[-- Attachment #2: patch_charset.txt --]
[-- Type: text/plain, Size: 671 bytes --]
diff -rau src.original/gdb/charset.c src/gdb/charset.c
--- src.original/gdb/charset.c 2009-07-15 13:05:42.000896000 -0400
+++ src/gdb/charset.c 2009-07-15 13:09:23.000013000 -0400
@@ -646,6 +646,20 @@
*out_chars = iter->out;
*ptr = orig_inptr;
*len = orig_in - iter->bytes;
+
+ if (num > 1) {
+ if ( (iter->out[0] == (gdb_wchar_t) 0xfffe) ||
+ (iter->out[0] == (gdb_wchar_t) 0xfeff) ) {
+
+ /* iconv returned byte order marks, skip those */
+ int mov;
+ for (mov = 0; mov < (num - 1); mov ++)
+ iter->out[mov] = iter->out[mov + 1];
+
+ num -= 1;
+ }
+ }
+
return num;
}
Only in src/gdb: charset.c.~1.24.~
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: iconv returning byte order marks for Solaris 2.9
2009-07-15 18:28 iconv returning byte order marks for Solaris 2.9 Andrew
@ 2009-07-15 18:57 ` Tom Tromey
2009-07-16 2:29 ` Andrew
0 siblings, 1 reply; 7+ messages in thread
From: Tom Tromey @ 2009-07-15 18:57 UTC (permalink / raw)
To: ke; +Cc: gdb-patches
>>>>> "Andrew" == Andrew <ke@alum.bu.edu> writes:
Andrew> I found a problem printing strings for gdb 6.8 weekly snapshot
Andrew> (2009 07 07) on Solaris 2.9.
Thanks for finding and diagnosing this.
Andrew> I eventually found that changing INTERMEDIATE_ENCODING
Andrew> in gdb_wchar.h to "UCS-4" and applying the following
Andrew> patch worked. Any comments?
Andrew> I'm not sure how to handle the INTERMEDIATE_ENCODING
Andrew> change, since it's probably system dependent.
I don't have access to Solaris. If I understand correctly, the
situation is:
* wchar_t on Solaris is encoded using UCS-4
* iconv_open accepts "wchar_t" as an encoding name
* in this case, iconv emits a BOM
First, this seems like it must be a Solaris bug, just because I can't
imagine how this would be useful.
I don't think we can use your patch as-is. It does the BOM elimination
unconditionally, but really I think we can only do it on platforms where
we know that wchar_t is UCS-4 (or UCS-2 I suppose).
Does Solaris 9 support a full suite of conversions? If not, one option
would be to use libiconv, and find a way to disable most of this code by
default on Solaris.
Failing that, the simplest fix would be if there is an encoding
(compatible with wchar_t) we can use on Solaris which does not insert
the BOM. For example, maybe "UCS-4BE" or "UCS-4LE", depending on the
architecture. I think a fix like this could be done entirely in
gdb_wchar.h. Could you try that?
As far as the host dependency, we can probably just check
__STDC_ISO_10646__.
Tom
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: iconv returning byte order marks for Solaris 2.9
2009-07-15 18:57 ` Tom Tromey
@ 2009-07-16 2:29 ` Andrew
2009-07-17 19:19 ` Tom Tromey
0 siblings, 1 reply; 7+ messages in thread
From: Andrew @ 2009-07-16 2:29 UTC (permalink / raw)
To: gdb-patches
In the system I'm working iconv_open doesn't accept "wchar_t" as encoding name. It failed when INTERMEDIATE_ENCODING was set to that.
But setting INTERMEDIATE_ENCODING to "UCS-4BE" eliminated the BOM in the beginning.
Andrew
--- On Wed, 7/15/09, Tom Tromey <tromey@redhat.com> wrote:
> From: Tom Tromey <tromey@redhat.com>
> Subject: Re: iconv returning byte order marks for Solaris 2.9
> To: ke@alum.bu.edu
> Cc: gdb-patches@sourceware.org
> Date: Wednesday, July 15, 2009, 2:24 PM
> >>>>> "Andrew" ==
> Andrew <ke@alum.bu.edu>
> writes:
>
> Andrew> I found a problem printing strings for gdb 6.8
> weekly snapshot
> Andrew> (2009 07 07) on Solaris 2.9.
>
> Thanks for finding and diagnosing this.
>
> Andrew> I eventually found that changing
> INTERMEDIATE_ENCODING
> Andrew> in gdb_wchar.h to "UCS-4" and applying the
> following
> Andrew> patch worked. Any comments?
>
> Andrew> I'm not sure how to handle the
> INTERMEDIATE_ENCODING
> Andrew> change, since it's probably system dependent.
>
> I don't have access to Solaris. If I understand
> correctly, the
> situation is:
>
> * wchar_t on Solaris is encoded using UCS-4
> * iconv_open accepts "wchar_t" as an encoding name
> * in this case, iconv emits a BOM
>
> First, this seems like it must be a Solaris bug, just
> because I can't
> imagine how this would be useful.
>
> I don't think we can use your patch as-is. It does
> the BOM elimination
> unconditionally, but really I think we can only do it on
> platforms where
> we know that wchar_t is UCS-4 (or UCS-2 I suppose).
>
> Does Solaris 9 support a full suite of conversions?
> If not, one option
> would be to use libiconv, and find a way to disable most of
> this code by
> default on Solaris.
>
> Failing that, the simplest fix would be if there is an
> encoding
> (compatible with wchar_t) we can use on Solaris which does
> not insert
> the BOM. For example, maybe "UCS-4BE" or "UCS-4LE",
> depending on the
> architecture. I think a fix like this could be done
> entirely in
> gdb_wchar.h. Could you try that?
>
> As far as the host dependency, we can probably just check
> __STDC_ISO_10646__.
>
> Tom
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: iconv returning byte order marks for Solaris 2.9
2009-07-16 2:29 ` Andrew
@ 2009-07-17 19:19 ` Tom Tromey
2009-07-21 20:18 ` Andrew
0 siblings, 1 reply; 7+ messages in thread
From: Tom Tromey @ 2009-07-17 19:19 UTC (permalink / raw)
To: ke; +Cc: gdb-patches
>>>>> "Andrew" == Andrew <ke@alum.bu.edu> writes:
Andrew> In the system I'm working iconv_open doesn't accept "wchar_t" as
Andrew> encoding name. It failed when INTERMEDIATE_ENCODING was set to
Andrew> that.
Ah, thanks.
Andrew> But setting INTERMEDIATE_ENCODING to "UCS-4BE" eliminated the
Andrew> BOM in the beginning.
Great. Could you try the appended patch?
I'm testing it on Linux.
Tom
diff --git a/gdb/gdb_wchar.h b/gdb/gdb_wchar.h
index 07a6c87..241e051 100644
--- a/gdb/gdb_wchar.h
+++ b/gdb/gdb_wchar.h
@@ -35,8 +35,6 @@
wrappers for the wchar_t functionality we use. */
-#define INTERMEDIATE_ENCODING "wchar_t"
-
#if defined (HAVE_ICONV)
#include <iconv.h>
#else
@@ -63,6 +61,20 @@ typedef wint_t gdb_wint_t;
#define LCST(X) L ## X
+#ifdef __STDC_ISO_10646__
+/* On Solaris 9, iconv_open does not accept "wchar_t". So, on this
+ platform, and other platforms where wchar_t is known to use
+ ISO-10646, choose an appropriate explicit charset name. Also,
+ UCS-4 on Solaris will emit a BOM, which we don't want. So, we
+ choose an explicit little- or big-endian variant, depending on the
+ host. */
+#if WORDS_BIGENDIAN
+#define INTERMEDIATE_ENCODING "UCS-4BE"
+#else
+#define INTERMEDIATE_ENCODING "UCS-4LE"
+#endif
+#endif
+
#else
typedef char gdb_wchar_t;
@@ -87,4 +99,8 @@ typedef int gdb_wint_t;
#endif
+#ifndef INTERMEDIATE_ENCODING
+#define INTERMEDIATE_ENCODING "wchar_t"
+#endif
+
#endif /* GDB_WCHAR_H */
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: iconv returning byte order marks for Solaris 2.9
2009-07-17 19:19 ` Tom Tromey
@ 2009-07-21 20:18 ` Andrew
2009-07-24 21:58 ` Tom Tromey
2009-08-14 20:13 ` Tom Tromey
0 siblings, 2 replies; 7+ messages in thread
From: Andrew @ 2009-07-21 20:18 UTC (permalink / raw)
To: gdb-patches
Thanks for the patch. I'm actually generating solaris binaries
using a cross compiler (from a Linux box) and in my current
configuration it doesn't work. __STDC_ISO_10646__ is not defined.
I will try installing gcc locally and see how that works.
Andrew
--- On Fri, 7/17/09, Tom Tromey <tromey@redhat.com> wrote:
> From: Tom Tromey <tromey@redhat.com>
> Subject: Re: iconv returning byte order marks for Solaris 2.9
> To: ke@alum.bu.edu
> Cc: gdb-patches@sourceware.org
> Date: Friday, July 17, 2009, 3:02 PM
> >>>>> "Andrew" ==
> Andrew <ke@alum.bu.edu>
> writes:
>
> Andrew> In the system I'm working iconv_open doesn't
> accept "wchar_t" as
> Andrew> encoding name. It failed when
> INTERMEDIATE_ENCODING was set to
> Andrew> that.
>
> Ah, thanks.
>
> Andrew> But setting INTERMEDIATE_ENCODING to "UCS-4BE"
> eliminated the
> Andrew> BOM in the beginning.
>
> Great. Could you try the appended patch?
> I'm testing it on Linux.
>
> Tom
>
> diff --git a/gdb/gdb_wchar.h b/gdb/gdb_wchar.h
> index 07a6c87..241e051 100644
> --- a/gdb/gdb_wchar.h
> +++ b/gdb/gdb_wchar.h
> @@ -35,8 +35,6 @@
> wrappers for the wchar_t functionality we
> use. */
>
>
> -#define INTERMEDIATE_ENCODING "wchar_t"
> -
> #if defined (HAVE_ICONV)
> #include <iconv.h>
> #else
> @@ -63,6 +61,20 @@ typedef wint_t gdb_wint_t;
>
> #define LCST(X) L ## X
>
> +#ifdef __STDC_ISO_10646__
> +/* On Solaris 9, iconv_open does not accept
> "wchar_t". So, on this
> + platform, and other platforms where
> wchar_t is known to use
> + ISO-10646, choose an appropriate
> explicit charset name. Also,
> + UCS-4 on Solaris will emit a BOM, which
> we don't want. So, we
> + choose an explicit little- or big-endian
> variant, depending on the
> + host. */
> +#if WORDS_BIGENDIAN
> +#define INTERMEDIATE_ENCODING "UCS-4BE"
> +#else
> +#define INTERMEDIATE_ENCODING "UCS-4LE"
> +#endif
> +#endif
> +
> #else
>
> typedef char gdb_wchar_t;
> @@ -87,4 +99,8 @@ typedef int gdb_wint_t;
>
> #endif
>
> +#ifndef INTERMEDIATE_ENCODING
> +#define INTERMEDIATE_ENCODING "wchar_t"
> +#endif
> +
> #endif /* GDB_WCHAR_H */
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: iconv returning byte order marks for Solaris 2.9
2009-07-21 20:18 ` Andrew
@ 2009-07-24 21:58 ` Tom Tromey
2009-08-14 20:13 ` Tom Tromey
1 sibling, 0 replies; 7+ messages in thread
From: Tom Tromey @ 2009-07-24 21:58 UTC (permalink / raw)
To: ke; +Cc: gdb-patches
>>>>> "Andrew" == Andrew <ke@alum.bu.edu> writes:
Andrew> Thanks for the patch. I'm actually generating solaris binaries
Andrew> using a cross compiler (from a Linux box) and in my current
Andrew> configuration it doesn't work. __STDC_ISO_10646__ is not defined.
Ouch.
This seems like a QoI issue in the cross compiler. But, it is hard to
call it a bug exactly.
I don't know the best thing to do in this case. I suppose we could make
a new variable used by configure.host that would override the name of
the wchar_t encoding.
Tom
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: iconv returning byte order marks for Solaris 2.9
2009-07-21 20:18 ` Andrew
2009-07-24 21:58 ` Tom Tromey
@ 2009-08-14 20:13 ` Tom Tromey
1 sibling, 0 replies; 7+ messages in thread
From: Tom Tromey @ 2009-08-14 20:13 UTC (permalink / raw)
To: ke; +Cc: gdb-patches
>>>>> "Andrew" == Andrew <ke@alum.bu.edu> writes:
Andrew> Thanks for the patch. I'm actually generating solaris binaries
Andrew> using a cross compiler (from a Linux box) and in my current
Andrew> configuration it doesn't work. __STDC_ISO_10646__ is not defined.
Andrew> I will try installing gcc locally and see how that works.
Hi, what is the status of this?
I'm asking because if I need to make some changes to gdb's configury,
I'd like to have plenty of time before 7.0.
thanks,
Tom
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-08-14 19:29 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-15 18:28 iconv returning byte order marks for Solaris 2.9 Andrew
2009-07-15 18:57 ` Tom Tromey
2009-07-16 2:29 ` Andrew
2009-07-17 19:19 ` Tom Tromey
2009-07-21 20:18 ` Andrew
2009-07-24 21:58 ` Tom Tromey
2009-08-14 20:13 ` Tom Tromey
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox