From: Elena Zannoni <ezannoni@redhat.com>
To: gdb@sources.redhat.com, jimb@redhat.com
Subject: charset.c problem with non-en_US locales
Date: Tue, 22 Apr 2003 19:59:00 -0000 [thread overview]
Message-ID: <16037.41011.517603.566953@localhost.redhat.com> (raw)
I got a bug report against the gdb in RH Linux which I found
interesting, and since it occurs in FSF gdb as well...
Problem: If you set the locale to Turkish, gdb errors out, complaining
that it cannot find the ISO-8859-1 charset.
[ezannoni@localhost gdb]$ LC_ALL=tr_TR.UTF-8 ./gdb
GDB doesn't know of any character set named `ISO-8859-1'.
No display number 0.
Disabling display 0 to avoid infinite recursion.
[ezannoni@localhost gdb]$
[ezannoni@localhost gdb]$ LC_ALL=tr_TR.ISO-8859-9 ./gdb
GDB doesn't know of any character set named `ISO-8859-1'.
No display number 0.
Disabling display 0 to avoid infinite recursion.
[ezannoni@localhost gdb]$
The real problem is in the use of the tolower() function in charset.c
to do a case insensitive comparison between the two strings
"ISO-8859-1" and "iso-8859-1".
/* Character set names are always compared ignoring case. */
static int
strcmp_case_insensitive (const char *p, const char *q)
{
while (*p && *q && tolower (*p) == tolower (*q))
p++, q++;
return tolower (*p) - tolower (*q);
}
When the locale is set to Turkish (or any other non-Latin), the
tolower/toupper functions don't work as they would in English. The
lowercase version of 'I' is not 'i', for instance but some other
chracter ('i' w/o the dot). Indeed the man pages for tolower/toupper
warn about this. strcasecmp() also has the problem.
So, I think the whole case-insensitive approach for the names of the
charsets and the translation tables should probably be removed. What
was the reason behind it? Was it that the user could type upper/lower
case charset names at the command line? After all the official name is
'ISO' not 'iso'.
This patch works, but I am not confident that this it's enough.
elena
Index: charset.c
===================================================================
RCS file: /cvs/uberbaum/gdb/charset.c,v
retrieving revision 1.3
diff -u -p -r1.3 charset.c
--- charset.c 14 Jan 2003 00:49:03 -0000 1.3
+++ charset.c 22 Apr 2003 19:56:53 -0000
@@ -160,10 +160,13 @@ struct translation {
static int
strcmp_case_insensitive (const char *p, const char *q)
{
- while (*p && *q && tolower (*p) == tolower (*q))
+#if 0
+ while (*p && *q && tolower (*p) == tolower (*q))
p++, q++;
return tolower (*p) - tolower (*q);
+#endif
+ return strcmp (p, q);
}
@@ -1207,24 +1210,24 @@ _initialize_charset (void)
register_charset (simple_charset ("ascii", 1,
ascii_print_literally, 0,
ascii_to_control, 0));
- register_charset (iso_8859_family_charset ("iso-8859-1"));
+ register_charset (iso_8859_family_charset ("ISO-8859-1"));
register_charset (ebcdic_family_charset ("ebcdic-us"));
register_charset (ebcdic_family_charset ("ibm1047"));
register_iconv_charsets ();
{
struct { char *from; char *to; int *table; } tlist[] = {
- { "ascii", "iso-8859-1", ascii_to_iso_8859_1_table },
+ { "ascii", "ISO-8859-1", ascii_to_iso_8859_1_table },
{ "ascii", "ebcdic-us", ascii_to_ebcdic_us_table },
{ "ascii", "ibm1047", ascii_to_ibm1047_table },
- { "iso-8859-1", "ascii", iso_8859_1_to_ascii_table },
- { "iso-8859-1", "ebcdic-us", iso_8859_1_to_ebcdic_us_table },
- { "iso-8859-1", "ibm1047", iso_8859_1_to_ibm1047_table },
+ { "ISO-8859-1", "ascii", iso_8859_1_to_ascii_table },
+ { "ISO-8859-1", "ebcdic-us", iso_8859_1_to_ebcdic_us_table },
+ { "ISO-8859-1", "ibm1047", iso_8859_1_to_ibm1047_table },
{ "ebcdic-us", "ascii", ebcdic_us_to_ascii_table },
- { "ebcdic-us", "iso-8859-1", ebcdic_us_to_iso_8859_1_table },
+ { "ebcdic-us", "ISO-8859-1", ebcdic_us_to_iso_8859_1_table },
{ "ebcdic-us", "ibm1047", ebcdic_us_to_ibm1047_table },
{ "ibm1047", "ascii", ibm1047_to_ascii_table },
- { "ibm1047", "iso-8859-1", ibm1047_to_iso_8859_1_table },
+ { "ibm1047", "ISO-8859-1", ibm1047_to_iso_8859_1_table },
{ "ibm1047", "ebcdic-us", ibm1047_to_ebcdic_us_table }
};
next reply other threads:[~2003-04-22 19:59 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-04-22 19:59 Elena Zannoni [this message]
2003-04-23 9:45 ` Eli Zaretskii
2003-04-23 17:00 ` H. J. Lu
2003-04-23 19:45 ` Andrew Cagney
2003-04-24 2:35 ` Jim Blandy
2003-04-24 20:07 ` Elena Zannoni
2003-04-24 20:26 ` Andrew Cagney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=16037.41011.517603.566953@localhost.redhat.com \
--to=ezannoni@redhat.com \
--cc=gdb@sources.redhat.com \
--cc=jimb@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox