From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29816 invoked by alias); 20 Sep 2002 00:36:36 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 29808 invoked from network); 20 Sep 2002 00:36:33 -0000 Received: from unknown (HELO mx1.redhat.com) (66.187.233.31) by sources.redhat.com with SMTP; 20 Sep 2002 00:36:33 -0000 Received: from int-mx2.corp.redhat.com (nat-pool-rdu-dmz.redhat.com [172.16.52.200]) by mx1.redhat.com (8.11.6/8.11.6) with ESMTP id g8K0JDi23369 for ; Thu, 19 Sep 2002 20:19:13 -0400 Received: from potter.sfbay.redhat.com (potter.sfbay.redhat.com [172.16.27.15]) by int-mx2.corp.redhat.com (8.11.6/8.11.6) with ESMTP id g8K0aVx24541; Thu, 19 Sep 2002 20:36:31 -0400 Received: from romulus.sfbay.redhat.com (IDENT:9D+rT6JRXYTe8nWBMvG/rWXE5jHuddlP@romulus.sfbay.redhat.com [172.16.27.251]) by potter.sfbay.redhat.com (8.11.6/8.11.6) with ESMTP id g8K0aUC28610; Thu, 19 Sep 2002 17:36:30 -0700 Received: (from kev@localhost) by romulus.sfbay.redhat.com (8.11.6/8.11.6) id g8K0aQG23110; Thu, 19 Sep 2002 17:36:26 -0700 Date: Thu, 19 Sep 2002 17:36:00 -0000 From: Kevin Buettner Message-Id: <1020920003625.ZM23109@localhost.localdomain> In-Reply-To: Kevin Buettner "[PATCH RFC] Character set support" (Sep 12, 5:30pm) References: <1020913003056.ZM15701@localhost.localdomain> To: gdb-patches@sources.redhat.com Subject: Re: [PATCH RFC] Character set support Cc: tromey@redhat.com, Jim Blandy MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2002-09/txt/msg00507.txt.bz2 On Sep 12, 5:30pm, Kevin Buettner wrote: > I'll wait a week for comments after which time, if there are no > objections, I'll commit it. > > Kevin > > gdb/ChangeLog: > 2002-MM-DD Jim Blandy > > Add support for distinct host and target character sets. > * charset.c, charset.h: New files. > * c-exp.y: #include "charset.h". > (yylex): Convert character and string literals to the target > character set, before returning them as the semantic value of the > token. > * c-lang.c: #include "charset.h". > (c_emit_char): Use charset-specific methods to recognize > characters with backslash escape forms, to decide which characters > to print literally and which to print using numeric escape > sequences, and to convert target characters to host characters > before printing. > * utils.c: #include "charset.h". > (no_control_char_error): New function. > (parse_escape): Use charset-specific methods to recognize > backslash escapes, parse `control character' notation, and convert > characters from the host character set to the target character set. > * configure.in: Set the default host character set. > Check where to find iconv, and what its argument types might be. > * acinclude.m4 (AM_ICONV): New macro, borrowed from GCC. > * Makefile.in (SFILES): List charset.c. > (COMMON_OBS): List charset.o. > (charset.o): New rule. > (charset_h): New header dependency variable. > (c-lang.o, utils.o, c-exp.tab.o): Note dependency on $(charset_h). > (LIBICONV): New variable, set by configure. > (CLIBS): Include $(LIBICONV) here. > * aclocal.m4, config.in, configure: Regenerated. > > gdb/testsuite/ChangeLog: > 2002-MM-DD Jim Blandy > > * gdb.base/charset.exp, gdb.base/charset.c: New files. I've committed the portions above. I'll wait on the docs portion of the patch until Eli has had a chance to look it over. I ended up with a conflict when I updated c-lang.c. I'm reposting that portion of the patch since it is somewhat different than what I originally posted. The conflict was due to Tom Tromey's additon of the embedded \0 disambiguation code in c_emit_char(). I'd appreciate it if both Tom and Jim would look over this patch to see if it looks sensible. I've run the testsuite and have done some testing by hand and the results look good to me... Index: c-lang.c =================================================================== RCS file: /cvs/src/src/gdb/c-lang.c,v retrieving revision 1.14 diff -u -p -r1.14 c-lang.c --- c-lang.c 17 Sep 2002 17:01:47 -0000 1.14 +++ c-lang.c 20 Sep 2002 00:22:40 -0000 @@ -29,6 +29,7 @@ #include "valprint.h" #include "macroscope.h" #include "gdb_assert.h" +#include "charset.h" extern void _initialize_c_language (void); static void c_emit_char (int c, struct ui_file * stream, int quoter); @@ -40,55 +41,30 @@ static void c_emit_char (int c, struct u static void c_emit_char (register int c, struct ui_file *stream, int quoter) { + const char *escape; + int host_char; + c &= 0xFF; /* Avoid sign bit follies */ - if (PRINT_LITERAL_FORM (c)) + escape = c_target_char_has_backslash_escape (c); + if (escape) { - if (c == '\\' || c == quoter) - { - fputs_filtered ("\\", stream); - } - fprintf_filtered (stream, "%c", c); + if (quoter == '"' && strcmp (escape, "0") == 0) + /* Print nulls embedded in double quoted strings as \000 to + prevent ambiguity. */ + fprintf_filtered (stream, "\\000"); + else + fprintf_filtered (stream, "\\%s", escape); } - else + else if (target_char_to_host (c, &host_char) + && host_char_print_literally (host_char)) { - switch (c) - { - case '\n': - fputs_filtered ("\\n", stream); - break; - case '\b': - fputs_filtered ("\\b", stream); - break; - case '\t': - fputs_filtered ("\\t", stream); - break; - case '\f': - fputs_filtered ("\\f", stream); - break; - case '\r': - fputs_filtered ("\\r", stream); - break; - case '\013': - fputs_filtered ("\\v", stream); - break; - case '\033': - fputs_filtered ("\\e", stream); - break; - case '\007': - fputs_filtered ("\\a", stream); - break; - case '\0': - if (quoter == '\'') - fputs_filtered ("\\0", stream); - else - fprintf_filtered (stream, "\\%.3o", (unsigned int) c); - break; - default: - fprintf_filtered (stream, "\\%.3o", (unsigned int) c); - break; - } + if (host_char == '\\' || host_char == quoter) + fputs_filtered ("\\", stream); + fprintf_filtered (stream, "%c", host_char); } + else + fprintf_filtered (stream, "\\%.3o", (unsigned int) c); } void