From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 118417 invoked by alias); 9 Oct 2015 13:31:45 -0000 Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org Received: (qmail 118401 invoked by uid 89); 9 Oct 2015 13:31:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: Yes, score=5.5 required=5.0 tests=AWL,BAYES_95,BODY_8BITS,GARBLED_BODY,RCVD_IN_DNSWL_NONE,SPF_SOFTFAIL autolearn=no version=3.3.2 X-HELO: mtaout26.012.net.il Received: from mtaout26.012.net.il (HELO mtaout26.012.net.il) (80.179.55.182) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 09 Oct 2015 13:31:43 +0000 Received: from conversion-daemon.mtaout26.012.net.il by mtaout26.012.net.il (HyperSendmail v2007.08) id <0NVY00600G6CWD00@mtaout26.012.net.il> for gdb@sourceware.org; Fri, 09 Oct 2015 16:34:40 +0300 (IDT) Received: from HOME-C4E4A596F7 ([84.94.185.246]) by mtaout26.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NVY00588GDS8620@mtaout26.012.net.il>; Fri, 09 Oct 2015 16:34:40 +0300 (IDT) Date: Fri, 09 Oct 2015 13:31:00 -0000 From: Eli Zaretskii Subject: Re: GDB/MI reporting non-ASCII file names In-reply-to: <5617A0FD.3020208@redhat.com> To: Pedro Alves Cc: gdb@sourceware.org Reply-to: Eli Zaretskii Message-id: <83si5ktbgf.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <83a8s5d1nw.fsf@gnu.org> <560BCCF9.2040202@redhat.com> <83bnckasck.fsf@gnu.org> <560BF686.1030400@redhat.com> <83a8s3c3c3.fsf@gnu.org> <5617A0FD.3020208@redhat.com> X-IsSubscribed: yes X-SW-Source: 2015-10/txt/msg00036.txt.bz2 > Date: Fri, 09 Oct 2015 12:11:57 +0100 > From: Pedro Alves > CC: gdb@sourceware.org > > On 09/30/2015 04:51 PM, Eli Zaretskii wrote: > > > If you compile a program from a source file whose name includes > > non-ASCII characters, then debug that program with -i=mi, do you see > > the file names correctly, after turning 7 bits off? > > > > Looks like I see the same as you. With a file named "çêá.c": > > (gdb) > set print sevenbit-strings on > &"set print sevenbit-strings on\n" > =cmd-param-changed,param="print sevenbit-strings",value="on" > ^done > (gdb) start > ... > *stopped,reason="breakpoint-hit",disp="del",bkptno="2",frame={addr="0x00000000004004fb",func="main",args=[{name="argc",value="1"},{name="argv",value="0x7fffffffd838"}],file="\303\247\303\252\303\241.c",fullname="/home/pedro/gdb/tests/\303\247\303\252\303\241.c",line="5"},thread-id="1",stopped-threads="all",core="2" > (gdb) > > (gdb) > set print sevenbit-strings off > &"set print sevenbit-strings off\n" > =cmd-param-changed,param="print sevenbit-strings",value="off" > ^done > (gdb) start > ... > *stopped,reason="breakpoint-hit",disp="del",bkptno="3",frame={addr="0x00000000004004fb",func="main",args=[{name="argc",value="1"},{name="argv",value="0x7fffffffd838"}],file="çêá.c",fullname="/home/pedro/gdb/tests/çêá.c",line="5"},thread-id="1",stopped-threads="all",core="2" > > > > > But with a file named "γλώσσα.c" + "set print sevenbit-strings off": > > *stopped,reason="breakpoint-hit",disp="del",bkptno="1",frame={addr="0x00000000004004fb",func="main",args=[{name="argc",value="1"},{name="argv",value="0x7fffffffd808"}],file="γλ�\216�\203�\203α.c",fullname="/home/pedro/gdb/tests/γλ�\216�\203�\203α.c",line="5"},thread-id="1",stopped-threads="all",core="3" > =breakpoint-deleted,id="1" > (gdb) I think the 0x7F..0xA0 range is a left-over from the Latin-N era, and is a bad idea with the current UTF-8 default. Would something like the following be acceptable (if accompanied with the suitable changes to NEWS and the manual)? diff --git a/gdb/utils.c b/gdb/utils.c index afeff12..56eb9d5 100644 --- a/gdb/utils.c +++ b/gdb/utils.c @@ -1509,12 +1509,11 @@ printchar (int c, void (*do_fputs) (const char *, struct ui_file *), void (*do_fprintf) (struct ui_file *, const char *, ...) ATTRIBUTE_FPTR_PRINTF_2, struct ui_file *stream, int quoter) { - c &= 0xFF; /* Avoid sign bit follies */ + c &= 0xFF; /* Avoid sign bit follies */ - if (c < 0x20 || /* Low control chars */ - (c >= 0x7F && c < 0xA0) || /* DEL, High controls */ - (sevenbit_strings && c >= 0x80)) - { /* high order bit set */ + if (c < 0x20 || /* Low control chars */ + (sevenbit_strings && c >= 0x80)) /* High order bit set */ + { switch (c) { case '\n':