From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-43948-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 5555 invoked by alias); 5 May 2006 18:24:07 -0000
Received: (qmail 5546 invoked by uid 22791); 5 May 2006 18:24:06 -0000
X-Spam-Check-By: sourceware.org
Received: from nile.gnat.com (HELO nile.gnat.com) (205.232.38.5)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 05 May 2006 18:23:59 +0000
Received: from localhost (localhost [127.0.0.1]) 	by filtered-nile.gnat.com (Postfix) with ESMTP id 6610D48CBDB 	for <gdb-patches@sources.redhat.com>; Fri,  5 May 2006 14:23:53 -0400 (EDT)
Received: from nile.gnat.com ([127.0.0.1])  by localhost (nile.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP  id 27388-01-8 for <gdb-patches@sources.redhat.com>;  Fri,  5 May 2006 14:23:53 -0400 (EDT)
Received: from takamaka.act-europe.fr (s142-179-108-108.bc.hsia.telus.net [142.179.108.108]) 	by nile.gnat.com (Postfix) with ESMTP id CD35448CBDA 	for <gdb-patches@sources.redhat.com>; Fri,  5 May 2006 14:23:52 -0400 (EDT)
Received: by takamaka.act-europe.fr (Postfix, from userid 507) 	id 09D4647E7F; Fri,  5 May 2006 11:23:52 -0700 (PDT)
Date: Fri, 05 May 2006 18:24:00 -0000
From: Joel Brobecker <brobecker@adacore.com>
To: gdb-patches@sources.redhat.com
Subject: [RFC/RFA] Cleaner handling of character entities ?
Message-ID: <20060505182351.GK1109@adacore.com>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="BOKacYhQ+x31HxR3"
Content-Disposition: inline
User-Agent: Mutt/1.4i
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2006-05/txt/msg00077.txt.bz2


--BOKacYhQ+x31HxR3
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-length: 2000

Hello,

We are currently working on transitioning from GCC 3.4 to GCC 4.1,
and we found an issue with character types. For a program like this:

        procedure P is
           A : Character := 'A';
        begin
           A := 'B'; --  START
        end;

The debugging info produced for the character is:

        .uleb128 0x2    # (DIE (0x62) DW_TAG_base_type)
        .long   .LASF0  # DW_AT_name: "__unknown__"
        .byte   0x1     # DW_AT_byte_size
        .byte   0x7     # DW_AT_encoding

The DW_AT_name used to be "character", and ada-lang.c was using
the name to identify character types.

After a small discussion with the company engineer for GCC debug
info production, he agreed that the name is wrong, and should be
changed back.

However, he also suggested that the debugger should avoid relying
on the type name, and use the encoding if available. In the case
above, 0x7 is DW_ATE_unsigned but it should be DW_ATE_unsigned_char
(0x8), so he will change that.

I looked at the GDB side and came up with a few changes here and
there that implement his suggestion. I discovered that we rely
quite a bit on the type name to identify characters, and I guessed
that it was historical because of debugging format shortcomings
(with stabs for instance).

I think the attached patch improves the situation in terms of making
things cleaner in the case of dwarf2, without impacting targets that
still use older debugging format like stabs.

2006-05-05  Joel Brobecker  <brobecker@adacore.com>

        * dwarf2read.c (read_base_type): Set code to TYPE_CODE_CHAR
        for char and unsigned char types.
        * ada-lang.c (ada_is_character_type): Always return true if
        the type code is TYPE_CODE_CHAR.
        * c-valprint.c (c_val_print): Print arrays whose element type
        code is TYPE_CODE_CHAR as strings.

Tested on x86-linux, with GCC 3.4 (dwarf2, stabs+), GCC 4.1 (dwarf2).
No regression.

What do you guys think? Wouldn't that be a step forward?

Thanks,
-- 
Joel

--BOKacYhQ+x31HxR3
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="char.diff"
Content-length: 2725

Index: dwarf2read.c
===================================================================
RCS file: /cvs/src/src/gdb/dwarf2read.c,v
retrieving revision 1.194
diff -u -p -r1.194 dwarf2read.c
--- dwarf2read.c	21 Apr 2006 20:26:07 -0000	1.194
+++ dwarf2read.c	5 May 2006 17:31:53 -0000
@@ -4728,10 +4728,15 @@ read_base_type (struct die_info *die, st
 	  code = TYPE_CODE_FLT;
 	  break;
 	case DW_ATE_signed:
+	  break;
 	case DW_ATE_signed_char:
+          code = TYPE_CODE_CHAR;
 	  break;
 	case DW_ATE_unsigned:
+	  type_flags |= TYPE_FLAG_UNSIGNED;
+	  break;
 	case DW_ATE_unsigned_char:
+          code = TYPE_CODE_CHAR;
 	  type_flags |= TYPE_FLAG_UNSIGNED;
 	  break;
 	default:
Index: ada-lang.c
===================================================================
RCS file: /cvs/src/src/gdb/ada-lang.c,v
retrieving revision 1.84
diff -u -p -r1.84 ada-lang.c
--- ada-lang.c	12 Jan 2006 08:36:29 -0000	1.84
+++ ada-lang.c	5 May 2006 17:30:55 -0000
@@ -7145,10 +7145,15 @@ int
 ada_is_character_type (struct type *type)
 {
   const char *name = ada_type_name (type);
+
+  /* If the type code says it's a character, then assume it really is,
+     and don't check any further.  */
+  if (TYPE_CODE (type) == TYPE_CODE_CHAR)
+    return 1;
+  
   return
     name != NULL
-    && (TYPE_CODE (type) == TYPE_CODE_CHAR
-        || TYPE_CODE (type) == TYPE_CODE_INT
+    && (TYPE_CODE (type) == TYPE_CODE_INT
         || TYPE_CODE (type) == TYPE_CODE_RANGE)
     && (strcmp (name, "character") == 0
         || strcmp (name, "wide_character") == 0
Index: c-valprint.c
===================================================================
RCS file: /cvs/src/src/gdb/c-valprint.c,v
retrieving revision 1.39
diff -u -p -r1.39 c-valprint.c
--- c-valprint.c	18 Jan 2006 21:24:19 -0000	1.39
+++ c-valprint.c	5 May 2006 17:31:16 -0000
@@ -96,9 +96,8 @@ c_val_print (struct type *type, const gd
 	    }
 	  /* For an array of chars, print with string syntax.  */
 	  if (eltlen == 1 &&
-	      ((TYPE_CODE (elttype) == TYPE_CODE_INT)
-	       || ((current_language->la_language == language_m2)
-		   && (TYPE_CODE (elttype) == TYPE_CODE_CHAR)))
+	      (TYPE_CODE (elttype) == TYPE_CODE_INT
+	       || TYPE_CODE (elttype) == TYPE_CODE_CHAR)
 	      && (format == 0 || format == 's'))
 	    {
 	      /* If requested, look for the first null char and only print
@@ -192,7 +191,8 @@ c_val_print (struct type *type, const gd
 	  /* FIXME: need to handle wchar_t here... */
 
 	  if (TYPE_LENGTH (elttype) == 1
-	      && TYPE_CODE (elttype) == TYPE_CODE_INT
+	      && (TYPE_CODE (elttype) == TYPE_CODE_INT
+                  || TYPE_CODE (elttype) == TYPE_CODE_CHAR)
 	      && (format == 0 || format == 's')
 	      && addr != 0)
 	    {

--BOKacYhQ+x31HxR3--