Mirror of the gdb mailing list
 help / color / mirror / Atom feed
From: Daniel Jacobowitz <drow@false.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: dewar@adacore.com, nickrob@snap.net.nz,
	jan.kratochvil@redhat.com, 	Mathieu.Lacage@sophia.inria.fr,
	gdb@sourceware.org
Subject: Re: [RFC] Signed/unsigned character arrays are not strings
Date: Tue, 27 Feb 2007 22:12:00 -0000	[thread overview]
Message-ID: <20070227215316.GA26262@caradoc.them.org> (raw)
In-Reply-To: <ulkij2tva.fsf@gnu.org>

On Tue, Feb 27, 2007 at 11:06:17PM +0200, Eli Zaretskii wrote:
> Doesn't a similar situation exist with "unsigned int" and "int", or
> with "unsigned long" and "long"?  And yet we don't treat them
> differently.
> 
> IOW, I think it's quite expected that explicit signedness is
> relatively rare, since in the vast majority of cases it is simply not
> needed.  Interpreting this phenomenon as saying something about what
> kind of data is stored is not necessarily a good idea.

I feel that this is different for two reasons.  One is that the
situation for int and long is not the same, because "int" and "signed
int" are the same type in C - but "char" and "signed char" are not.
Char is explicitly of indeterminate sign.  The other is that there is
a widespread use of "char" for string data and "signed char" or
"unsigned char" for non-string data.

Of course, the first reason is a matter of standards and the second is
only a matter of my feeling and fumbling around with search engines.

> > I know that as a GDB developer, debugging GDB, I'd want explicitly
> > signed or unsigned characters to be printed as data
> 
> That is indeed one reason to use unsigned char.  But there is another,
> as demonstrated by Emacs's Lisp_String type: to store non-ASCII
> characters whose upper bit might be set.  And in those latter cases,
> we do want the data displayed as text, not as numeric codes.

Yes, there are good counterexamples.  Though I believe emacs also
stores some numeric non-character data in its strings (isn't there a
length or kind byte?).  Plus, this gets dangerously close to support
for explicitly printing strings of different character sets and
encodings - UTF-8 support is requested once a year or so.

Anyway, that's a project for another week :-)

> > This is user interface, not core
> > functionality.  It's more like clarifying the text of one of GCC's
> > warning messages than changing the dialect of C it accepts.  I think
> > we have a lot of freedom to adapt our default output to be more useful
> > to our users, especially when we provide a way to get the old
> > behavior.
> 
> The issue is precisely that it is controversial whether the proposed
> output is necessarily more useful to the user.  It is clearly more
> useful in some cases, but not in the others.

Yes, this is the part I think is really important.  I've provided what
information I can to support the fact that the new behavior is more
useful:

  - it's right for debugging GDB itself
  - it's right for processor vector registers supported by GDB
  - it seems to be right more often than not, given my abuse of
    CodeSearch.

But this is fuzzy.  I don't know how to find out more.  We have no way
to poll users.  I tried polling my coworkers, and got only agreement
that strings are usually stored in char * and not in unsigned char *.

There's some concrete reasons to do that, too.  GCC in some
configurations warns about passing an unsigned char * to a function
expecting a char * - functions like strlen.  As you know, some
applications have disabled this warning because they disagree or
because they agree but it would be too much labor to clean up; but I
know of plenty of projects that are fine with the new warning.

> > In this case that method is even completely backwards compatible.
> 
> ??? Now _I_ would like to ask for explanations.  Do you mean the cast
> to "char *" trick? if so, that's not backward compatibility, because
> existing scripts in .gdbinit files need to be modified to get back
> past behavior.

Other way round: using (char *) results would result in a backwards
compatible .gdbinit, because it would work with both old and new
versions of GDB.

Anyway, if we end up leaving the change in I will try to clarify the
manual.  There is almost nothing in it now about printing strings
using "print".

-- 
Daniel Jacobowitz
CodeSourcery


  reply	other threads:[~2007-02-27 21:53 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-24 16:13 Nick Roberts
2007-02-24 20:11 ` Daniel Jacobowitz
2007-02-24 20:53   ` Nick Roberts
2007-02-24 21:07     ` Jan Kratochvil
2007-02-25  8:00       ` Daniel Jacobowitz
2007-02-25 19:54         ` Nick Roberts
2007-02-25 21:07     ` mathieu lacage
2007-02-26  0:45       ` Jan Kratochvil
2007-02-27  7:17         ` Eli Zaretskii
2007-02-27  9:29         ` Daniel Jacobowitz
2007-02-27 12:02           ` Nick Roberts
2007-02-27 17:06             ` Robert Dewar
2007-02-27 18:42               ` Daniel Jacobowitz
2007-02-27 21:53                 ` Eli Zaretskii
2007-02-27 22:12                   ` Daniel Jacobowitz [this message]
2007-02-27 22:14                     ` Mark Kettenis
2007-02-28  0:47                       ` Paul Koning
2007-02-28  1:14                       ` Jim Blandy
2007-02-28  1:59                         ` Jim Blandy
2007-02-28  5:26                           ` Nick Roberts
2007-02-28 14:35                       ` Daniel Jacobowitz
2007-03-01  0:43                         ` Jim Blandy
2007-03-01  0:54                         ` Nick Roberts
2007-02-27 21:47               ` Eli Zaretskii
2007-02-27 22:12             ` Daniel Jacobowitz
2007-04-10 21:59         ` Daniel Jacobowitz
2007-02-28 13:05 pkoning
2007-03-01 11:01 ` Mark Kettenis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070227215316.GA26262@caradoc.them.org \
    --to=drow@false.org \
    --cc=Mathieu.Lacage@sophia.inria.fr \
    --cc=dewar@adacore.com \
    --cc=eliz@gnu.org \
    --cc=gdb@sourceware.org \
    --cc=jan.kratochvil@redhat.com \
    --cc=nickrob@snap.net.nz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox