From: Tom Tromey <tromey@redhat.com>
To: "Alexey Feldgendler" <alexeyf@opera.com>
Cc: gdb-patches@sourceware.org
Subject: Re: Default target wide character set
Date: Wed, 16 Sep 2009 18:56:00 -0000 [thread overview]
Message-ID: <m3ljkepvlz.fsf@fleche.redhat.com> (raw)
In-Reply-To: <op.u0axp6t456e9f9@xman.oslo.opera.com> (Alexey Feldgendler's message of "Tue, 15 Sep 2009 16:11:56 +0200")
>>>>> "Alexey" == Alexey Feldgendler <alexeyf@opera.com> writes:
Alexey> I got assigned part-time to contribute to gdb, mostly by fixing
Alexey> bugs that affect us, but also to implement new features.
Welcome to GDB.
I don't know your copyright assignment situation, but if you are
planning to submit patches, it doesn't hurt to get started on that
early. Send me email off-list if you want to do this.
Alexey> A. Have the default target wide character set depend on the size of
Alexey> the type named wchar_t.
Alexey> Side question: how does gdb figure out sizeof(wchar_t)? Does it come
Alexey> from the symbol table or from elsewhere?
Yeah, look in c-lang.c for a call to lookup_typename with an argument of
"wchar_t". The resulting type can be queried for its attributes.
Alexey> B. Have charset_for_string_type() check after calling
Alexey> target_wide_charset() whether the width of the returned character set
Alexey> matches the width of the actual string type, and use fallback similar
Alexey> to what's done for C_STRING_16 and C_STRING_32 if it doesn't.
Alexey> What do you think of options A and B? Or is there maybe another
Alexey> possiblity that I'm overlooking?
Yeah, I think there is another solution. It is pretty similar to these,
though.
The general problem is that the relevant standards put very few
constraints on wchar_t. There is no guarantee that they use any form of
UCS -- and there are systems which in fact do not.
Therefore, if the user picks some random target wide charset, I think we
ought to honor his request.
Another wrinkle is that there are no good ways to determine any
characteristics of character sets. This simply isn't part of any
standardized API (we could of course implement our own database for
this... but I was not motivated to do so). What this means is that we
can also do very little error checking in practice -- if the target uses
UCS-4, but the user says "set target-wide-charset SJIS", well, he will
get nonsense in response, with no warning from GDB.
What I would propose doing is adding a new charset named "UCS". If this
is selected as the target wide charset, then we would automatically pick
UCS-2 or UCS-4 depending on sizeof(target wchar_t). This would probably
mean having a few special cases in the code (like we do for the -BE and
-LE variants). We would then make this the default target wide charset.
What do you think of that?
Tom
next prev parent reply other threads:[~2009-09-16 18:56 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-15 14:12 Alexey Feldgendler
2009-09-16 18:56 ` Tom Tromey [this message]
2009-09-16 19:11 ` Pedro Alves
2009-09-16 20:41 ` Eli Zaretskii
2009-09-16 20:46 ` Tom Tromey
2009-09-17 10:25 ` Pedro Alves
2009-09-16 20:40 ` Eli Zaretskii
2009-09-16 20:46 ` Tom Tromey
2009-09-16 20:54 ` Eli Zaretskii
2009-09-17 8:50 ` Alexey Feldgendler
2009-09-18 20:42 ` Tom Tromey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3ljkepvlz.fsf@fleche.redhat.com \
--to=tromey@redhat.com \
--cc=alexeyf@opera.com \
--cc=gdb-patches@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox