From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20007 invoked by alias); 7 Jul 2008 23:31:11 -0000 Received: (qmail 19995 invoked by uid 22791); 7 Jul 2008 23:31:09 -0000 X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 07 Jul 2008 23:30:52 +0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id m67NUlSl021357; Mon, 7 Jul 2008 19:30:47 -0400 Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [10.11.255.20]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m67NUl1O028560; Mon, 7 Jul 2008 19:30:47 -0400 Received: from opsy.redhat.com (vpn-10-112.bos.redhat.com [10.16.10.112]) by pobox.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m67NUkF0024369; Mon, 7 Jul 2008 19:30:47 -0400 Received: by opsy.redhat.com (Postfix, from userid 500) id 219A93781A0; Mon, 7 Jul 2008 17:30:46 -0600 (MDT) To: Thiago Jung Bauermann Cc: gdb ml Subject: Re: [RFC] string handling in python References: <1215408302.1795.38.camel@localhost.localdomain> From: Tom Tromey Reply-To: tromey@redhat.com X-Attribution: Tom Date: Mon, 07 Jul 2008 23:31:00 -0000 In-Reply-To: <1215408302.1795.38.camel@localhost.localdomain> (Thiago Jung Bauermann's message of "Mon\, 07 Jul 2008 02\:25\:02 -0300") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2008-07/txt/msg00047.txt.bz2 >>>>> "Thiago" == Thiago Jung Bauermann writes: Thiago> So, in my opinion for GDB's Python bindings we should always Thiago> use Unicode strings, and convert to/from desired encodings as Thiago> necessary. Strings provided by the user would be assumed to Thiago> have host_charset () encoding, and strings coming from/going Thiago> to the inferior would be assumed to have target_charset () Thiago> encoding. Sounds reasonable to me. I thought we already did some of this... search for host_charset in the python directory. Thiago> So for example, to create a value object of char * type using Thiago> a string provided by the user or coming from Python code, GDB Thiago> would first convert the Python string object (assumed to be in Thiago> the host charset) to a unicode object (this process is called Thiago> "decoding", in python parlance), and then convert it from Thiago> unicode to a string in the target charset. This sounds like a good candidate for convenience functions, one for each direction. Tom