From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 91875 invoked by alias); 19 Apr 2019 22:22:09 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 91866 invoked by uid 89); 19 Apr 2019 22:22:09 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-14.4 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=Break, 004, Breakpoint, skills X-HELO: mail-wm1-f68.google.com Received: from mail-wm1-f68.google.com (HELO mail-wm1-f68.google.com) (209.85.128.68) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 19 Apr 2019 22:22:07 +0000 Received: by mail-wm1-f68.google.com with SMTP id z6so9734100wmi.0 for ; Fri, 19 Apr 2019 15:22:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=OeEvanjKGUXNkwmoSOlyUptRPrPp5nsYUWO62bgsAP0=; b=FjvOc6UT1sleIkU/u+nChq0CuA05fYgnzH5TBFLUn2SQCJVeSIrEM+jsQG/uDElwDn ZRhq80LxvlQe9iXr9FPiHMzMR2Y4TjKYmIFMf6QPPLTv9xJ/ufs+Ju3BYIoOcIvhZDQv WePeLXpGdjreR2M3qIbVLFHGTfWr8Q8GwVHtLcTiSyfgbTl4yJ6wAGXV71Z8pUYmfJQl pL+O+cUT8GzJuGZM7rUMXHHaF05FyS/L5eUztOzw6do1Xs8gbA+Ci8uLMP56T8bV9qBj ej1/RQ1RrVkwOhQo7jv08ppwwd1lxQAv7Vs74WgJSWMVYZ94xXIKU+Qb9pbqWZeNyzxv vCMg== Return-Path: Received: from localhost (host109-148-134-137.range109-148.btcentralplus.com. [109.148.134.137]) by smtp.gmail.com with ESMTPSA id a11sm4369350wrx.5.2019.04.19.15.22.04 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 19 Apr 2019 15:22:04 -0700 (PDT) Date: Fri, 19 Apr 2019 22:22:00 -0000 From: Andrew Burgess To: Tom Tromey Cc: gdb-patches@sourceware.org Subject: Re: [PATCHv2 4/5] gdb: Introduce new language field la_is_string_type_p Message-ID: <20190419222203.GW2737@embecosm.com> References: <878sw6dpg0.fsf@tromey.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <878sw6dpg0.fsf@tromey.com> X-Fortune: Sauron is alive in Argentina! X-Editor: GNU Emacs [ http://www.gnu.org/software/emacs ] User-Agent: Mutt/1.9.2 (2017-12-15) X-IsSubscribed: yes X-SW-Source: 2019-04/txt/msg00392.txt.bz2 * Tom Tromey [2019-04-19 08:45:03 -0600]: > >>>>> "Andrew" == Andrew Burgess writes: > > Andrew> Some languages already have a "is this a string" predicate that I was > Andrew> able to reuse, while for other languages I've had to add a new > Andrew> predicate. In this case I took inspiration from the value printing > Andrew> code for that language - what different conditions would result in > Andrew> printing something as a string. > > This looks essentially fine, but I had some questions. > > Andrew> +bool > Andrew> +c_is_string_type_p (struct type *type) > Andrew> +{ > Andrew> + type = check_typedef (type); > Andrew> + while (TYPE_CODE (type) == TYPE_CODE_REF) > Andrew> + { > Andrew> + type = TYPE_TARGET_TYPE (type); > Andrew> + type = check_typedef (type); > Andrew> + } > Andrew> + > Andrew> + switch (TYPE_CODE (type)) > Andrew> + { > Andrew> + case TYPE_CODE_ARRAY: > Andrew> + { > Andrew> + /* See if target type looks like a string. */ > Andrew> + struct type *array_target_type = TYPE_TARGET_TYPE (type); > Andrew> + return (TYPE_LENGTH (type) > 0 > Andrew> + && TYPE_LENGTH (array_target_type) > 0 > Andrew> + && c_textual_element_type (array_target_type, 0)); > Andrew> + } > Andrew> + case TYPE_CODE_STRING: > Andrew> + return true; > > It seems to me that a "char *" should be considered a string in C; > and probably a "wchar_t *" as well. Maybe see c-lang.c:classify_type. Thanks, I'll take a look at these cases. > > Andrew> +/* Return true if TYPE is a string type. */ > Andrew> +static bool > Andrew> +rust_is_string_type_p (struct type *type) > Andrew> +{ > Andrew> + LONGEST low_bound, high_bound; > Andrew> + > Andrew> + type = check_typedef (type); > Andrew> + return ((TYPE_CODE (type) == TYPE_CODE_STRING) > Andrew> + || (TYPE_CODE (type) == TYPE_CODE_PTR > Andrew> + && (TYPE_CODE (TYPE_TARGET_TYPE (type)) == TYPE_CODE_ARRAY > Andrew> + && rust_u8_type_p (TYPE_TARGET_TYPE (TYPE_TARGET_TYPE (type))) > Andrew> + && get_array_bounds (TYPE_TARGET_TYPE (type), &low_bound, > Andrew> + &high_bound))) > Andrew> + || ((TYPE_CODE (type) == TYPE_CODE_UNION > Andrew> + || (TYPE_CODE (type) == TYPE_CODE_STRUCT > Andrew> + && !rust_enum_p (type))) > Andrew> + && rust_slice_type_p (type) > Andrew> + && strcmp (TYPE_NAME (type), "&str") == 0)); > > I didn't understand the reason for TYPE_CODE_UNION here. Most of the _is_string functions were created by looking through the language's value printing code looking for places where the language identifies something as a string, then taking all of the conditions that led to that point and placing them in the _is_string function. For rust, rust_val_print_str is called from val_print_struct, which is called for TYPE_CODE_STRUCT and TYPE_CODE_UNION, hence the above... ...but, after prompting, I took a closer look, and rust_slice_type_p is only true for TYPE_CODE_STRUCT, which makes the union check redundant - so its gone! > > Also, I think an array or slice of 'char' should probably be considered > a string in Rust. See rust_chartype_p. That sounds sensible, but .... I don't believe these things that you describe are currently printed as strings. Here's what I tried (excuse my poor rust skills, maybe I'm not trying what you're suggesting): $ cat str.rs #![allow(dead_code)] #![allow(unused_variables)] #![allow(unused_assignments)] fn main () { let str1: &str = "str1"; let str2: String = "str2".to_string(); let str3: [char; 4] = [ 's', 't', 'r', '3' ]; let str4: &[char] = &str3; println!("hello world"); // Break here. } $ rustc -g -o str str.rs $ gdb str <... snip ...> (gdb) b 11 Breakpoint 1 at 0x3d3b: file str.rs, line 11. (gdb) r <... snip ...> Breakpoint 1, str::main () at str.rs:11 11 println!("hello world"); // Break here. (gdb) p str1 $1 = "str1" (gdb) p str2 $2 = alloc::string::String {vec: alloc::vec::Vec {buf: alloc::raw_vec::RawVec {ptr: core::ptr::Unique {pointer: core::nonzero::NonZero<*const u8> (0x555555784a40 "str2\000"), _marker: core::marker::PhantomData}, cap: 4, a: alloc::alloc::Global}, len: 4}} (gdb) p str3 $3 = [115 's', 116 't', 114 'r', 51 '3'] (gdb) p str4 $4 = &[char] {data_ptr: 0x7fffffffcd90 "s\000\000\000t\000\000\000r\000\000\000\063\000\000\000\xffffcd90\x7fff\004\000\000\000", length: 4} So I think that currently only the '&str' (String slice?) is printed as a string. Like I said, my rust-foo is weak, so if you have an example of a different type that is currently printed as a string then I'm happy to fold this info a rust test and make any fixes for rust's _is_string function as needed. You'll have noticed (maybe?) that the original patch didn't include a rust test at all. This was because some of rusts value printing seems a little broken right now. For example, printing an array slice (&str) variable works fine, but place this inside a struct and it no longer works, for example: $ cat nested.rs #![allow(dead_code)] #![allow(unused_variables)] #![allow(unused_assignments)] pub struct Bad { pub field1: i32, pub field2: &'static str, } fn main () { let s = Bad { field1: 1, field2: "hello" }; println!("hello world"); // Break here. } $ rustc -g -o nested nested.rs $ gdb nested <... snip ...> (gdb) b 12 Breakpoint 1 at 0x3d00: file nested.rs, line 12. (gdb) r <... snip ...> Breakpoint 1, nested::main () at nested.rs:12 12 println!("hello world"); // Break here. (gdb) p s $1 = nested::Bad {field1: 1, field2: } (gdb) However, I fixed that with this patch: diff --git a/gdb/rust-lang.c b/gdb/rust-lang.c index 2fada465d65..597986a4b34 100644 --- a/gdb/rust-lang.c +++ b/gdb/rust-lang.c @@ -378,6 +378,8 @@ val_print_struct (struct type *type, int embedded_offset, if (rust_slice_type_p (type) && strcmp (TYPE_NAME (type), "&str") == 0) { + if (embedded_offset != 0) + val = value_at_lazy (type, value_address (val) + embedded_offset); rust_val_print_str (stream, val, options); return; } So, I think the summary is, I'm happy to fix rust_is_string_p to cover any cases that currently print as a string, but I think that if something _doesn't_ currently print as a string then rust_is_string_p shouldn't identify it as a string - if it did then we'd end up printing a structure at a depth when it should have been replaced with ellipsis. Thanks, Andrew