From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-158648-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 40239 invoked by alias); 7 Sep 2019 20:30:54 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 40229 invoked by uid 89); 7 Sep 2019 20:30:53 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-6.6 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.1 spammy=pm, sk:DW_AT_s, sk:dw_at_s, impact
X-HELO: simark.ca
Received: from simark.ca (HELO simark.ca) (158.69.221.121) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 07 Sep 2019 20:30:51 +0000
Received: from [10.0.0.11] (unknown [192.222.164.54])	(using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)	(No client certificate requested)	by simark.ca (Postfix) with ESMTPSA id AC1AF1ED17;	Sat,  7 Sep 2019 16:30:49 -0400 (EDT)
Subject: Re: [PATCH v2 1/4] DWARF 5 support: Handle dwo_id
To: Ali Tamur <tamur@google.com>, gdb-patches@sourceware.org
References: <d8a1fa51-d7bc-6f54-729d-acaeaa9cf079@simark.ca> <20190906215249.161246-1-tamur@google.com>
From: Simon Marchi <simark@simark.ca>
Message-ID: <4caf107e-8766-8a6d-df57-b824c287fa70@simark.ca>
Date: Sat, 07 Sep 2019 20:30:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0
MIME-Version: 1.0
In-Reply-To: <20190906215249.161246-1-tamur@google.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-SW-Source: 2019-09/txt/msg00115.txt.bz2

Hi Ali,

Thanks for the new version.

On 2019-09-06 5:52 p.m., Ali Tamur via gdb-patches wrote:
>> Could you also precise (here and in the commit message) how you have tested this? Â Which
>> compiler, which version, which test case?
> I compiled 1) synced master, and 2) that plus this patch. I used gcc as the Â 
> compiler. I ran the gdb tests with "make check" and the same set of tests Â  Â 
> failed in each case. Until it comes to a point where gdb actually starts to Â 
> handle 'hello world' compiled for DWARF 5, I don't know if I can do better.

Running the testsuite before and after the patch and comparing the results (as you did)
is indeed the recommended way, as the testsuite is not passing at 100% (especially given
the number of different possible configurations.

When you say you ran "make check", I understand it was without any special arguments, so
with DWARF 4 (gcc's default today) and non-split-DWARF?

Since you're touching the split-DWARF code, it might be a good idea to run the whole
testsuite with -gsplit-dwarf, with DWARF 4 (to verify there are no regressions) and
DWARF 5 (to check how things improve).

To illustrate, I ran this without your patch applied:

  $ make check TESTS="gdb.base/break.exp" RUNTESTFLAGS="CFLAGS_FOR_TARGET='-gdwarf-4 -gsplit-dwarf'"
  # of expected passes		105

  $ make check TESTS="gdb.base/break.exp" RUNTESTFLAGS="CFLAGS_FOR_TARGET='-gdwarf-5 -gsplit-dwarf'"
  # of expected passes		15
  # of unexpected failures	44

And this with your patch applied:

  $ make check TESTS="gdb.base/break.exp" RUNTESTFLAGS="CFLAGS_FOR_TARGET='-gdwarf-4 -gsplit-dwarf'"
  # of expected passes		105

  (good, no regressions with DWARF 4)

  $ make check TESTS="gdb.base/break.exp" RUNTESTFLAGS="CFLAGS_FOR_TARGET='-gdwarf-5 -gsplit-dwarf'"
  # of expected passes		60
  # of unexpected failures	45

  (not perfect, but the test went further before giving up, good news!)

This is just one test though.  Running the whole gdb.base directory (with TESTS="gdb.base/*.exp")
for each patch you add would give good coverage without being too time-consuming.  And maybe one
full run (just omit the TESTS variable) at the end to compare before/after your patch series.

And since DWARF 5 is relatively new stuff, results can vary greatly if using different versions of the
same compiler to run the tests.  So if you could mention in the commit message which gcc version the
tests were ran against, I think it would be useful.

>>> + Â  Â  DW_UT_skeleton, DW_UT_split_compile also contain a signature. Â */
>>> Â  Â ULONGEST signature;
>>
>> Does DW_UT_partial contain a signature? Â In section 7.5.1.1 of the DWARF5 spec, it doesn't seem so.
> My bad, corrected the comment. (Code needed no change).
> 
>> I think the code will be confusing if we keep the field named simply "signature", when it could
>> actually mean "dwo_id". Â Somebody reading the code with the DWARF 5 spec on the side will have some
>> trouble making the correlation between both. Â Maybe we should start using a union to describe that
>> different fields are used with different kinds of units?
> Please see the definition of struct dwo_unit, 'signature' is already used Â  Â 
> there. Renamed the field as "signature_or_dwo_id". If that is ok with you, Â  Â 
> I'd rather not introduce a union; the code is already complicated enough and Â 
> I think it's not a good idea to mix such refactoring with behaviour changes Â 
> in the same patch.

I would say that the same change should be done in dwo_unit then.  But I totally
agree that we should not mix up cleanups with adding new features.  So here's
what I suggest: revert the field in comp_unit_head to be just `signature` for
now, it will avoid having too many unrelated changes in this patch (updated all
the users of that field).  We'll make a following cleanup patch to either rename
the field or introduce a union.

Sorry you went through the trouble of doing this rename only for me to suggest
you revert it, but as you say I think it's better to to mix refactoring with
behavior changes.

>> Here too, should we do something based on the CU's DWARF version? Â I presume we don't want
>> to accept a DW_AT_GNU_dwo_name attribute in a DWARF 5 CU.
> Â clang (and maybe others) still uses DW_AT_GNU_dwo_name in DWARF 5.

Ok.  Given that DWARF 5 introduces the feature to supersede DW_AT_GNU_dwo_name,
I would have thought compilers would not emit it when using DWARF 5.  But indeed,
clang does emit it (using today's master):

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000024 version = 0x0005 unit_type = e abbr_offset = 0x0000 addr_size = 0x08 DWO_id = 0xe23aa1245e312b00 (next unit at 0x00000028)

0x00000014: DW_TAG_compile_unit
              DW_AT_stmt_list	(0x00000000)
              DW_AT_str_offsets_base	(0x00000008)
              DW_AT_comp_dir	("/home/simark/build/binutils-gdb/gdb")
              DW_AT_GNU_pubnames	(true)
              DW_AT_GNU_dwo_name	("test.dwo")
              DW_AT_addr_base	(0x00000008)
              DW_AT_low_pc	(0x0000000000000000)
              DW_AT_high_pc	(0x0000000000000035)

Not that clang still uses DW_TAG_compile_unit (which is the DWARF 4 way of doing it,
I presume), unlike gcc 9.1.0 who uses DW_TAG_skeleton_unit + DW_AT_dwo_name:

0x00000014: DW_TAG_skeleton_unit
              DW_AT_low_pc	(0x0000000000000000)
              DW_AT_high_pc	(0x000000000000002a)
              DW_AT_stmt_list	(0x00000000)
              DW_AT_dwo_name	("test.dwo")
              DW_AT_comp_dir	("/home/simark/build/binutils-gdb/gdb")
              DW_AT_GNU_pubnames	(true)
              DW_AT_addr_base	(0x00000008)

When clang switches to use DW_TAG_skeleton_unit, I think they'll have to switch to using
DW_AT_dwo_name.  My understanding of DWARF 5 (section 3.1.2) is that this attribute is
mandatory in a DW_TAG_skeleton_unit.

It doesn't really impact us though, we have to support both.

> Thanks for the detailed review. I am attaching the updated commit message, Changelog and changes below.

Thanks, that looks nice, I added a few comments below, nothing major.

> +static const char *dwarf_unit_type_name(int unit_type);

Space before parenthesis.  Applies to the rest of the patch, for both
function signatures and calls.

> @@ -6390,18 +6397,27 @@ read_comp_unit_head (struct comp_unit_head *cu_header,
>        switch (cu_header->unit_type)
>  	{
>  	case DW_UT_compile:
> +	case DW_UT_partial:
> +	case DW_UT_skeleton:
> +	case DW_UT_split_compile:
>  	  if (section_kind != rcuh_kind::COMPILE)
>  	    error (_("Dwarf Error: wrong unit_type in compilation unit header "
> -		   "(is DW_UT_compile, should be DW_UT_type) [in module %s]"),
> -		   filename);
> +		   "(is %s, should be DW_UT_type) [in module %s]"),
> +		   dwarf_unit_type_name(cu_header->unit_type), filename);

Let's use %s and dwarf_unit_type_name to print DW_UT_type here, as you did below.

>  	  break;
>  	case DW_UT_type:
> +	case DW_UT_split_type:
>  	  section_kind = rcuh_kind::TYPE;
>  	  break;
>  	default:
>  	  error (_("Dwarf Error: wrong unit_type in compilation unit header "
> -		 "(is %d, should be %d or %d) [in module %s]"),
> -		 cu_header->unit_type, DW_UT_compile, DW_UT_type, filename);
> +		 "(is %#04x, should be one of: %s, %s, %s, %s or %s) "
> +		 "[in module %s]"), cu_header->unit_type,
> +		 dwarf_unit_type_name(DW_UT_compile),
> +		 dwarf_unit_type_name(DW_UT_skeleton),
> +		 dwarf_unit_type_name(DW_UT_split_compile),
> +		 dwarf_unit_type_name(DW_UT_type),
> +		 dwarf_unit_type_name(DW_UT_split_type), filename);

[snip]

> @@ -7297,47 +7319,58 @@ read_cutu_die_from_dwo (struct dwarf2_per_cu_data *this_cu,
>    return 1;
>  }
>  
> +/* Return the signature of the compile unit. In DWARF 4 and before, the
> +   signature is in the DW_AT_GNU_dwo_id attribute. In DWARF 5 and later, the
> +   signature is part of the header. Returns 0 if the signature is not found.  */

This comment needs to be updated (the last part).

> +static gdb::optional<ULONGEST>
> +lookup_dwo_id (struct dwarf2_cu *cu, struct die_info* comp_unit_die)
> +{
> +  if (cu->header.version >= 5)
> +    return cu->header.signature_or_dwo_id;
> +  struct attribute *attr;
> +  attr = dwarf2_attr (comp_unit_die, DW_AT_GNU_dwo_id, cu);
> +  if (!attr)

Use either of:

  if (attr == NULL)
  if (attr == nullptr)

This happens a few times in the patch.  In the opposite case, if (attr) should be
replaced with either of:

  if (attr != NULL)
  if (attr != nullptr)

Ref: https://sourceware.org/gdb/wiki/Internals%20GDB-C-Coding-Standards#Comparison_With_NULL_And_Zero

> +    return gdb::optional<ULONGEST>();
> +  return DW_UNSND (attr);
> +}
> +
>  /* Subroutine of init_cutu_and_read_dies to simplify it.
>     Look up the DWO unit specified by COMP_UNIT_DIE of THIS_CU.
>     Returns NULL if the specified DWO unit cannot be found.  */
>  
>  static struct dwo_unit *
> -lookup_dwo_unit (struct dwarf2_per_cu_data *this_cu,
> +lookup_dwo_unit (const struct die_reader_specs *reader,
>  		 struct die_info *comp_unit_die)

Is there a reason for this function to now take `reader` rather than `this_cu`?  At first
sight, I don't see why this change is needed or why it would help.

> @@ -22812,6 +22854,23 @@ dwarf_attr_name (unsigned attr)
>    return name;
>  }
>  
> +/* Convert a unit type to corresponding DW_UT name.  */
> +
> +static const char *
> +dwarf_unit_type_name(int unit_type) {
> +  switch (unit_type) {
> +    case 0x01: return "DW_UT_compile (0x01)";
> +    case 0x02: return "DW_UT_type (0x02)";
> +    case 0x03: return "DW_UT_partial (0x03)";
> +    case 0x04: return "DW_UT_skeleton (0x04)";
> +    case 0x05: return "DW_UT_split_compile (0x05)";
> +    case 0x06: return "DW_UT_split_type (0x06)";
> +    case 0x80: return "DW_UT_lo_user (0x80)";
> +    case 0xff: return "DW_UT_hi_user (0xff)";
> +    default: return nullptr;
> +  }
> +}

Format the switch-case like:

  switch (...)
    {
    case 1:
      break;
    break:
      break;
    }

Thanks,

Simon