From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-159439-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 4925 invoked by alias); 10 Oct 2019 03:14:50 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 4706 invoked by uid 89); 10 Oct 2019 03:14:31 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-15.8 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.1 spammy=(unknown), H*r:112
X-HELO: smtp.polymtl.ca
Received: from smtp.polymtl.ca (HELO smtp.polymtl.ca) (132.207.4.11) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 10 Oct 2019 03:14:29 +0000
Received: from simark.ca (simark.ca [158.69.221.121])	(authenticated bits=0)	by smtp.polymtl.ca (8.14.7/8.14.7) with ESMTP id x9A3EKQ0007821	(version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT)	for <gdb-patches@sourceware.org>; Wed, 9 Oct 2019 23:14:25 -0400
DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.polymtl.ca x9A3EKQ0007821
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=polymtl.ca;	s=default; t=1570677266;	bh=sopv6totQRvQd7T7c6X+dvvAXLxQCJRcuwp6AY/flfo=;	h=Date:From:To:Cc:Subject:In-Reply-To:References:From;	b=m9/kMgjedsSjuHRoCOptKcf2ZYAxYj7sxID8d/fVgnSF3FAcN22rx8jv5yONgf+pv	 Wgyk/ld/41g+34YIhA/QB6FPM4a1458LPDlOBFx7pXA1eqWnMkJPschMc75Sa/yjJV	 YeMbIMSVrFF6GZtbHS+Gr0Io6dHLVb0YXsRB4iVo=
Received: by simark.ca (Postfix, from userid 112)	id A12221E8E4; Wed,  9 Oct 2019 23:14:20 -0400 (EDT)
Received: from simark.ca (localhost [127.0.0.1])	by simark.ca (Postfix) with ESMTP id 5FDCB1E581;	Wed,  9 Oct 2019 23:14:17 -0400 (EDT)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII; format=flowed
Content-Transfer-Encoding: 7bit
Date: Thu, 10 Oct 2019 03:14:00 -0000
From: Simon Marchi <simon.marchi@polymtl.ca>
To: Ali Tamur <tamur@google.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH] DWARF 5 support: Handle line table and file indexes
In-Reply-To: <20191002015214.219674-1-tamur@google.com>
References: <20191002015214.219674-1-tamur@google.com>
Message-ID: <bd686edb3ae4bd37daa033efaa3ff63a@polymtl.ca>
X-Sender: simon.marchi@polymtl.ca
User-Agent: Roundcube Webmail/1.3.10
X-IsSubscribed: yes
X-SW-Source: 2019-10/txt/msg00271.txt.bz2

Hi Ali,

Thanks for the patch.  I noted a few minor comments below.

On 2019-10-01 21:52, Ali Tamur via gdb-patches wrote:
> [Sorry, I forgot to add gdb/ChangeLog, resending]
> 
> *  Fix handling of file and directory indexes in line tables; in DWARF 
> 5 the
> indexes are zero-based. Make file_names field private to abstract this 
> detail
> from the clients. Introduce a new file_names_size method. Reflect these 
> changes
> in clients.

I think it's a good idea to encapsulate this.  Should include_dirs be 
private as well, since it is in the same situation?

> *  Pass whether the file index is zero or one based to a few methods.

Hmm, I am not a big fan of that, I had a hard time understanding why 
that was necessary. From what I understand now, it's only necessary so 
that some code that is agnostic to DWARF 4 vs 5 can iterate on all file 
entries.

Instead, maybe we could have the line_header object provide a way for 
clients to iterate on file entries, without exposing the vector.  Maybe 
a range adapter that would let clients do:

for (const file_entry &fe : lh->file_names ())
   {
     ...
   }

Hopefully, that is enough for all cases that pass is_zero_based = true 
in your patch.

> *  Handle DW_FORM_data16 in read_formatted_entries; it is used to 
> record MD5
> of the file entries in DWARF 5.
> *  Fix a bug in line header parsing that calculates the length of the 
> header
> incorrectly. (Seemingly this manifests itself only in DWARF 5).
> *  Introduce a new method, is_valid_file_index that takes into account 
> whether
> it is DWARF 5. Use it in related clients.
> 
> Tested with CC=/usr/bin/gcc (version 8.3.0) against master branch (also 
> with
> -gsplit-dwarf and -gdwarf-4 flags) and there was no increase in the set 
> of
> tests that fails. (gdb still cannot debug a 'hello world' program with 
> DWARF 5,
> so for the time being, this is all we care about).
> 
> This is part of an effort to support DWARF 5 in gdb.
> 
> gdb/ChangeLog:
> 
> 	* dwarf2read.c (dir_index): Change type.
> 	(file_name_index): Likewise.
> 	(line_header::include_dir_at): Change comment and implementation on
> 	whether it is DWARF 5.
> 	(line_header::file_name_at): Change comment, API and implementation.
> 	(line_header::file_names_size): New method.
> 	(line_header::file_names): Change to private field.
> 	(file_full_name): Change API.
> 	(dw2_get_file_names_reader): Initialize loval variable. Use new 
> methods.
> 	(dwarf2_cu::setup_type_unit_groups): Change implementation and use new
> 	methods.
> 	(process_structure_scope): Reflect API change.
> 	(line_header::add_file_name): Likewise.
> 	(read_formatted_entries): Handle DW_FORM_data16.
> 	(dwarf_decode_line_header): Fix line header length calculation.
> 	(psymtab_include_file_name): Reflect API change.
> 	(lnp_state_machine::current_file): Likewise.
> 	(lnp_state_machine::m_file): Update comment.
> 	(lnp_state_machine::record_line): Reflect type change.
> 	(dwarf_decode_lines): Reflect API change.
> 	(new_symbol): Likewise.
> 	(is_valid_file_index): New function.
> 	(file_file_name): Change API and implementation to take care of DWARF 
> 5.
> 	(file_full_name): Likewise.
> 	(macro_start_file): Reflect API change.
> ---
>  gdb/dwarf2read.c | 161 +++++++++++++++++++++++++++--------------------
>  1 file changed, 94 insertions(+), 67 deletions(-)
> 
> diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
> index 53e7393a7c..c474bdcd8b 100644
> --- a/gdb/dwarf2read.c
> +++ b/gdb/dwarf2read.c
> @@ -922,13 +922,13 @@ typedef void (die_reader_func_ftype) (const
> struct die_reader_specs *reader,
>  				      int has_children,
>  				      void *data);
> 
> -/* A 1-based directory index.  This is a strong typedef to prevent
> -   accidentally using a directory index as a 0-based index into an
> -   array/vector.  */
> -enum class dir_index : unsigned int {};
> +/* dir_index is 1-based in DWARF 4 and before, and is 0-based in DWARF 
> 5 and
> +   later.  */
> +typedef int dir_index;
> 
> -/* Likewise, a 1-based file name index.  */
> -enum class file_name_index : unsigned int {};
> +/* file_name_index is 1-based in DWARF 4 and before, and is 0-based in 
> DWARF 5
> +   and later.  */
> +typedef int file_name_index;

Is there a reason you changed these?  The reason why these were of enum 
class type (or "strong typedef") still seems valid to me.  We don't want 
some code accessing the vectors blindy using these indices, so it 
encourages people to use the right accessor method instead.

> 
>  struct file_entry
>  {
> @@ -980,26 +980,30 @@ struct line_header
>    void add_file_name (const char *name, dir_index d_index,
>  		      unsigned int mod_time, unsigned int length);
> 
> -  /* Return the include dir at INDEX (1-based).  Returns NULL if INDEX
> -     is out of bounds.  */
> +  /* Return the include dir at INDEX (0-based in DWARF 5 and 1-based 
> before).
> +     Returns NULL if INDEX is out of bounds.  */
>    const char *include_dir_at (dir_index index) const
>    {
> -    /* Convert directory index number (1-based) to vector index
> -       (0-based).  */
> -    size_t vec_index = to_underlying (index) - 1;
> +    size_t vec_index;
> +    if (version <= 4)
> +      vec_index = index - 1;
> +    else
> +      vec_index = index;
> 
>      if (vec_index >= include_dirs.size ())
>        return NULL;
>      return include_dirs[vec_index];
>    }
> 
> -  /* Return the file name at INDEX (1-based).  Returns NULL if INDEX
> -     is out of bounds.  */
> -  file_entry *file_name_at (file_name_index index)
> +  /* Return the file name at INDEX (0-based in DWARF 5 and 1-based 
> before).
> +     Returns NULL if INDEX is out of bounds.  */
> +  file_entry *file_name_at (file_name_index index, bool 
> is_zero_indexed)
>    {
> -    /* Convert file name index number (1-based) to vector index
> -       (0-based).  */
> -    size_t vec_index = to_underlying (index) - 1;
> +    size_t vec_index;
> +    if (is_zero_indexed || version >= 5)
> +        vec_index = index;

Remove one indentation level.

> +    else
> +      vec_index = index - 1;
> 
>      if (vec_index >= file_names.size ())
>        return NULL;
> @@ -1032,12 +1036,20 @@ struct line_header
>       pointers.  The memory is owned by debug_line_buffer.  */
>    std::vector<const char *> include_dirs;
> 
> -  /* The file_names table.  */
> -  std::vector<file_entry> file_names;
> +  int file_names_size () {
> +    return file_names.size();
> +  }

Format it like this:

   int file_names_size ()
   {
     return file_names.size();
   }

or we also accept this for short/trivial methods:

   int file_names_size ()
   { return file_names.size(); }

> 
>    /* The start and end of the statement program following this
>       header.  These point into dwarf2_per_objfile->line_buffer.  */
>    const gdb_byte *statement_program_start {}, *statement_program_end 
> {};
> +
> + private:
> +  /* The file_names table. This is private because the meaning of
> indexes differ

differ -> differs

> +     among DWARF versions (The first valid index is 1 in DWARF 4 and 
> before,
> +     and is 0 in DWARF 5 and later). So the client should use 
> file_name_at

Two spaces after period.

> +     method for access.  */
> +  std::vector<file_entry> file_names;

You can rename this to m_file_names (that's the style we use for private 
members).

> @@ -3638,7 +3650,7 @@ dw2_get_file_names_reader (const struct
> die_reader_specs *reader,
>    struct objfile *objfile = dwarf2_per_objfile->objfile;
>    struct dwarf2_per_cu_data *lh_cu;
>    struct attribute *attr;
> -  int i;
> +  int i = 0;

Unnecessary change?  In fact, I'd take the opportunity to move that 
declaration in the for loop that uses it.

> @@ -20629,18 +20648,18 @@ dwarf_decode_line_header (sect_offset
> sect_off, struct dwarf2_cu *cu)
>     Returns NULL if FILE_INDEX should be ignored, i.e., it is 
> pst->filename.  */
> 
>  static const char *
> -psymtab_include_file_name (const struct line_header *lh, int 
> file_index,
> +psymtab_include_file_name (struct line_header *lh, int file_index,
>  			   const struct partial_symtab *pst,
>  			   const char *comp_dir,
>  			   gdb::unique_xmalloc_ptr<char> *name_holder)
>  {
> -  const file_entry &fe = lh->file_names[file_index];
> -  const char *include_name = fe.name;
> +  file_entry *fe = lh->file_name_at (file_index, true);

Hmm it's not great to have to lose the const here.  Can you try adding a 
const version of file_name_at, that returns a const file_entry *?

> @@ -24154,22 +24173,29 @@ dwarf_alloc_die (struct dwarf2_cu *cu, int 
> num_attrs)
>     *LH's file name table.  The result is allocated using xmalloc; the 
> caller is
>     responsible for freeing it.  */

The comment above applies to the `file_file_name` function, so it should 
stay right on top of it.  Also, can you please document the new 
IS_ZERO_INDEXED parameter?

> 
> +bool is_valid_file_index (int file_index, struct line_header *lh,

This function should be static.  Also, place the return type on its own 
line:

static bool
is_valid_file_index (...

Simon