From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-159381-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 25046 invoked by alias); 8 Oct 2019 01:05:39 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 25038 invoked by uid 89); 8 Oct 2019 01:05:39 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-32.4 required=5.0 tests=AWL,BAYES_00,ENV_AND_HDR_SPF_MATCH,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS,USER_IN_DEF_SPF_WL autolearn=ham version=3.3.1 spammy=Ali, Tamur, U*tamur, tamur@google.com
X-HELO: mail-wm1-f48.google.com
Received: from mail-wm1-f48.google.com (HELO mail-wm1-f48.google.com) (209.85.128.48) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 08 Oct 2019 01:05:36 +0000
Received: by mail-wm1-f48.google.com with SMTP id r17so1197675wme.0        for <gdb-patches@sourceware.org>; Mon, 07 Oct 2019 18:05:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d=google.com; s=20161025;        h=mime-version:references:in-reply-to:from:date:message-id:subject:to;        bh=j+sKgjmUOdCRa/oBNdrHGOc52Px5GOkxjClJZwhH+jM=;        b=bHtxu2VAR6+5t1vPZWI11t6Tvon3h8/HeMKI30VnOwUvaE3RBUIxYImNB+MxPqCdYB         J4mGqeRcJknPnMY26f1pMrouZbt6En75nwscTgmDy/aq8IefZEm7KdQQk2U4DYEw9r44         mbzE8yNUzXUF6G6tJtoaw15ulo77xfxcEYpmhkMjfuiR+56Luh7SlzeEID0gG1Tu7Cr+         6N6cqXvfyXOWeVQ6bQh2qRWLaeHbF6mIPMcu4RXmz9ialKdX21BtSk+AodcneUy40i6e         udid4ILzNLQWNniefrzJIuky5h2aFcSeskC+V4N8+Ms3ldBHv8F00EkG8d5pGL0J4N8W         XY2A==
MIME-Version: 1.0
References: <20191001045219.141577-1-tamur@google.com> <20191002015214.219674-1-tamur@google.com>
In-Reply-To: <20191002015214.219674-1-tamur@google.com>
From: "Ali Tamur via gdb-patches" <gdb-patches@sourceware.org>
Reply-To: Ali Tamur <tamur@google.com>
Date: Tue, 08 Oct 2019 01:05:00 -0000
Message-ID: <CAH=Am=5H34Wdjw36US3o+YsUiH49B4KnvzHF-3g5huwG9RcXxg@mail.gmail.com>
Subject: Re: [PATCH] DWARF 5 support: Handle line table and file indexes
To: gdb-patches@sourceware.org
Content-Type: text/plain; charset="UTF-8"
X-IsSubscribed: yes
X-SW-Source: 2019-10/txt/msg00213.txt.bz2

Hi. Friendly ping?

On Tue, Oct 1, 2019 at 6:52 PM Ali Tamur <tamur@google.com> wrote:
>
> [Sorry, I forgot to add gdb/ChangeLog, resending]
>
> *  Fix handling of file and directory indexes in line tables; in DWARF 5 the
> indexes are zero-based. Make file_names field private to abstract this detail
> from the clients. Introduce a new file_names_size method. Reflect these changes
> in clients.
> *  Pass whether the file index is zero or one based to a few methods.
> *  Handle DW_FORM_data16 in read_formatted_entries; it is used to record MD5
> of the file entries in DWARF 5.
> *  Fix a bug in line header parsing that calculates the length of the header
> incorrectly. (Seemingly this manifests itself only in DWARF 5).
> *  Introduce a new method, is_valid_file_index that takes into account whether
> it is DWARF 5. Use it in related clients.
>
> Tested with CC=/usr/bin/gcc (version 8.3.0) against master branch (also with
> -gsplit-dwarf and -gdwarf-4 flags) and there was no increase in the set of
> tests that fails. (gdb still cannot debug a 'hello world' program with DWARF 5,
> so for the time being, this is all we care about).
>
> This is part of an effort to support DWARF 5 in gdb.
>
> gdb/ChangeLog:
>
>         * dwarf2read.c (dir_index): Change type.
>         (file_name_index): Likewise.
>         (line_header::include_dir_at): Change comment and implementation on
>         whether it is DWARF 5.
>         (line_header::file_name_at): Change comment, API and implementation.
>         (line_header::file_names_size): New method.
>         (line_header::file_names): Change to private field.
>         (file_full_name): Change API.
>         (dw2_get_file_names_reader): Initialize loval variable. Use new methods.
>         (dwarf2_cu::setup_type_unit_groups): Change implementation and use new
>         methods.
>         (process_structure_scope): Reflect API change.
>         (line_header::add_file_name): Likewise.
>         (read_formatted_entries): Handle DW_FORM_data16.
>         (dwarf_decode_line_header): Fix line header length calculation.
>         (psymtab_include_file_name): Reflect API change.
>         (lnp_state_machine::current_file): Likewise.
>         (lnp_state_machine::m_file): Update comment.
>         (lnp_state_machine::record_line): Reflect type change.
>         (dwarf_decode_lines): Reflect API change.
>         (new_symbol): Likewise.
>         (is_valid_file_index): New function.
>         (file_file_name): Change API and implementation to take care of DWARF 5.
>         (file_full_name): Likewise.
>         (macro_start_file): Reflect API change.
> ---
>  gdb/dwarf2read.c | 161 +++++++++++++++++++++++++++--------------------
>  1 file changed, 94 insertions(+), 67 deletions(-)
>
> diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
> index 53e7393a7c..c474bdcd8b 100644
> --- a/gdb/dwarf2read.c
> +++ b/gdb/dwarf2read.c
> @@ -922,13 +922,13 @@ typedef void (die_reader_func_ftype) (const struct die_reader_specs *reader,
>                                       int has_children,
>                                       void *data);
>
> -/* A 1-based directory index.  This is a strong typedef to prevent
> -   accidentally using a directory index as a 0-based index into an
> -   array/vector.  */
> -enum class dir_index : unsigned int {};
> +/* dir_index is 1-based in DWARF 4 and before, and is 0-based in DWARF 5 and
> +   later.  */
> +typedef int dir_index;
>
> -/* Likewise, a 1-based file name index.  */
> -enum class file_name_index : unsigned int {};
> +/* file_name_index is 1-based in DWARF 4 and before, and is 0-based in DWARF 5
> +   and later.  */
> +typedef int file_name_index;
>
>  struct file_entry
>  {
> @@ -980,26 +980,30 @@ struct line_header
>    void add_file_name (const char *name, dir_index d_index,
>                       unsigned int mod_time, unsigned int length);
>
> -  /* Return the include dir at INDEX (1-based).  Returns NULL if INDEX
> -     is out of bounds.  */
> +  /* Return the include dir at INDEX (0-based in DWARF 5 and 1-based before).
> +     Returns NULL if INDEX is out of bounds.  */
>    const char *include_dir_at (dir_index index) const
>    {
> -    /* Convert directory index number (1-based) to vector index
> -       (0-based).  */
> -    size_t vec_index = to_underlying (index) - 1;
> +    size_t vec_index;
> +    if (version <= 4)
> +      vec_index = index - 1;
> +    else
> +      vec_index = index;
>
>      if (vec_index >= include_dirs.size ())
>        return NULL;
>      return include_dirs[vec_index];
>    }
>
> -  /* Return the file name at INDEX (1-based).  Returns NULL if INDEX
> -     is out of bounds.  */
> -  file_entry *file_name_at (file_name_index index)
> +  /* Return the file name at INDEX (0-based in DWARF 5 and 1-based before).
> +     Returns NULL if INDEX is out of bounds.  */
> +  file_entry *file_name_at (file_name_index index, bool is_zero_indexed)
>    {
> -    /* Convert file name index number (1-based) to vector index
> -       (0-based).  */
> -    size_t vec_index = to_underlying (index) - 1;
> +    size_t vec_index;
> +    if (is_zero_indexed || version >= 5)
> +        vec_index = index;
> +    else
> +      vec_index = index - 1;
>
>      if (vec_index >= file_names.size ())
>        return NULL;
> @@ -1032,12 +1036,20 @@ struct line_header
>       pointers.  The memory is owned by debug_line_buffer.  */
>    std::vector<const char *> include_dirs;
>
> -  /* The file_names table.  */
> -  std::vector<file_entry> file_names;
> +  int file_names_size () {
> +    return file_names.size();
> +  }
>
>    /* The start and end of the statement program following this
>       header.  These point into dwarf2_per_objfile->line_buffer.  */
>    const gdb_byte *statement_program_start {}, *statement_program_end {};
> +
> + private:
> +  /* The file_names table. This is private because the meaning of indexes differ
> +     among DWARF versions (The first valid index is 1 in DWARF 4 and before,
> +     and is 0 in DWARF 5 and later). So the client should use file_name_at
> +     method for access.  */
> +  std::vector<file_entry> file_names;
>  };
>
>  typedef std::unique_ptr<line_header> line_header_up;
> @@ -1962,7 +1974,7 @@ static file_and_directory find_file_and_directory (struct die_info *die,
>                                                    struct dwarf2_cu *cu);
>
>  static char *file_full_name (int file, struct line_header *lh,
> -                            const char *comp_dir);
> +                            const char *comp_dir, bool is_zero_indexed);
>
>  /* Expected enum dwarf_unit_type for read_comp_unit_head.  */
>  enum class rcuh_kind { COMPILE, TYPE };
> @@ -3638,7 +3650,7 @@ dw2_get_file_names_reader (const struct die_reader_specs *reader,
>    struct objfile *objfile = dwarf2_per_objfile->objfile;
>    struct dwarf2_per_cu_data *lh_cu;
>    struct attribute *attr;
> -  int i;
> +  int i = 0;
>    void **slot;
>    struct quick_file_names *qfn;
>
> @@ -3697,13 +3709,13 @@ dw2_get_file_names_reader (const struct die_reader_specs *reader,
>    if (strcmp (fnd.name, "<unknown>") != 0)
>      ++offset;
>
> -  qfn->num_file_names = offset + lh->file_names.size ();
> +  qfn->num_file_names = offset + lh->file_names_size ();
>    qfn->file_names =
>      XOBNEWVEC (&objfile->objfile_obstack, const char *, qfn->num_file_names);
>    if (offset != 0)
>      qfn->file_names[0] = xstrdup (fnd.name);
> -  for (i = 0; i < lh->file_names.size (); ++i)
> -    qfn->file_names[i + offset] = file_full_name (i + 1, lh.get (), fnd.comp_dir);
> +  for (i = 0; i < lh->file_names_size (); ++i)
> +    qfn->file_names[i + offset] = file_full_name (i + 1, lh.get (), fnd.comp_dir, true);
>    qfn->real_names = NULL;
>
>    lh_cu->v.quick->file_names = qfn;
> @@ -11696,16 +11708,16 @@ dwarf2_cu::setup_type_unit_groups (struct die_info *die)
>          process_full_type_unit still needs to know if this is the first
>          time.  */
>
> -      tu_group->num_symtabs = line_header->file_names.size ();
> +      tu_group->num_symtabs = line_header->file_names_size ();
>        tu_group->symtabs = XNEWVEC (struct symtab *,
> -                                  line_header->file_names.size ());
> +                                  line_header->file_names_size ());
>
> -      for (i = 0; i < line_header->file_names.size (); ++i)
> +      for (i = 0; i < line_header->file_names_size (); ++i)
>         {
> -         file_entry &fe = line_header->file_names[i];
> +         file_entry *fe = line_header->file_name_at (i, true);
>
> -         dwarf2_start_subfile (this, fe.name,
> -                               fe.include_dir (line_header));
> +         dwarf2_start_subfile (this, fe->name,
> +                               fe->include_dir (line_header));
>           buildsym_compunit *b = get_builder ();
>           if (b->get_current_subfile ()->symtab == NULL)
>             {
> @@ -11718,8 +11730,8 @@ dwarf2_cu::setup_type_unit_groups (struct die_info *die)
>                 = allocate_symtab (cust, b->get_current_subfile ()->name);
>             }
>
> -         fe.symtab = b->get_current_subfile ()->symtab;
> -         tu_group->symtabs[i] = fe.symtab;
> +         fe->symtab = b->get_current_subfile ()->symtab;
> +         tu_group->symtabs[i] = fe->symtab;
>         }
>      }
>    else
> @@ -11732,11 +11744,10 @@ dwarf2_cu::setup_type_unit_groups (struct die_info *die)
>                         compunit_language (cust),
>                         0, cust));
>
> -      for (i = 0; i < line_header->file_names.size (); ++i)
> +      for (i = 0; i < line_header->file_names_size (); ++i)
>         {
> -         file_entry &fe = line_header->file_names[i];
> -
> -         fe.symtab = tu_group->symtabs[i];
> +         file_entry *fe = line_header->file_name_at (i, true);
> +         fe->symtab = tu_group->symtabs[i];
>         }
>      }
>
> @@ -16209,7 +16220,7 @@ process_structure_scope (struct die_info *die, struct dwarf2_cu *cu)
>             {
>               /* Any related symtab will do.  */
>               symtab
> -               = cu->line_header->file_name_at (file_name_index (1))->symtab;
> +               = cu->line_header->file_name_at (0, true)->symtab;
>             }
>           else
>             {
> @@ -20277,7 +20288,7 @@ line_header::add_file_name (const char *name,
>  {
>    if (dwarf_line_debug >= 2)
>      fprintf_unfiltered (gdb_stdlog, "Adding file %u: %s\n",
> -                       (unsigned) file_names.size () + 1, name);
> +                       (unsigned) file_names_size () + 1, name);
>
>    file_names.emplace_back (name, d_index, mod_time, length);
>  }
> @@ -20393,6 +20404,11 @@ read_formatted_entries (struct dwarf2_per_objfile *dwarf2_per_objfile,
>               buf += 8;
>               break;
>
> +           case DW_FORM_data16:
> +             /*  This is used for MD5, but file_entry does not record MD5s. */
> +             buf += 16;
> +             break;
> +
>             case DW_FORM_udata:
>               uint.emplace (read_unsigned_leb128 (abfd, buf, &bytes_read));
>               buf += bytes_read;
> @@ -20493,12 +20509,15 @@ dwarf_decode_line_header (sect_offset sect_off, struct dwarf2_cu *cu)
>      read_checked_initial_length_and_offset (abfd, line_ptr, &cu->header,
>                                             &bytes_read, &offset_size);
>    line_ptr += bytes_read;
> +
> +  const gdb_byte *start_here = line_ptr;
> +
>    if (line_ptr + lh->total_length > (section->buffer + section->size))
>      {
>        dwarf2_statement_list_fits_in_line_number_section_complaint ();
>        return 0;
>      }
> -  lh->statement_program_end = line_ptr + lh->total_length;
> +  lh->statement_program_end = start_here + lh->total_length;
>    lh->version = read_2_bytes (abfd, line_ptr);
>    line_ptr += 2;
>    if (lh->version > 5)
> @@ -20528,6 +20547,7 @@ dwarf_decode_line_header (sect_offset sect_off, struct dwarf2_cu *cu)
>      }
>    lh->header_length = read_offset_1 (abfd, line_ptr, offset_size);
>    line_ptr += offset_size;
> +  lh->statement_program_start = line_ptr + lh->header_length;
>    lh->minimum_instruction_length = read_1_byte (abfd, line_ptr);
>    line_ptr += 1;
>    if (lh->version >= 4)
> @@ -20612,7 +20632,6 @@ dwarf_decode_line_header (sect_offset sect_off, struct dwarf2_cu *cu)
>         }
>        line_ptr += bytes_read;
>      }
> -  lh->statement_program_start = line_ptr;
>
>    if (line_ptr > (section->buffer + section->size))
>      complaint (_("line number info header doesn't "
> @@ -20629,18 +20648,18 @@ dwarf_decode_line_header (sect_offset sect_off, struct dwarf2_cu *cu)
>     Returns NULL if FILE_INDEX should be ignored, i.e., it is pst->filename.  */
>
>  static const char *
> -psymtab_include_file_name (const struct line_header *lh, int file_index,
> +psymtab_include_file_name (struct line_header *lh, int file_index,
>                            const struct partial_symtab *pst,
>                            const char *comp_dir,
>                            gdb::unique_xmalloc_ptr<char> *name_holder)
>  {
> -  const file_entry &fe = lh->file_names[file_index];
> -  const char *include_name = fe.name;
> +  file_entry *fe = lh->file_name_at (file_index, true);
> +  const char *include_name = fe->name;
>    const char *include_name_to_compare = include_name;
>    const char *pst_filename;
>    int file_is_pst;
>
> -  const char *dir_name = fe.include_dir (lh);
> +  const char *dir_name = fe->include_dir (lh);
>
>    gdb::unique_xmalloc_ptr<char> hold_compare;
>    if (!IS_ABSOLUTE_PATH (include_name)
> @@ -20712,7 +20731,7 @@ public:
>    {
>      /* lh->file_names is 0-based, but the file name numbers in the
>         statement program are 1-based.  */
> -    return m_line_header->file_name_at (m_file);
> +    return m_line_header->file_name_at (m_file, false);
>    }
>
>    /* Record the line in the state machine.  END_SEQUENCE is true if
> @@ -20809,8 +20828,8 @@ private:
>       and initialized according to the DWARF spec.  */
>
>    unsigned char m_op_index = 0;
> -  /* The line table index (1-based) of the current file.  */
> -  file_name_index m_file = (file_name_index) 1;
> +  /* The line table index of the current file.  */
> +  file_name_index m_file = 1;
>    unsigned int m_line = 1;
>
>    /* These are initialized in the constructor.  */
> @@ -21002,7 +21021,7 @@ lnp_state_machine::record_line (bool end_sequence)
>        fprintf_unfiltered (gdb_stdlog,
>                           "Processing actual line %u: file %u,"
>                           " address %s, is_stmt %u, discrim %u\n",
> -                         m_line, to_underlying (m_file),
> +                         m_line, m_file,
>                           paddress (m_gdbarch, m_address),
>                           m_is_stmt, m_discriminator);
>      }
> @@ -21345,8 +21364,8 @@ dwarf_decode_lines (struct line_header *lh, const char *comp_dir,
>
>        /* Now that we're done scanning the Line Header Program, we can
>           create the psymtab of each included file.  */
> -      for (file_index = 0; file_index < lh->file_names.size (); file_index++)
> -        if (lh->file_names[file_index].included_p == 1)
> +      for (file_index = 0; file_index < lh->file_names_size (); file_index++)
> +        if (lh->file_name_at (file_index, true)->included_p == 1)
>            {
>             gdb::unique_xmalloc_ptr<char> name_holder;
>             const char *include_name =
> @@ -21365,11 +21384,11 @@ dwarf_decode_lines (struct line_header *lh, const char *comp_dir,
>        struct compunit_symtab *cust = builder->get_compunit_symtab ();
>        int i;
>
> -      for (i = 0; i < lh->file_names.size (); i++)
> +      for (i = 0; i < lh->file_names_size (); i++)
>         {
> -         file_entry &fe = lh->file_names[i];
> +         file_entry *fe = lh->file_name_at (i, true);
>
> -         dwarf2_start_subfile (cu, fe.name, fe.include_dir (lh));
> +         dwarf2_start_subfile (cu, fe->name, fe->include_dir (lh));
>
>           if (builder->get_current_subfile ()->symtab == NULL)
>             {
> @@ -21377,7 +21396,7 @@ dwarf_decode_lines (struct line_header *lh, const char *comp_dir,
>                 = allocate_symtab (cust,
>                                    builder->get_current_subfile ()->name);
>             }
> -         fe.symtab = builder->get_current_subfile ()->symtab;
> +         fe->symtab = builder->get_current_subfile ()->symtab;
>         }
>      }
>  }
> @@ -21596,7 +21615,7 @@ new_symbol (struct die_info *die, struct type *type, struct dwarf2_cu *cu,
>           struct file_entry *fe;
>
>           if (cu->line_header != NULL)
> -           fe = cu->line_header->file_name_at (file_index);
> +           fe = cu->line_header->file_name_at (file_index, false);
>           else
>             fe = NULL;
>
> @@ -24154,22 +24173,29 @@ dwarf_alloc_die (struct dwarf2_cu *cu, int num_attrs)
>     *LH's file name table.  The result is allocated using xmalloc; the caller is
>     responsible for freeing it.  */
>
> +bool is_valid_file_index (int file_index, struct line_header *lh,
> +                         bool is_zero_indexed) {
> +  if (is_zero_indexed || lh->version >= 5)
> +    return 0 <= file_index && file_index < lh->file_names_size ();
> +  return 1 <= file_index && file_index <= lh->file_names_size ();
> +}
> +
>  static char *
> -file_file_name (int file, struct line_header *lh)
> +file_file_name (int file, struct line_header *lh, bool is_zero_indexed)
>  {
>    /* Is the file number a valid index into the line header's file name
>       table?  Remember that file numbers start with one, not zero.  */
> -  if (1 <= file && file <= lh->file_names.size ())
> +  if (is_valid_file_index (file, lh, is_zero_indexed))
>      {
> -      const file_entry &fe = lh->file_names[file - 1];
> +      const file_entry *fe = lh->file_name_at (file, is_zero_indexed);
>
> -      if (!IS_ABSOLUTE_PATH (fe.name))
> +      if (!IS_ABSOLUTE_PATH (fe->name))
>         {
> -         const char *dir = fe.include_dir (lh);
> +         const char *dir = fe->include_dir (lh);
>           if (dir != NULL)
> -           return concat (dir, SLASH_STRING, fe.name, (char *) NULL);
> +           return concat (dir, SLASH_STRING, fe->name, (char *) NULL);
>         }
> -      return xstrdup (fe.name);
> +      return xstrdup (fe->name);
>      }
>    else
>      {
> @@ -24193,13 +24219,14 @@ file_file_name (int file, struct line_header *lh)
>     compilation.  The result is allocated using xmalloc; the caller is
>     responsible for freeing it.  */
>  static char *
> -file_full_name (int file, struct line_header *lh, const char *comp_dir)
> +file_full_name (int file, struct line_header *lh, const char *comp_dir,
> +               bool is_zero_indexed)
>  {
>    /* Is the file number a valid index into the line header's file name
>       table?  Remember that file numbers start with one, not zero.  */
> -  if (1 <= file && file <= lh->file_names.size ())
> +  if (is_valid_file_index (file, lh, is_zero_indexed))
>      {
> -      char *relative = file_file_name (file, lh);
> +      char *relative = file_file_name (file, lh, is_zero_indexed);
>
>        if (IS_ABSOLUTE_PATH (relative) || comp_dir == NULL)
>         return relative;
> @@ -24207,7 +24234,7 @@ file_full_name (int file, struct line_header *lh, const char *comp_dir)
>                        relative, (char *) NULL);
>      }
>    else
> -    return file_file_name (file, lh);
> +    return file_file_name (file, lh, is_zero_indexed);
>  }
>
>
> @@ -24218,7 +24245,7 @@ macro_start_file (struct dwarf2_cu *cu,
>                    struct line_header *lh)
>  {
>    /* File name relative to the compilation directory of this source file.  */
> -  char *file_name = file_file_name (file, lh);
> +  char *file_name = file_file_name (file, lh, false);
>
>    if (! current_file)
>      {
> --
> 2.23.0.444.g18eeb5a265-goog
>