From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-159811-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 52373 invoked by alias); 18 Oct 2019 19:31:24 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 52363 invoked by uid 89); 18 Oct 2019 19:31:24 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=(unknown), sk:get_com, owned
X-HELO: mail-pf1-f202.google.com
Received: from mail-pf1-f202.google.com (HELO mail-pf1-f202.google.com) (209.85.210.202) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 18 Oct 2019 19:31:21 +0000
Received: by mail-pf1-f202.google.com with SMTP id z4so5393812pfn.0        for <gdb-patches@sourceware.org>; Fri, 18 Oct 2019 12:31:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d=google.com; s=20161025;        h=date:in-reply-to:message-id:mime-version:references:subject:from:to         :cc:content-transfer-encoding;        bh=H50A7x2EHUZASj+bZ113F6hYbBkmFbP+U2aNo9ytbJM=;        b=DwWoIFvrT+JoogSNtO8ve5X2075PhQfe39og6180SkBNigLz0jQKSJtZVjPRHbU/ux         PEBfZu7088jZs+Dy49acjy1OwUA5/Z21v/TPv4odoZeh9hcQcRa2t3vIyo8h/24kqQiO         HjpKwa8sxBop0m9m5uqTptTzTVugCKQo5FwFXfrEEHYBZRD8NxyoItdGiJ3CiN6kpT8l         BUo1gYbRI8ZAvV4cslu9en36+vUiBFmWQ5jz2Ez9s57CBdO6jERu5MQCbXKNlNIPqBn/         udjHu1Es5+LUvh4O+8DgKZFRfueq28J91Nt/tpaMT+2L+g8Jx4je4kOkdhrClubCBNgp         j/zA==
Date: Fri, 18 Oct 2019 19:31:00 -0000
In-Reply-To: <99f8963b76221d747d0564919217a2eb@polymtl.ca>
Message-Id: <20191018193116.57020-1-tamur@google.com>
Mime-Version: 1.0
References: <99f8963b76221d747d0564919217a2eb@polymtl.ca>
Subject: [PATCH] DWARF 5 support: Handle line table and file indexes
From: "Ali Tamur via gdb-patches" <gdb-patches@sourceware.org>
Reply-To: Ali Tamur <tamur@google.com>
To: Simon Marchi <simon.marchi@polymtl.ca>
Cc: gdb-patches@sourceware.org, Ali Tamur <tamur@google.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-IsSubscribed: yes
X-SW-Source: 2019-10/txt/msg00643.txt.bz2

I've done everything asked, except this bit:
> I'd still suggest adding a gdb_assert to verify that in the case of=20
DWARF <=3D 4, index should be > 0.

This caused failure in multiple tests, I think the correct behaviour is to
return nullptr when DWARF <=3D 4 and index =3D 0 (done).
---
*  Fix handling of file and directory indexes in line tables; in DWARF 5 the
indexes are zero-based. Make file_names field private to abstract this deta=
il
from the clients. Introduce file_names and file_names_size methods. Reflect
these changes in clients.
*  Handle DW_FORM_data16 in read_formatted_entries; it is used to record MD5
of the file entries in DWARF 5.
*  Fix a bug in line header parsing that calculates the length of the header
incorrectly. (Seemingly this manifests itself only in DWARF 5).
*  Introduce a new method, is_valid_file_index that takes into account whet=
her
it is DWARF 5. Use it in related clients.

Tested with CC=3D/usr/bin/gcc (version 8.3.0) against master branch (also w=
ith
-gsplit-dwarf and -gdwarf-4 flags) and there was no increase in the set of
tests that fails. (gdb still cannot debug a 'hello world' program with DWAR=
F 5,
so for the time being, this is all we care about).

This is part of an effort to support DWARF 5 in gdb.
---
 gdb/dwarf2read.c | 177 +++++++++++++++++++++++++++++------------------
 1 file changed, 108 insertions(+), 69 deletions(-)

diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index ee9df34ed2..fd241bf780 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -922,13 +922,13 @@ typedef void (die_reader_func_ftype) (const struct di=
e_reader_specs *reader,
 				      int has_children,
 				      void *data);
=20
-/* A 1-based directory index.  This is a strong typedef to prevent
-   accidentally using a directory index as a 0-based index into an
-   array/vector.  */
-enum class dir_index : unsigned int {};
+/* dir_index is 1-based in DWARF 4 and before, and is 0-based in DWARF 5 a=
nd
+   later.  */
+typedef int dir_index;
=20
-/* Likewise, a 1-based file name index.  */
-enum class file_name_index : unsigned int {};
+/* file_name_index is 1-based in DWARF 4 and before, and is 0-based in DWA=
RF 5
+   and later.  */
+typedef int file_name_index;
=20
 struct file_entry
 {
@@ -980,32 +980,40 @@ struct line_header
   void add_file_name (const char *name, dir_index d_index,
 		      unsigned int mod_time, unsigned int length);
=20
-  /* Return the include dir at INDEX (1-based).  Returns NULL if INDEX
-     is out of bounds.  */
+  /* Return the include dir at INDEX (0-based in DWARF 5 and 1-based befor=
e).
+     Returns NULL if INDEX is out of bounds.  */
   const char *include_dir_at (dir_index index) const
   {
-    /* Convert directory index number (1-based) to vector index
-       (0-based).  */
-    size_t vec_index =3D to_underlying (index) - 1;
-
-    if (vec_index >=3D include_dirs.size ())
+    int vec_index;
+    if (version >=3D 5)
+      vec_index =3D index;
+    else
+      vec_index =3D index - 1;
+    if (vec_index < 0 || vec_index >=3D m_include_dirs.size ())
       return NULL;
-    return include_dirs[vec_index];
+    return m_include_dirs[vec_index];
   }
=20
-  /* Return the file name at INDEX (1-based).  Returns NULL if INDEX
-     is out of bounds.  */
+  /* Return the file name at INDEX (0-based in DWARF 5 and 1-based before).
+     Returns NULL if INDEX is out of bounds.  */
   file_entry *file_name_at (file_name_index index)
   {
-    /* Convert file name index number (1-based) to vector index
-       (0-based).  */
-    size_t vec_index =3D to_underlying (index) - 1;
-
-    if (vec_index >=3D file_names.size ())
+    int vec_index;
+    if (version >=3D 5)
+      vec_index =3D index;
+    else
+      vec_index =3D index - 1;
+    if (vec_index < 0 || vec_index >=3D m_file_names.size ())
       return NULL;
-    return &file_names[vec_index];
+    return &m_file_names[vec_index];
   }
=20
+  /* The indexes are 0-based in DWARF 5 and 1-based in DWARF 4. Therefore,
+     this method should only be used to iterate through all file entries i=
n an
+     index-agnostic manner.  */
+  std::vector<file_entry> &file_names ()
+  { return m_file_names; }
+
   /* Offset of line number information in .debug_line section.  */
   sect_offset sect_off {};
=20
@@ -1032,12 +1040,23 @@ struct line_header
      pointers.  The memory is owned by debug_line_buffer.  */
   std::vector<const char *> include_dirs;
=20
-  /* The file_names table.  */
-  std::vector<file_entry> file_names;
+  int file_names_size ()
+  { return m_file_names.size(); }
=20
   /* The start and end of the statement program following this
      header.  These point into dwarf2_per_objfile->line_buffer.  */
   const gdb_byte *statement_program_start {}, *statement_program_end {};
+
+ private:
+  /* The include_directories table.  Note these are observing
+     pointers.  The memory is owned by debug_line_buffer.  */
+  std::vector<const char *> m_include_dirs;
+
+  /* The file_names table. This is private because the meaning of indexes
+     differs among DWARF versions (The first valid index is 1 in DWARF 4 a=
nd
+     before, and is 0 in DWARF 5 and later).  So the client should use
+     file_name_at method for access.  */
+  std::vector<file_entry> m_file_names;
 };
=20
 typedef std::unique_ptr<line_header> line_header_up;
@@ -3644,7 +3663,6 @@ dw2_get_file_names_reader (const struct die_reader_sp=
ecs *reader,
   struct objfile *objfile =3D dwarf2_per_objfile->objfile;
   struct dwarf2_per_cu_data *lh_cu;
   struct attribute *attr;
-  int i;
   void **slot;
   struct quick_file_names *qfn;
=20
@@ -3703,12 +3721,12 @@ dw2_get_file_names_reader (const struct die_reader_=
specs *reader,
   if (strcmp (fnd.name, "<unknown>") !=3D 0)
     ++offset;
=20
-  qfn->num_file_names =3D offset + lh->file_names.size ();
+  qfn->num_file_names =3D offset + lh->file_names_size ();
   qfn->file_names =3D
     XOBNEWVEC (&objfile->objfile_obstack, const char *, qfn->num_file_name=
s);
   if (offset !=3D 0)
     qfn->file_names[0] =3D xstrdup (fnd.name);
-  for (i =3D 0; i < lh->file_names.size (); ++i)
+  for (int i =3D 0; i < lh->file_names_size (); ++i)
     qfn->file_names[i + offset] =3D file_full_name (i + 1, lh.get (), fnd.=
comp_dir);
   qfn->real_names =3D NULL;
=20
@@ -11717,14 +11735,14 @@ dwarf2_cu::setup_type_unit_groups (struct die_inf=
o *die)
 	 process_full_type_unit still needs to know if this is the first
 	 time.  */
=20
-      tu_group->num_symtabs =3D line_header->file_names.size ();
+      tu_group->num_symtabs =3D line_header->file_names_size ();
       tu_group->symtabs =3D XNEWVEC (struct symtab *,
-				   line_header->file_names.size ());
+				   line_header->file_names_size ());
=20
-      for (i =3D 0; i < line_header->file_names.size (); ++i)
+      auto &file_names =3D line_header->file_names ();
+      for (i =3D 0; i < file_names.size (); ++i)
 	{
-	  file_entry &fe =3D line_header->file_names[i];
-
+	  file_entry &fe =3D file_names[i];
 	  dwarf2_start_subfile (this, fe.name,
 				fe.include_dir (line_header));
 	  buildsym_compunit *b =3D get_builder ();
@@ -11753,10 +11771,10 @@ dwarf2_cu::setup_type_unit_groups (struct die_inf=
o *die)
 			compunit_language (cust),
 			0, cust));
=20
-      for (i =3D 0; i < line_header->file_names.size (); ++i)
+      auto &file_names =3D line_header->file_names ();
+      for (i =3D 0; i < file_names.size (); ++i)
 	{
-	  file_entry &fe =3D line_header->file_names[i];
-
+	  file_entry &fe =3D file_names[i];
 	  fe.symtab =3D tu_group->symtabs[i];
 	}
     }
@@ -16230,7 +16248,7 @@ process_structure_scope (struct die_info *die, stru=
ct dwarf2_cu *cu)
 	    {
 	      /* Any related symtab will do.  */
 	      symtab
-		=3D cu->line_header->file_name_at (file_name_index (1))->symtab;
+		=3D cu->line_header->file_names ()[0].symtab;
 	    }
 	  else
 	    {
@@ -20285,10 +20303,16 @@ void
 line_header::add_include_dir (const char *include_dir)
 {
   if (dwarf_line_debug >=3D 2)
-    fprintf_unfiltered (gdb_stdlog, "Adding dir %zu: %s\n",
-			include_dirs.size () + 1, include_dir);
-
-  include_dirs.push_back (include_dir);
+    {
+      size_t new_size;
+      if (version >=3D 5)
+        new_size =3D m_include_dirs.size ();
+      else
+        new_size =3D m_include_dirs.size () + 1;
+      fprintf_unfiltered (gdb_stdlog, "Adding dir %zu: %s\n",
+			  new_size, include_dir);
+    }
+  m_include_dirs.push_back (include_dir);
 }
=20
 void
@@ -20298,10 +20322,16 @@ line_header::add_file_name (const char *name,
 			    unsigned int length)
 {
   if (dwarf_line_debug >=3D 2)
-    fprintf_unfiltered (gdb_stdlog, "Adding file %u: %s\n",
-			(unsigned) file_names.size () + 1, name);
-
-  file_names.emplace_back (name, d_index, mod_time, length);
+    {
+      size_t new_size;
+      if (version >=3D 5)
+        new_size =3D file_names_size ();
+      else
+        new_size =3D file_names_size () + 1;
+      fprintf_unfiltered (gdb_stdlog, "Adding file %zu: %s\n",
+			  new_size, name);
+    }
+  m_file_names.emplace_back (name, d_index, mod_time, length);
 }
=20
 /* A convenience function to find the proper .debug_line section for a CU.=
  */
@@ -20415,6 +20445,11 @@ read_formatted_entries (struct dwarf2_per_objfile =
*dwarf2_per_objfile,
 	      buf +=3D 8;
 	      break;
=20
+	    case DW_FORM_data16:
+	      /*  This is used for MD5, but file_entry does not record MD5s. */
+	      buf +=3D 16;
+	      break;
+
 	    case DW_FORM_udata:
 	      uint.emplace (read_unsigned_leb128 (abfd, buf, &bytes_read));
 	      buf +=3D bytes_read;
@@ -20515,12 +20550,15 @@ dwarf_decode_line_header (sect_offset sect_off, s=
truct dwarf2_cu *cu)
     read_checked_initial_length_and_offset (abfd, line_ptr, &cu->header,
 					    &bytes_read, &offset_size);
   line_ptr +=3D bytes_read;
+
+  const gdb_byte *start_here =3D line_ptr;
+
   if (line_ptr + lh->total_length > (section->buffer + section->size))
     {
       dwarf2_statement_list_fits_in_line_number_section_complaint ();
       return 0;
     }
-  lh->statement_program_end =3D line_ptr + lh->total_length;
+  lh->statement_program_end =3D start_here + lh->total_length;
   lh->version =3D read_2_bytes (abfd, line_ptr);
   line_ptr +=3D 2;
   if (lh->version > 5)
@@ -20550,6 +20588,7 @@ dwarf_decode_line_header (sect_offset sect_off, str=
uct dwarf2_cu *cu)
     }
   lh->header_length =3D read_offset_1 (abfd, line_ptr, offset_size);
   line_ptr +=3D offset_size;
+  lh->statement_program_start =3D line_ptr + lh->header_length;
   lh->minimum_instruction_length =3D read_1_byte (abfd, line_ptr);
   line_ptr +=3D 1;
   if (lh->version >=3D 4)
@@ -20634,7 +20673,6 @@ dwarf_decode_line_header (sect_offset sect_off, str=
uct dwarf2_cu *cu)
 	}
       line_ptr +=3D bytes_read;
     }
-  lh->statement_program_start =3D line_ptr;
=20
   if (line_ptr > (section->buffer + section->size))
     complaint (_("line number info header doesn't "
@@ -20651,12 +20689,11 @@ dwarf_decode_line_header (sect_offset sect_off, s=
truct dwarf2_cu *cu)
    Returns NULL if FILE_INDEX should be ignored, i.e., it is pst->filename=
.  */
=20
 static const char *
-psymtab_include_file_name (const struct line_header *lh, int file_index,
+psymtab_include_file_name (const struct line_header *lh, const file_entry =
&fe,
 			   const struct partial_symtab *pst,
 			   const char *comp_dir,
 			   gdb::unique_xmalloc_ptr<char> *name_holder)
 {
-  const file_entry &fe =3D lh->file_names[file_index];
   const char *include_name =3D fe.name;
   const char *include_name_to_compare =3D include_name;
   const char *pst_filename;
@@ -20831,8 +20868,8 @@ private:
      and initialized according to the DWARF spec.  */
=20
   unsigned char m_op_index =3D 0;
-  /* The line table index (1-based) of the current file.  */
-  file_name_index m_file =3D (file_name_index) 1;
+  /* The line table index of the current file.  */
+  file_name_index m_file =3D 1;
   unsigned int m_line =3D 1;
=20
   /* These are initialized in the constructor.  */
@@ -21024,7 +21061,7 @@ lnp_state_machine::record_line (bool end_sequence)
       fprintf_unfiltered (gdb_stdlog,
 			  "Processing actual line %u: file %u,"
 			  " address %s, is_stmt %u, discrim %u\n",
-			  m_line, to_underlying (m_file),
+			  m_line, m_file,
 			  paddress (m_gdbarch, m_address),
 			  m_is_stmt, m_discriminator);
     }
@@ -21363,17 +21400,15 @@ dwarf_decode_lines (struct line_header *lh, const=
 char *comp_dir,
=20
   if (decode_for_pst_p)
     {
-      int file_index;
-
       /* Now that we're done scanning the Line Header Program, we can
          create the psymtab of each included file.  */
-      for (file_index =3D 0; file_index < lh->file_names.size (); file_ind=
ex++)
-        if (lh->file_names[file_index].included_p =3D=3D 1)
+      for (auto &file_entry : lh->file_names ())
+        if (file_entry.included_p =3D=3D 1)
           {
 	    gdb::unique_xmalloc_ptr<char> name_holder;
 	    const char *include_name =3D
-	      psymtab_include_file_name (lh, file_index, pst, comp_dir,
-					 &name_holder);
+	      psymtab_include_file_name (lh, file_entry, pst,
+					 comp_dir, &name_holder);
 	    if (include_name !=3D NULL)
               dwarf2_create_include_psymtab (include_name, pst, objfile);
           }
@@ -21385,14 +21420,10 @@ dwarf_decode_lines (struct line_header *lh, const=
 char *comp_dir,
 	 line numbers).  */
       buildsym_compunit *builder =3D cu->get_builder ();
       struct compunit_symtab *cust =3D builder->get_compunit_symtab ();
-      int i;
=20
-      for (i =3D 0; i < lh->file_names.size (); i++)
+      for (auto &fe : lh->file_names ())
 	{
-	  file_entry &fe =3D lh->file_names[i];
-
 	  dwarf2_start_subfile (cu, fe.name, fe.include_dir (lh));
-
 	  if (builder->get_current_subfile ()->symtab =3D=3D NULL)
 	    {
 	      builder->get_current_subfile ()->symtab
@@ -24190,6 +24221,14 @@ dwarf_alloc_die (struct dwarf2_cu *cu, int num_att=
rs)
 =0C
 /* Macro support.  */
=20
+static bool
+is_valid_file_index (int file_index, struct line_header *lh)
+{
+  if (lh->version >=3D 5)
+    return 0 <=3D file_index && file_index < lh->file_names_size ();
+  return 1 <=3D file_index && file_index <=3D lh->file_names_size ();
+}
+
 /* Return file name relative to the compilation directory of file number I=
 in
    *LH's file name table.  The result is allocated using xmalloc; the call=
er is
    responsible for freeing it.  */
@@ -24199,17 +24238,17 @@ file_file_name (int file, struct line_header *lh)
 {
   /* Is the file number a valid index into the line header's file name
      table?  Remember that file numbers start with one, not zero.  */
-  if (1 <=3D file && file <=3D lh->file_names.size ())
+  if (is_valid_file_index (file, lh))
     {
-      const file_entry &fe =3D lh->file_names[file - 1];
+      const file_entry *fe =3D lh->file_name_at (file);
=20
-      if (!IS_ABSOLUTE_PATH (fe.name))
+      if (!IS_ABSOLUTE_PATH (fe->name))
 	{
-	  const char *dir =3D fe.include_dir (lh);
+	  const char *dir =3D fe->include_dir (lh);
 	  if (dir !=3D NULL)
-	    return concat (dir, SLASH_STRING, fe.name, (char *) NULL);
+	    return concat (dir, SLASH_STRING, fe->name, (char *) NULL);
 	}
-      return xstrdup (fe.name);
+      return xstrdup (fe->name);
     }
   else
     {
@@ -24237,7 +24276,7 @@ file_full_name (int file, struct line_header *lh, c=
onst char *comp_dir)
 {
   /* Is the file number a valid index into the line header's file name
      table?  Remember that file numbers start with one, not zero.  */
-  if (1 <=3D file && file <=3D lh->file_names.size ())
+  if (is_valid_file_index (file, lh))
     {
       char *relative =3D file_file_name (file, lh);
=20
--=20
2.23.0.866.gb869b98d4c-goog