From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-bounces~public-inbox=simark.ca@sourceware.org>
Received: from simark.ca
	by simark.ca with LMTP
	id iNlKA7+CjGiifgEAWB0awg
	(envelope-from <gdb-patches-bounces~public-inbox=simark.ca@sourceware.org>)
	for <public-inbox@simark.ca>; Fri, 01 Aug 2025 05:02:55 -0400
Authentication-Results: simark.ca;
	dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=NSfBH8+Y;
	dkim-atps=neutral
Received: by simark.ca (Postfix, from userid 112)
	id 05B531E102; Fri,  1 Aug 2025 05:02:55 -0400 (EDT)
X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on simark.ca
X-Spam-Level: 
X-Spam-Status: No, score=-10.1 required=5.0 tests=ARC_SIGNED,ARC_VALID,
	BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,
	MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_VALIDITY_CERTIFIED,
	RCVD_IN_VALIDITY_RPBL,RCVD_IN_VALIDITY_SAFE autolearn=ham
	autolearn_force=no version=4.0.1
Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256)
	(No client certificate requested)
	by simark.ca (Postfix) with ESMTPS id 451171E089
	for <public-inbox@simark.ca>; Fri,  1 Aug 2025 05:02:54 -0400 (EDT)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id E5B3D385841E
	for <public-inbox@simark.ca>; Fri,  1 Aug 2025 09:02:53 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E5B3D385841E
Authentication-Results: sourceware.org;
	dkim=pass (1024-bit key, unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=NSfBH8+Y
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by sourceware.org (Postfix) with ESMTP id 706193858C50
 for <gdb-patches@sourceware.org>; Fri,  1 Aug 2025 08:58:39 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 706193858C50
Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none)
 header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 706193858C50
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=170.10.133.124
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1754038719; cv=none;
 b=UCpAPftRx6CmGT+/c47JVGUwklm3F2+Hh+N+ZbQ8F1pmPDv5fTGsfCHsyopZMR/7/kQEy6u+0dS6cTaN9738dVF1e5oD/8Mk8YvtmJqWLiYFUq39O+Ak+zK7rWGDFp120vDBKFJs2F7s6x/9j8UsTvJCvztVEv1o0uTpw5eEggw=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1754038719; c=relaxed/simple;
 bh=9YLRpbKHxR+LN8DWH6Cmp1pqfPkrztCx8n2r6sKXfrU=;
 h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;
 b=ZLVEmSCm07g4L9GZ/pHDtFvwCCcQ2t+6nD+UstSc3o4R0Rp6Q++i9N1OCmElvpycznmktSIhu4Qws/5BGmkFJokHhFHtua+FtNE/GrRGf6MkrTEOeFU7LBrdnsRuJHGmrs26uzf9AmVbSycJ1i2asiMN3dno9Ve06v6OxsmBrJ8=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 706193858C50
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1754038719;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references;
 bh=dPofA5Lj9sadtE+gFnKr2nb1KxyA5/+dhPMZFnfoWLM=;
 b=NSfBH8+YFze+XhZ0ooy63CXYVsUvbFzGh8wR5kPxjjefOtAXIb/C3VR56lO4xKm7U8mgJ4
 la+uZxYx8phvXQ2PVOxOJYO4zAn0Jv2JX7tAb5KvgbCFTYd52g0T4iC6zmf9PefGnWnqrV
 dkRRjACKQyQDM11jbMYx8x3XoPM4Yh8=
Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com
 [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-386-i-SeToUhPjmIvpAtgJhGUA-1; Fri, 01 Aug 2025 04:58:36 -0400
X-MC-Unique: i-SeToUhPjmIvpAtgJhGUA-1
X-Mimecast-MFC-AGG-ID: i-SeToUhPjmIvpAtgJhGUA_1754038715
Received: by mail-wr1-f72.google.com with SMTP id
 ffacd0b85a97d-3b7851a096fso1000949f8f.1
 for <gdb-patches@sourceware.org>; Fri, 01 Aug 2025 01:58:36 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1754038714; x=1754643514;
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
 :subject:date:message-id:reply-to;
 bh=dPofA5Lj9sadtE+gFnKr2nb1KxyA5/+dhPMZFnfoWLM=;
 b=Yz59gvT2uUO2k7/ePdEeHcB1xlnpomjr7/YOG5IAnnWAkdVwK9SpP3tFJW3XO1dMmj
 YbExAloYSBJS+YfHIfuX2VziSFCxSfeDTxPKO1Qo66B4iqUWGTR825G3w/gSpXa4p9qJ
 h+K3R1bVr9m0ebgUgR2zUjff/MbuhMJ0Bj6I1MITuw+NZNn22Ev9KJWF3xldWfeihajE
 9QTT3R+9TXYRFjn9/e+NpsrRt1D0IeUtRmVkJN2lsb5FfD09WW4S4qXVGRfHIDvB4yi+
 KxghCJo3e5bL7FKd3/Ab8gs08CgAb+K20fS/1J/+TnS8w0AchDdDy/Y1OtfnpatCmba1
 XW5w==
X-Gm-Message-State: AOJu0Yyh21B5vJl0oGWLSqzOl9ReZWtrc16VABtF4+Z8olPLBgeEVDgE
 32M6LHY8UH78qtH2eBgejF3dO6a73iNJYu8EokACtZt8jGz9/Q47sMlyZYGJyUHJQJVeEyRUtIe
 GpQRhZn0GegVV4FOWvK+kpYrYpzlG+njI3U7ldQkicEfmGsWTa6ZgekHsr/QMjZuZKCGd+wCeMN
 zq3ObZnIcVSnBC8Z5vsnNFI3PsbG6mMhKldolQiv/kWO5rwzs=
X-Gm-Gg: ASbGnctwM6KssatyNQ2zIlY789JA/gZx6PuR372xZtItRHq5I/d/0cL1AovIL2IemTa
 38YQbYTwxbX8cXkHBonnb4QnxGZu+wzn/LlbwdGcZ4wAsGcTsOdhplxmdFfXq6MfDtWwFmZ6fJX
 iUyLzP2uH2KjIu4EIoh+VmuD0kBr9NR7qvmHTwODTdaOpFKB8sUs8L5iPSu6O6sE5tGzErF/yq4
 /Co5yNND9y1zKFi0A6NYz8zJ3f6ammYj8+NbT/T30chw91+YaEdkh1S5jMx+76sFqOBInXbDHna
 HRAshd63dSB4Riw0kt0E06hRw6jlnSzTaJJW0EuYEC7K02H07tJ6ZVLIlxA=
X-Received: by 2002:a05:6000:2013:b0:3b8:d0bb:754d with SMTP id
 ffacd0b85a97d-3b8d348dfeemr1348537f8f.37.1754038714316; 
 Fri, 01 Aug 2025 01:58:34 -0700 (PDT)
X-Google-Smtp-Source: AGHT+IFxSJqugsvjyJcSp6m1WjQDeYDBplRAA4jFAgub+L5aP9Xx3XTEYZpzKeaBfk1OT8Eg+etpPw==
X-Received: by 2002:a05:6000:2013:b0:3b8:d0bb:754d with SMTP id
 ffacd0b85a97d-3b8d348dfeemr1348511f8f.37.1754038713725; 
 Fri, 01 Aug 2025 01:58:33 -0700 (PDT)
Received: from localhost (27.81.93.209.dyn.plus.net. [209.93.81.27])
 by smtp.gmail.com with ESMTPSA id
 5b1f17b1804b1-458953f8e0dsm94894055e9.31.2025.08.01.01.58.33
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Fri, 01 Aug 2025 01:58:33 -0700 (PDT)
From: Andrew Burgess <aburgess@redhat.com>
To: gdb-patches@sourceware.org
Cc: Andrew Burgess <aburgess@redhat.com>
Subject: [PATCHv2 3/7] gdb: split dwarf line table parsing in two
Date: Fri,  1 Aug 2025 09:58:16 +0100
Message-ID: <7a7d13968eaf6e454910c2ed0bab70e0d3b045a1.1754038556.git.aburgess@redhat.com>
X-Mailer: git-send-email 2.47.1
In-Reply-To: <cover.1754038556.git.aburgess@redhat.com>
References: <cover.1753006060.git.aburgess@redhat.com>
 <cover.1754038556.git.aburgess@redhat.com>
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
X-Mimecast-MFC-PROC-ID: v-bm4saJhckmjh60NnXM2PgYCHWG6q6AmIlh6q88uJo_1754038715
X-Mimecast-Originator: redhat.com
Content-Transfer-Encoding: 8bit
content-type: text/plain; charset="US-ASCII"; x-default=true
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
Errors-To: gdb-patches-bounces~public-inbox=simark.ca@sourceware.org

A later commit in this series, that improves GDB's ability to debug
optimised code, wants to use the line table information in order to
"fix" inline blocks with a truncated address range.  For the reasoning
behind wanting to do that, please read ahead in the series.

Assuming that we accept for now the need to use the line table
information to adjust the block ranges, then why is this commit
needed?

GDB splits the line table data info different symtabs, adding end of
sequence markers as we move between symtabs.  This seems to work fine
for GDB, but causes a problem for me in this case.

What I will want to do is this: scan the line table and spot line
table entries that corresponds to the end addresses of an inline
block's address range.  If the block meets certain requirements, then
the end address of the block is adjusted to be that of the next line
table entry.

The way that GDB currently splits the line table entries between
symtabs makes this harder.  I will have the set of blocks end
addresses which I know might be fixable, but to find the line table
entry corresponding to that address requires searching through all the
symtabs.  Having found the entry for the end address, I then need to
find the next line table entry.  For some blocks this is easy, it's
the next entry in the same symtab.  But for other blocks the next
entry might be in a different symtab, which requires yet another full
search.

I did try implementing this approach, but the number of full symtab
searches is significant, and it had a significant impact on GDB's
debug parsing performance.  The impact was such that an operation that
currently takes ~7seconds would take ~3minutes or more.  Now I could
possibly improve that 3 minutes figure by optimising the code some,
but I think that would add unnecessary complexity.

By deferring building the line table until after we have parsed the
DIEs it becomes simple to spot when a line table entry corresponds to
a block end address, and finding the next entry is always trivial, as,
at this point, the next entry is just the next entry which we will
process.  With this approach I see no noticable impact on DWARF
parsing performance.

This patch is just the refactoring.  There's no finding block end
addresses and "fixing" being done here.  This just sets things up for
the later commits.

There should be no user visible changes after this commit.
---
 gdb/dwarf2/read.c | 111 +++++++++++++++++++++++++---------------------
 1 file changed, 61 insertions(+), 50 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 6f3bd97e5bc..a68b1434e53 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -787,9 +787,7 @@ static line_header_up dwarf_decode_line_header (sect_offset sect_off,
 						struct dwarf2_cu *cu,
 						const char *comp_dir);
 
-static void dwarf_decode_lines (struct line_header *,
-				struct dwarf2_cu *,
-				unrelocated_addr, int decode_mapping);
+static void dwarf_decode_lines (struct dwarf2_cu *cu, unrelocated_addr lowpc);
 
 static void dwarf2_start_subfile (dwarf2_cu *cu, const file_entry &fe,
 				  const line_header &lh);
@@ -5866,29 +5864,55 @@ find_file_and_directory (struct die_info *die, struct dwarf2_cu *cu)
   return *cu->per_cu->fnd;
 }
 
-/* Handle DW_AT_stmt_list for a compilation unit.
-   DIE is the DW_TAG_compile_unit die for CU.
-   COMP_DIR is the compilation directory.  LOWPC is passed to
-   dwarf_decode_lines.  See dwarf_decode_lines comments about it.  */
+/* Ensure that every file_entry within the line_table of CU has a symtab
+   allocated for it. */
 
 static void
-handle_DW_AT_stmt_list (struct die_info *die, struct dwarf2_cu *cu,
-			const file_and_directory &fnd, unrelocated_addr lowpc,
-			bool have_code) /* ARI: editCase function */
+create_symtabs_from_cu_line_table (struct dwarf2_cu *cu)
 {
-  dwarf2_per_objfile *per_objfile = cu->per_objfile;
-  struct attribute *attr;
-  hashval_t line_header_local_hash;
-  void **slot;
-  int decode_mapping;
+  /* Make sure a symtab is created for every file, even files
+     which contain only variables (i.e. no code with associated
+     line numbers).  */
+  buildsym_compunit *builder = cu->get_builder ();
+  struct compunit_symtab *cust = builder->get_compunit_symtab ();
 
-  gdb_assert (! cu->per_cu->is_debug_types);
+  struct line_header *lh = cu->line_header;
+  gdb_assert (lh != nullptr);
 
-  attr = dwarf2_attr (die, DW_AT_stmt_list, cu);
+  for (auto &fe : lh->file_names ())
+    {
+      dwarf2_start_subfile (cu, fe, *lh);
+      subfile *sf = builder->get_current_subfile ();
+
+      if (sf->symtab == nullptr)
+	sf->symtab = allocate_symtab (cust, sf->name.c_str (),
+				      sf->name_for_id.c_str ());
+
+      fe.symtab = sf->symtab;
+    }
+}
+
+
+/* Handle DW_AT_stmt_list for a compilation unit. DIE is the
+   DW_TAG_compile_unit die for CU.  FND is used to access the compilation
+   directory.  This function will decode the line table header and create
+   symtab objects for the files referenced in the line table.  The line
+   table itself though is not processed by this function.  If there is no
+   line table, or there's a problem decoding the header, then CU will not
+   be updated.  */
+
+static void
+decode_line_header_for_cu (struct die_info *die, struct dwarf2_cu *cu,
+			   const file_and_directory &fnd)
+{
+  gdb_assert (!cu->per_cu->is_debug_types);
+
+  struct attribute *attr = dwarf2_attr (die, DW_AT_stmt_list, cu);
   if (attr == NULL || !attr->form_is_unsigned ())
     return;
 
   sect_offset line_offset = (sect_offset) attr->as_unsigned ();
+  dwarf2_per_objfile *per_objfile = cu->per_objfile;
 
   /* The line header hash table is only created if needed (it exists to
      prevent redundant reading of the line table for partial_units).
@@ -5906,8 +5930,9 @@ handle_DW_AT_stmt_list (struct die_info *die, struct dwarf2_cu *cu,
 				   xcalloc, xfree));
     }
 
+  void **slot;
   line_header line_header_local (line_offset, cu->per_cu->is_dwz);
-  line_header_local_hash = line_header_hash (&line_header_local);
+  hashval_t line_header_local_hash = line_header_hash (&line_header_local);
   if (per_objfile->line_header_hash != NULL)
     {
       slot = htab_find_slot_with_hash (per_objfile->line_header_hash.get (),
@@ -5960,12 +5985,8 @@ handle_DW_AT_stmt_list (struct die_info *die, struct dwarf2_cu *cu,
 	 then this is what we want as well.  */
       gdb_assert (die->tag != DW_TAG_partial_unit);
     }
-  decode_mapping = (die->tag != DW_TAG_partial_unit);
-  /* The have_code check is here because, if LOWPC and HIGHPC are both 0x0,
-     then there won't be any interesting code in the CU, but a check later on
-     (in lnp_state_machine::check_line_address) will fail to properly exclude
-     an entry that was removed via --gc-sections.  */
-  dwarf_decode_lines (cu->line_header, cu, lowpc, decode_mapping && have_code);
+
+  create_symtabs_from_cu_line_table (cu);
 }
 
 /* Process DW_TAG_compile_unit or DW_TAG_partial_unit.  */
@@ -6017,10 +6038,12 @@ read_file_scope (struct die_info *die, struct dwarf2_cu *cu)
   scoped_restore restore_sym_cu
     = make_scoped_restore (&per_objfile->sym_cu, cu);
 
-  /* Decode line number information if present.  We do this before
-     processing child DIEs, so that the line header table is available
-     for DW_AT_decl_file.  */
-  handle_DW_AT_stmt_list (die, cu, fnd, unrel_low, unrel_low != unrel_high);
+  /* Decode the line header if present.  We do this before processing child
+     DIEs, so that information is available for DW_AT_decl_file.  We defer
+     parsing the actual line table until after processing the child DIEs,
+     this allows us to fix up some of the inline function blocks as the
+     line table is read.  */
+  decode_line_header_for_cu (die, cu, fnd);
 
   /* Process all dies in compilation unit.  */
   for (die_info *child_die : die->children ())
@@ -6028,6 +6051,12 @@ read_file_scope (struct die_info *die, struct dwarf2_cu *cu)
 
   per_objfile->sym_cu = nullptr;
 
+  /* If we actually have code, then read the line table now.  */
+  if (unrel_low != unrel_high
+      && die->tag != DW_TAG_partial_unit
+      && cu->line_header != nullptr)
+    dwarf_decode_lines (cu, unrel_low);
+
   /* Decode macro information, if present.  Dwarf 2 macro information
      refers to information in the line number info statement program
      header, so we can only read it if we've read the header
@@ -16565,29 +16594,11 @@ dwarf_decode_lines_1 (struct line_header *lh, struct dwarf2_cu *cu,
    table is read in.  */
 
 static void
-dwarf_decode_lines (struct line_header *lh, struct dwarf2_cu *cu,
-		    unrelocated_addr lowpc, int decode_mapping)
+dwarf_decode_lines (struct dwarf2_cu *cu, unrelocated_addr lowpc)
 {
-  if (decode_mapping)
-    dwarf_decode_lines_1 (lh, cu, lowpc);
+  gdb_assert (cu->line_header != nullptr);
 
-  /* Make sure a symtab is created for every file, even files
-     which contain only variables (i.e. no code with associated
-     line numbers).  */
-  buildsym_compunit *builder = cu->get_builder ();
-  struct compunit_symtab *cust = builder->get_compunit_symtab ();
-
-  for (auto &fe : lh->file_names ())
-    {
-      dwarf2_start_subfile (cu, fe, *lh);
-      subfile *sf = builder->get_current_subfile ();
-
-      if (sf->symtab == nullptr)
-	sf->symtab = allocate_symtab (cust, sf->name.c_str (),
-				      sf->name_for_id.c_str ());
-
-      fe.symtab = sf->symtab;
-    }
+  dwarf_decode_lines_1 (cu->line_header, cu, lowpc);
 }
 
 /* Start a subfile for DWARF.  FILENAME is the name of the file and
@@ -16852,7 +16863,7 @@ new_symbol (struct die_info *die, struct type *type, struct dwarf2_cu *cu,
 	      if (file_cu->line_header == nullptr)
 		{
 		  file_and_directory fnd (nullptr, nullptr);
-		  handle_DW_AT_stmt_list (file_cu->dies, file_cu, fnd, {}, false);
+		  decode_line_header_for_cu (file_cu->dies, file_cu, fnd);
 		}
 
 	      if (file_cu->line_header != nullptr)
-- 
2.47.1