* [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls.
@ 2025-10-12 21:15 William Ferreira
2025-10-15 20:50 ` Tom Tromey
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: William Ferreira @ 2025-10-12 21:15 UTC (permalink / raw)
To: gdb-patches; +Cc: William Ferreira
PR testsuite/32261 requests a script that could convert old .S-based
tests (that were made before dwarf.exp existed) into the new
Dwarf::assemble calls in Tcl. This commit is an initial implementation
of such a script. Python was chosen for convenience, and only relies on
a single external library.
Usage: the script operates not on .S files, but on ELF files with DWARF
information. To convert an old test, one must run said test via
`make check-gdb TESTS=testname` in their build directory. This will
produce, as a side effect, an ELF file the test used as an inferior, at
gdb/testsuite/outputs/testname/testname. This ELF file is this script's
input.
Reliability: not counting the limitations listed below, the script seems
functional enough to be worthy of discussion in the mailing list. I have
been testing it with different tests that already use Dwarf::assemble,
to see if the script can produce a similar call to it. Indeed, in most
cases that I've tested (including some more complex ones, marked with an
asterisk below) it is able to produce comparable output to the original
exp file. Of course, it can't reproduce the complex code *before* the
Dwarf::assemble call. Values calculated then are simply inlined.
The following .exp files have been tried in this way and their outputs
highly resemble the original:
- gdb.dwarf2/dynarr-ptr
- gdb.dwarf2/void-type
- gdb.dwarf2/ada-thick-pointer
- gdb.dwarf2/atomic-type
- gdb.dwarf2/dw2-entry-points (*)
- gdb.dwarf2/main-subprogram
The following .exp files work, with caveats addressed in the limitations
section below.
- gdb.dwarf2/cpp-linkage-name
- Works correctly except for one attribute of the form SPECIAL_expr.
- gdb.dwarf2/dw2-unusual-field-names
- Same as above, with two instances of SPECIAL_expr.
- gdb.dwarf2/implptr-optimized-out
- Same as above, with two instances of SPECIAL_expr.
- gdb.dwarf2/negative-data-member-location
- Same as above, with one instance of SPECIAL_expr.
The following .exp files FAIL, but that is expected:
- gdb.dwarf2/staticvirtual.exp
- high_pc and low_pc of subprogram "~S" are hardcoded in the original
.exp file, but they get replaced with a get_func_info call. Since
the function S::~S is not present in the original, get_func_info
will fail.
The following .exp files DO NOT WORK with this script:
- gdb.dwarf2/cu-no-addrs
- `aranges` not supported.
- Compile unit high_pc and low_pc hardcoded, prone to user error
due to forgetting to replace with variables.
- gdb.dwarf2/dw2-inline-stepping
- Same as above.
- gdb.dwarf2/fission-relative-dwo
- `fission` not supported.
- gdb.dwarf2/dw2-prologue-end and gdb.dwarf2/dw2-prologue-end-2
- Line tables not supported.
Currently the script has the following known limitations:
- It does not support line tables.
- It does not use $srcfile and other variables in the call to
Dwarf::assemble (since it can't know where it is safe to substitute).
- It does not support "fission" type DWARFs (in fact I still have no
clue what those are).
- It does not support cu {label LABEL} {} CUs, mostly because I couldn't
find the information using pyelftools.
- It sometimes outputs empty CUs at the start and end of the call. This
might be a problem with my machine, but I've checked with DWARF dumps
and they are indeed in the input ELF files generated by
`make check-gdb`.
- It does not support attributes with the forms DW_FORM_block* and
DW_FORM_exprloc. This is mostly not a concern of the difficulty of
the implementation, but of it being an incomplete feature and, thus,
more susceptible to users forgetting to correct its mistakes or
unfinished values (please see discussion started by Guinevere at
comment 23 https://sourceware.org/bugzilla/show_bug.cgi?id=32261#c23).
The incompleteness of this feature is easy to demonstrate: any call to
gdb_target_symbol, a common tool used in these attributes, needs a
symbol name that is erased after compilation. There is no way to guess
where that address being referenced in a DW_OP_addr comes from, and it
can't be hard coded since it can change depending on the machine
compiling it.
Please bring up any further shortcomings this script may have with your
expectations.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32261
---
gdb/contrib/dwarf-to-dwarf-assembler.py | 680 ++++++++++++++++++++++++
1 file changed, 680 insertions(+)
create mode 100755 gdb/contrib/dwarf-to-dwarf-assembler.py
diff --git a/gdb/contrib/dwarf-to-dwarf-assembler.py b/gdb/contrib/dwarf-to-dwarf-assembler.py
new file mode 100755
index 00000000000..4d545cf4b13
--- /dev/null
+++ b/gdb/contrib/dwarf-to-dwarf-assembler.py
@@ -0,0 +1,680 @@
+#!/usr/bin/env python3
+# pyright: strict
+
+# Copyright 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+# Due to the pyelftools dependency, this script requires Python version
+# 3.10 or greater to run.
+
+"""A utility to convert ELF files with DWARF info to Dwarf::assemble code.
+
+Usage:
+ python ./asm_to_dwarf_assembler.py <path/to/elf/file>
+
+Dependencies:
+ Python >= 3.10
+ pyelftools >= 0.31
+
+Notes:
+- Line tables are not currently supported.
+- Non-contiguous subprograms are not currently supported.
+- If you want to use $srcfile or similar, you must edit the references to the
+ file name manually, including DW_AT_name attributes on compile units.
+- If run with binaries generated by make check-gdb, it may include an
+ additional compile_unit before and after the actual compile units. This is
+ an artifact of the normal compilation process, as these CUs are indeed in
+ the generated DWARF in some cases.
+"""
+
+import sys
+import re
+import errno
+
+from io import BytesIO, IOBase
+from typing import Optional, Annotated
+from functools import cache
+from dataclasses import dataclass
+from copy import copy
+from datetime import datetime
+from logging import getLogger
+
+from elftools.elf.elffile import ELFFile
+from elftools.dwarf.compileunit import CompileUnit as RawCompileUnit
+from elftools.dwarf.die import DIE as RawDIE, AttributeValue
+
+
+logger = getLogger(__file__)
+
+
+# While these aren't supported, their detection is important for replacing them
+# with SPECIAL_expr and for writing the placeholder {MANUAL} expr list.
+EXPR_ATTRIBUTE_FORMS = [
+ "DW_FORM_exprloc",
+ "DW_FORM_block",
+ "DW_FORM_block1",
+ "DW_FORM_block2",
+ "DW_FORM_block4",
+]
+
+
+# Workaround for my editor not to freak out over unclosed braces.
+lbrace, rbrace = "{", "}"
+
+
+@cache
+def get_indent_str(indent_count: int) -> str:
+ """Get whitespace string to prepend to another for indenting."""
+ indent = (indent_count // 2) * "\t"
+ if indent_count % 2 == 1:
+ indent += " "
+ return indent
+
+
+def indent(line: str, indent_count: int) -> str:
+ """Indent line by indent_count levels."""
+ return get_indent_str(indent_count) + line
+
+
+def labelify_str(s: str) -> str:
+ """Make s appropriate for a label name."""
+ # Replace "*" with the literal word "ptr".
+ s = s.replace("*", "ptr")
+
+ # Replace any non-"word" characters by "_".
+ s = re.sub(r"\W", "_", s)
+
+ # Remove consecutive "_"s.
+ s = re.sub(r"__+", "_", s)
+
+ return s
+
+
+class DWARFAttribute:
+ """Storage unit for a single DWARF attribute.
+
+ All its values are strings that are usually passed on
+ directly to format. The exceptions to this are attributes
+ with int values with DW_FORM_ref4 or DW_FORM_ref_addr form.
+ Their values are interpreted as the global offset of the DIE
+ being referenced, which are looked up dynamically to fetch
+ their labels.
+ """
+ def __init__(
+ self,
+ die_offset: int,
+ name: str,
+ value: str | bytes | int | bool,
+ form=None,
+ ):
+ self.die_offset = die_offset
+ self.name = name
+ self.value = value
+ self.form = form
+
+ def _format_expr_value(self) -> str:
+ self.form = "SPECIAL_expr"
+ return "{ MANUAL: Fill expr list }"
+
+ def _needs_escaping(self, str_value: str) -> bool:
+ charset = set(str_value)
+ return bool(charset.intersection({"{", "}", " ", "\t"}))
+
+ def _format_str(self, str_value: str) -> str:
+ if self._needs_escaping(str_value):
+ escaped_str = str(str_value)
+ # Replace single escape (which is itself escaped because of regex)
+ # with a double escape (which doesn't mean anything to regex so
+ # it doesn't need escaping).
+ escaped_str = re.sub(r"\\", r"\\", escaped_str)
+ escaped_str = re.sub("([{}])", r"\\\1", escaped_str)
+ return "{" + escaped_str + "}"
+ else:
+ return str_value
+
+ def _format_value(
+ self,
+ offset_die_lookup: dict[int, "DWARFDIE"],
+ indent_count: int = 0
+ ) -> str:
+ if self.form in EXPR_ATTRIBUTE_FORMS:
+ return self._format_expr_value()
+ elif isinstance(self.value, bool):
+ return str(int(self.value))
+ elif isinstance(self.value, int):
+ if self.form == "DW_FORM_ref4":
+ # ref4-style referencing label.
+ die = offset_die_lookup[self.value]
+ return ":$" + die.tcl_label
+ elif self.form == "DW_FORM_ref_addr":
+ # ref_addr-style referencing label.
+ die = offset_die_lookup[self.value]
+ return "%$" + die.tcl_label
+ else:
+ return str(self.value)
+ elif isinstance(self.value, bytes):
+ return self._format_str(self.value.decode("ascii"))
+ elif isinstance(self.value, str):
+ return self._format_str(self.value)
+ else:
+ raise NotImplementedError(
+ f"Unknown data type: {type(self.value)}"
+ )
+
+ def format(
+ self,
+ offset_die_lookup: dict[int, "DWARFDIE"],
+ indent_count: int = 0
+ ) -> str:
+ """Format the attribute in the form {name value form}.
+
+ If form is DW_FORM_exprloc or DW_FORM_block, see next section on
+ DWARFOperations.
+
+ If it isn't, value is formatted as follows:
+ If bool, use "1" if True, "0" if False.
+ If int:
+ If form is DW_FORM_ref4, use ":$label" where label is the
+ tcl_label of the DWARFDIE at offset "value".
+ If form is DW_FORM_ref_addr, use "%$label" where label is
+ the tcl_label of the DWARFDIE at offset "value".
+ Else, use value directly.
+ If bytes, use value.decode("ascii")
+ If str, use value directly.
+ Any other type results in a NotImplementedError being raised.
+
+ Regarding DW_FORM_exprloc and DW_FORM_block:
+ The form is replaced with SPECIAL_expr.
+ The entries in the value are interpreted and decoded using the
+ dwarf_operations dictionary, and replaced with their names where
+ applicable.
+ """
+ s = lbrace
+ s += self.name + " "
+ s += self._format_value(offset_die_lookup)
+
+ # Only explicitly state form if it's not a reference.
+ if self.form not in [None, "DW_FORM_ref4", "DW_FORM_ref_addr"]:
+ s += " " + self.form
+
+ s += rbrace
+ return indent(s, indent_count)
+
+
+class DWARFDIE:
+ """This script's parsed version of a RawDIE."""
+ def __init__(
+ self,
+ offset: int,
+ tag: str,
+ attrs: dict[str, DWARFAttribute],
+ tcl_label: Optional[str] = None
+ ):
+ self.offset: Annotated[
+ int,
+ "Global offset of the DIE."
+ ] = offset
+ self.tag: Annotated[
+ str,
+ "DWARF tag for this DIE."
+ ] = tag
+ self.attrs: Annotated[
+ dict[str, DWARFAttribute],
+ "Dict of attributes for this DIE."
+ ] = copy(attrs)
+ self.children: Annotated[
+ list[DWARFDIE],
+ "List of child DIEs of this DIE."
+ ] = []
+ self.tcl_label: Annotated[
+ str,
+ "Label used by the Tcl code to reference this DIE, if any. These "
+ 'take the form of "label: " before the actual DIE definition.'
+ ] = tcl_label
+
+ def format_lines(
+ self,
+ offset_die_lookup: dict[int, "DWARFDIE"],
+ indent_count: int = 0
+ ) -> list[str]:
+ """Get the list of lines that represent this DIE in Dwarf assembler."""
+ die_lines = []
+
+ # Prepend label to first line, if it's set.
+ if self.tcl_label:
+ first_line_start = self.tcl_label + ": "
+ else:
+ first_line_start = ""
+
+ # First line, including label.
+ first_line = indent(
+ first_line_start + self.tag + " " + lbrace,
+ indent_count
+ )
+ die_lines.append(first_line)
+
+ # Format attributes, if any.
+ if self.attrs:
+ for attr_name, attr in self.attrs.items():
+ attr_line = attr.format(
+ offset_die_lookup,
+ indent_count=indent_count + 1
+ )
+ die_lines.append(attr_line)
+ die_lines.append(indent(rbrace, indent_count))
+ else:
+ # Don't create a new line, just append and immediately close the
+ # brace on the last line.
+ die_lines[-1] += rbrace
+
+ # Format children, if any.
+ if self.children:
+ # Only open a new brace if there are any children for the
+ # current DIE.
+ die_lines[-1] += " " + lbrace
+ for child in self.children:
+ child_lines = child.format_lines(
+ offset_die_lookup,
+ indent_count=indent_count + 1
+ )
+ die_lines.extend(child_lines)
+ die_lines.append(indent(rbrace, indent_count))
+
+ return die_lines
+
+ def format(
+ self,
+ offset_die_lookup: dict[int, "DWARFDIE"],
+ indent_count: int = 0
+ ) -> str:
+ """Join result from format_lines into a single str."""
+ return "\n".join(self.format_lines(offset_die_lookup, indent_count))
+
+ def name(self) -> Optional[str]:
+ """Get DW_AT_name (if present) decoded as ASCII."""
+ raw_value = self.attrs.get("DW_AT_name")
+ if raw_value is None:
+ return None
+ else:
+ return raw_value.value.decode("ascii")
+
+ def type_name(self) -> str:
+ """Name of Dwarf tag, with the "DW_TAG_" prefix removed."""
+ return re.sub("DW_TAG_", "", self.tag)
+
+
+class DWARFCompileUnit(DWARFDIE):
+ """Wrapper subclass for CU DIEs.
+
+ This is necessary due to the special format CUs take in Dwarf::assemble.
+
+ Instead of simply:
+ DW_TAG_compile_unit {
+ <attributes>
+ } {
+ <children>
+ }
+
+ CUs are formatted as:
+ cu { <cu_special_vars> } {
+ DW_TAG_compile_unit {
+ <attributes>
+ } {
+ <children>
+ }
+ }
+ """
+
+ # Default value for parameter is_64 defined in dwarf.exp line 1553.
+ # This value is converted to 0/1 automatically when emitting
+ # Dwarf::assemble code.
+ default_is_64 = False
+
+ # Default value for parameter dwarf_version defined in dwarf.exp line 1552.
+ default_dwarf_version = 4
+
+ # Default value for parameter is_fission defined in dwarf.exp line 1556.
+ # Currently not implemented, see comment below.
+ # default_is_fission = False
+
+ # Tag that signifies a DIE is a compile unit.
+ compile_unit_tag = "DW_TAG_compile_unit"
+
+ def __init__(
+ self,
+ raw_die: RawDIE,
+ raw_cu: RawCompileUnit,
+ attrs: dict[str, DWARFAttribute],
+ ):
+ """Initialize additional instance variables for CU encoding.
+
+ The additional instance variables are:
+ - is_64_bit: bool
+ Whether this CU is 64 bit or not.
+ - dwarf_version: int
+ default DWARFCompileUnit.default_dwarf_version
+ Version of DWARF this CU is using.
+ - addr_size: Optional[int]
+ default None
+ Size of an address in bytes.
+
+ These variables are used to configure the first parameter of the cu
+ proc (which contains calls to the compile_unit proc in the body of
+ Dwarf::assemble).
+ """
+ super().__init__(
+ raw_die.offset,
+ DWARFCompileUnit.compile_unit_tag,
+ attrs
+ )
+ self.raw_cu = raw_cu
+ self.dwarf_version: int = raw_cu.header.get(
+ "version",
+ DWARFCompileUnit.default_dwarf_version
+ )
+ self.addr_size: Optional[int] = raw_cu.header.get("address_size")
+ self.is_64_bit: bool = raw_cu.dwarf_format() == 64
+
+ # Fission is not currently implemented because I don't know where to
+ # fetch this information from.
+ # self.is_fission: bool = self.default_is_fission
+
+ # CU labels are not currently implemented because I haven't found where
+ # pyelftools exposes this information.
+ # self.cu_label: Optional[str] = None
+
+ def format_lines(
+ self,
+ offset_die_lookup: dict[int, DWARFDIE],
+ indent_count: int = 0,
+ ) -> list[str]:
+ lines = []
+ lines.append(self._get_header(indent_count))
+ inner_lines = super().format_lines(
+ offset_die_lookup,
+ indent_count + 1
+ )
+ lines += inner_lines
+ lines.append(indent(rbrace, indent_count))
+ return lines
+
+ def _get_header(self, indent_count: int = 0) -> str:
+ """Assemble the first line of the surrounding 'cu {} {}' proc call."""
+ header = indent("cu " + lbrace, indent_count)
+ cu_params = []
+
+ if self.is_64_bit != DWARFCompileUnit.default_is_64:
+ # Convert from True/False to 1/0.
+ param_value = int(self.is_64_bit)
+ cu_params += ["is_64", str(param_value)]
+
+ if self.dwarf_version != DWARFCompileUnit.default_dwarf_version:
+ cu_params += ["version", str(self.dwarf_version)]
+
+ if self.addr_size is not None:
+ cu_params += ["addr_size", str(self.addr_size)]
+
+ # Fission is not currently implemented, see comment above.
+ # if self.is_fission != DWARFCompileUnit.default_is_fission:
+ # # Same as is_64_bit conversion, True/False -> 1/0.
+ # param_value = int(self.is_fission)
+ # cu_params += ["fission", str(param_value)]
+
+ # CU labels are not currently implemented, see commend above.
+ # if self.cu_label is not None:
+ # cu_params += ["label", self.cu_label]
+
+ if cu_params:
+ header += " ".join(cu_params)
+
+ header += rbrace + " " + lbrace
+ return header
+
+
+class DWARFParser:
+ """Converter from pyelftools's DWARF representation to this script's."""
+
+ def __init__(self, elf_file: IOBase):
+ """Init parser with file opened in binary mode.
+
+ File can be closed after this function is called.
+ """
+ self.raw_data = BytesIO(elf_file.read())
+ self.elf_data = ELFFile(self.raw_data)
+ self.dwarf_info = self.elf_data.get_dwarf_info()
+ self.offset_to_die: dict[int, DWARFDIE] = {}
+ self.label_to_die: dict[str, DWARFDIE] = {}
+ self.referenced_offsets: Annotated[
+ set[int],
+ "The set of all offsets that were referenced by some DIE."
+ ] = set()
+ self.raw_cu_list: list[RawCompileUnit] = []
+ self.top_level_dies: list[DWARFDIE] = []
+ self.subprograms: list[DWARFDIE] = []
+ self.taken_labels: set[str] = set()
+
+ self._read_all_cus()
+ self._create_necessary_labels()
+
+ def _read_all_cus(self):
+ """Populate self.raw_cu_list with all CUs in self.dwarf_info."""
+ for cu in self.dwarf_info.iter_CUs():
+ self._read_cu(cu)
+
+ def _read_cu(self, raw_cu: RawCompileUnit):
+ """Read a compile_unit into self.cu_list."""
+ self.raw_cu_list.append(raw_cu)
+ for raw_die in raw_cu.iter_DIEs():
+ if not raw_die.is_null():
+ self._parse_die(raw_cu, raw_die)
+
+ def _parse_die(self, die_cu: RawCompileUnit, raw_die: RawDIE) -> DWARFDIE:
+ """Process a single DIE and add it to offset_to_die.
+
+ Look for DW_FORM_ref4 and DWD_FORM_ref_addr form attributes and replace
+ them with the global offset of the referenced DIE, and adding the
+ referenced DIE to a set. This will be used later to assign and use
+ labels only to DIEs that need it.
+
+ In case the DIE is a top-level DIE, add it to self.top_level_dies.
+
+ In case the DIE is a subprogram, add it to self.subprograms and call
+ self._use_vars_for_low_and_high_pc_attr with it.
+ """
+ processed_attrs = {}
+ attr_value: AttributeValue
+ for attr_name, attr_value in raw_die.attributes.items():
+ actual_value = attr_value.value
+ if attr_value.form in ("DW_FORM_ref4", "DW_FORM_ref_addr"):
+ referenced_die = raw_die.get_DIE_from_attribute(attr_name)
+ actual_value = referenced_die.offset
+ self.referenced_offsets.add(referenced_die.offset)
+
+ processed_attrs[attr_name] = DWARFAttribute(
+ raw_die.offset,
+ attr_name,
+ actual_value,
+ attr_value.form
+ )
+
+ if raw_die.tag == DWARFCompileUnit.compile_unit_tag:
+ processed_die = DWARFCompileUnit(raw_die, die_cu, processed_attrs)
+ else:
+ processed_die = DWARFDIE(
+ raw_die.offset,
+ raw_die.tag,
+ processed_attrs,
+ None
+ )
+
+ if raw_die.get_parent() is None:
+ # Top level DIE
+ self.top_level_dies.append(processed_die)
+ else:
+ # Setting the parent here assumes the parent was already processed
+ # prior to this DIE being found.
+ # As far as I'm aware, this is always true in DWARF.
+ processed_parent = self.offset_to_die[raw_die.get_parent().offset]
+ processed_parent.children.append(processed_die)
+
+ if processed_die.tag == "DW_TAG_subprogram":
+ self.subprograms.append(processed_die)
+ self._use_vars_for_low_and_high_pc_attr(processed_die)
+
+ self.offset_to_die[processed_die.offset] = processed_die
+ return processed_die
+
+ def _create_necessary_labels(self):
+ """Create labels to DIEs that were referenced by others."""
+ for offset in self.referenced_offsets:
+ die = self.offset_to_die[offset]
+ self._create_label_for_die(die)
+
+ def _use_vars_for_low_and_high_pc_attr(self, subprogram: DWARFDIE) -> None:
+ """Replace existing PC attributes with Tcl variables.
+
+ If DW_AT_low_pc exists for this DIE, replace it with accessing the
+ variable whose name is given by self.subprogram_start_var(subprogram).
+
+ If DW_AT_high_pc exists for this DIE, replace it with accessing the
+ variable whose name is given by self.subprogram_end_var(subprogram).
+ """
+ low_pc_attr_name = "DW_AT_low_pc"
+ if low_pc_attr_name in subprogram.attrs:
+ start = self.subprogram_start_var(subprogram)
+ subprogram.attrs[low_pc_attr_name].value = start
+
+ high_pc_attr_name = "DW_AT_high_pc"
+ if high_pc_attr_name in subprogram.attrs:
+ end = self.subprogram_end_var(subprogram)
+ subprogram.attrs[high_pc_attr_name].value = end
+
+ def _create_label_for_die(self, die: DWARFDIE) -> None:
+ """Set tcl_label to a unique string among other DIEs for this parser.
+
+ As a first attempt, use labelify(die.name()). If the DIE does not have
+ a name, use labelify(die.type_name()).
+
+ If the chosen initial label is already taken, try again appending "_2".
+ While the attempt is still taken, try again replacing it with "_3", then
+ "_4", and so on.
+
+ This function also creates an entry on self.label_to_die.
+ """
+ if die.tcl_label is not None:
+ return
+
+ label = labelify_str(die.name() or die.type_name())
+
+ # Deduplicate label in case of collision
+ if label in self.taken_labels:
+ suffix_nr = 2
+
+ # Walrus operator to prevent writing the assembled label_suffix
+ # string literal twice. This could be rewritten by copying the
+ # string literal to the line after the end of the while loop,
+ # but I deemed it would be too frail in case one of them needs
+ # to be changed and the other is forgotten.
+ while (new_label := f"{label}_{suffix_nr}") in self.taken_labels:
+ suffix_nr += 1
+ label = new_label
+
+ die.tcl_label = label
+ self.label_to_die[label] = die
+ self.taken_labels.add(label)
+
+ def subprogram_start_var(self, subprogram: DWARFDIE) -> str:
+ """Name of the Tcl variable that holds the low PC for a subprogram."""
+ return f"${subprogram.name()}_start"
+
+ def subprogram_end_var(self, subprogram: DWARFDIE) -> str:
+ """Name of the Tcl variable that holds the high PC for a subprogram."""
+ return f"${subprogram.name()}_end"
+
+ def all_labels(self) -> set[str]:
+ """Get a copy of the set of all labels known to the parser so far."""
+ return copy(self.taken_labels)
+
+
+class DWARFAssemblerGenerator:
+ """Class that generates Dwarf::assemble code out of a DWARFParser."""
+
+ def __init__(self, dwarf_parser: DWARFParser, output=sys.stdout):
+ self.dwarf_parser = dwarf_parser
+ self.output = output
+
+ def emit(self, line: str, indent_count: int) -> None:
+ """Print a single line indented indent_count times to self.output.
+
+ If line is empty, it will always print an empty line, even with nonzero
+ indent_count.
+ """
+ if line:
+ line = get_indent_str(indent_count) + line
+ print(line, file=self.output)
+
+ def generate_die(self, die: DWARFDIE, indent_count: int):
+ """Generate the lines that represent a DIE."""
+ die_lines = die.format(self.dwarf_parser.offset_to_die, indent_count)
+ self.emit(die_lines, 0)
+
+ def generate(self):
+ indent_count = 0
+
+ self.emit("Dwarf::assemble $asm_file {", indent_count)
+
+ # Begin Dwarf::assemble body.
+ indent_count += 1
+ self.emit("global srcdir subdir srcfile", indent_count)
+
+ all_labels = self.dwarf_parser.all_labels()
+ if all_labels:
+ self.emit("declare_labels " + " ".join(all_labels), indent_count)
+
+ self.emit("", 0)
+ for subprogram in self.dwarf_parser.subprograms:
+ self.emit(f"get_func_info {subprogram.name()}", indent_count)
+
+ for die in self.dwarf_parser.top_level_dies:
+ self.generate_die(die, indent_count)
+
+ # TODO: line table, if it's within scope (it probably isn't).
+
+ # End Dwarf::assemble body.
+ indent_count -= 1
+ self.emit(rbrace, indent_count)
+
+
+def main(argv):
+ try:
+ filename = argv[1]
+ except IndexError:
+ print("Usage:", file=sys.stderr)
+ print("python ./asm_to_dwarf_assembler.py <path/to/elf/file>", file=sys.stderr)
+ sys.exit(errno.EOPNOTSUP)
+
+ try:
+ with open(filename, "rb") as elf_file:
+ parser = DWARFParser(elf_file)
+ except Exception as e:
+ print("Error parsing ELF file. Does it contain DWARF information?", file=sys.stderr)
+ print(str(e), file=sys.stderr)
+ sys.exit(errno.ENODATA)
+ generator = DWARFAssemblerGenerator(parser)
+ generator.generate()
+
+
+if __name__ == "__main__":
+ main(sys.argv)
--
2.49.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls.
2025-10-12 21:15 [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls William Ferreira
@ 2025-10-15 20:50 ` Tom Tromey
2025-10-15 21:10 ` William Ferreira
2025-10-16 6:47 ` Tom de Vries
2025-10-16 7:14 ` Tom de Vries
2 siblings, 1 reply; 10+ messages in thread
From: Tom Tromey @ 2025-10-15 20:50 UTC (permalink / raw)
To: William Ferreira; +Cc: gdb-patches
>>>>> William Ferreira <wqferr@gmail.com> writes:
> PR testsuite/32261 requests a script that could convert old .S-based
> tests (that were made before dwarf.exp existed) into the new
> Dwarf::assemble calls in Tcl. This commit is an initial implementation
> of such a script. Python was chosen for convenience, and only relies on
> a single external library.
Thank you for doing this.
> The following .exp files have been tried in this way and their outputs
> highly resemble the original:
> - gdb.dwarf2/dynarr-ptr
> - gdb.dwarf2/void-type
> - gdb.dwarf2/ada-thick-pointer
> - gdb.dwarf2/atomic-type
> - gdb.dwarf2/dw2-entry-points (*)
> - gdb.dwarf2/main-subprogram
Are you planning to try to convert these tests? If so that would be
fantastic. If not, that's totally fine, maybe we should file a bug for
this. It'd be nice to get rid of the old .s tests. Anyway let me know.
> Currently the script has the following known limitations:
[...]
FWIW I think basically any limitations are fine. The DWARF assembler
itself had many at the start, people tend to add things as needed.
> + s = lbrace
> + s += self.name + " "
> + s += self._format_value(offset_die_lookup)
I somewhat suspect this output will not work now, because there were
some recent changes to make attributes Tcl code rather than
specially-parsed data.
However that's yet another thing we can fix in situ.
Approved-By: Tom Tromey <tom@tromey.com>
I forget what your copyright situation is. If you are all set up, we
can land this. If you plan to submit more gdb contributions, let me
know and we can set up write-after-approval access for you.
Tom
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls.
2025-10-15 20:50 ` Tom Tromey
@ 2025-10-15 21:10 ` William Ferreira
2025-10-16 15:52 ` Tom Tromey
0 siblings, 1 reply; 10+ messages in thread
From: William Ferreira @ 2025-10-15 21:10 UTC (permalink / raw)
To: Tom Tromey; +Cc: gdb-patches
[-- Attachment #1: Type: text/plain, Size: 2375 bytes --]
Copyright is already assigned, everything is ready. I can convert the tests
in a following commit, along with a few others.
I will have some free time starting on the 27th, so if you have any other
comments please submit a proper bug report before then. If you do so,
please send its URL to me either directly or in this thread.
Regarding your concerns as to whether that section will work properly,
could you please point me to a DWARF assembler test that uses this new
syntax? I might be able to fix this implementation before the rebase.
Thank you,
William Ferreira.
On Wed, Oct 15, 2025, 17:50 Tom Tromey <tom@tromey.com> wrote:
> >>>>> William Ferreira <wqferr@gmail.com> writes:
>
> > PR testsuite/32261 requests a script that could convert old .S-based
> > tests (that were made before dwarf.exp existed) into the new
> > Dwarf::assemble calls in Tcl. This commit is an initial implementation
> > of such a script. Python was chosen for convenience, and only relies on
> > a single external library.
>
> Thank you for doing this.
>
> > The following .exp files have been tried in this way and their outputs
> > highly resemble the original:
> > - gdb.dwarf2/dynarr-ptr
> > - gdb.dwarf2/void-type
> > - gdb.dwarf2/ada-thick-pointer
> > - gdb.dwarf2/atomic-type
> > - gdb.dwarf2/dw2-entry-points (*)
> > - gdb.dwarf2/main-subprogram
>
> Are you planning to try to convert these tests? If so that would be
> fantastic. If not, that's totally fine, maybe we should file a bug for
> this. It'd be nice to get rid of the old .s tests. Anyway let me know.
>
> > Currently the script has the following known limitations:
> [...]
>
> FWIW I think basically any limitations are fine. The DWARF assembler
> itself had many at the start, people tend to add things as needed.
>
> > + s = lbrace
> > + s += self.name + " "
> > + s += self._format_value(offset_die_lookup)
>
> I somewhat suspect this output will not work now, because there were
> some recent changes to make attributes Tcl code rather than
> specially-parsed data.
>
> However that's yet another thing we can fix in situ.
>
> Approved-By: Tom Tromey <tom@tromey.com>
>
> I forget what your copyright situation is. If you are all set up, we
> can land this. If you plan to submit more gdb contributions, let me
> know and we can set up write-after-approval access for you.
>
> Tom
>
[-- Attachment #2: Type: text/html, Size: 3294 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls.
2025-10-12 21:15 [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls William Ferreira
2025-10-15 20:50 ` Tom Tromey
@ 2025-10-16 6:47 ` Tom de Vries
2025-10-16 7:14 ` Tom de Vries
2 siblings, 0 replies; 10+ messages in thread
From: Tom de Vries @ 2025-10-16 6:47 UTC (permalink / raw)
To: William Ferreira, gdb-patches
On 10/12/25 11:15 PM, William Ferreira wrote:
> + sys.exit(errno.EOPNOTSUP)
Hi,
I run into:
...
$ ./gdb/contrib/dwarf-to-dwarf-assembler.py
Usage:
python ./asm_to_dwarf_assembler.py <path/to/elf/file>
Traceback (most recent call last):
File
"/data/vries/gdb/binutils-gdb.git/./gdb/contrib/dwarf-to-dwarf-assembler.py",
line 662, in main
filename = argv[1]
~~~~^^^
IndexError: list index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/data/vries/gdb/binutils-gdb.git/./gdb/contrib/dwarf-to-dwarf-assembler.py",
line 680, in <module>
main(sys.argv)
~~~~^^^^^^^^^^
File
"/data/vries/gdb/binutils-gdb.git/./gdb/contrib/dwarf-to-dwarf-assembler.py",
line 666, in main
sys.exit(errno.EOPNOTSUP)
^^^^^^^^^^^^^^^
AttributeError: module 'errno' has no attribute 'EOPNOTSUP'. Did you
mean: 'EOPNOTSUPP'?
$
...
Fixed by:
...
diff --git a/gdb/contrib/dwarf-to-dwarf-assembler.py
b/gdb/contrib/dwarf-to-dwarf-assembler.py
index 4d545cf4b13..24d8e83932e 100755
--- a/gdb/contrib/dwarf-to-dwarf-assembler.py
+++ b/gdb/contrib/dwarf-to-dwarf-assembler.py
@@ -663,7 +663,7 @@ def main(argv):
except IndexError:
print("Usage:", file=sys.stderr)
print("python ./asm_to_dwarf_assembler.py <path/to/elf/file>",
file=sys.stderr)
- sys.exit(errno.EOPNOTSUP)
+ sys.exit(errno.EOPNOTSUPP)
try:
with open(filename, "rb") as elf_file:
...
after which I get:
...
$ ./gdb/contrib/dwarf-to-dwarf-assembler.py
Usage:
python ./asm_to_dwarf_assembler.py <path/to/elf/file>
$
...
Thanks,
- Tom
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls.
2025-10-12 21:15 [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls William Ferreira
2025-10-15 20:50 ` Tom Tromey
2025-10-16 6:47 ` Tom de Vries
@ 2025-10-16 7:14 ` Tom de Vries
2025-10-16 15:55 ` Tom Tromey
2 siblings, 1 reply; 10+ messages in thread
From: Tom de Vries @ 2025-10-16 7:14 UTC (permalink / raw)
To: William Ferreira, gdb-patches
On 10/12/25 11:15 PM, William Ferreira wrote:
> + Regarding DW_FORM_exprloc and DW_FORM_block:
> + The form is replaced with SPECIAL_expr.
> + The entries in the value are interpreted and decoded using the
> + dwarf_operations dictionary, and replaced with their names where
> + applicable.
> + """
> + s = lbrace
> + s += self.name + " "
> + s += self._format_value(offset_die_lookup)
I tried out the script with a test-case I'm about to commit, and ran into:
...
Traceback (most recent call last):
File
"/data/vries/gdb/binutils-gdb.git/./gdb/contrib/dwarf-to-dwarf-assembler.py",
line 680, in <module>
main(sys.argv)
~~~~^^^^^^^^^^
File
"/data/vries/gdb/binutils-gdb.git/./gdb/contrib/dwarf-to-dwarf-assembler.py",
line 676, in main
generator.generate()
~~~~~~~~~~~~~~~~~~^^
File
"/data/vries/gdb/binutils-gdb.git/./gdb/contrib/dwarf-to-dwarf-assembler.py",
line 651, in generate
self.generate_die(die, indent_count)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
File
"/data/vries/gdb/binutils-gdb.git/./gdb/contrib/dwarf-to-dwarf-assembler.py",
line 630, in generate_die
die_lines = die.format(self.dwarf_parser.offset_to_die, indent_count)
File
"/data/vries/gdb/binutils-gdb.git/./gdb/contrib/dwarf-to-dwarf-assembler.py",
line 303, in format
return "\n".join(self.format_lines(offset_die_lookup, indent_count))
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/data/vries/gdb/binutils-gdb.git/./gdb/contrib/dwarf-to-dwarf-assembler.py",
line 405, in format_lines
inner_lines = super().format_lines(
offset_die_lookup,
indent_count + 1
)
File
"/data/vries/gdb/binutils-gdb.git/./gdb/contrib/dwarf-to-dwarf-assembler.py",
line 271, in format_lines
attr_line = attr.format(
offset_die_lookup,
indent_count=indent_count + 1
)
File
"/data/vries/gdb/binutils-gdb.git/./gdb/contrib/dwarf-to-dwarf-assembler.py",
line 205, in format
s += self.name + " "
~~~~~~~~~~^~~~~
TypeError: unsupported operand type(s) for +: 'int' and 'str'
...
The problem is that gcc 15 is using dwarf6 attribute DW_AT_language_name
with value 144 (0x90), which is unknown to the script.
Using this patch:
...
diff --git a/gdb/contrib/dwarf-to-dwarf-assembler.py
b/gdb/contrib/dwarf-to-dwarf-assembler.py
index 4d545cf4b13..13df3c78345 100755
--- a/gdb/contrib/dwarf-to-dwarf-assembler.py
+++ b/gdb/contrib/dwarf-to-dwarf-assembler.py
@@ -202,7 +202,11 @@ class DWARFAttribute:
applicable.
"""
s = lbrace
- s += self.name + " "
+ if isinstance(self.name, int):
+ s += "DW_AT_" + str(self.name)
+ else:
+ s += self.name
+ s += " "
s += self._format_value(offset_die_lookup)
# Only explicitly state form if it's not a reference.
...
I get:
...
{DW_AT_language 33 DW_FORM_data1}
{DW_AT_144 4 DW_FORM_data1}
{DW_AT_145 201703 DW_FORM_data4}
{DW_AT_name
/data/vries/gdb/src/gdb/testsuite/gdb.cp/pretty-print.cc DW_FORM_line_strp}
...
I haven't looked at whether that output makes sense with the dwarf
assembler, perhaps there's a better way, but at least this allows the
script to finish and makes it reasonably clear what the situation is.
Thanks,
- Tom
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls.
2025-10-15 21:10 ` William Ferreira
@ 2025-10-16 15:52 ` Tom Tromey
0 siblings, 0 replies; 10+ messages in thread
From: Tom Tromey @ 2025-10-16 15:52 UTC (permalink / raw)
To: William Ferreira; +Cc: Tom Tromey, gdb-patches
>>>>> William Ferreira <wqferr@gmail.com> writes:
> Copyright is already assigned, everything is ready. I can convert the
> tests in a following commit, along with a few others.
Cool, thanks.
> Regarding your concerns as to whether that section will work properly,
> could you please point me to a DWARF assembler test that uses this new
> syntax? I might be able to fix this implementation before the rebase.
Basically all of them. See commit f78ff4d0.
Tom
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls.
2025-10-16 7:14 ` Tom de Vries
@ 2025-10-16 15:55 ` Tom Tromey
2025-10-20 20:52 ` William Ferreira
0 siblings, 1 reply; 10+ messages in thread
From: Tom Tromey @ 2025-10-16 15:55 UTC (permalink / raw)
To: Tom de Vries; +Cc: William Ferreira, gdb-patches
>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
Tom> The problem is that gcc 15 is using dwarf6 attribute
Tom> DW_AT_language_name with value 144 (0x90), which is unknown to the
Tom> script.
FWIW my inclination is to land the script so that we can apply fixes
like yours directly.
Tom> {DW_AT_144 4 DW_FORM_data1}
Hex here would be nicer since then it would be easier to find in
dwarf2.def. Though I guess not in this particular case since the DWARF
6 patch hasn't apparently been merged over from GCC.
Tom
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls.
2025-10-16 15:55 ` Tom Tromey
@ 2025-10-20 20:52 ` William Ferreira
2025-10-21 17:17 ` Tom Tromey
0 siblings, 1 reply; 10+ messages in thread
From: William Ferreira @ 2025-10-20 20:52 UTC (permalink / raw)
To: Tom Tromey; +Cc: Tom de Vries, gdb-patches
I agree we should merge this as soon as possible, but I've been told
by a friend who also works on GDB to say that "I do not have write
after approval".
William.
On Thu, Oct 16, 2025 at 12:55 PM Tom Tromey <tom@tromey.com> wrote:
>
> >>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
>
> Tom> The problem is that gcc 15 is using dwarf6 attribute
> Tom> DW_AT_language_name with value 144 (0x90), which is unknown to the
> Tom> script.
>
> FWIW my inclination is to land the script so that we can apply fixes
> like yours directly.
>
> Tom> {DW_AT_144 4 DW_FORM_data1}
>
> Hex here would be nicer since then it would be easier to find in
> dwarf2.def. Though I guess not in this particular case since the DWARF
> 6 patch hasn't apparently been merged over from GCC.
>
> Tom
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls.
2025-10-20 20:52 ` William Ferreira
@ 2025-10-21 17:17 ` Tom Tromey
2025-10-21 19:18 ` Tom de Vries
0 siblings, 1 reply; 10+ messages in thread
From: Tom Tromey @ 2025-10-21 17:17 UTC (permalink / raw)
To: William Ferreira; +Cc: Tom Tromey, Tom de Vries, gdb-patches
William> I agree we should merge this as soon as possible, but I've been told
William> by a friend who also works on GDB to say that "I do not have write
William> after approval".
I'm checking it in now. 'pre-commit' found some formatting issues but
also fixed them up, so I'm actually committing the updated version.
Tom, could you then check in your follow-up patches? Thanks. Normally
I wouldn't land something in a state requiring immediate fixes, but I
think in this case it seems a little simpler, and also harmless since
this is a contrib script.
Tom
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls.
2025-10-21 17:17 ` Tom Tromey
@ 2025-10-21 19:18 ` Tom de Vries
0 siblings, 0 replies; 10+ messages in thread
From: Tom de Vries @ 2025-10-21 19:18 UTC (permalink / raw)
To: Tom Tromey, William Ferreira; +Cc: gdb-patches
On 10/21/25 7:17 PM, Tom Tromey wrote:
> William> I agree we should merge this as soon as possible, but I've been told
> William> by a friend who also works on GDB to say that "I do not have write
> William> after approval".
>
> I'm checking it in now. 'pre-commit' found some formatting issues but
> also fixed them up, so I'm actually committing the updated version.
>
> Tom, could you then check in your follow-up patches? Thanks. Normally
> I wouldn't land something in a state requiring immediate fixes, but I
> think in this case it seems a little simpler, and also harmless since
> this is a contrib script.
Agreed, will do.
Thanks,
- Tom
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-10-21 19:18 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-12 21:15 [PATCH v3] [gdb] Create script to convert old tests into Dwarf::assemble calls William Ferreira
2025-10-15 20:50 ` Tom Tromey
2025-10-15 21:10 ` William Ferreira
2025-10-16 15:52 ` Tom Tromey
2025-10-16 6:47 ` Tom de Vries
2025-10-16 7:14 ` Tom de Vries
2025-10-16 15:55 ` Tom Tromey
2025-10-20 20:52 ` William Ferreira
2025-10-21 17:17 ` Tom Tromey
2025-10-21 19:18 ` Tom de Vries
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox