[PATCH v2 11/28] Rewrite the .gdb_index reader

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Tom Tromey <tom@tromey.com>
To: gdb-patches@sourceware.org
Cc: Tom Tromey <tom@tromey.com>
Subject: [PATCH v2 11/28] Rewrite the .gdb_index reader
Date: Wed, 02 Apr 2025 17:45:10 -0600	[thread overview]
Message-ID: <20250402-search-in-psyms-v2-11-ea91704487cb@tromey.com> (raw)
In-Reply-To: <20250402-search-in-psyms-v2-0-ea91704487cb@tromey.com>

This patch rewrites the .gdb_index reader to create the same data
structures that are created by the cooked indexer and the .debug_names
reader.

This is done in support of this series; but also because, from what I
can tell, the "templates.exp" change didn't really work properly with
this reader.

In addition to fixing that problem, this patch removes a lot of code.

Implementing this required a couple of hacks, as .gdb_index does not
contain all the information that's used by the cooked index
implementation.

* The index-searching code likes to differentiate between the various
  DWARF tags when matching, but .gdb_index lumps many things into a
  single "other" category.  To handle this, we introduce a phony tag
  that's used so that the match method can match on multiple domains.

* Similarly, .gdb_index doesn't distinguish between the type and
  struct domains, so another phony tag is used for this.

* Support for older versions of .gdb_index is removed entirely.

* The reader must attempt to guess the language of various symbols.
  This is somewhat finicky.  "Plain" (unqualified) symbols are marked
  as language_unknown and then a couple of hacks are used to handle
  these -- one in expand_symtabs_matching and another when recognizing
  "main".

For what it's worth, I consider .gdb_index to be near the end of its
life.  While .debug_names is not perfect -- we found a number of bugs
in the standard while implementing it -- it is better than .gdb_index
and also better documented.

After this patch, we could conceivably remove dwarf_scanner_base.
However, I have not done this.

Finally, this patch also changes this reader to dump the content of
the index, as the other DWARF readers do.  This can be handy when
debugging gdb.
---
 gdb/dwarf2/cooked-index-shard.c  |   11 +-
 gdb/dwarf2/cooked-index-worker.h |   10 +-
 gdb/dwarf2/read-gdb-index.c      | 1270 +++++++-------------------------------
 gdb/dwarf2/read-gdb-index.h      |   14 +
 gdb/dwarf2/read.c                |   58 +-
 gdb/dwarf2/tag.h                 |    8 +
 6 files changed, 302 insertions(+), 1069 deletions(-)

diff --git a/gdb/dwarf2/cooked-index-shard.c b/gdb/dwarf2/cooked-index-shard.c
index 29a8aea513786e4c1c1ed77dee8610fc329d1c8a..888a0fa345b9124e2814aa722e71a9d2fd5adf8e 100644
--- a/gdb/dwarf2/cooked-index-shard.c
+++ b/gdb/dwarf2/cooked-index-shard.c
@@ -86,7 +86,16 @@ cooked_index_shard::add (sect_offset die_offset, enum dwarf_tag tag,
      implicit "main" discovery.  */
   if ((flags & IS_MAIN) != 0)
     m_main = result;
-  else if ((flags & IS_PARENT_DEFERRED) == 0
+  /* The language check here is subtle: it exists solely to work
+     around a bug in .gdb_index.  That index does not record
+     languages, but it might emit an entry for "main".  However,
+     recognizing this "main" as being the main program would be wrong
+     -- for example, an Ada program has a C "main" but this is not the
+     desired target of the "start" command.  Requiring the language to
+     be set here avoids over-eagerly setting the "main" when using
+     .gdb_index.  Should .gdb_index ever be removed (PR symtab/31363),
+     the language_unknown check here could also be removed.  */
+  else if (lang != language_unknown
 	   && parent_entry.resolved == nullptr
 	   && m_main == nullptr
 	   && language_may_use_plain_main (lang)
diff --git a/gdb/dwarf2/cooked-index-worker.h b/gdb/dwarf2/cooked-index-worker.h
index c6e005c14dea1bb891680969c17172d494a7de46..a6b4780373c54a51760e46997a24aca57c340969 100644
--- a/gdb/dwarf2/cooked-index-worker.h
+++ b/gdb/dwarf2/cooked-index-worker.h
@@ -108,6 +108,12 @@ class cooked_index_worker_result
     return &m_addrmap;
   }
 
+  /* Set the mutable addrmap.  */
+  void set_addrmap (addrmap_mutable new_map)
+  {
+    m_addrmap = std::move (new_map);
+  }
+
   /* Return the parent_map that is currently being created.  */
   parent_map *get_parent_map ()
   {
@@ -203,10 +209,10 @@ enum class cooked_state
 /* An object of this type controls the scanning of the DWARF.  It
    schedules the worker tasks and tracks the current state.  Once
    scanning is done, this object is discarded.
-
+   
    This is an abstract base class that defines the basic behavior of
    scanners.  Separate concrete implementations exist for scanning
-   .debug_names and .debug_info.  */
+   .debug_names, .gdb_index, and .debug_info.  */
 
 class cooked_index_worker
 {
diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c
index 7bcf50d347609f0b27068228cd78633dd57e1556..1d7704b8c5bb7b5487b2373ea064c11ab1630e72 100644
--- a/gdb/dwarf2/read-gdb-index.c
+++ b/gdb/dwarf2/read-gdb-index.c
@@ -27,16 +27,68 @@
 #include "event-top.h"
 #include "gdb/gdb-index.h"
 #include "gdbsupport/gdb-checked-static-cast.h"
-#include "mapped-index.h"
+#include "cooked-index.h"
 #include "read.h"
 #include "extract-store-integer.h"
 #include "cp-support.h"
 #include "symtab.h"
 #include "gdbsupport/selftest.h"
+#include "tag.h"
+#include "gdbsupport/gdb-safe-ctype.h"
 
 /* When true, do not reject deprecated .gdb_index sections.  */
 static bool use_deprecated_index_sections = false;
 
+struct dwarf2_gdb_index : public cooked_index_functions
+{
+  /* This dumps minimal information about the index.
+     It is called via "mt print objfiles".
+     One use is to verify .gdb_index has been loaded by the
+     gdb.dwarf2/gdb-index.exp testcase.  */
+  void dump (struct objfile *objfile) override;
+};
+
+/* This is like a cooked index, but as it has been ingested from
+   .gdb_index, it can't be used to write out an index.  */
+
+class cooked_gdb_index : public cooked_index
+{
+public:
+
+  cooked_gdb_index (std::unique_ptr<cooked_index_worker> worker,
+		    int version)
+    : cooked_index (std::move (worker)),
+      version (version)
+  { }
+
+  cooked_index *index_for_writing () override
+  { return nullptr; }
+
+  quick_symbol_functions_up make_quick_functions () const override
+  { return quick_symbol_functions_up (new dwarf2_gdb_index); }
+
+  bool version_check () const override
+  {
+    return version >= 8;
+  }
+
+  /* Index data format version.  */
+  int version;
+};
+
+/* See above.  */
+
+void
+dwarf2_gdb_index::dump (struct objfile *objfile)
+{
+  dwarf2_per_objfile *per_objfile = get_dwarf2_per_objfile (objfile);
+
+  cooked_gdb_index *index = (gdb::checked_static_cast<cooked_gdb_index *>
+			     (per_objfile->per_bfd->index_table.get ()));
+  gdb_printf (".gdb_index: version %d\n", index->version);
+  cooked_index_functions::dump (objfile);
+}
+
 /* This is a view into the index that converts from bytes to an
    offset_type, and allows indexing.  Unaligned bytes are specifically
    allowed here, and handled via unpacking.  */
@@ -77,43 +129,11 @@ class offset_view
   gdb::array_view<const gdb_byte> m_bytes;
 };
 
-/* An index into a (C++) symbol name component in a symbol name as
-   recorded in the mapped_index's symbol table.  For each C++ symbol
-   in the symbol table, we record one entry for the start of each
-   component in the symbol in a table of name components, and then
-   sort the table, in order to be able to binary search symbol names,
-   ignoring leading namespaces, both completion and regular look up.
-   For example, for symbol "A::B::C", we'll have an entry that points
-   to "A::B::C", another that points to "B::C", and another for "C".
-   Note that function symbols in GDB index have no parameter
-   information, just the function/method names.  You can convert a
-   name_component to a "const char *" using the
-   'mapped_index::symbol_name_at(offset_type)' method.  */
-
-struct name_component
-{
-  /* Offset in the symbol name where the component starts.  Stored as
-     a (32-bit) offset instead of a pointer to save memory and improve
-     locality on 64-bit architectures.  */
-  offset_type name_offset;
-
-  /* The symbol's index in the symbol and constant pool tables of a
-     mapped_index.  */
-  offset_type idx;
-};
-
-/* A description of .gdb_index index.  The file format is described in
-   a comment by the code that writes the index.  */
+/* A worker for reading .gdb_index.  The file format is described in
+   the manual.  */
 
-struct mapped_gdb_index : public dwarf_scanner_base
+struct mapped_gdb_index
 {
-  /* The name_component table (a sorted vector).  See name_component's
-     description above.  */
-  std::vector<name_component> name_components;
-
-  /* How NAME_COMPONENTS is sorted.  */
-  enum case_sensitivity name_components_casing;
-
   /* Index data format version.  */
   int version = 0;
 
@@ -132,6 +152,14 @@ struct mapped_gdb_index : public dwarf_scanner_base
   /* An address map that maps from PC to dwarf2_per_cu.  */
   addrmap_fixed *index_addrmap = nullptr;
 
+  /* The name of 'main', or nullptr if not known.  */
+  const char *main_name = nullptr;
+  /* The language of 'main', if known.  */
+  enum language main_lang = language_minimal;
+
+  /* The result we're constructing.  */
+  cooked_index_worker_result result;
+
   /* Return the index into the constant pool of the name of the IDXth
      symbol in the symbol table.  */
   offset_type symbol_name_index (offset_type idx) const
@@ -148,221 +176,41 @@ struct mapped_gdb_index : public dwarf_scanner_base
 
   /* Return whether the name at IDX in the symbol table should be
      ignored.  */
-  virtual bool symbol_name_slot_invalid (offset_type idx) const
+  bool symbol_name_slot_invalid (offset_type idx) const
   {
-    return (symbol_name_index (idx) == 0
-	    && symbol_vec_index (idx) == 0);
+    return symbol_name_index (idx) == 0 && symbol_vec_index (idx) == 0;
   }
 
   /* Convenience method to get at the name of the symbol at IDX in the
      symbol table.  */
-  virtual const char *symbol_name_at
-    (offset_type idx, dwarf2_per_objfile *per_objfile) const
+  const char *symbol_name_at (offset_type idx,
+			      dwarf2_per_objfile *per_objfile) const
   {
     return (const char *) (this->constant_pool.data ()
 			   + symbol_name_index (idx));
   }
 
-  virtual size_t symbol_name_count () const
+  size_t symbol_name_count () const
   { return this->symbol_table.size () / 2; }
 
+  /* Set the name and language of the main function from the shortcut
+     table.  */
+  void set_main_name (dwarf2_per_objfile *per_objfile);
+
   /* Build the symbol name component sorted vector, if we haven't
      yet.  */
   void build_name_components (dwarf2_per_objfile *per_objfile);
-
-  /* Returns the lower (inclusive) and upper (exclusive) bounds of the
-     possible matches for LN_NO_PARAMS in the name component
-     vector.  */
-  std::pair<std::vector<name_component>::const_iterator,
-	    std::vector<name_component>::const_iterator>
-    find_name_components_bounds (const lookup_name_info &ln_no_params,
-				 enum language lang,
-				 dwarf2_per_objfile *per_objfile) const;
-
-  quick_symbol_functions_up make_quick_functions () const override;
-
-  bool version_check () const override
-  {
-    return version >= 8;
-  }
-
-  dwarf2_per_cu *lookup (unrelocated_addr addr) override
-  {
-    if (index_addrmap == nullptr)
-      return nullptr;
-
-    void *obj = index_addrmap->find (static_cast<CORE_ADDR> (addr));
-    return static_cast<dwarf2_per_cu *> (obj);
-  }
-
-  cooked_index *index_for_writing () override
-  { return nullptr; }
 };
 
-
-/* Starting from a search name, return the string that finds the upper
-   bound of all strings that start with SEARCH_NAME in a sorted name
-   list.  Returns the empty string to indicate that the upper bound is
-   the end of the list.  */
-
-static std::string
-make_sort_after_prefix_name (const char *search_name)
-{
-  /* When looking to complete "func", we find the upper bound of all
-     symbols that start with "func" by looking for where we'd insert
-     the closest string that would follow "func" in lexicographical
-     order.  Usually, that's "func"-with-last-character-incremented,
-     i.e. "fund".  Mind non-ASCII characters, though.  Usually those
-     will be UTF-8 multi-byte sequences, but we can't be certain.
-     Especially mind the 0xff character, which is a valid character in
-     non-UTF-8 source character sets (e.g. Latin1 'ÿ'), and we can't
-     rule out compilers allowing it in identifiers.  Note that
-     conveniently, strcmp/strcasecmp are specified to compare
-     characters interpreted as unsigned char.  So what we do is treat
-     the whole string as a base 256 number composed of a sequence of
-     base 256 "digits" and add 1 to it.  I.e., adding 1 to 0xff wraps
-     to 0, and carries 1 to the following more-significant position.
-     If the very first character in SEARCH_NAME ends up incremented
-     and carries/overflows, then the upper bound is the end of the
-     list.  The string after the empty string is also the empty
-     string.
-
-     Some examples of this operation:
-
-       SEARCH_NAME  => "+1" RESULT
-
-       "abc"              => "abd"
-       "ab\xff"           => "ac"
-       "\xff" "a" "\xff"  => "\xff" "b"
-       "\xff"             => ""
-       "\xff\xff"         => ""
-       ""                 => ""
-
-     Then, with these symbols for example:
-
-      func
-      func1
-      fund
-
-     completing "func" looks for symbols between "func" and
-     "func"-with-last-character-incremented, i.e. "fund" (exclusive),
-     which finds "func" and "func1", but not "fund".
-
-     And with:
-
-      funcÿ     (Latin1 'ÿ' [0xff])
-      funcÿ1
-      fund
-
-     completing "funcÿ" looks for symbols between "funcÿ" and "fund"
-     (exclusive), which finds "funcÿ" and "funcÿ1", but not "fund".
-
-     And with:
-
-      ÿÿ        (Latin1 'ÿ' [0xff])
-      ÿÿ1
-
-     completing "ÿ" or "ÿÿ" looks for symbols between between "ÿÿ" and
-     the end of the list.
-  */
-  std::string after = search_name;
-  while (!after.empty () && (unsigned char) after.back () == 0xff)
-    after.pop_back ();
-  if (!after.empty ())
-    after.back () = (unsigned char) after.back () + 1;
-  return after;
-}
-
-/* See declaration.  */
-
-std::pair<std::vector<name_component>::const_iterator,
-	  std::vector<name_component>::const_iterator>
-mapped_gdb_index::find_name_components_bounds
-  (const lookup_name_info &lookup_name_without_params, language lang,
-   dwarf2_per_objfile *per_objfile) const
-{
-  auto *name_cmp
-    = this->name_components_casing == case_sensitive_on ? strcmp : strcasecmp;
-
-  const char *lang_name
-    = lookup_name_without_params.language_lookup_name (lang);
-
-  /* Comparison function object for lower_bound that matches against a
-     given symbol name.  */
-  auto lookup_compare_lower = [&] (const name_component &elem,
-				   const char *name)
-    {
-      const char *elem_qualified = this->symbol_name_at (elem.idx, per_objfile);
-      const char *elem_name = elem_qualified + elem.name_offset;
-      return name_cmp (elem_name, name) < 0;
-    };
-
-  /* Comparison function object for upper_bound that matches against a
-     given symbol name.  */
-  auto lookup_compare_upper = [&] (const char *name,
-				   const name_component &elem)
-    {
-      const char *elem_qualified = this->symbol_name_at (elem.idx, per_objfile);
-      const char *elem_name = elem_qualified + elem.name_offset;
-      return name_cmp (name, elem_name) < 0;
-    };
-
-  auto begin = this->name_components.begin ();
-  auto end = this->name_components.end ();
-
-  /* Find the lower bound.  */
-  auto lower = [&] ()
-    {
-      if (lookup_name_without_params.completion_mode () && lang_name[0] == '\0')
-	return begin;
-      else
-	return std::lower_bound (begin, end, lang_name, lookup_compare_lower);
-    } ();
-
-  /* Find the upper bound.  */
-  auto upper = [&] ()
-    {
-      if (lookup_name_without_params.completion_mode ())
-	{
-	  /* In completion mode, we want UPPER to point past all
-	     symbols names that have the same prefix.  I.e., with
-	     these symbols, and completing "func":
-
-	      function        << lower bound
-	      function1
-	      other_function  << upper bound
-
-	     We find the upper bound by looking for the insertion
-	     point of "func"-with-last-character-incremented,
-	     i.e. "fund".  */
-	  std::string after = make_sort_after_prefix_name (lang_name);
-	  if (after.empty ())
-	    return end;
-	  return std::lower_bound (lower, end, after.c_str (),
-				   lookup_compare_lower);
-	}
-      else
-	return std::upper_bound (lower, end, lang_name, lookup_compare_upper);
-    } ();
-
-  return {lower, upper};
-}
-
 /* See declaration.  */
 
 void
 mapped_gdb_index::build_name_components (dwarf2_per_objfile *per_objfile)
 {
-  if (!this->name_components.empty ())
-    return;
-
-  this->name_components_casing = case_sensitivity;
-  auto *name_cmp
-    = this->name_components_casing == case_sensitive_on ? strcmp : strcasecmp;
+  std::vector<std::pair<std::string_view, std::vector<cooked_index_entry *>>>
+    need_parents;
+  gdb::unordered_map<std::string_view, cooked_index_entry *> by_name;
 
-  /* The code below only knows how to break apart components of C++
-     symbol names (and other languages that use '::' as
-     namespace/module separator) and Ada symbol names.  */
   auto count = this->symbol_name_count ();
   for (offset_type idx = 0; idx < count; idx++)
     {
@@ -371,815 +219,186 @@ mapped_gdb_index::build_name_components (dwarf2_per_objfile *per_objfile)
 
       const char *name = this->symbol_name_at (idx, per_objfile);
 
-      /* Add each name component to the name component table.  */
-      unsigned int previous_len = 0;
+      /* This code only knows how to break apart components of C++
+	 symbol names (and other languages that use '::' as
+	 namespace/module separator) and Ada symbol names.
 
+	 It's unfortunate that we need the language, but since it is
+	 really only used to rebuild full names, pairing it with the
+	 split method is fine.  */
+      enum language lang;
+      std::vector<std::string_view> components;
       if (strstr (name, "::") != nullptr)
 	{
-	  for (unsigned int current_len = cp_find_first_component (name);
-	       name[current_len] != '\0';
-	       current_len += cp_find_first_component (name + current_len))
-	    {
-	      gdb_assert (name[current_len] == ':');
-	      this->name_components.push_back ({previous_len, idx});
-	      /* Skip the '::'.  */
-	      current_len += 2;
-	      previous_len = current_len;
-	    }
+	  components = split_name (name, split_style::CXX);
+	  lang = language_cplus;
 	}
-      else
+      else if (strchr (name, '<') != nullptr)
 	{
-	  /* Handle the Ada encoded (aka mangled) form here.  */
-	  for (const char *iter = strstr (name, "__");
-	       iter != nullptr;
-	       iter = strstr (iter, "__"))
-	    {
-	      this->name_components.push_back ({previous_len, idx});
-	      iter += 2;
-	      previous_len = iter - name;
-	    }
+	  /* Guess that this is a template and so a C++ name.  */
+	  components.emplace_back (name);
+	  lang = language_cplus;
 	}
-
-      this->name_components.push_back ({previous_len, idx});
-    }
-
-  /* Sort name_components elements by name.  */
-  auto name_comp_compare = [&] (const name_component &left,
-				const name_component &right)
-    {
-      const char *left_qualified
-	= this->symbol_name_at (left.idx, per_objfile);
-      const char *right_qualified
-	= this->symbol_name_at (right.idx, per_objfile);
-
-      const char *left_name = left_qualified + left.name_offset;
-      const char *right_name = right_qualified + right.name_offset;
-
-      return name_cmp (left_name, right_name) < 0;
-    };
-
-  std::sort (this->name_components.begin (),
-	     this->name_components.end (),
-	     name_comp_compare);
-}
-
-/* Helper for dw2_expand_symtabs_matching that works with a
-   mapped_index_base instead of the containing objfile.  This is split
-   to a separate function in order to be able to unit test the
-   name_components matching using a mock mapped_index_base.  For each
-   symbol name that matches, calls MATCH_CALLBACK, passing it the
-   symbol's index in the mapped_index_base symbol table.  */
-
-static bool
-dw2_expand_symtabs_matching_symbol
-  (mapped_gdb_index &index,
-   const lookup_name_info &lookup_name_in,
-   expand_symtabs_symbol_matcher symbol_matcher,
-   gdb::function_view<bool (offset_type)> match_callback,
-   dwarf2_per_objfile *per_objfile,
-   expand_symtabs_lang_matcher lang_matcher)
-{
-  lookup_name_info lookup_name_without_params
-    = lookup_name_in.make_ignore_params ();
-
-  /* Build the symbol name component sorted vector, if we haven't
-     yet.  */
-  index.build_name_components (per_objfile);
-
-  /* The same symbol may appear more than once in the range though.
-     E.g., if we're looking for symbols that complete "w", and we have
-     a symbol named "w1::w2", we'll find the two name components for
-     that same symbol in the range.  To be sure we only call the
-     callback once per symbol, we first collect the symbol name
-     indexes that matched in a temporary vector and ignore
-     duplicates.  */
-  std::vector<offset_type> matches;
-
-  struct name_and_matcher
-  {
-    symbol_name_matcher_ftype *matcher;
-    const char *name;
-
-    bool operator== (const name_and_matcher &other) const
-    {
-      return matcher == other.matcher && strcmp (name, other.name) == 0;
-    }
-  };
-
-  /* A vector holding all the different symbol name matchers, for all
-     languages.  */
-  std::vector<name_and_matcher> matchers;
-
-  for (int i = 0; i < nr_languages; i++)
-    {
-      enum language lang_e = (enum language) i;
-      if (lang_matcher != nullptr && !lang_matcher (lang_e))
-	continue;
-
-      const language_defn *lang = language_def (lang_e);
-      symbol_name_matcher_ftype *name_matcher
-	= lang->get_symbol_name_matcher (lookup_name_without_params);
-
-      name_and_matcher key {
-	 name_matcher,
-	 lookup_name_without_params.language_lookup_name (lang_e)
-      };
-
-      /* Don't insert the same comparison routine more than once.
-	 Note that we do this linear walk.  This is not a problem in
-	 practice because the number of supported languages is
-	 low.  */
-      if (std::find (matchers.begin (), matchers.end (), key)
-	  != matchers.end ())
-	continue;
-      matchers.push_back (std::move (key));
-
-      auto bounds
-	= index.find_name_components_bounds (lookup_name_without_params,
-					     lang_e, per_objfile);
-
-      /* Now for each symbol name in range, check to see if we have a name
-	 match, and if so, call the MATCH_CALLBACK callback.  */
-
-      for (; bounds.first != bounds.second; ++bounds.first)
+      else if (strstr (name, "__") != nullptr)
 	{
-	  const char *qualified
-	    = index.symbol_name_at (bounds.first->idx, per_objfile);
-
-	  if (!name_matcher (qualified, lookup_name_without_params, NULL)
-	      || (symbol_matcher != NULL && !symbol_matcher (qualified)))
-	    continue;
-
-	  matches.push_back (bounds.first->idx);
+	  /* The Ada case is handled during finalization, because gdb
+	     does not write the synthesized package names into the
+	     index.  */
+	  components.emplace_back (name);
+	  lang = language_ada;
 	}
-    }
-
-  std::sort (matches.begin (), matches.end ());
-
-  /* Finally call the callback, once per match.  */
-  ULONGEST prev = -1;
-  bool result = true;
-  for (offset_type idx : matches)
-    {
-      if (prev != idx)
+      else
 	{
-	  if (!match_callback (idx))
-	    {
-	      result = false;
-	      break;
-	    }
-	  prev = idx;
+	  components = split_name (name, split_style::DOT_STYLE);
+	  /* Mark ordinary names as having an unknown language.  This
+	     is a hack to avoid problems with some Ada names.  */
+	  lang = (components.size () == 1) ? language_unknown : language_go;
 	}
-    }
-
-  /* Above we use a type wider than idx's for 'prev', since 0 and
-     (offset_type)-1 are both possible values.  */
-  static_assert (sizeof (prev) > sizeof (offset_type), "");
-
-  return result;
-}
-
-#if GDB_SELF_TEST
-
-namespace selftests { namespace dw2_expand_symtabs_matching {
-
-/* A mock .gdb_index/.debug_names-like name index table, enough to
-   exercise dw2_expand_symtabs_matching_symbol, which works with the
-   mapped_index_base interface.  Builds an index from the symbol list
-   passed as parameter to the constructor.  */
-class mock_mapped_index : public mapped_gdb_index
-{
-public:
-  mock_mapped_index (gdb::array_view<const char *> symbols)
-    : m_symbol_table (symbols)
-  {}
-
-  DISABLE_COPY_AND_ASSIGN (mock_mapped_index);
-
-  bool symbol_name_slot_invalid (offset_type idx) const override
-  { return false; }
-
-  /* Return the number of names in the symbol table.  */
-  size_t symbol_name_count () const override
-  {
-    return m_symbol_table.size ();
-  }
-
-  /* Get the name of the symbol at IDX in the symbol table.  */
-  const char *symbol_name_at
-    (offset_type idx, dwarf2_per_objfile *per_objfile) const override
-  {
-    return m_symbol_table[idx];
-  }
-
-  quick_symbol_functions_up make_quick_functions () const override
-  {
-    return nullptr;
-  }
-
-private:
-  gdb::array_view<const char *> m_symbol_table;
-};
-
-/* Convenience function that converts a NULL pointer to a "<null>"
-   string, to pass to print routines.  */
-
-static const char *
-string_or_null (const char *str)
-{
-  return str != NULL ? str : "<null>";
-}
-
-/* Check if a lookup_name_info built from
-   NAME/MATCH_TYPE/COMPLETION_MODE matches the symbols in the mock
-   index.  EXPECTED_LIST is the list of expected matches, in expected
-   matching order.  If no match expected, then an empty list is
-   specified.  Returns true on success.  On failure prints a warning
-   indicating the file:line that failed, and returns false.  */
-
-static bool
-check_match (const char *file, int line,
-	     mock_mapped_index &mock_index,
-	     const char *name, symbol_name_match_type match_type,
-	     bool completion_mode,
-	     std::initializer_list<const char *> expected_list,
-	     dwarf2_per_objfile *per_objfile)
-{
-  lookup_name_info lookup_name (name, match_type, completion_mode);
-
-  bool matched = true;
-
-  auto mismatch = [&] (const char *expected_str,
-		       const char *got)
-  {
-    warning (_("%s:%d: match_type=%s, looking-for=\"%s\", "
-	       "expected=\"%s\", got=\"%s\"\n"),
-	     file, line,
-	     (match_type == symbol_name_match_type::FULL
-	      ? "FULL" : "WILD"),
-	     name, string_or_null (expected_str), string_or_null (got));
-    matched = false;
-  };
-
-  auto expected_it = expected_list.begin ();
-  auto expected_end = expected_list.end ();
-
-  dw2_expand_symtabs_matching_symbol (mock_index, lookup_name,
-				      nullptr,
-				      [&] (offset_type idx)
-  {
-    const char *matched_name = mock_index.symbol_name_at (idx, per_objfile);
-    const char *expected_str
-      = expected_it == expected_end ? NULL : *expected_it++;
-
-    if (expected_str == NULL || strcmp (expected_str, matched_name) != 0)
-      mismatch (expected_str, matched_name);
-    return true;
-  }, per_objfile, nullptr);
-
-  const char *expected_str
-  = expected_it == expected_end ? NULL : *expected_it++;
-  if (expected_str != NULL)
-    mismatch (expected_str, NULL);
-
-  return matched;
-}
-
-/* The symbols added to the mock mapped_index for testing (in
-   canonical form).  */
-static const char *test_symbols[] = {
-  "function",
-  "std::bar",
-  "std::zfunction",
-  "std::zfunction2",
-  "w1::w2",
-  "ns::foo<char*>",
-  "ns::foo<int>",
-  "ns::foo<long>",
-  "ns2::tmpl<int>::foo2",
-  "(anonymous namespace)::A::B::C",
-
-  /* These are used to check that the increment-last-char in the
-     matching algorithm for completion doesn't match "t1_fund" when
-     completing "t1_func".  */
-  "t1_func",
-  "t1_func1",
-  "t1_fund",
-  "t1_fund1",
-
-  /* A UTF-8 name with multi-byte sequences to make sure that
-     cp-name-parser understands this as a single identifier ("função"
-     is "function" in PT).  */
-  (const char *)u8"u8função",
-
-  /* Test a symbol name that ends with a 0xff character, which is a
-     valid character in non-UTF-8 source character sets (e.g. Latin1
-     'ÿ'), and we can't rule out compilers allowing it in identifiers.
-     We test this because the completion algorithm finds the upper
-     bound of symbols by looking for the insertion point of
-     "func"-with-last-character-incremented, i.e. "fund", and adding 1
-     to 0xff should wraparound and carry to the previous character.
-     See comments in make_sort_after_prefix_name.  */
-  "yfunc\377",
-
-  /* Some more symbols with \377 (0xff).  See above.  */
-  "\377",
-  "\377\377123",
-
-  /* A name with all sorts of complications.  Starts with "z" to make
-     it easier for the completion tests below.  */
-#define Z_SYM_NAME \
-  "z::std::tuple<(anonymous namespace)::ui*, std::bar<(anonymous namespace)::ui> >" \
-    "::tuple<(anonymous namespace)::ui*, " \
-    "std::default_delete<(anonymous namespace)::ui>, void>"
-
-  Z_SYM_NAME
-};
-
-/* Returns true if the mapped_index_base::find_name_component_bounds
-   method finds EXPECTED_SYMS in INDEX when looking for SEARCH_NAME,
-   in completion mode.  */
-
-static bool
-check_find_bounds_finds (mapped_gdb_index &index,
-			 const char *search_name,
-			 gdb::array_view<const char *> expected_syms,
-			 dwarf2_per_objfile *per_objfile)
-{
-  lookup_name_info lookup_name (search_name,
-				symbol_name_match_type::FULL, true);
-
-  auto bounds = index.find_name_components_bounds (lookup_name,
-						   language_cplus,
-						   per_objfile);
-
-  size_t distance = std::distance (bounds.first, bounds.second);
-  if (distance != expected_syms.size ())
-    return false;
-
-  for (size_t exp_elem = 0; exp_elem < distance; exp_elem++)
-    {
-      auto nc_elem = bounds.first + exp_elem;
-      const char *qualified = index.symbol_name_at (nc_elem->idx, per_objfile);
-      if (strcmp (qualified, expected_syms[exp_elem]) != 0)
-	return false;
-    }
-
-  return true;
-}
-
-/* Test the lower-level mapped_index::find_name_component_bounds
-   method.  */
-
-static void
-test_mapped_index_find_name_component_bounds ()
-{
-  mock_mapped_index mock_index (test_symbols);
-
-  mock_index.build_name_components (NULL /* per_objfile */);
-
-  /* Test the lower-level mapped_index::find_name_component_bounds
-     method in completion mode.  */
-  {
-    static const char *expected_syms[] = {
-      "t1_func",
-      "t1_func1",
-    };
-
-    SELF_CHECK (check_find_bounds_finds
-		  (mock_index, "t1_func", expected_syms,
-		   NULL /* per_objfile */));
-  }
-
-  /* Check that the increment-last-char in the name matching algorithm
-     for completion doesn't get confused with Ansi1 'ÿ' / 0xff.  See
-     make_sort_after_prefix_name.  */
-  {
-    static const char *expected_syms1[] = {
-      "\377",
-      "\377\377123",
-    };
-    SELF_CHECK (check_find_bounds_finds
-		  (mock_index, "\377", expected_syms1, NULL /* per_objfile */));
-
-    static const char *expected_syms2[] = {
-      "\377\377123",
-    };
-    SELF_CHECK (check_find_bounds_finds
-		  (mock_index, "\377\377", expected_syms2,
-		   NULL /* per_objfile */));
-  }
-}
-
-/* Test dw2_expand_symtabs_matching_symbol.  */
-
-static void
-test_dw2_expand_symtabs_matching_symbol ()
-{
-  mock_mapped_index mock_index (test_symbols);
-
-  /* We let all tests run until the end even if some fails, for debug
-     convenience.  */
-  bool any_mismatch = false;
-
-  /* Create the expected symbols list (an initializer_list).  Needed
-     because lists have commas, and we need to pass them to CHECK,
-     which is a macro.  */
-#define EXPECT(...) { __VA_ARGS__ }
-
-  /* Wrapper for check_match that passes down the current
-     __FILE__/__LINE__.  */
-#define CHECK_MATCH(NAME, MATCH_TYPE, COMPLETION_MODE, EXPECTED_LIST)	\
-  any_mismatch |= !check_match (__FILE__, __LINE__,			\
-				mock_index,				\
-				NAME, MATCH_TYPE, COMPLETION_MODE,	\
-				EXPECTED_LIST, NULL)
-
-  /* Identity checks.  */
-  for (const char *sym : test_symbols)
-    {
-      /* Should be able to match all existing symbols.  */
-      CHECK_MATCH (sym, symbol_name_match_type::FULL, false,
-		   EXPECT (sym));
-
-      /* Should be able to match all existing symbols with
-	 parameters.  */
-      std::string with_params = std::string (sym) + "(int)";
-      CHECK_MATCH (with_params.c_str (), symbol_name_match_type::FULL, false,
-		   EXPECT (sym));
-
-      /* Should be able to match all existing symbols with
-	 parameters and qualifiers.  */
-      with_params = std::string (sym) + " ( int ) const";
-      CHECK_MATCH (with_params.c_str (), symbol_name_match_type::FULL, false,
-		   EXPECT (sym));
-
-      /* This should really find sym, but cp-name-parser.y doesn't
-	 know about lvalue/rvalue qualifiers yet.  */
-      with_params = std::string (sym) + " ( int ) &&";
-      CHECK_MATCH (with_params.c_str (), symbol_name_match_type::FULL, false,
-		   {});
-    }
-
-  /* Check that the name matching algorithm for completion doesn't get
-     confused with Latin1 'ÿ' / 0xff.  See
-     make_sort_after_prefix_name.  */
-  {
-    static const char str[] = "\377";
-    CHECK_MATCH (str, symbol_name_match_type::FULL, true,
-		 EXPECT ("\377", "\377\377123"));
-  }
-
-  /* Check that the increment-last-char in the matching algorithm for
-     completion doesn't match "t1_fund" when completing "t1_func".  */
-  {
-    static const char str[] = "t1_func";
-    CHECK_MATCH (str, symbol_name_match_type::FULL, true,
-		 EXPECT ("t1_func", "t1_func1"));
-  }
-
-  /* Check that completion mode works at each prefix of the expected
-     symbol name.  */
-  {
-    static const char str[] = "function(int)";
-    size_t len = strlen (str);
-    std::string lookup;
-
-    for (size_t i = 1; i < len; i++)
-      {
-	lookup.assign (str, i);
-	CHECK_MATCH (lookup.c_str (), symbol_name_match_type::FULL, true,
-		     EXPECT ("function"));
-      }
-  }
-
-  /* While "w" is a prefix of both components, the match function
-     should still only be called once.  */
-  {
-    CHECK_MATCH ("w", symbol_name_match_type::FULL, true,
-		 EXPECT ("w1::w2"));
-    CHECK_MATCH ("w", symbol_name_match_type::WILD, true,
-		 EXPECT ("w1::w2"));
-  }
-
-  /* Same, with a "complicated" symbol.  */
-  {
-    static const char str[] = Z_SYM_NAME;
-    size_t len = strlen (str);
-    std::string lookup;
-
-    for (size_t i = 1; i < len; i++)
-      {
-	lookup.assign (str, i);
-	CHECK_MATCH (lookup.c_str (), symbol_name_match_type::FULL, true,
-		     EXPECT (Z_SYM_NAME));
-      }
-  }
-
-  /* In FULL mode, an incomplete symbol doesn't match.  */
-  {
-    CHECK_MATCH ("std::zfunction(int", symbol_name_match_type::FULL, false,
-		 {});
-  }
-
-  /* A complete symbol with parameters matches any overload, since the
-     index has no overload info.  */
-  {
-    CHECK_MATCH ("std::zfunction(int)", symbol_name_match_type::FULL, true,
-		 EXPECT ("std::zfunction", "std::zfunction2"));
-    CHECK_MATCH ("zfunction(int)", symbol_name_match_type::WILD, true,
-		 EXPECT ("std::zfunction", "std::zfunction2"));
-    CHECK_MATCH ("zfunc", symbol_name_match_type::WILD, true,
-		 EXPECT ("std::zfunction", "std::zfunction2"));
-  }
-
-  /* Check that whitespace is ignored appropriately.  A symbol with a
-     template argument list. */
-  {
-    static const char expected[] = "ns::foo<int>";
-    CHECK_MATCH ("ns :: foo < int > ", symbol_name_match_type::FULL, false,
-		 EXPECT (expected));
-    CHECK_MATCH ("foo < int > ", symbol_name_match_type::WILD, false,
-		 EXPECT (expected));
-  }
-
-  /* Check that whitespace is ignored appropriately.  A symbol with a
-     template argument list that includes a pointer.  */
-  {
-    static const char expected[] = "ns::foo<char*>";
-    /* Try both completion and non-completion modes.  */
-    static const bool completion_mode[2] = {false, true};
-    for (size_t i = 0; i < 2; i++)
-      {
-	CHECK_MATCH ("ns :: foo < char * >", symbol_name_match_type::FULL,
-		     completion_mode[i], EXPECT (expected));
-	CHECK_MATCH ("foo < char * >", symbol_name_match_type::WILD,
-		     completion_mode[i], EXPECT (expected));
-
-	CHECK_MATCH ("ns :: foo < char * > (int)", symbol_name_match_type::FULL,
-		     completion_mode[i], EXPECT (expected));
-	CHECK_MATCH ("foo < char * > (int)", symbol_name_match_type::WILD,
-		     completion_mode[i], EXPECT (expected));
-      }
-  }
-
-  {
-    /* Check method qualifiers are ignored.  */
-    static const char expected[] = "ns::foo<char*>";
-    CHECK_MATCH ("ns :: foo < char * >  ( int ) const",
-		 symbol_name_match_type::FULL, true, EXPECT (expected));
-    CHECK_MATCH ("ns :: foo < char * >  ( int ) &&",
-		 symbol_name_match_type::FULL, true, EXPECT (expected));
-    CHECK_MATCH ("foo < char * >  ( int ) const",
-		 symbol_name_match_type::WILD, true, EXPECT (expected));
-    CHECK_MATCH ("foo < char * >  ( int ) &&",
-		 symbol_name_match_type::WILD, true, EXPECT (expected));
-  }
-
-  /* Test lookup names that don't match anything.  */
-  {
-    CHECK_MATCH ("bar2", symbol_name_match_type::WILD, false,
-		 {});
-
-    CHECK_MATCH ("doesntexist", symbol_name_match_type::FULL, false,
-		 {});
-  }
-
-  /* Some wild matching tests, exercising "(anonymous namespace)",
-     which should not be confused with a parameter list.  */
-  {
-    static const char *syms[] = {
-      "A::B::C",
-      "B::C",
-      "C",
-      "A :: B :: C ( int )",
-      "B :: C ( int )",
-      "C ( int )",
-    };
-
-    for (const char *s : syms)
-      {
-	CHECK_MATCH (s, symbol_name_match_type::WILD, false,
-		     EXPECT ("(anonymous namespace)::A::B::C"));
-      }
-  }
-
-  {
-    static const char expected[] = "ns2::tmpl<int>::foo2";
-    CHECK_MATCH ("tmp", symbol_name_match_type::WILD, true,
-		 EXPECT (expected));
-    CHECK_MATCH ("tmpl<", symbol_name_match_type::WILD, true,
-		 EXPECT (expected));
-  }
-
-  SELF_CHECK (!any_mismatch);
 
-#undef EXPECT
-#undef CHECK_MATCH
-}
-
-static void
-run_test ()
-{
-  test_mapped_index_find_name_component_bounds ();
-  test_dw2_expand_symtabs_matching_symbol ();
-}
-
-}} /* namespace selftests::dw2_expand_symtabs_matching */
-
-#endif /* GDB_SELF_TEST */
-
-struct dwarf2_gdb_index : public dwarf2_base_index_functions
-{
-  /* This dumps minimal information about the index.
-     It is called via "mt print objfiles".
-     One use is to verify .gdb_index has been loaded by the
-     gdb.dwarf2/gdb-index.exp testcase.  */
-  void dump (struct objfile *objfile) override;
-
-  bool expand_symtabs_matching
-    (struct objfile *objfile,
-     expand_symtabs_file_matcher file_matcher,
-     const lookup_name_info *lookup_name,
-     expand_symtabs_symbol_matcher symbol_matcher,
-     expand_symtabs_expansion_listener expansion_notify,
-     block_search_flags search_flags,
-     domain_search_flags domain,
-     expand_symtabs_lang_matcher lang_matcher) override;
-};
-
-/* This dumps minimal information about the index.
-   It is called via "mt print objfiles".
-   One use is to verify .gdb_index has been loaded by the
-   gdb.dwarf2/gdb-index.exp testcase.  */
-
-void
-dwarf2_gdb_index::dump (struct objfile *objfile)
-{
-  dwarf2_per_objfile *per_objfile = get_dwarf2_per_objfile (objfile);
-
-  mapped_gdb_index *index = (gdb::checked_static_cast<mapped_gdb_index *>
-			     (per_objfile->per_bfd->index_table.get ()));
-  gdb_printf (".gdb_index: version %d\n", index->version);
-  gdb_printf ("\n");
-}
-
-/* Helper for dw2_expand_matching symtabs.  Called on each symbol
-   matched, to expand corresponding CUs that were marked.  IDX is the
-   index of the symbol name that matched.  */
-
-static bool
-dw2_expand_marked_cus (dwarf2_per_objfile *per_objfile, offset_type idx,
-		       auto_bool_vector &marked,
-		       expand_symtabs_file_matcher file_matcher,
-		       expand_symtabs_expansion_listener expansion_notify,
-		       block_search_flags search_flags,
-		       domain_search_flags kind,
-		       expand_symtabs_lang_matcher lang_matcher)
-{
-  offset_type vec_len, vec_idx;
-  bool global_seen = false;
-  mapped_gdb_index &index
-    = *(gdb::checked_static_cast<mapped_gdb_index *>
-	(per_objfile->per_bfd->index_table.get ()));
-
-  offset_view vec (index.constant_pool.slice (index.symbol_vec_index (idx)));
-  vec_len = vec[0];
-  for (vec_idx = 0; vec_idx < vec_len; ++vec_idx)
-    {
-      offset_type cu_index_and_attrs = vec[vec_idx + 1];
-      /* This value is only valid for index versions >= 7.  */
-      int is_static = GDB_INDEX_SYMBOL_STATIC_VALUE (cu_index_and_attrs);
-      gdb_index_symbol_kind symbol_kind =
-	GDB_INDEX_SYMBOL_KIND_VALUE (cu_index_and_attrs);
-      int cu_index = GDB_INDEX_CU_VALUE (cu_index_and_attrs);
-      /* Only check the symbol attributes if they're present.
-	 Indices prior to version 7 don't record them,
-	 and indices >= 7 may elide them for certain symbols
-	 (gold does this).  */
-      int attrs_valid =
-	(index.version >= 7
-	 && symbol_kind != GDB_INDEX_SYMBOL_KIND_NONE);
-
-      /* Work around gold/15646.  */
-      if (attrs_valid
-	  && !is_static
-	  && symbol_kind == GDB_INDEX_SYMBOL_KIND_TYPE)
+      std::vector<cooked_index_entry *> these_entries;
+      offset_view vec (constant_pool.slice (symbol_vec_index (idx)));
+      offset_type vec_len = vec[0];
+      bool global_seen = false;
+      for (offset_type vec_idx = 0; vec_idx < vec_len; ++vec_idx)
 	{
-	  if (global_seen)
+	  offset_type cu_index_and_attrs = vec[vec_idx + 1];
+	  gdb_index_symbol_kind symbol_kind
+	    = GDB_INDEX_SYMBOL_KIND_VALUE (cu_index_and_attrs);
+	  /* Only use a symbol if the attributes are present.  Indices
+	     prior to version 7 don't record them, and indices >= 7
+	     may elide them for certain symbols (gold does this).  */
+	  if (symbol_kind == GDB_INDEX_SYMBOL_KIND_NONE)
 	    continue;
 
-	  global_seen = true;
-	}
+	  int is_static = GDB_INDEX_SYMBOL_STATIC_VALUE (cu_index_and_attrs);
 
-      /* Only check the symbol's kind if it has one.  */
-      if (attrs_valid)
-	{
-	  if (is_static)
-	    {
-	      if ((search_flags & SEARCH_STATIC_BLOCK) == 0)
-		continue;
-	    }
-	  else
+	  int cu_index = GDB_INDEX_CU_VALUE (cu_index_and_attrs);
+	  /* Don't crash on bad data.  */
+	  if (cu_index >= per_objfile->per_bfd->all_units.size ())
 	    {
-	      if ((search_flags & SEARCH_GLOBAL_BLOCK) == 0)
-		continue;
+	      complaint (_(".gdb_index entry has bad CU index"
+			   " [in module %s]"),
+			 objfile_name (per_objfile->objfile));
+	      continue;
 	    }
+	  dwarf2_per_cu *per_cu = per_objfile->per_bfd->get_cu (cu_index);
 
-	  domain_search_flags mask = 0;
+	  enum language this_lang = lang;
+	  dwarf_tag tag;
 	  switch (symbol_kind)
 	    {
 	    case GDB_INDEX_SYMBOL_KIND_VARIABLE:
-	      mask = SEARCH_VAR_DOMAIN;
+	      tag = DW_TAG_variable;
 	      break;
 	    case GDB_INDEX_SYMBOL_KIND_FUNCTION:
-	      mask = SEARCH_FUNCTION_DOMAIN;
+	      tag = DW_TAG_subprogram;
 	      break;
 	    case GDB_INDEX_SYMBOL_KIND_TYPE:
-	      mask = SEARCH_TYPE_DOMAIN | SEARCH_STRUCT_DOMAIN;
+	      if (is_static)
+		tag = (dwarf_tag) DW_TAG_GDB_INDEX_TYPE;
+	      else
+		{
+		  /* Work around gold/15646.  */
+		  if (global_seen)
+		    continue;
+		  global_seen = true;
+
+		  tag = DW_TAG_structure_type;
+		  this_lang = language_cplus;
+		}
 	      break;
+	      /* The "default" should not happen, but we mention it to
+		 avoid an uninitialized warning.  */
+	    default:
 	    case GDB_INDEX_SYMBOL_KIND_OTHER:
-	      mask = SEARCH_MODULE_DOMAIN | SEARCH_TYPE_DOMAIN;
+	      tag = (dwarf_tag) DW_TAG_GDB_INDEX_OTHER;
 	      break;
 	    }
-	  if ((kind & mask) == 0)
-	    continue;
+
+	  cooked_index_flag flags = 0;
+	  if (is_static)
+	    flags |= IS_STATIC;
+	  if (main_name != nullptr
+	      && tag == DW_TAG_subprogram
+	      && strcmp (name, main_name) == 0)
+	    {
+	      flags |= IS_MAIN;
+	      this_lang = main_lang;
+	      /* Don't bother looking for another.  */
+	      main_name = nullptr;
+	    }
+
+	  /* Note that this assumes the final component ends in \0.  */
+	  cooked_index_entry *entry = result.add (per_cu->sect_off, tag,
+						  flags, this_lang,
+						  components.back ().data (),
+						  nullptr, per_cu);
+	  /* Don't bother pushing if we do not need a panrent.  */
+	  if (components.size () > 1)
+	    these_entries.push_back (entry);
+
+	  /* We don't care exactly which entry ends up in this
+	     map.  */
+	  by_name[std::string_view (name)] = entry;
 	}
 
-      /* Don't crash on bad data.  */
-      if (cu_index >= per_objfile->per_bfd->all_units.size ())
+      if (components.size () > 1)
 	{
-	  complaint (_(".gdb_index entry has bad CU index"
-		       " [in module %s]"), objfile_name (per_objfile->objfile));
-	  continue;
-	}
+	  std::string_view penultimate = components[components.size () - 2];
+	  std::string_view prefix (name, &penultimate.back () + 1 - name);
 
-      dwarf2_per_cu *per_cu = per_objfile->per_bfd->get_cu (cu_index);
-      if (!dw2_expand_symtabs_matching_one (per_cu, per_objfile, marked,
-					    file_matcher, expansion_notify,
-					    lang_matcher))
-	return false;
+	  need_parents.emplace_back (prefix, std::move (these_entries));
+	}
     }
 
-  return true;
+  for (const auto &[prefix, entries] : need_parents)
+    {
+      auto iter = by_name.find (prefix);
+      /* If we can't find the parent entry, just lose.  It should
+	 always be there.  We could synthesize it from the components,
+	 if we kept those, but that seems like overkill.  */
+      if (iter != by_name.end ())
+	{
+	  for (cooked_index_entry *entry : entries)
+	    entry->set_parent (iter->second);
+	}
+    }
 }
 
-bool
-dwarf2_gdb_index::expand_symtabs_matching
-  (objfile *objfile,
-   expand_symtabs_file_matcher file_matcher,
-   const lookup_name_info *lookup_name,
-   expand_symtabs_symbol_matcher symbol_matcher,
-   expand_symtabs_expansion_listener expansion_notify,
-   block_search_flags search_flags,
-   domain_search_flags domain,
-   expand_symtabs_lang_matcher lang_matcher)
+/* The worker that reads a mapped index and fills in a
+   cooked_index_worker_result.  */
+
+class gdb_index_worker : public cooked_index_worker
 {
-  dwarf2_per_objfile *per_objfile = get_dwarf2_per_objfile (objfile);
+public:
 
-  auto_bool_vector marked;
-  dw_expand_symtabs_matching_file_matcher (per_objfile, marked, file_matcher);
+  gdb_index_worker (dwarf2_per_objfile *per_objfile,
+		    std::unique_ptr<mapped_gdb_index> map)
+    : cooked_index_worker (per_objfile),
+      map (std::move (map))
+  { }
 
-  /* This invariant is documented in quick-functions.h.  */
-  gdb_assert (lookup_name != nullptr || symbol_matcher == nullptr);
-  if (lookup_name == nullptr)
-    {
-      for (dwarf2_per_cu *per_cu : all_units_range (per_objfile->per_bfd))
-	{
-	  QUIT;
+  void do_reading () override;
 
-	  if (!dw2_expand_symtabs_matching_one (per_cu, per_objfile,
-						marked, file_matcher,
-						expansion_notify,
-						lang_matcher))
-	    return false;
-	}
-      return true;
-    }
+  /* The map we're reading.  */
+  std::unique_ptr<mapped_gdb_index> map;
+};
+
+void
+gdb_index_worker::do_reading ()
+{
+  complaint_interceptor complaint_handler;
+  map->build_name_components (m_per_objfile);
 
-  mapped_gdb_index &index
-    = *(gdb::checked_static_cast<mapped_gdb_index *>
-	(per_objfile->per_bfd->index_table.get ()));
+  m_results.push_back (std::move (map->result));
+  m_results[0].done_reading (complaint_handler.release ());
 
-  bool result
-    = dw2_expand_symtabs_matching_symbol (index, *lookup_name,
-					  symbol_matcher,
-					  [&] (offset_type idx)
-    {
-      if (!dw2_expand_marked_cus (per_objfile, idx, marked, file_matcher,
-				  expansion_notify, search_flags, domain,
-				  lang_matcher))
-	return false;
-      return true;
-    }, per_objfile, lang_matcher);
+  /* No longer needed.  */
+  map.reset ();
 
-  return result;
-}
+  done_reading ();
 
-quick_symbol_functions_up
-mapped_gdb_index::make_quick_functions () const
-{
-  return quick_symbol_functions_up (new dwarf2_gdb_index);
+  bfd_thread_cleanup ();
 }
 
 /* A helper function that reads the .gdb_index from BUFFER and fills
@@ -1208,11 +427,9 @@ read_gdb_index_from_buffer (const char *filename,
 
   /* Version check.  */
   offset_type version = metadata[0];
-  /* Versions earlier than 3 emitted every copy of a psymbol.  This
-     causes the index to behave very poorly for certain requests.  Version 3
-     contained incomplete addrmap.  So, it seems better to just ignore such
-     indices.  */
-  if (version < 4)
+  /* GDB now requires the symbol attributes, which were added in
+     version 7.  */
+  if (version < 7)
     {
       static int warning_printed = 0;
       if (!warning_printed)
@@ -1223,30 +440,6 @@ read_gdb_index_from_buffer (const char *filename,
 	}
       return 0;
     }
-  /* Index version 4 uses a different hash function than index version
-     5 and later.
-
-     Versions earlier than 6 did not emit psymbols for inlined
-     functions.  Using these files will cause GDB not to be able to
-     set breakpoints on inlined functions by name, so we ignore these
-     indices unless the user has done
-     "set use-deprecated-index-sections on".  */
-  if (version < 6 && !deprecated_ok)
-    {
-      static int warning_printed = 0;
-      if (!warning_printed)
-	{
-	  warning (_("\
-Skipping deprecated .gdb_index section in %s.\n\
-Do \"%ps\" before the file is read\n\
-to use the section anyway."),
-		   filename,
-		   styled_string (command_style.style (),
-				  "set use-deprecated-index-sections on"));
-	  warning_printed = 1;
-	}
-      return 0;
-    }
   /* Version 7 indices generated by gold refer to the CU for a symbol instead
      of the TU (for symbols coming from TUs),
      http://sourceware.org/bugzilla/show_bug.cgi?id=15021.
@@ -1431,23 +624,19 @@ create_addrmap_from_gdb_index (dwarf2_per_objfile *per_objfile,
       mutable_map.set_empty (lo, hi - 1, per_bfd->get_cu (cu_index));
     }
 
-  index->index_addrmap
-    = new (&per_bfd->obstack) addrmap_fixed (&per_bfd->obstack, &mutable_map);
+  index->result.set_addrmap (std::move (mutable_map));
 }
 
-/* Sets the name and language of the main function from the shortcut table.  */
-
-static void
-set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile,
-			      mapped_gdb_index *index)
+void
+mapped_gdb_index::set_main_name (dwarf2_per_objfile *per_objfile)
 {
   const auto expected_size = 2 * sizeof (offset_type);
-  if (index->shortcut_table.size () < expected_size)
+  if (this->shortcut_table.size () < expected_size)
     /* The data in the section is not present, is corrupted or is in a version
        we don't know about.  Regardless, we can't make use of it.  */
     return;
 
-  auto ptr = index->shortcut_table.data ();
+  auto ptr = this->shortcut_table.data ();
   const auto dw_lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);
   if (dw_lang >= DW_LANG_hi_user)
     {
@@ -1463,13 +652,11 @@ set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile,
     }
   ptr += 4;
 
-  const auto lang = dwarf_lang_to_enum_language (dw_lang);
+  main_lang = dwarf_lang_to_enum_language (dw_lang);
   const auto name_offset = extract_unsigned_integer (ptr,
 						     sizeof (offset_type),
 						     BFD_ENDIAN_LITTLE);
-  const auto name = (const char *) (index->constant_pool.data () + name_offset);
-
-  set_objfile_main_name (per_objfile->objfile, name, (enum language) lang);
+  main_name = (const char *) (this->constant_pool.data () + name_offset);
 }
 
 /* See read-gdb-index.h.  */
@@ -1557,9 +744,15 @@ dwarf2_read_gdb_index
 
   create_addrmap_from_gdb_index (per_objfile, map.get ());
 
-  set_main_name_from_gdb_index (per_objfile, map.get ());
+  map->set_main_name (per_objfile);
 
-  per_bfd->index_table = std::move (map);
+  int version = map->version;
+  auto worker = std::make_unique<gdb_index_worker> (per_objfile,
+						    std::move (map));
+  auto idx = std::make_unique<cooked_gdb_index> (std::move (worker),
+						 version);
+
+  per_bfd->start_reading (std::move (idx));
 
   return true;
 }
@@ -1580,9 +773,4 @@ Warning: This option must be enabled before gdb reads the file."),
 			   NULL,
 			   NULL,
 			   &setlist, &showlist);
-
-#if GDB_SELF_TEST
-  selftests::register_test ("dw2_expand_symtabs_matching",
-			    selftests::dw2_expand_symtabs_matching::run_test);
-#endif
 }
diff --git a/gdb/dwarf2/read-gdb-index.h b/gdb/dwarf2/read-gdb-index.h
index e38a8318db8fac78336208097bac288f8321be3f..f605985f648104d4508e934e727c608eb0928bff 100644
--- a/gdb/dwarf2/read-gdb-index.h
+++ b/gdb/dwarf2/read-gdb-index.h
@@ -27,6 +27,20 @@ struct dwarf2_per_objfile;
 struct dwz_file;
 struct objfile;
 
+/* .gdb_index doesn't distinguish between the various "other" symbols
+   -- but the symbol search machinery really wants to.  For example,
+   an imported decl is "other" but is really a namespace and thus in
+   TYPE_DOMAIN; whereas a Fortran module is also "other" but is in the
+   MODULE_DOMAIN.  We use this value internally to represent the
+   "other" case so that matching can work.  The exact value does not
+   matter, all that matters here is that it won't overlap with any
+   symbol that gdb might create.  */
+#define DW_TAG_GDB_INDEX_OTHER 0xffff
+
+/* Similarly, .gdb_index can't distinguish between the type and struct
+   domains.  This is a special tag that inhabits both.  */
+#define DW_TAG_GDB_INDEX_TYPE 0xfffe
+
 /* Callback types for dwarf2_read_gdb_index.  */
 
 typedef gdb::function_view
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index f1084530221e82f33efb35f542d6d842fe3df2ab..6eee9631623435cbb39830116b790ba4b1f0c21d 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -14570,13 +14570,26 @@ cooked_index_functions::expand_symtabs_matching
 		continue;
 	    }
 
+	  /* This is a bit of a hack to support .gdb_index.  Since
+	     .gdb_index does not record languages, and since we want
+	     to know the language to avoid excessive CU expansion due
+	     to false matches, if we see a symbol with an unknown
+	     language we find the CU's language.  Only the .gdb_index
+	     reader creates such symbols.  */
+	  enum language entry_lang = entry->lang;
+	  if (entry_lang == language_unknown)
+	    {
+	      entry->per_cu->ensure_lang (per_objfile);
+	      entry_lang = entry->per_cu->lang ();
+	    }
+
 	  /* We've found the base name of the symbol; now walk its
 	     parentage chain, ensuring that each component
 	     matches.  */
 	  bool found = true;
 
 	  const cooked_index_entry *parent = entry->get_parent ();
-	  const language_defn *lang_def = language_def (entry->lang);
+	  const language_defn *lang_def = language_def (entry_lang);
 	  for (int i = name_vec.size () - 1; i > 0; --i)
 	    {
 	      /* If we ran out of entries, or if this segment doesn't
@@ -14586,17 +14599,15 @@ cooked_index_functions::expand_symtabs_matching
 		  found = false;
 		  break;
 		}
-	      if (parent->lang != language_unknown)
+
+	      symbol_name_matcher_ftype *name_matcher
+		= (lang_def->get_symbol_name_matcher
+		   (segment_lookup_names[i-1]));
+	      if (!name_matcher (parent->canonical,
+				 segment_lookup_names[i-1], nullptr))
 		{
-		  symbol_name_matcher_ftype *name_matcher
-		    = lang_def->get_symbol_name_matcher
-		      (segment_lookup_names[i-1]);
-		  if (!name_matcher (parent->canonical,
-				     segment_lookup_names[i-1], nullptr))
-		    {
-		      found = false;
-		      break;
-		    }
+		  found = false;
+		  break;
 		}
 
 	      parent = parent->get_parent ();
@@ -14619,23 +14630,20 @@ cooked_index_functions::expand_symtabs_matching
 	     seems like the loop above could just examine every
 	     element of the name, avoiding the need to check here; but
 	     this is hard.  See PR symtab/32733.  */
-	  if (symbol_matcher != nullptr || entry->lang != language_unknown)
+	  auto_obstack temp_storage;
+	  const char *full_name = entry->full_name (&temp_storage,
+						    FOR_ADA_LINKAGE_NAME);
+	  if (symbol_matcher == nullptr)
 	    {
-	      auto_obstack temp_storage;
-	      const char *full_name = entry->full_name (&temp_storage,
-							FOR_ADA_LINKAGE_NAME);
-	      if (symbol_matcher == nullptr)
-		{
-		  symbol_name_matcher_ftype *name_matcher
-		    = (lang_def->get_symbol_name_matcher
-		       (lookup_name_without_params));
-		  if (!name_matcher (full_name, lookup_name_without_params,
-				     nullptr))
-		    continue;
-		}
-	      else if (!symbol_matcher (full_name))
+	      symbol_name_matcher_ftype *name_matcher
+		= (lang_def->get_symbol_name_matcher
+		   (lookup_name_without_params));
+	      if (!name_matcher (full_name, lookup_name_without_params,
+				 nullptr))
 		continue;
 	    }
+	  else if (!symbol_matcher (full_name))
+	    continue;
 
 	  if (!dw2_expand_symtabs_matching_one (entry->per_cu, per_objfile,
 						marked, file_matcher,
diff --git a/gdb/dwarf2/tag.h b/gdb/dwarf2/tag.h
index ef1ad217c6d34fbacdf5f6372b0d29ad3b6e7bbd..e2f056033702c303f3b3354810e09fe27126625d 100644
--- a/gdb/dwarf2/tag.h
+++ b/gdb/dwarf2/tag.h
@@ -22,6 +22,7 @@
 
 #include "dwarf2.h"
 #include "symtab.h"
+#include "read-gdb-index.h"
 
 /* Return true if TAG represents a type, false otherwise.  */
 
@@ -144,6 +145,13 @@ tag_matches_domain (dwarf_tag tag, domain_search_flags search, language lang)
       else
 	flags = SEARCH_MODULE_DOMAIN;
       break;
+
+    case DW_TAG_GDB_INDEX_OTHER:
+      flags = SEARCH_MODULE_DOMAIN | SEARCH_TYPE_DOMAIN;
+      break;
+    case DW_TAG_GDB_INDEX_TYPE:
+      flags = SEARCH_STRUCT_DOMAIN | SEARCH_TYPE_DOMAIN;
+      break;
     }
 
   return (flags & search) != 0;

-- 
2.46.1

next prev parent reply	other threads:[~2025-04-03  0:06 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-02 23:44 [PATCH v2 00/28] Search symbols via quick API Tom Tromey
2025-04-02 23:45 ` [PATCH v2 01/28] Add another minor hack to cooked_index_entry::full_name Tom Tromey
2025-04-02 23:45 ` [PATCH v2 02/28] Change ada_decode to preserve upper-case in some situations Tom Tromey
2025-04-02 23:45 ` [PATCH v2 03/28] Emit some type declarations in .gdb_index Tom Tromey
2025-04-21  2:50   ` Simon Marchi
2025-04-21 14:50     ` Tom Tromey
2025-04-23  4:11       ` Simon Marchi
2025-04-23 20:54         ` Tom Tromey
2025-04-02 23:45 ` [PATCH v2 04/28] Ada import functions not in index Tom Tromey
2025-04-02 23:45 ` [PATCH v2 05/28] Fix index's handling of DW_TAG_imported_declaration Tom Tromey
2025-04-02 23:45 ` [PATCH v2 06/28] Put all CTF symbols in global scope Tom Tromey
2025-04-02 23:45 ` [PATCH v2 07/28] Restore "ingestion" of .debug_str when writing .debug_names Tom Tromey
2025-04-02 23:45 ` [PATCH v2 08/28] Entries from anon-struct.exp not in cooked index Tom Tromey
2025-04-02 23:45 ` [PATCH v2 09/28] Remove dwarf2_per_cu_data::mark Tom Tromey
2025-04-21  3:09   ` Simon Marchi
2025-04-21 15:38     ` Tom Tromey
2025-04-23  4:12       ` Simon Marchi
2025-04-02 23:45 ` [PATCH v2 10/28] Have expand_symtabs_matching work for already-expanded CUs Tom Tromey
2025-04-23 15:53   ` Simon Marchi
2025-04-23 20:39     ` Tom Tromey
2025-04-23 20:57       ` Tom Tromey
2025-04-02 23:45 ` Tom Tromey [this message]
2025-04-23 17:22   ` [PATCH v2 11/28] Rewrite the .gdb_index reader Simon Marchi
2025-04-23 20:50     ` Tom Tromey
2025-04-24 14:37   ` Pedro Alves
2025-04-02 23:45 ` [PATCH v2 12/28] Convert default_collect_symbol_completion_matches_break_on Tom Tromey
2025-04-02 23:45 ` [PATCH v2 13/28] Convert gdbpy_lookup_static_symbols Tom Tromey
2025-04-02 23:45 ` [PATCH v2 14/28] Convert ada_add_global_exceptions Tom Tromey
2025-04-02 23:45 ` [PATCH v2 15/28] Convert ada_language_defn::collect_symbol_completion_matches Tom Tromey
2025-04-02 23:45 ` [PATCH v2 16/28] Convert ada-lang.c:map_matching_symbols Tom Tromey
2025-04-02 23:45 ` [PATCH v2 17/28] Remove expand_symtabs_matching Tom Tromey
2025-04-02 23:45 ` [PATCH v2 18/28] Simplify basic_lookup_transparent_type Tom Tromey
2025-04-02 23:45 ` [PATCH v2 19/28] Remove objfile::expand_symtabs_for_function Tom Tromey
2025-04-02 23:45 ` [PATCH v2 20/28] Convert linespec.c:iterate_over_all_matching_symtabs Tom Tromey
2025-04-02 23:45 ` [PATCH v2 21/28] Simplify block_lookup_symbol_primary Tom Tromey
2025-04-02 23:45 ` [PATCH v2 22/28] Pass lookup_name_info to block_lookup_symbol_primary Tom Tromey
2025-04-02 23:45 ` [PATCH v2 23/28] Simplify block_lookup_symbol Tom Tromey
2025-04-02 23:45 ` [PATCH v2 24/28] Add best_symbol_tracker Tom Tromey
2025-04-02 23:45 ` [PATCH v2 25/28] Convert lookup_symbol_via_quick_fns Tom Tromey
2025-04-02 23:45 ` [PATCH v2 26/28] Convert lookup_symbol_in_objfile Tom Tromey
2025-04-02 23:45 ` [PATCH v2 27/28] Make dw_expand_symtabs_matching_file_matcher static Tom Tromey
2025-04-23 20:00   ` Simon Marchi
2025-04-23 20:09     ` Tom Tromey
2025-04-23 20:44       ` Tom Tromey
2025-04-02 23:45 ` [PATCH v2 28/28] Remove enter_symbol_lookup Tom Tromey
2025-04-23 20:09 ` [PATCH v2 00/28] Search symbols via quick API Simon Marchi
2025-04-24 21:09   ` Tom Tromey
2025-04-28 14:07 ` Guinevere Larsen
2025-04-28 22:06   ` Tom Tromey
2025-04-29 19:31     ` Guinevere Larsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250402-search-in-psyms-v2-11-ea91704487cb@tromey.com \
    --to=tom@tromey.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox