From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id +EVdD/HQ7WcdkiQAWB0awg (envelope-from ) for ; Wed, 02 Apr 2025 20:06:09 -0400 Authentication-Results: simark.ca; dkim=fail reason="signature verification failed" (768-bit key; unprotected) header.d=tromey.com header.i=@tromey.com header.a=rsa-sha256 header.s=default header.b=dZS4V9xr; dkim-atps=neutral Received: by simark.ca (Postfix, from userid 112) id 39F6F1E0C3; Wed, 2 Apr 2025 20:06:09 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-3.8 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_00, DKIM_INVALID,DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_BL_SPAMCOP_NET, RCVD_IN_DNSWL_MED autolearn=ham autolearn_force=no version=4.0.1 Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id A500D1E0C0 for ; Wed, 2 Apr 2025 20:06:06 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 527C1385AC29 for ; Thu, 3 Apr 2025 00:06:06 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 527C1385AC29 Authentication-Results: sourceware.org; dkim=fail reason="signature verification failed" (768-bit key, unprotected) header.d=tromey.com header.i=@tromey.com header.a=rsa-sha256 header.s=default header.b=dZS4V9xr Received: from omta40.uswest2.a.cloudfilter.net (omta40.uswest2.a.cloudfilter.net [35.89.44.39]) by sourceware.org (Postfix) with ESMTPS id 3F5A93857037 for ; Wed, 2 Apr 2025 23:45:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3F5A93857037 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tromey.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tromey.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3F5A93857037 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=35.89.44.39 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1743637516; cv=none; b=n+2mZHUBC1YF3kZKwy494+Fkf46YARc3ELZUVEJIDdk7vATj0rH3hSzSw9Gb4gH4/OFwTIPQpAVyw4sa0JxXJziaY8B/KX6GixIpfjnL24ePdInP6MXmkGIZh9zt4/w5pu2T6zPeXy7Zy1/gGBZVdkm5CZXz8X0D09pvg6yKg50= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1743637516; c=relaxed/simple; bh=6dQrWIMJ62cHK/lUPvngFaLuDHyBNiE1GxrnN7csHN8=; h=DKIM-Signature:From:Date:Subject:MIME-Version:Message-Id:To; b=LmHJtkt7wOQOsOxMtny882Ab47HQS4VV5+hzOkcxtcSrW0N/18oHtTPi/MycvpZN0Gcb5UfpM/QOOiIMYPEjV5kqy0GDsHdoVM2KgPyk7o5rb2Ij3GjtTkNGHZaMnX8pBMCiVG1x+hILvXaZOCIhYEM5rbccRiw//LXXrfNG5Jg= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3F5A93857037 Received: from eig-obgw-6006a.ext.cloudfilter.net ([10.0.30.182]) by cmsmtp with ESMTPS id 05U9udHkDf1UX07lkuFIks; Wed, 02 Apr 2025 23:45:16 +0000 Received: from box5379.bluehost.com ([162.241.216.53]) by cmsmtp with ESMTPS id 07ljuwXqWh1RE07lju5bgf; Wed, 02 Apr 2025 23:45:15 +0000 X-Authority-Analysis: v=2.4 cv=Su+W6uO0 c=1 sm=1 tr=0 ts=67edcc0b a=ApxJNpeYhEAb1aAlGBBbmA==:117 a=ApxJNpeYhEAb1aAlGBBbmA==:17 a=IkcTkHD0fZMA:10 a=XR8D0OoHHMoA:10 a=ItBw4LHWJt0A:10 a=CCpqsmhAAAAA:8 a=ZG2YMBv2zmPnHxWT510A:9 a=3ZKOabzyN94A:10 a=QEXdDO2ut3YA:10 a=ul9cdbp4aOFLsgKbc677:22 a=6Ogn3jAGHLSNbaov7Orx:22 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tromey.com; s=default; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=K0WA+djbePvhm4CpeMvvZ6xfBvNXlVhUvx9XC72P0Ww=; b=dZS4V9xrfez7Y0PZnv6qJyXVpj pCphLMofO8+OM37LVmC1v1HD3uLa1+BcftOgnpVLQZCj/oIJwq+THKI7ddBP11Kbwff/ptbB5a7sa DgjvNs4RULcW4e7JqBsOddTeW; Received: from 97-122-123-18.hlrn.qwest.net ([97.122.123.18]:56394 helo=prentzel.local) by box5379.bluehost.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.1) (envelope-from ) id 1u07li-000000014hd-3iAL; Wed, 02 Apr 2025 17:45:15 -0600 From: Tom Tromey Date: Wed, 02 Apr 2025 17:45:10 -0600 Subject: [PATCH v2 11/28] Rewrite the .gdb_index reader MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Message-Id: <20250402-search-in-psyms-v2-11-ea91704487cb@tromey.com> References: <20250402-search-in-psyms-v2-0-ea91704487cb@tromey.com> In-Reply-To: <20250402-search-in-psyms-v2-0-ea91704487cb@tromey.com> To: gdb-patches@sourceware.org Cc: Tom Tromey X-Mailer: b4 0.14.2 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - box5379.bluehost.com X-AntiAbuse: Original Domain - sourceware.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - tromey.com X-BWhitelist: no X-Source-IP: 97.122.123.18 X-Source-L: No X-Exim-ID: 1u07li-000000014hd-3iAL X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: 97-122-123-18.hlrn.qwest.net (prentzel.local) [97.122.123.18]:56394 X-Source-Auth: tom+tromey.com X-Email-Count: 12 X-Org: HG=bhshared;ORG=bluehost; X-Source-Cap: ZWx5bnJvYmk7ZWx5bnJvYmk7Ym94NTM3OS5ibHVlaG9zdC5jb20= X-Local-Domain: yes X-CMAE-Envelope: MS4xfEwQ4KdjEQiZ1jP4BS8gliybazfkAv4mmOYOudpDotWksrmuiUNcwcqoLJEZhhfEtrQftvQYd5ZG3w3lHDcX4jCUWlS2mXsaR57jNtbvCzR9pGrl9YQ5 vigP5BDa9ndObaCYnik6VVf5dmdBCmN2ag9HufJlHdkKxD8RWBnzjbYMh6+ItxtvBYwB4JfVv90S+pemZXVI19vqDort998mCT4= X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces~public-inbox=simark.ca@sourceware.org This patch rewrites the .gdb_index reader to create the same data structures that are created by the cooked indexer and the .debug_names reader. This is done in support of this series; but also because, from what I can tell, the "templates.exp" change didn't really work properly with this reader. In addition to fixing that problem, this patch removes a lot of code. Implementing this required a couple of hacks, as .gdb_index does not contain all the information that's used by the cooked index implementation. * The index-searching code likes to differentiate between the various DWARF tags when matching, but .gdb_index lumps many things into a single "other" category. To handle this, we introduce a phony tag that's used so that the match method can match on multiple domains. * Similarly, .gdb_index doesn't distinguish between the type and struct domains, so another phony tag is used for this. * Support for older versions of .gdb_index is removed entirely. * The reader must attempt to guess the language of various symbols. This is somewhat finicky. "Plain" (unqualified) symbols are marked as language_unknown and then a couple of hacks are used to handle these -- one in expand_symtabs_matching and another when recognizing "main". For what it's worth, I consider .gdb_index to be near the end of its life. While .debug_names is not perfect -- we found a number of bugs in the standard while implementing it -- it is better than .gdb_index and also better documented. After this patch, we could conceivably remove dwarf_scanner_base. However, I have not done this. Finally, this patch also changes this reader to dump the content of the index, as the other DWARF readers do. This can be handy when debugging gdb. --- gdb/dwarf2/cooked-index-shard.c | 11 +- gdb/dwarf2/cooked-index-worker.h | 10 +- gdb/dwarf2/read-gdb-index.c | 1270 +++++++------------------------------- gdb/dwarf2/read-gdb-index.h | 14 + gdb/dwarf2/read.c | 58 +- gdb/dwarf2/tag.h | 8 + 6 files changed, 302 insertions(+), 1069 deletions(-) diff --git a/gdb/dwarf2/cooked-index-shard.c b/gdb/dwarf2/cooked-index-shard.c index 29a8aea513786e4c1c1ed77dee8610fc329d1c8a..888a0fa345b9124e2814aa722e71a9d2fd5adf8e 100644 --- a/gdb/dwarf2/cooked-index-shard.c +++ b/gdb/dwarf2/cooked-index-shard.c @@ -86,7 +86,16 @@ cooked_index_shard::add (sect_offset die_offset, enum dwarf_tag tag, implicit "main" discovery. */ if ((flags & IS_MAIN) != 0) m_main = result; - else if ((flags & IS_PARENT_DEFERRED) == 0 + /* The language check here is subtle: it exists solely to work + around a bug in .gdb_index. That index does not record + languages, but it might emit an entry for "main". However, + recognizing this "main" as being the main program would be wrong + -- for example, an Ada program has a C "main" but this is not the + desired target of the "start" command. Requiring the language to + be set here avoids over-eagerly setting the "main" when using + .gdb_index. Should .gdb_index ever be removed (PR symtab/31363), + the language_unknown check here could also be removed. */ + else if (lang != language_unknown && parent_entry.resolved == nullptr && m_main == nullptr && language_may_use_plain_main (lang) diff --git a/gdb/dwarf2/cooked-index-worker.h b/gdb/dwarf2/cooked-index-worker.h index c6e005c14dea1bb891680969c17172d494a7de46..a6b4780373c54a51760e46997a24aca57c340969 100644 --- a/gdb/dwarf2/cooked-index-worker.h +++ b/gdb/dwarf2/cooked-index-worker.h @@ -108,6 +108,12 @@ class cooked_index_worker_result return &m_addrmap; } + /* Set the mutable addrmap. */ + void set_addrmap (addrmap_mutable new_map) + { + m_addrmap = std::move (new_map); + } + /* Return the parent_map that is currently being created. */ parent_map *get_parent_map () { @@ -203,10 +209,10 @@ enum class cooked_state /* An object of this type controls the scanning of the DWARF. It schedules the worker tasks and tracks the current state. Once scanning is done, this object is discarded. - + This is an abstract base class that defines the basic behavior of scanners. Separate concrete implementations exist for scanning - .debug_names and .debug_info. */ + .debug_names, .gdb_index, and .debug_info. */ class cooked_index_worker { diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c index 7bcf50d347609f0b27068228cd78633dd57e1556..1d7704b8c5bb7b5487b2373ea064c11ab1630e72 100644 --- a/gdb/dwarf2/read-gdb-index.c +++ b/gdb/dwarf2/read-gdb-index.c @@ -27,16 +27,68 @@ #include "event-top.h" #include "gdb/gdb-index.h" #include "gdbsupport/gdb-checked-static-cast.h" -#include "mapped-index.h" +#include "cooked-index.h" #include "read.h" #include "extract-store-integer.h" #include "cp-support.h" #include "symtab.h" #include "gdbsupport/selftest.h" +#include "tag.h" +#include "gdbsupport/gdb-safe-ctype.h" /* When true, do not reject deprecated .gdb_index sections. */ static bool use_deprecated_index_sections = false; +struct dwarf2_gdb_index : public cooked_index_functions +{ + /* This dumps minimal information about the index. + It is called via "mt print objfiles". + One use is to verify .gdb_index has been loaded by the + gdb.dwarf2/gdb-index.exp testcase. */ + void dump (struct objfile *objfile) override; +}; + +/* This is like a cooked index, but as it has been ingested from + .gdb_index, it can't be used to write out an index. */ + +class cooked_gdb_index : public cooked_index +{ +public: + + cooked_gdb_index (std::unique_ptr worker, + int version) + : cooked_index (std::move (worker)), + version (version) + { } + + cooked_index *index_for_writing () override + { return nullptr; } + + quick_symbol_functions_up make_quick_functions () const override + { return quick_symbol_functions_up (new dwarf2_gdb_index); } + + bool version_check () const override + { + return version >= 8; + } + + /* Index data format version. */ + int version; +}; + +/* See above. */ + +void +dwarf2_gdb_index::dump (struct objfile *objfile) +{ + dwarf2_per_objfile *per_objfile = get_dwarf2_per_objfile (objfile); + + cooked_gdb_index *index = (gdb::checked_static_cast + (per_objfile->per_bfd->index_table.get ())); + gdb_printf (".gdb_index: version %d\n", index->version); + cooked_index_functions::dump (objfile); +} + /* This is a view into the index that converts from bytes to an offset_type, and allows indexing. Unaligned bytes are specifically allowed here, and handled via unpacking. */ @@ -77,43 +129,11 @@ class offset_view gdb::array_view m_bytes; }; -/* An index into a (C++) symbol name component in a symbol name as - recorded in the mapped_index's symbol table. For each C++ symbol - in the symbol table, we record one entry for the start of each - component in the symbol in a table of name components, and then - sort the table, in order to be able to binary search symbol names, - ignoring leading namespaces, both completion and regular look up. - For example, for symbol "A::B::C", we'll have an entry that points - to "A::B::C", another that points to "B::C", and another for "C". - Note that function symbols in GDB index have no parameter - information, just the function/method names. You can convert a - name_component to a "const char *" using the - 'mapped_index::symbol_name_at(offset_type)' method. */ - -struct name_component -{ - /* Offset in the symbol name where the component starts. Stored as - a (32-bit) offset instead of a pointer to save memory and improve - locality on 64-bit architectures. */ - offset_type name_offset; - - /* The symbol's index in the symbol and constant pool tables of a - mapped_index. */ - offset_type idx; -}; - -/* A description of .gdb_index index. The file format is described in - a comment by the code that writes the index. */ +/* A worker for reading .gdb_index. The file format is described in + the manual. */ -struct mapped_gdb_index : public dwarf_scanner_base +struct mapped_gdb_index { - /* The name_component table (a sorted vector). See name_component's - description above. */ - std::vector name_components; - - /* How NAME_COMPONENTS is sorted. */ - enum case_sensitivity name_components_casing; - /* Index data format version. */ int version = 0; @@ -132,6 +152,14 @@ struct mapped_gdb_index : public dwarf_scanner_base /* An address map that maps from PC to dwarf2_per_cu. */ addrmap_fixed *index_addrmap = nullptr; + /* The name of 'main', or nullptr if not known. */ + const char *main_name = nullptr; + /* The language of 'main', if known. */ + enum language main_lang = language_minimal; + + /* The result we're constructing. */ + cooked_index_worker_result result; + /* Return the index into the constant pool of the name of the IDXth symbol in the symbol table. */ offset_type symbol_name_index (offset_type idx) const @@ -148,221 +176,41 @@ struct mapped_gdb_index : public dwarf_scanner_base /* Return whether the name at IDX in the symbol table should be ignored. */ - virtual bool symbol_name_slot_invalid (offset_type idx) const + bool symbol_name_slot_invalid (offset_type idx) const { - return (symbol_name_index (idx) == 0 - && symbol_vec_index (idx) == 0); + return symbol_name_index (idx) == 0 && symbol_vec_index (idx) == 0; } /* Convenience method to get at the name of the symbol at IDX in the symbol table. */ - virtual const char *symbol_name_at - (offset_type idx, dwarf2_per_objfile *per_objfile) const + const char *symbol_name_at (offset_type idx, + dwarf2_per_objfile *per_objfile) const { return (const char *) (this->constant_pool.data () + symbol_name_index (idx)); } - virtual size_t symbol_name_count () const + size_t symbol_name_count () const { return this->symbol_table.size () / 2; } + /* Set the name and language of the main function from the shortcut + table. */ + void set_main_name (dwarf2_per_objfile *per_objfile); + /* Build the symbol name component sorted vector, if we haven't yet. */ void build_name_components (dwarf2_per_objfile *per_objfile); - - /* Returns the lower (inclusive) and upper (exclusive) bounds of the - possible matches for LN_NO_PARAMS in the name component - vector. */ - std::pair::const_iterator, - std::vector::const_iterator> - find_name_components_bounds (const lookup_name_info &ln_no_params, - enum language lang, - dwarf2_per_objfile *per_objfile) const; - - quick_symbol_functions_up make_quick_functions () const override; - - bool version_check () const override - { - return version >= 8; - } - - dwarf2_per_cu *lookup (unrelocated_addr addr) override - { - if (index_addrmap == nullptr) - return nullptr; - - void *obj = index_addrmap->find (static_cast (addr)); - return static_cast (obj); - } - - cooked_index *index_for_writing () override - { return nullptr; } }; - -/* Starting from a search name, return the string that finds the upper - bound of all strings that start with SEARCH_NAME in a sorted name - list. Returns the empty string to indicate that the upper bound is - the end of the list. */ - -static std::string -make_sort_after_prefix_name (const char *search_name) -{ - /* When looking to complete "func", we find the upper bound of all - symbols that start with "func" by looking for where we'd insert - the closest string that would follow "func" in lexicographical - order. Usually, that's "func"-with-last-character-incremented, - i.e. "fund". Mind non-ASCII characters, though. Usually those - will be UTF-8 multi-byte sequences, but we can't be certain. - Especially mind the 0xff character, which is a valid character in - non-UTF-8 source character sets (e.g. Latin1 'ÿ'), and we can't - rule out compilers allowing it in identifiers. Note that - conveniently, strcmp/strcasecmp are specified to compare - characters interpreted as unsigned char. So what we do is treat - the whole string as a base 256 number composed of a sequence of - base 256 "digits" and add 1 to it. I.e., adding 1 to 0xff wraps - to 0, and carries 1 to the following more-significant position. - If the very first character in SEARCH_NAME ends up incremented - and carries/overflows, then the upper bound is the end of the - list. The string after the empty string is also the empty - string. - - Some examples of this operation: - - SEARCH_NAME => "+1" RESULT - - "abc" => "abd" - "ab\xff" => "ac" - "\xff" "a" "\xff" => "\xff" "b" - "\xff" => "" - "\xff\xff" => "" - "" => "" - - Then, with these symbols for example: - - func - func1 - fund - - completing "func" looks for symbols between "func" and - "func"-with-last-character-incremented, i.e. "fund" (exclusive), - which finds "func" and "func1", but not "fund". - - And with: - - funcÿ (Latin1 'ÿ' [0xff]) - funcÿ1 - fund - - completing "funcÿ" looks for symbols between "funcÿ" and "fund" - (exclusive), which finds "funcÿ" and "funcÿ1", but not "fund". - - And with: - - ÿÿ (Latin1 'ÿ' [0xff]) - ÿÿ1 - - completing "ÿ" or "ÿÿ" looks for symbols between between "ÿÿ" and - the end of the list. - */ - std::string after = search_name; - while (!after.empty () && (unsigned char) after.back () == 0xff) - after.pop_back (); - if (!after.empty ()) - after.back () = (unsigned char) after.back () + 1; - return after; -} - -/* See declaration. */ - -std::pair::const_iterator, - std::vector::const_iterator> -mapped_gdb_index::find_name_components_bounds - (const lookup_name_info &lookup_name_without_params, language lang, - dwarf2_per_objfile *per_objfile) const -{ - auto *name_cmp - = this->name_components_casing == case_sensitive_on ? strcmp : strcasecmp; - - const char *lang_name - = lookup_name_without_params.language_lookup_name (lang); - - /* Comparison function object for lower_bound that matches against a - given symbol name. */ - auto lookup_compare_lower = [&] (const name_component &elem, - const char *name) - { - const char *elem_qualified = this->symbol_name_at (elem.idx, per_objfile); - const char *elem_name = elem_qualified + elem.name_offset; - return name_cmp (elem_name, name) < 0; - }; - - /* Comparison function object for upper_bound that matches against a - given symbol name. */ - auto lookup_compare_upper = [&] (const char *name, - const name_component &elem) - { - const char *elem_qualified = this->symbol_name_at (elem.idx, per_objfile); - const char *elem_name = elem_qualified + elem.name_offset; - return name_cmp (name, elem_name) < 0; - }; - - auto begin = this->name_components.begin (); - auto end = this->name_components.end (); - - /* Find the lower bound. */ - auto lower = [&] () - { - if (lookup_name_without_params.completion_mode () && lang_name[0] == '\0') - return begin; - else - return std::lower_bound (begin, end, lang_name, lookup_compare_lower); - } (); - - /* Find the upper bound. */ - auto upper = [&] () - { - if (lookup_name_without_params.completion_mode ()) - { - /* In completion mode, we want UPPER to point past all - symbols names that have the same prefix. I.e., with - these symbols, and completing "func": - - function << lower bound - function1 - other_function << upper bound - - We find the upper bound by looking for the insertion - point of "func"-with-last-character-incremented, - i.e. "fund". */ - std::string after = make_sort_after_prefix_name (lang_name); - if (after.empty ()) - return end; - return std::lower_bound (lower, end, after.c_str (), - lookup_compare_lower); - } - else - return std::upper_bound (lower, end, lang_name, lookup_compare_upper); - } (); - - return {lower, upper}; -} - /* See declaration. */ void mapped_gdb_index::build_name_components (dwarf2_per_objfile *per_objfile) { - if (!this->name_components.empty ()) - return; - - this->name_components_casing = case_sensitivity; - auto *name_cmp - = this->name_components_casing == case_sensitive_on ? strcmp : strcasecmp; + std::vector>> + need_parents; + gdb::unordered_map by_name; - /* The code below only knows how to break apart components of C++ - symbol names (and other languages that use '::' as - namespace/module separator) and Ada symbol names. */ auto count = this->symbol_name_count (); for (offset_type idx = 0; idx < count; idx++) { @@ -371,815 +219,186 @@ mapped_gdb_index::build_name_components (dwarf2_per_objfile *per_objfile) const char *name = this->symbol_name_at (idx, per_objfile); - /* Add each name component to the name component table. */ - unsigned int previous_len = 0; + /* This code only knows how to break apart components of C++ + symbol names (and other languages that use '::' as + namespace/module separator) and Ada symbol names. + It's unfortunate that we need the language, but since it is + really only used to rebuild full names, pairing it with the + split method is fine. */ + enum language lang; + std::vector components; if (strstr (name, "::") != nullptr) { - for (unsigned int current_len = cp_find_first_component (name); - name[current_len] != '\0'; - current_len += cp_find_first_component (name + current_len)) - { - gdb_assert (name[current_len] == ':'); - this->name_components.push_back ({previous_len, idx}); - /* Skip the '::'. */ - current_len += 2; - previous_len = current_len; - } + components = split_name (name, split_style::CXX); + lang = language_cplus; } - else + else if (strchr (name, '<') != nullptr) { - /* Handle the Ada encoded (aka mangled) form here. */ - for (const char *iter = strstr (name, "__"); - iter != nullptr; - iter = strstr (iter, "__")) - { - this->name_components.push_back ({previous_len, idx}); - iter += 2; - previous_len = iter - name; - } + /* Guess that this is a template and so a C++ name. */ + components.emplace_back (name); + lang = language_cplus; } - - this->name_components.push_back ({previous_len, idx}); - } - - /* Sort name_components elements by name. */ - auto name_comp_compare = [&] (const name_component &left, - const name_component &right) - { - const char *left_qualified - = this->symbol_name_at (left.idx, per_objfile); - const char *right_qualified - = this->symbol_name_at (right.idx, per_objfile); - - const char *left_name = left_qualified + left.name_offset; - const char *right_name = right_qualified + right.name_offset; - - return name_cmp (left_name, right_name) < 0; - }; - - std::sort (this->name_components.begin (), - this->name_components.end (), - name_comp_compare); -} - -/* Helper for dw2_expand_symtabs_matching that works with a - mapped_index_base instead of the containing objfile. This is split - to a separate function in order to be able to unit test the - name_components matching using a mock mapped_index_base. For each - symbol name that matches, calls MATCH_CALLBACK, passing it the - symbol's index in the mapped_index_base symbol table. */ - -static bool -dw2_expand_symtabs_matching_symbol - (mapped_gdb_index &index, - const lookup_name_info &lookup_name_in, - expand_symtabs_symbol_matcher symbol_matcher, - gdb::function_view match_callback, - dwarf2_per_objfile *per_objfile, - expand_symtabs_lang_matcher lang_matcher) -{ - lookup_name_info lookup_name_without_params - = lookup_name_in.make_ignore_params (); - - /* Build the symbol name component sorted vector, if we haven't - yet. */ - index.build_name_components (per_objfile); - - /* The same symbol may appear more than once in the range though. - E.g., if we're looking for symbols that complete "w", and we have - a symbol named "w1::w2", we'll find the two name components for - that same symbol in the range. To be sure we only call the - callback once per symbol, we first collect the symbol name - indexes that matched in a temporary vector and ignore - duplicates. */ - std::vector matches; - - struct name_and_matcher - { - symbol_name_matcher_ftype *matcher; - const char *name; - - bool operator== (const name_and_matcher &other) const - { - return matcher == other.matcher && strcmp (name, other.name) == 0; - } - }; - - /* A vector holding all the different symbol name matchers, for all - languages. */ - std::vector matchers; - - for (int i = 0; i < nr_languages; i++) - { - enum language lang_e = (enum language) i; - if (lang_matcher != nullptr && !lang_matcher (lang_e)) - continue; - - const language_defn *lang = language_def (lang_e); - symbol_name_matcher_ftype *name_matcher - = lang->get_symbol_name_matcher (lookup_name_without_params); - - name_and_matcher key { - name_matcher, - lookup_name_without_params.language_lookup_name (lang_e) - }; - - /* Don't insert the same comparison routine more than once. - Note that we do this linear walk. This is not a problem in - practice because the number of supported languages is - low. */ - if (std::find (matchers.begin (), matchers.end (), key) - != matchers.end ()) - continue; - matchers.push_back (std::move (key)); - - auto bounds - = index.find_name_components_bounds (lookup_name_without_params, - lang_e, per_objfile); - - /* Now for each symbol name in range, check to see if we have a name - match, and if so, call the MATCH_CALLBACK callback. */ - - for (; bounds.first != bounds.second; ++bounds.first) + else if (strstr (name, "__") != nullptr) { - const char *qualified - = index.symbol_name_at (bounds.first->idx, per_objfile); - - if (!name_matcher (qualified, lookup_name_without_params, NULL) - || (symbol_matcher != NULL && !symbol_matcher (qualified))) - continue; - - matches.push_back (bounds.first->idx); + /* The Ada case is handled during finalization, because gdb + does not write the synthesized package names into the + index. */ + components.emplace_back (name); + lang = language_ada; } - } - - std::sort (matches.begin (), matches.end ()); - - /* Finally call the callback, once per match. */ - ULONGEST prev = -1; - bool result = true; - for (offset_type idx : matches) - { - if (prev != idx) + else { - if (!match_callback (idx)) - { - result = false; - break; - } - prev = idx; + components = split_name (name, split_style::DOT_STYLE); + /* Mark ordinary names as having an unknown language. This + is a hack to avoid problems with some Ada names. */ + lang = (components.size () == 1) ? language_unknown : language_go; } - } - - /* Above we use a type wider than idx's for 'prev', since 0 and - (offset_type)-1 are both possible values. */ - static_assert (sizeof (prev) > sizeof (offset_type), ""); - - return result; -} - -#if GDB_SELF_TEST - -namespace selftests { namespace dw2_expand_symtabs_matching { - -/* A mock .gdb_index/.debug_names-like name index table, enough to - exercise dw2_expand_symtabs_matching_symbol, which works with the - mapped_index_base interface. Builds an index from the symbol list - passed as parameter to the constructor. */ -class mock_mapped_index : public mapped_gdb_index -{ -public: - mock_mapped_index (gdb::array_view symbols) - : m_symbol_table (symbols) - {} - - DISABLE_COPY_AND_ASSIGN (mock_mapped_index); - - bool symbol_name_slot_invalid (offset_type idx) const override - { return false; } - - /* Return the number of names in the symbol table. */ - size_t symbol_name_count () const override - { - return m_symbol_table.size (); - } - - /* Get the name of the symbol at IDX in the symbol table. */ - const char *symbol_name_at - (offset_type idx, dwarf2_per_objfile *per_objfile) const override - { - return m_symbol_table[idx]; - } - - quick_symbol_functions_up make_quick_functions () const override - { - return nullptr; - } - -private: - gdb::array_view m_symbol_table; -}; - -/* Convenience function that converts a NULL pointer to a "" - string, to pass to print routines. */ - -static const char * -string_or_null (const char *str) -{ - return str != NULL ? str : ""; -} - -/* Check if a lookup_name_info built from - NAME/MATCH_TYPE/COMPLETION_MODE matches the symbols in the mock - index. EXPECTED_LIST is the list of expected matches, in expected - matching order. If no match expected, then an empty list is - specified. Returns true on success. On failure prints a warning - indicating the file:line that failed, and returns false. */ - -static bool -check_match (const char *file, int line, - mock_mapped_index &mock_index, - const char *name, symbol_name_match_type match_type, - bool completion_mode, - std::initializer_list expected_list, - dwarf2_per_objfile *per_objfile) -{ - lookup_name_info lookup_name (name, match_type, completion_mode); - - bool matched = true; - - auto mismatch = [&] (const char *expected_str, - const char *got) - { - warning (_("%s:%d: match_type=%s, looking-for=\"%s\", " - "expected=\"%s\", got=\"%s\"\n"), - file, line, - (match_type == symbol_name_match_type::FULL - ? "FULL" : "WILD"), - name, string_or_null (expected_str), string_or_null (got)); - matched = false; - }; - - auto expected_it = expected_list.begin (); - auto expected_end = expected_list.end (); - - dw2_expand_symtabs_matching_symbol (mock_index, lookup_name, - nullptr, - [&] (offset_type idx) - { - const char *matched_name = mock_index.symbol_name_at (idx, per_objfile); - const char *expected_str - = expected_it == expected_end ? NULL : *expected_it++; - - if (expected_str == NULL || strcmp (expected_str, matched_name) != 0) - mismatch (expected_str, matched_name); - return true; - }, per_objfile, nullptr); - - const char *expected_str - = expected_it == expected_end ? NULL : *expected_it++; - if (expected_str != NULL) - mismatch (expected_str, NULL); - - return matched; -} - -/* The symbols added to the mock mapped_index for testing (in - canonical form). */ -static const char *test_symbols[] = { - "function", - "std::bar", - "std::zfunction", - "std::zfunction2", - "w1::w2", - "ns::foo", - "ns::foo", - "ns::foo", - "ns2::tmpl::foo2", - "(anonymous namespace)::A::B::C", - - /* These are used to check that the increment-last-char in the - matching algorithm for completion doesn't match "t1_fund" when - completing "t1_func". */ - "t1_func", - "t1_func1", - "t1_fund", - "t1_fund1", - - /* A UTF-8 name with multi-byte sequences to make sure that - cp-name-parser understands this as a single identifier ("função" - is "function" in PT). */ - (const char *)u8"u8função", - - /* Test a symbol name that ends with a 0xff character, which is a - valid character in non-UTF-8 source character sets (e.g. Latin1 - 'ÿ'), and we can't rule out compilers allowing it in identifiers. - We test this because the completion algorithm finds the upper - bound of symbols by looking for the insertion point of - "func"-with-last-character-incremented, i.e. "fund", and adding 1 - to 0xff should wraparound and carry to the previous character. - See comments in make_sort_after_prefix_name. */ - "yfunc\377", - - /* Some more symbols with \377 (0xff). See above. */ - "\377", - "\377\377123", - - /* A name with all sorts of complications. Starts with "z" to make - it easier for the completion tests below. */ -#define Z_SYM_NAME \ - "z::std::tuple<(anonymous namespace)::ui*, std::bar<(anonymous namespace)::ui> >" \ - "::tuple<(anonymous namespace)::ui*, " \ - "std::default_delete<(anonymous namespace)::ui>, void>" - - Z_SYM_NAME -}; - -/* Returns true if the mapped_index_base::find_name_component_bounds - method finds EXPECTED_SYMS in INDEX when looking for SEARCH_NAME, - in completion mode. */ - -static bool -check_find_bounds_finds (mapped_gdb_index &index, - const char *search_name, - gdb::array_view expected_syms, - dwarf2_per_objfile *per_objfile) -{ - lookup_name_info lookup_name (search_name, - symbol_name_match_type::FULL, true); - - auto bounds = index.find_name_components_bounds (lookup_name, - language_cplus, - per_objfile); - - size_t distance = std::distance (bounds.first, bounds.second); - if (distance != expected_syms.size ()) - return false; - - for (size_t exp_elem = 0; exp_elem < distance; exp_elem++) - { - auto nc_elem = bounds.first + exp_elem; - const char *qualified = index.symbol_name_at (nc_elem->idx, per_objfile); - if (strcmp (qualified, expected_syms[exp_elem]) != 0) - return false; - } - - return true; -} - -/* Test the lower-level mapped_index::find_name_component_bounds - method. */ - -static void -test_mapped_index_find_name_component_bounds () -{ - mock_mapped_index mock_index (test_symbols); - - mock_index.build_name_components (NULL /* per_objfile */); - - /* Test the lower-level mapped_index::find_name_component_bounds - method in completion mode. */ - { - static const char *expected_syms[] = { - "t1_func", - "t1_func1", - }; - - SELF_CHECK (check_find_bounds_finds - (mock_index, "t1_func", expected_syms, - NULL /* per_objfile */)); - } - - /* Check that the increment-last-char in the name matching algorithm - for completion doesn't get confused with Ansi1 'ÿ' / 0xff. See - make_sort_after_prefix_name. */ - { - static const char *expected_syms1[] = { - "\377", - "\377\377123", - }; - SELF_CHECK (check_find_bounds_finds - (mock_index, "\377", expected_syms1, NULL /* per_objfile */)); - - static const char *expected_syms2[] = { - "\377\377123", - }; - SELF_CHECK (check_find_bounds_finds - (mock_index, "\377\377", expected_syms2, - NULL /* per_objfile */)); - } -} - -/* Test dw2_expand_symtabs_matching_symbol. */ - -static void -test_dw2_expand_symtabs_matching_symbol () -{ - mock_mapped_index mock_index (test_symbols); - - /* We let all tests run until the end even if some fails, for debug - convenience. */ - bool any_mismatch = false; - - /* Create the expected symbols list (an initializer_list). Needed - because lists have commas, and we need to pass them to CHECK, - which is a macro. */ -#define EXPECT(...) { __VA_ARGS__ } - - /* Wrapper for check_match that passes down the current - __FILE__/__LINE__. */ -#define CHECK_MATCH(NAME, MATCH_TYPE, COMPLETION_MODE, EXPECTED_LIST) \ - any_mismatch |= !check_match (__FILE__, __LINE__, \ - mock_index, \ - NAME, MATCH_TYPE, COMPLETION_MODE, \ - EXPECTED_LIST, NULL) - - /* Identity checks. */ - for (const char *sym : test_symbols) - { - /* Should be able to match all existing symbols. */ - CHECK_MATCH (sym, symbol_name_match_type::FULL, false, - EXPECT (sym)); - - /* Should be able to match all existing symbols with - parameters. */ - std::string with_params = std::string (sym) + "(int)"; - CHECK_MATCH (with_params.c_str (), symbol_name_match_type::FULL, false, - EXPECT (sym)); - - /* Should be able to match all existing symbols with - parameters and qualifiers. */ - with_params = std::string (sym) + " ( int ) const"; - CHECK_MATCH (with_params.c_str (), symbol_name_match_type::FULL, false, - EXPECT (sym)); - - /* This should really find sym, but cp-name-parser.y doesn't - know about lvalue/rvalue qualifiers yet. */ - with_params = std::string (sym) + " ( int ) &&"; - CHECK_MATCH (with_params.c_str (), symbol_name_match_type::FULL, false, - {}); - } - - /* Check that the name matching algorithm for completion doesn't get - confused with Latin1 'ÿ' / 0xff. See - make_sort_after_prefix_name. */ - { - static const char str[] = "\377"; - CHECK_MATCH (str, symbol_name_match_type::FULL, true, - EXPECT ("\377", "\377\377123")); - } - - /* Check that the increment-last-char in the matching algorithm for - completion doesn't match "t1_fund" when completing "t1_func". */ - { - static const char str[] = "t1_func"; - CHECK_MATCH (str, symbol_name_match_type::FULL, true, - EXPECT ("t1_func", "t1_func1")); - } - - /* Check that completion mode works at each prefix of the expected - symbol name. */ - { - static const char str[] = "function(int)"; - size_t len = strlen (str); - std::string lookup; - - for (size_t i = 1; i < len; i++) - { - lookup.assign (str, i); - CHECK_MATCH (lookup.c_str (), symbol_name_match_type::FULL, true, - EXPECT ("function")); - } - } - - /* While "w" is a prefix of both components, the match function - should still only be called once. */ - { - CHECK_MATCH ("w", symbol_name_match_type::FULL, true, - EXPECT ("w1::w2")); - CHECK_MATCH ("w", symbol_name_match_type::WILD, true, - EXPECT ("w1::w2")); - } - - /* Same, with a "complicated" symbol. */ - { - static const char str[] = Z_SYM_NAME; - size_t len = strlen (str); - std::string lookup; - - for (size_t i = 1; i < len; i++) - { - lookup.assign (str, i); - CHECK_MATCH (lookup.c_str (), symbol_name_match_type::FULL, true, - EXPECT (Z_SYM_NAME)); - } - } - - /* In FULL mode, an incomplete symbol doesn't match. */ - { - CHECK_MATCH ("std::zfunction(int", symbol_name_match_type::FULL, false, - {}); - } - - /* A complete symbol with parameters matches any overload, since the - index has no overload info. */ - { - CHECK_MATCH ("std::zfunction(int)", symbol_name_match_type::FULL, true, - EXPECT ("std::zfunction", "std::zfunction2")); - CHECK_MATCH ("zfunction(int)", symbol_name_match_type::WILD, true, - EXPECT ("std::zfunction", "std::zfunction2")); - CHECK_MATCH ("zfunc", symbol_name_match_type::WILD, true, - EXPECT ("std::zfunction", "std::zfunction2")); - } - - /* Check that whitespace is ignored appropriately. A symbol with a - template argument list. */ - { - static const char expected[] = "ns::foo"; - CHECK_MATCH ("ns :: foo < int > ", symbol_name_match_type::FULL, false, - EXPECT (expected)); - CHECK_MATCH ("foo < int > ", symbol_name_match_type::WILD, false, - EXPECT (expected)); - } - - /* Check that whitespace is ignored appropriately. A symbol with a - template argument list that includes a pointer. */ - { - static const char expected[] = "ns::foo"; - /* Try both completion and non-completion modes. */ - static const bool completion_mode[2] = {false, true}; - for (size_t i = 0; i < 2; i++) - { - CHECK_MATCH ("ns :: foo < char * >", symbol_name_match_type::FULL, - completion_mode[i], EXPECT (expected)); - CHECK_MATCH ("foo < char * >", symbol_name_match_type::WILD, - completion_mode[i], EXPECT (expected)); - - CHECK_MATCH ("ns :: foo < char * > (int)", symbol_name_match_type::FULL, - completion_mode[i], EXPECT (expected)); - CHECK_MATCH ("foo < char * > (int)", symbol_name_match_type::WILD, - completion_mode[i], EXPECT (expected)); - } - } - - { - /* Check method qualifiers are ignored. */ - static const char expected[] = "ns::foo"; - CHECK_MATCH ("ns :: foo < char * > ( int ) const", - symbol_name_match_type::FULL, true, EXPECT (expected)); - CHECK_MATCH ("ns :: foo < char * > ( int ) &&", - symbol_name_match_type::FULL, true, EXPECT (expected)); - CHECK_MATCH ("foo < char * > ( int ) const", - symbol_name_match_type::WILD, true, EXPECT (expected)); - CHECK_MATCH ("foo < char * > ( int ) &&", - symbol_name_match_type::WILD, true, EXPECT (expected)); - } - - /* Test lookup names that don't match anything. */ - { - CHECK_MATCH ("bar2", symbol_name_match_type::WILD, false, - {}); - - CHECK_MATCH ("doesntexist", symbol_name_match_type::FULL, false, - {}); - } - - /* Some wild matching tests, exercising "(anonymous namespace)", - which should not be confused with a parameter list. */ - { - static const char *syms[] = { - "A::B::C", - "B::C", - "C", - "A :: B :: C ( int )", - "B :: C ( int )", - "C ( int )", - }; - - for (const char *s : syms) - { - CHECK_MATCH (s, symbol_name_match_type::WILD, false, - EXPECT ("(anonymous namespace)::A::B::C")); - } - } - - { - static const char expected[] = "ns2::tmpl::foo2"; - CHECK_MATCH ("tmp", symbol_name_match_type::WILD, true, - EXPECT (expected)); - CHECK_MATCH ("tmpl<", symbol_name_match_type::WILD, true, - EXPECT (expected)); - } - - SELF_CHECK (!any_mismatch); -#undef EXPECT -#undef CHECK_MATCH -} - -static void -run_test () -{ - test_mapped_index_find_name_component_bounds (); - test_dw2_expand_symtabs_matching_symbol (); -} - -}} /* namespace selftests::dw2_expand_symtabs_matching */ - -#endif /* GDB_SELF_TEST */ - -struct dwarf2_gdb_index : public dwarf2_base_index_functions -{ - /* This dumps minimal information about the index. - It is called via "mt print objfiles". - One use is to verify .gdb_index has been loaded by the - gdb.dwarf2/gdb-index.exp testcase. */ - void dump (struct objfile *objfile) override; - - bool expand_symtabs_matching - (struct objfile *objfile, - expand_symtabs_file_matcher file_matcher, - const lookup_name_info *lookup_name, - expand_symtabs_symbol_matcher symbol_matcher, - expand_symtabs_expansion_listener expansion_notify, - block_search_flags search_flags, - domain_search_flags domain, - expand_symtabs_lang_matcher lang_matcher) override; -}; - -/* This dumps minimal information about the index. - It is called via "mt print objfiles". - One use is to verify .gdb_index has been loaded by the - gdb.dwarf2/gdb-index.exp testcase. */ - -void -dwarf2_gdb_index::dump (struct objfile *objfile) -{ - dwarf2_per_objfile *per_objfile = get_dwarf2_per_objfile (objfile); - - mapped_gdb_index *index = (gdb::checked_static_cast - (per_objfile->per_bfd->index_table.get ())); - gdb_printf (".gdb_index: version %d\n", index->version); - gdb_printf ("\n"); -} - -/* Helper for dw2_expand_matching symtabs. Called on each symbol - matched, to expand corresponding CUs that were marked. IDX is the - index of the symbol name that matched. */ - -static bool -dw2_expand_marked_cus (dwarf2_per_objfile *per_objfile, offset_type idx, - auto_bool_vector &marked, - expand_symtabs_file_matcher file_matcher, - expand_symtabs_expansion_listener expansion_notify, - block_search_flags search_flags, - domain_search_flags kind, - expand_symtabs_lang_matcher lang_matcher) -{ - offset_type vec_len, vec_idx; - bool global_seen = false; - mapped_gdb_index &index - = *(gdb::checked_static_cast - (per_objfile->per_bfd->index_table.get ())); - - offset_view vec (index.constant_pool.slice (index.symbol_vec_index (idx))); - vec_len = vec[0]; - for (vec_idx = 0; vec_idx < vec_len; ++vec_idx) - { - offset_type cu_index_and_attrs = vec[vec_idx + 1]; - /* This value is only valid for index versions >= 7. */ - int is_static = GDB_INDEX_SYMBOL_STATIC_VALUE (cu_index_and_attrs); - gdb_index_symbol_kind symbol_kind = - GDB_INDEX_SYMBOL_KIND_VALUE (cu_index_and_attrs); - int cu_index = GDB_INDEX_CU_VALUE (cu_index_and_attrs); - /* Only check the symbol attributes if they're present. - Indices prior to version 7 don't record them, - and indices >= 7 may elide them for certain symbols - (gold does this). */ - int attrs_valid = - (index.version >= 7 - && symbol_kind != GDB_INDEX_SYMBOL_KIND_NONE); - - /* Work around gold/15646. */ - if (attrs_valid - && !is_static - && symbol_kind == GDB_INDEX_SYMBOL_KIND_TYPE) + std::vector these_entries; + offset_view vec (constant_pool.slice (symbol_vec_index (idx))); + offset_type vec_len = vec[0]; + bool global_seen = false; + for (offset_type vec_idx = 0; vec_idx < vec_len; ++vec_idx) { - if (global_seen) + offset_type cu_index_and_attrs = vec[vec_idx + 1]; + gdb_index_symbol_kind symbol_kind + = GDB_INDEX_SYMBOL_KIND_VALUE (cu_index_and_attrs); + /* Only use a symbol if the attributes are present. Indices + prior to version 7 don't record them, and indices >= 7 + may elide them for certain symbols (gold does this). */ + if (symbol_kind == GDB_INDEX_SYMBOL_KIND_NONE) continue; - global_seen = true; - } + int is_static = GDB_INDEX_SYMBOL_STATIC_VALUE (cu_index_and_attrs); - /* Only check the symbol's kind if it has one. */ - if (attrs_valid) - { - if (is_static) - { - if ((search_flags & SEARCH_STATIC_BLOCK) == 0) - continue; - } - else + int cu_index = GDB_INDEX_CU_VALUE (cu_index_and_attrs); + /* Don't crash on bad data. */ + if (cu_index >= per_objfile->per_bfd->all_units.size ()) { - if ((search_flags & SEARCH_GLOBAL_BLOCK) == 0) - continue; + complaint (_(".gdb_index entry has bad CU index" + " [in module %s]"), + objfile_name (per_objfile->objfile)); + continue; } + dwarf2_per_cu *per_cu = per_objfile->per_bfd->get_cu (cu_index); - domain_search_flags mask = 0; + enum language this_lang = lang; + dwarf_tag tag; switch (symbol_kind) { case GDB_INDEX_SYMBOL_KIND_VARIABLE: - mask = SEARCH_VAR_DOMAIN; + tag = DW_TAG_variable; break; case GDB_INDEX_SYMBOL_KIND_FUNCTION: - mask = SEARCH_FUNCTION_DOMAIN; + tag = DW_TAG_subprogram; break; case GDB_INDEX_SYMBOL_KIND_TYPE: - mask = SEARCH_TYPE_DOMAIN | SEARCH_STRUCT_DOMAIN; + if (is_static) + tag = (dwarf_tag) DW_TAG_GDB_INDEX_TYPE; + else + { + /* Work around gold/15646. */ + if (global_seen) + continue; + global_seen = true; + + tag = DW_TAG_structure_type; + this_lang = language_cplus; + } break; + /* The "default" should not happen, but we mention it to + avoid an uninitialized warning. */ + default: case GDB_INDEX_SYMBOL_KIND_OTHER: - mask = SEARCH_MODULE_DOMAIN | SEARCH_TYPE_DOMAIN; + tag = (dwarf_tag) DW_TAG_GDB_INDEX_OTHER; break; } - if ((kind & mask) == 0) - continue; + + cooked_index_flag flags = 0; + if (is_static) + flags |= IS_STATIC; + if (main_name != nullptr + && tag == DW_TAG_subprogram + && strcmp (name, main_name) == 0) + { + flags |= IS_MAIN; + this_lang = main_lang; + /* Don't bother looking for another. */ + main_name = nullptr; + } + + /* Note that this assumes the final component ends in \0. */ + cooked_index_entry *entry = result.add (per_cu->sect_off, tag, + flags, this_lang, + components.back ().data (), + nullptr, per_cu); + /* Don't bother pushing if we do not need a panrent. */ + if (components.size () > 1) + these_entries.push_back (entry); + + /* We don't care exactly which entry ends up in this + map. */ + by_name[std::string_view (name)] = entry; } - /* Don't crash on bad data. */ - if (cu_index >= per_objfile->per_bfd->all_units.size ()) + if (components.size () > 1) { - complaint (_(".gdb_index entry has bad CU index" - " [in module %s]"), objfile_name (per_objfile->objfile)); - continue; - } + std::string_view penultimate = components[components.size () - 2]; + std::string_view prefix (name, &penultimate.back () + 1 - name); - dwarf2_per_cu *per_cu = per_objfile->per_bfd->get_cu (cu_index); - if (!dw2_expand_symtabs_matching_one (per_cu, per_objfile, marked, - file_matcher, expansion_notify, - lang_matcher)) - return false; + need_parents.emplace_back (prefix, std::move (these_entries)); + } } - return true; + for (const auto &[prefix, entries] : need_parents) + { + auto iter = by_name.find (prefix); + /* If we can't find the parent entry, just lose. It should + always be there. We could synthesize it from the components, + if we kept those, but that seems like overkill. */ + if (iter != by_name.end ()) + { + for (cooked_index_entry *entry : entries) + entry->set_parent (iter->second); + } + } } -bool -dwarf2_gdb_index::expand_symtabs_matching - (objfile *objfile, - expand_symtabs_file_matcher file_matcher, - const lookup_name_info *lookup_name, - expand_symtabs_symbol_matcher symbol_matcher, - expand_symtabs_expansion_listener expansion_notify, - block_search_flags search_flags, - domain_search_flags domain, - expand_symtabs_lang_matcher lang_matcher) +/* The worker that reads a mapped index and fills in a + cooked_index_worker_result. */ + +class gdb_index_worker : public cooked_index_worker { - dwarf2_per_objfile *per_objfile = get_dwarf2_per_objfile (objfile); +public: - auto_bool_vector marked; - dw_expand_symtabs_matching_file_matcher (per_objfile, marked, file_matcher); + gdb_index_worker (dwarf2_per_objfile *per_objfile, + std::unique_ptr map) + : cooked_index_worker (per_objfile), + map (std::move (map)) + { } - /* This invariant is documented in quick-functions.h. */ - gdb_assert (lookup_name != nullptr || symbol_matcher == nullptr); - if (lookup_name == nullptr) - { - for (dwarf2_per_cu *per_cu : all_units_range (per_objfile->per_bfd)) - { - QUIT; + void do_reading () override; - if (!dw2_expand_symtabs_matching_one (per_cu, per_objfile, - marked, file_matcher, - expansion_notify, - lang_matcher)) - return false; - } - return true; - } + /* The map we're reading. */ + std::unique_ptr map; +}; + +void +gdb_index_worker::do_reading () +{ + complaint_interceptor complaint_handler; + map->build_name_components (m_per_objfile); - mapped_gdb_index &index - = *(gdb::checked_static_cast - (per_objfile->per_bfd->index_table.get ())); + m_results.push_back (std::move (map->result)); + m_results[0].done_reading (complaint_handler.release ()); - bool result - = dw2_expand_symtabs_matching_symbol (index, *lookup_name, - symbol_matcher, - [&] (offset_type idx) - { - if (!dw2_expand_marked_cus (per_objfile, idx, marked, file_matcher, - expansion_notify, search_flags, domain, - lang_matcher)) - return false; - return true; - }, per_objfile, lang_matcher); + /* No longer needed. */ + map.reset (); - return result; -} + done_reading (); -quick_symbol_functions_up -mapped_gdb_index::make_quick_functions () const -{ - return quick_symbol_functions_up (new dwarf2_gdb_index); + bfd_thread_cleanup (); } /* A helper function that reads the .gdb_index from BUFFER and fills @@ -1208,11 +427,9 @@ read_gdb_index_from_buffer (const char *filename, /* Version check. */ offset_type version = metadata[0]; - /* Versions earlier than 3 emitted every copy of a psymbol. This - causes the index to behave very poorly for certain requests. Version 3 - contained incomplete addrmap. So, it seems better to just ignore such - indices. */ - if (version < 4) + /* GDB now requires the symbol attributes, which were added in + version 7. */ + if (version < 7) { static int warning_printed = 0; if (!warning_printed) @@ -1223,30 +440,6 @@ read_gdb_index_from_buffer (const char *filename, } return 0; } - /* Index version 4 uses a different hash function than index version - 5 and later. - - Versions earlier than 6 did not emit psymbols for inlined - functions. Using these files will cause GDB not to be able to - set breakpoints on inlined functions by name, so we ignore these - indices unless the user has done - "set use-deprecated-index-sections on". */ - if (version < 6 && !deprecated_ok) - { - static int warning_printed = 0; - if (!warning_printed) - { - warning (_("\ -Skipping deprecated .gdb_index section in %s.\n\ -Do \"%ps\" before the file is read\n\ -to use the section anyway."), - filename, - styled_string (command_style.style (), - "set use-deprecated-index-sections on")); - warning_printed = 1; - } - return 0; - } /* Version 7 indices generated by gold refer to the CU for a symbol instead of the TU (for symbols coming from TUs), http://sourceware.org/bugzilla/show_bug.cgi?id=15021. @@ -1431,23 +624,19 @@ create_addrmap_from_gdb_index (dwarf2_per_objfile *per_objfile, mutable_map.set_empty (lo, hi - 1, per_bfd->get_cu (cu_index)); } - index->index_addrmap - = new (&per_bfd->obstack) addrmap_fixed (&per_bfd->obstack, &mutable_map); + index->result.set_addrmap (std::move (mutable_map)); } -/* Sets the name and language of the main function from the shortcut table. */ - -static void -set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile, - mapped_gdb_index *index) +void +mapped_gdb_index::set_main_name (dwarf2_per_objfile *per_objfile) { const auto expected_size = 2 * sizeof (offset_type); - if (index->shortcut_table.size () < expected_size) + if (this->shortcut_table.size () < expected_size) /* The data in the section is not present, is corrupted or is in a version we don't know about. Regardless, we can't make use of it. */ return; - auto ptr = index->shortcut_table.data (); + auto ptr = this->shortcut_table.data (); const auto dw_lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE); if (dw_lang >= DW_LANG_hi_user) { @@ -1463,13 +652,11 @@ set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile, } ptr += 4; - const auto lang = dwarf_lang_to_enum_language (dw_lang); + main_lang = dwarf_lang_to_enum_language (dw_lang); const auto name_offset = extract_unsigned_integer (ptr, sizeof (offset_type), BFD_ENDIAN_LITTLE); - const auto name = (const char *) (index->constant_pool.data () + name_offset); - - set_objfile_main_name (per_objfile->objfile, name, (enum language) lang); + main_name = (const char *) (this->constant_pool.data () + name_offset); } /* See read-gdb-index.h. */ @@ -1557,9 +744,15 @@ dwarf2_read_gdb_index create_addrmap_from_gdb_index (per_objfile, map.get ()); - set_main_name_from_gdb_index (per_objfile, map.get ()); + map->set_main_name (per_objfile); - per_bfd->index_table = std::move (map); + int version = map->version; + auto worker = std::make_unique (per_objfile, + std::move (map)); + auto idx = std::make_unique (std::move (worker), + version); + + per_bfd->start_reading (std::move (idx)); return true; } @@ -1580,9 +773,4 @@ Warning: This option must be enabled before gdb reads the file."), NULL, NULL, &setlist, &showlist); - -#if GDB_SELF_TEST - selftests::register_test ("dw2_expand_symtabs_matching", - selftests::dw2_expand_symtabs_matching::run_test); -#endif } diff --git a/gdb/dwarf2/read-gdb-index.h b/gdb/dwarf2/read-gdb-index.h index e38a8318db8fac78336208097bac288f8321be3f..f605985f648104d4508e934e727c608eb0928bff 100644 --- a/gdb/dwarf2/read-gdb-index.h +++ b/gdb/dwarf2/read-gdb-index.h @@ -27,6 +27,20 @@ struct dwarf2_per_objfile; struct dwz_file; struct objfile; +/* .gdb_index doesn't distinguish between the various "other" symbols + -- but the symbol search machinery really wants to. For example, + an imported decl is "other" but is really a namespace and thus in + TYPE_DOMAIN; whereas a Fortran module is also "other" but is in the + MODULE_DOMAIN. We use this value internally to represent the + "other" case so that matching can work. The exact value does not + matter, all that matters here is that it won't overlap with any + symbol that gdb might create. */ +#define DW_TAG_GDB_INDEX_OTHER 0xffff + +/* Similarly, .gdb_index can't distinguish between the type and struct + domains. This is a special tag that inhabits both. */ +#define DW_TAG_GDB_INDEX_TYPE 0xfffe + /* Callback types for dwarf2_read_gdb_index. */ typedef gdb::function_view diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c index f1084530221e82f33efb35f542d6d842fe3df2ab..6eee9631623435cbb39830116b790ba4b1f0c21d 100644 --- a/gdb/dwarf2/read.c +++ b/gdb/dwarf2/read.c @@ -14570,13 +14570,26 @@ cooked_index_functions::expand_symtabs_matching continue; } + /* This is a bit of a hack to support .gdb_index. Since + .gdb_index does not record languages, and since we want + to know the language to avoid excessive CU expansion due + to false matches, if we see a symbol with an unknown + language we find the CU's language. Only the .gdb_index + reader creates such symbols. */ + enum language entry_lang = entry->lang; + if (entry_lang == language_unknown) + { + entry->per_cu->ensure_lang (per_objfile); + entry_lang = entry->per_cu->lang (); + } + /* We've found the base name of the symbol; now walk its parentage chain, ensuring that each component matches. */ bool found = true; const cooked_index_entry *parent = entry->get_parent (); - const language_defn *lang_def = language_def (entry->lang); + const language_defn *lang_def = language_def (entry_lang); for (int i = name_vec.size () - 1; i > 0; --i) { /* If we ran out of entries, or if this segment doesn't @@ -14586,17 +14599,15 @@ cooked_index_functions::expand_symtabs_matching found = false; break; } - if (parent->lang != language_unknown) + + symbol_name_matcher_ftype *name_matcher + = (lang_def->get_symbol_name_matcher + (segment_lookup_names[i-1])); + if (!name_matcher (parent->canonical, + segment_lookup_names[i-1], nullptr)) { - symbol_name_matcher_ftype *name_matcher - = lang_def->get_symbol_name_matcher - (segment_lookup_names[i-1]); - if (!name_matcher (parent->canonical, - segment_lookup_names[i-1], nullptr)) - { - found = false; - break; - } + found = false; + break; } parent = parent->get_parent (); @@ -14619,23 +14630,20 @@ cooked_index_functions::expand_symtabs_matching seems like the loop above could just examine every element of the name, avoiding the need to check here; but this is hard. See PR symtab/32733. */ - if (symbol_matcher != nullptr || entry->lang != language_unknown) + auto_obstack temp_storage; + const char *full_name = entry->full_name (&temp_storage, + FOR_ADA_LINKAGE_NAME); + if (symbol_matcher == nullptr) { - auto_obstack temp_storage; - const char *full_name = entry->full_name (&temp_storage, - FOR_ADA_LINKAGE_NAME); - if (symbol_matcher == nullptr) - { - symbol_name_matcher_ftype *name_matcher - = (lang_def->get_symbol_name_matcher - (lookup_name_without_params)); - if (!name_matcher (full_name, lookup_name_without_params, - nullptr)) - continue; - } - else if (!symbol_matcher (full_name)) + symbol_name_matcher_ftype *name_matcher + = (lang_def->get_symbol_name_matcher + (lookup_name_without_params)); + if (!name_matcher (full_name, lookup_name_without_params, + nullptr)) continue; } + else if (!symbol_matcher (full_name)) + continue; if (!dw2_expand_symtabs_matching_one (entry->per_cu, per_objfile, marked, file_matcher, diff --git a/gdb/dwarf2/tag.h b/gdb/dwarf2/tag.h index ef1ad217c6d34fbacdf5f6372b0d29ad3b6e7bbd..e2f056033702c303f3b3354810e09fe27126625d 100644 --- a/gdb/dwarf2/tag.h +++ b/gdb/dwarf2/tag.h @@ -22,6 +22,7 @@ #include "dwarf2.h" #include "symtab.h" +#include "read-gdb-index.h" /* Return true if TAG represents a type, false otherwise. */ @@ -144,6 +145,13 @@ tag_matches_domain (dwarf_tag tag, domain_search_flags search, language lang) else flags = SEARCH_MODULE_DOMAIN; break; + + case DW_TAG_GDB_INDEX_OTHER: + flags = SEARCH_MODULE_DOMAIN | SEARCH_TYPE_DOMAIN; + break; + case DW_TAG_GDB_INDEX_TYPE: + flags = SEARCH_STRUCT_DOMAIN | SEARCH_TYPE_DOMAIN; + break; } return (flags & search) != 0; -- 2.46.1