From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id Kw2YCSbCx2CXUAAAWB0awg (envelope-from ) for ; Mon, 14 Jun 2021 16:55:02 -0400 Received: by simark.ca (Postfix, from userid 112) id 102D21F163; Mon, 14 Jun 2021 16:55:02 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED, MAILING_LIST_MULTI,T_DKIM_INVALID,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id E25E21E54D for ; Mon, 14 Jun 2021 16:55:00 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 384913973029 for ; Mon, 14 Jun 2021 20:55:00 +0000 (GMT) Received: from gateway36.websitewelcome.com (gateway36.websitewelcome.com [192.185.184.18]) by sourceware.org (Postfix) with ESMTPS id EEAAD396DC11 for ; Mon, 14 Jun 2021 20:54:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EEAAD396DC11 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tromey.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=tromey.com Received: from cm12.websitewelcome.com (cm12.websitewelcome.com [100.42.49.8]) by gateway36.websitewelcome.com (Postfix) with ESMTP id E7EF140201D0D for ; Mon, 14 Jun 2021 15:54:41 -0500 (CDT) Received: from box5379.bluehost.com ([162.241.216.53]) by cmsmtp with SMTP id stbVlqBSeELyOstbVlcomo; Mon, 14 Jun 2021 15:54:41 -0500 X-Authority-Reason: nr=8 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tromey.com; s=default; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References :Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=Jvbf6oK7fIvTS9kQQa2Y0Zr4jMtKEqlcMn6kUkNaEtQ=; b=L0Y88E/2ySbjShDLjNT44cTqvm e5eIejv2kJ73rwcMgdDy2quk/6oSgfn6VJnKj6nvxbm4MHUusJBVvUifQLM4b+Cnhclnw5+TeIMlg CJxZC5/1vYfR01ahYhjxY7TRT; Received: from 97-122-81-113.hlrn.qwest.net ([97.122.81.113]:35334 helo=localhost.localdomain) by box5379.bluehost.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1lstbV-003wWw-LY; Mon, 14 Jun 2021 14:54:41 -0600 From: Tom Tromey To: Tom de Vries Subject: Re: [RFC][gdb/symtab] Lazy expansion of full symbol table References: <20210614093908.GA22709@delia> X-Attribution: Tom Date: Mon, 14 Jun 2021 14:54:40 -0600 In-Reply-To: <20210614093908.GA22709@delia> (Tom de Vries's message of "Mon, 14 Jun 2021 11:39:10 +0200") Message-ID: <87pmwoxj3j.fsf@tromey.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - box5379.bluehost.com X-AntiAbuse: Original Domain - sourceware.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - tromey.com X-BWhitelist: no X-Source-IP: 97.122.81.113 X-Source-L: No X-Exim-ID: 1lstbV-003wWw-LY X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: 97-122-81-113.hlrn.qwest.net (localhost.localdomain) [97.122.81.113]:35334 X-Source-Auth: tom+tromey.com X-Email-Count: 8 X-Source-Cap: ZWx5bnJvYmk7ZWx5bnJvYmk7Ym94NTM3OS5ibHVlaG9zdC5jb20= X-Local-Domain: yes X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tom Tromey , gdb-patches@sourceware.org Errors-To: gdb-patches-bounces+public-inbox=simark.ca@sourceware.org Sender: "Gdb-patches" >>>>> "Tom" == Tom de Vries writes: Tom> [ I'm not posting the experimental patch series as such for now. Available Tom> here ( https://github.com/vries/gdb/commits/lazy-full-symtab ). ] Tom> In PR23710, the stated problem is that gdb is slow and memory hungry when Tom> consuming debug information generated by GCC with LTO. Tom> | Minimal symbols | 0.18 | Tom> | Partial symbols | 2.34 | Tom> | Full symbols | 3.34 | I don't have this executable but FWIW my scanner rewrite is ~10x faster than the current psymtab reader. Tom> A way to fix this is to do processing of the full symbols in a lazy fashion. Tom> This patch series implements a first attempt at this, for now intended not to Tom> be functionally correct, but to assess the kind of performance improvement we Tom> get from this. Tom> With current trunk (commit 987610f2d68), we get 3.44, instead of the 6.44 Tom> without this patch series. That's about in line with the preliminary results I saw as well. You can see that branch at tromey/lazily-read-function-bodies, but it's probably unusable since I last rebased it in 2017. Tom> The problem is that we need a way to do this gradually instead: Tom> - expand a few symbols Tom> - get the correspoding symbol table Tom> - expand a few more symbols Tom> - get the updated symbol table contain all expanded symbols Tom> I'm not sure what is the smartest way to do that. Tom> My current idea is to try to keep the builder around rather than destroy it, Tom> and have it generate an updated symbol table each time. Tom> Is this a good idea? Tom> Any other comments? I think it's a good idea to do something. Historically I didn't look into speeding up CU expansion on the theory that it didn't really matter -- most CUs are small -- but LTO upends that. I never finished the earlier patch that I referenced, but the idea there was to skip function bodies when reading a CU. Then, the functions could be fully read when needed. I got this to work for by-name lookups, but I never finished the part where a by-address lookup would resize the blockvector, etc. This approach could be extended to also not read type bodies until needed. This could work by inflating the type during check_typedef. This can be a little dangerous because we do miss these calls from time to time; but maybe nearly as good would be to only lazily read "complicated" types, and use the accessor methods on those types to ensure they are inflated properly. Another deeper approach would be to abstract away symtabs (and I guess the blockvector) the way that we did for psymtabs: hide the details behind the "quick" methods and let each debug reader decide for itself how to store and look up symbols. This is probably more work, but would allow even more laziness. For example with the new indexer the symbols could simply hang off the entries in the index. Tom