From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id tFp8IM4jx2CFPQAAWB0awg (envelope-from ) for ; Mon, 14 Jun 2021 05:39:26 -0400 Received: by simark.ca (Postfix, from userid 112) id 763421F163; Mon, 14 Jun 2021 05:39:26 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-0.5 required=5.0 tests=DKIM_SIGNED, MAILING_LIST_MULTI,RDNS_DYNAMIC,T_DKIM_INVALID,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id ECE491E939 for ; Mon, 14 Jun 2021 05:39:24 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2FE0F3951840 for ; Mon, 14 Jun 2021 09:39:24 +0000 (GMT) Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 68AD33851C22 for ; Mon, 14 Jun 2021 09:39:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 68AD33851C22 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from imap.suse.de (imap-alt.suse-dmz.suse.de [192.168.254.47]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 7E8C221976; Mon, 14 Jun 2021 09:39:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623663551; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=lo/GV+NwuL0CyCdPpTTCR3VB71eRU5m3pRTlgbVUIaQ=; b=h//Z1CzDRYwXaAb9eoMLEkKe/BVShb3lpeyBy1rkCVUG9gdQzjOmEkJx4nR6aRXXMfwwXf L2O+XVrIyDdrvm5d4nyYrKN17dHoZ8U7Dj3fMg6g6R5OIEgN1a/H/JRAdaUU2jOYEWIYui XOGJ/YnNULr5nr1a2/jZnbPNlu1wJMY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623663551; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=lo/GV+NwuL0CyCdPpTTCR3VB71eRU5m3pRTlgbVUIaQ=; b=fyyE8+KaJjWEWAqXYSV3hBpQgZPZIB6vCGtLi9c4i+0vyEcNamDeu8VcHN/MqZ4eiDrWk4 GwOzgqeAukggnbBQ== Received: from imap3-int (imap-alt.suse-dmz.suse.de [192.168.254.47]) by imap.suse.de (Postfix) with ESMTP id 59A06118DD; Mon, 14 Jun 2021 09:39:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623663551; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=lo/GV+NwuL0CyCdPpTTCR3VB71eRU5m3pRTlgbVUIaQ=; b=h//Z1CzDRYwXaAb9eoMLEkKe/BVShb3lpeyBy1rkCVUG9gdQzjOmEkJx4nR6aRXXMfwwXf L2O+XVrIyDdrvm5d4nyYrKN17dHoZ8U7Dj3fMg6g6R5OIEgN1a/H/JRAdaUU2jOYEWIYui XOGJ/YnNULr5nr1a2/jZnbPNlu1wJMY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623663551; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=lo/GV+NwuL0CyCdPpTTCR3VB71eRU5m3pRTlgbVUIaQ=; b=fyyE8+KaJjWEWAqXYSV3hBpQgZPZIB6vCGtLi9c4i+0vyEcNamDeu8VcHN/MqZ4eiDrWk4 GwOzgqeAukggnbBQ== Received: from director2.suse.de ([192.168.254.72]) by imap3-int with ESMTPSA id 8CP1FL8jx2C/UAAALh3uQQ (envelope-from ); Mon, 14 Jun 2021 09:39:11 +0000 Date: Mon, 14 Jun 2021 11:39:10 +0200 From: Tom de Vries To: gdb-patches@sourceware.org Subject: [RFC][gdb/symtab] Lazy expansion of full symbol table Message-ID: <20210614093908.GA22709@delia> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tom Tromey Errors-To: gdb-patches-bounces+public-inbox=simark.ca@sourceware.org Sender: "Gdb-patches" Hi, [ I'm not posting the experimental patch series as such for now. Available here ( https://github.com/vries/gdb/commits/lazy-full-symtab ). ] In PR23710, the stated problem is that gdb is slow and memory hungry when consuming debug information generated by GCC with LTO. I. Taking the range of final releases 8.1.1 to 10.2, as well as a recent trunk commit (3633d4fb446), and an experiment using cc1: ... $ gdb -q -batch cc1 -ex "b do_rpo_vn" ... we get: ... +---------+----------------+ | Version | real (seconds) | +---------+----------------+ | 8.1.1 | 9.42 | | 8.2.1 | - (PR23712) | | 8.3.1 | 9.31 | | 9.2 | 8.50 | | 10.2 | 5.86 | | trunk | 6.36 | +---------+----------------+ ... which is nice progress in the releases. The regression on trunk since 10.2 has been filed as PR27937. [ The 10.2 score can be further improved to 5.23, by setting dwarf max-cache-age to 1000. Defaults to 5, see PR25703. ] However, the best score is still more than a factor 3 slower than lldb: ... +-------------+----------------+ | Version | real (seconds) | +-------------+----------------+ | gdb 10.2 | 5.86 | | lldb 10.0.1 | 1.74 | +-------------+----------------+ ... II. Breaking down the 10.2 time of 5.86, we have: ... +-----------------+----------------+ | | real (seconds) | +-----------------+----------------+ | Minimal symbols | 0.18 | | Partial symbols | 2.34 | | Full symbols | 3.34 | +-----------------+----------------+ ... So: - the minimal symbols and partial symbols are processed for all CUs, while the full symbols are processed only for the necessary CUs - still the majority of the time is spent for the full symbols This is due to the combination of: - the one-CU-at-a-time strategy of gdb, and - the code generation for LTO which combines several CUs into an artificial CU. In other words, LTO increases the scope of processing from individual CUs to larger artificial CUs, and consequently things take much longer. III. A way to fix this is to do processing of the full symbols in a lazy fashion. This patch series implements a first attempt at this, for now intended not to be functionally correct, but to assess the kind of performance improvement we get from this. With current trunk (commit 987610f2d68), we get 3.44, instead of the 6.44 without this patch series. IV. The patch series consists of: [gdb/symtab] Cover letter -- Lazy expansion of full symbol table This. [gdb/symtab] Add lazy_expand_symtab_p, default to false Add variable to enable lazy expansion. Keep false for now to enable gradual introduction of implementation. [gdb/symtab] Add sect_off field to partial_symbol Keep track of section offset in partial symbol. [gdb/symtab] Keep track of interesting_symbols in partial_symtab When searching for a symbol in a partial symbol table: - keep going after finding a match - store the matching partial symbols in a vector interesting_symbols [gdb/symtab] Add interesting_symbols to dwarf2_per_cu_data Make the vector interesting_symbols available during full symbols expansion. [gdb/symtab] Handle interesting_symbols in read_file_scope Using the interesting_symbols vector to filter the DIEs that we process when doing read_file_scope. [gdb/symtab] Add reset_compunit_symtab Add a new function reset_compunit_symtab [gdb/symtab] Reset compunit_symtab in psymtab_to_symtab Use the new function to reset the full symbols table before expanding. [ Without this patch, after finding symbols for the first time, we are not able to find any others. With this patch, we are able to find other symbols, but only after forgetting the first ones. This obviously needs proper fixing. ] [gdb/symtab] Set lazy_expand_symtab_p to true Enable. V. The current state of trunk is that expanding full symbols is a two part process: - a builder is created during expansion - after expansion the builder is destroyed after delivering the end result: a symbol table The problem is that we need a way to do this gradually instead: - expand a few symbols - get the correspoding symbol table - expand a few more symbols - get the updated symbol table contain all expanded symbols I'm not sure what is the smartest way to do that. My current idea is to try to keep the builder around rather than destroy it, and have it generate an updated symbol table each time. Is this a good idea? Any other comments? Thanks, - Tom [gdb/symtab] Cover letter -- Lazy expansion of full symbol table --- COVER-LETTER | 1 + 1 file changed, 1 insertion(+) diff --git a/COVER-LETTER b/COVER-LETTER new file mode 100644 index 00000000000..d273939703b --- /dev/null +++ b/COVER-LETTER @@ -0,0 +1 @@ +Lazy expansion of full symbol table