From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9014 invoked by alias); 22 Nov 2010 19:44:11 -0000 Received: (qmail 8953 invoked by uid 22791); 22 Nov 2010 19:44:09 -0000 X-SWARE-Spam-Status: No, hits=-5.4 required=5.0 tests=AWL,BAYES_00,KAM_STOCKGEN,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,TW_BJ,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 22 Nov 2010 19:44:02 +0000 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id oAMJhi7Y025938 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 22 Nov 2010 14:43:45 -0500 Received: from host0.dyn.jankratochvil.net (ovpn01.gateway.prod.ext.phx2.redhat.com [10.5.9.1]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id oAMJhbL5011692 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 22 Nov 2010 14:43:43 -0500 Received: from host0.dyn.jankratochvil.net (localhost.localdomain [127.0.0.1]) by host0.dyn.jankratochvil.net (8.14.4/8.14.4) with ESMTP id oAMJhaH7022115; Mon, 22 Nov 2010 20:43:36 +0100 Received: (from jkratoch@localhost) by host0.dyn.jankratochvil.net (8.14.4/8.14.4/Submit) id oAMJhanj022114; Mon, 22 Nov 2010 20:43:36 +0100 Date: Mon, 22 Nov 2010 19:44:00 -0000 From: Jan Kratochvil To: Joel Brobecker Cc: Tom Tromey , gdb-patches@sourceware.org Subject: Re: [patch 2/2] iFort compat.: case insensitive symbols (PR 11313) Message-ID: <20101122194336.GA21855@host0.dyn.jankratochvil.net> References: <20101108183133.GE2933@adacore.com> <20101122035334.GA9229@host0.dyn.jankratochvil.net> <20101122185432.GT2634@adacore.com> <20101122191905.GA20976@host0.dyn.jankratochvil.net> <20101122193041.GU2634@adacore.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101122193041.GU2634@adacore.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2010-11/txt/msg00310.txt.bz2 On Mon, 22 Nov 2010 20:30:41 +0100, Joel Brobecker wrote: > I was actually wondering about the change in the hash algorithm more > than the cost of calling tolower. For instance, "tmp" and "Tmp" would > have had different hash values, but not anymore. So, presumably, when > you start looking up for "tmp", the associated hash bucket will also > contain "Tmp" whereas it wouldn't before. I need to look at the actual > hashing parameters to see if we can figure out whether this should have > any real effect in practice... If the number of elements in each bucket > is reasonable, a few more iterations shouldn't be an issue. This is a more general issue. I think (I did not measure it) most of the symbols differ even after tolower. The symbols like tmp<->Tmp exist but rarely. I agree the hashing function will get worse but I did not even measure it considering the change negligible. There is more an issue MINIMAL_SYMBOL_HASH_SIZE is constant: #define MINIMAL_SYMBOL_HASH_SIZE 2039 Some objfiles have many symbols: libwebkit.so.debug: 54980 symbols /MINIMAL_SYMBOL_HASH_SIZE = 27 log2(54980)=16 gdb symtab: 36452 symbols /MINIMAL_SYMBOL_HASH_SIZE = 18 log2(54980)=16 In such case in fact the whole hash table makes no sense and it is even cheaper to just do binary search on objfile->msymbols which is already qsort-ed and be done with it. Still a hash table should be faster than a binary search but the hash table size would need to be adaptable. But rather than optimizations of this which reduce just the CPU load which was in my measurements 2% during GDB startup (due to its waiting on disk). We could for example rather delay searching+loading any objfiles' symbols we do not need which would do another major GDB startup time reduction like .gdb_index did. This is the reason I did not intend to spend any time on some CPU discutable optimizations, they IMO do not make sense with the current state of gdb performance. Thanks, Jan