From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id 31D5A382EBF2 for ; Thu, 30 Jul 2020 16:26:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 31D5A382EBF2 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark@simark.ca Received: from [10.0.0.11] (173-246-6-90.qc.cable.ebox.net [173.246.6.90]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 7B8551E792; Thu, 30 Jul 2020 12:26:25 -0400 (EDT) Subject: Re: [PATCH 4/4] gdb: change regcache list to be a map To: John Baldwin , Pedro Alves , Simon Marchi , gdb-patches@sourceware.org Cc: Laurent References: <20200720204101.2849535-1-simon.marchi@efficios.com> <20200720204101.2849535-5-simon.marchi@efficios.com> <5b521483-2f09-3424-943d-40f83b3080af@palves.net> <0357b2ae-cf42-3743-2c9d-ad116546e747@FreeBSD.org> From: Simon Marchi Message-ID: <149057a0-1a50-a8e5-b788-323d798e57ae@simark.ca> Date: Thu, 30 Jul 2020 12:26:18 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <0357b2ae-cf42-3743-2c9d-ad116546e747@FreeBSD.org> Content-Type: text/plain; charset=utf-8 Content-Language: fr Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, NICE_REPLY_A, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2020 16:26:27 -0000 On 2020-07-24 12:59 p.m., John Baldwin wrote: > On 7/23/20 6:53 PM, Pedro Alves wrote: >> On 7/20/20 9:41 PM, Simon Marchi via Gdb-patches wrote: >>> The function regcache_thread_ptid_changed is called when a thread >>> changes ptid. It is implemented efficiently using the map, although >>> that's not very important: it is not called often, mostly when creating >>> an inferior, on some specific platforms. >>> >>> Note: In hash_target_ptid, I am combining hash values from std::hash by >>> summing them. I don't think it's ideal, since std::hash is just the >>> identity function for base types. But I don't know what would be better >>> to reduce the change of collisions. If anybody has a better idea, I'd >>> be interested. >> >> I'd maybe look in some kernel sources, e.g., Linux or BSD, what they >> use as hash function for pids. > > FreeBSD just does a very simple and with a power of 2 based on the > maximum number of processes (set via a boot-time tunable). > > However, it has separate hash tables for pids and thread ids (tids). > It doesn't have a single hash table of threads indexed by (pid, tid) > because the tid namespace is independent and a tid is a fully unique > name to a kernel thread (lwp). > >>> +/* Functor to hash a ptid. */ >>> + >>> +struct hash_ptid >>> +{ >>> + size_t operator() (const ptid_t &ptid) const >>> + { >>> + std::hash long_hash; >>> + >>> + return (long_hash (ptid.pid ()) >>> + + long_hash (ptid.lwp ()) >>> + + long_hash (ptid.tid ())); >>> + } >>> +}; > > I think summing the three components might be the best option. > Presumably the low bits of the hash are what actually get used and > so the goal would be to have more entropy there. This means you > probably would _not_ want to do something like: > > pid << 32 | lwp << 16 | tid > > since many native backends will have tid of all zeroes. It would > also not be ideal for backends that only do processes (so only > pid is non-zero). The sum approach degrades to the right thing > for those targets without needing extra complication or conditionals. Ok, thanks for the tips. So I'll probably keep it like this for now then. I'll still take a look at the kernel implementations like Pedro suggested. Simon