From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id 358423850434 for ; Thu, 30 Jul 2020 16:58:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 358423850434 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark@simark.ca Received: from [10.0.0.11] (173-246-6-90.qc.cable.ebox.net [173.246.6.90]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id AE2971E792; Thu, 30 Jul 2020 12:58:20 -0400 (EDT) Subject: Re: [PATCH 4/4] gdb: change regcache list to be a map To: Pedro Alves , Simon Marchi , gdb-patches@sourceware.org Cc: Laurent References: <20200720204101.2849535-1-simon.marchi@efficios.com> <20200720204101.2849535-5-simon.marchi@efficios.com> <5b521483-2f09-3424-943d-40f83b3080af@palves.net> From: Simon Marchi Message-ID: <82d8b01a-2b37-0b79-f657-1081375270ee@simark.ca> Date: Thu, 30 Jul 2020 12:58:13 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <5b521483-2f09-3424-943d-40f83b3080af@palves.net> Content-Type: text/plain; charset=utf-8 Content-Language: fr Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, NICE_REPLY_A, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2020 16:58:23 -0000 On 2020-07-23 9:53 p.m., Pedro Alves wrote: > On 7/20/20 9:41 PM, Simon Marchi via Gdb-patches wrote: >> One regcache object is created for each stopped thread and is stored in >> the regcache::regcaches linked list. Looking up a regcache for a given >> thread is therefore in O(number of threads). Stopping all threads then >> becomes O((number of threads) ^ 2). It becomes noticeable when >> debugging thousands of threads, which is typical with GPU targets. This >> patch replaces the linked list with an std::unordered_multimap, indexed >> by (target, ptid). >> >> I originally designed it using an std::unordered_map with (target, ptid, >> arch) as the key, because that's how lookups are done (in >> get_thread_arch_aspace_regcache). However, the registers_changed_ptid >> function, also somewhat on the hot path (it is used when resuming >> threads), needs to delete all regcaches associated to a given (target, >> ptid) tuple. Using (target, ptid) as a key allows to do this more >> efficiently (see exception below). If the key of the map was (target, >> ptid, arch), we'd have to walk all items of the map. >> >> The lookup (in get_thread_arch_aspace_regcache), walks over all existing >> regcaches belonging to this (target, ptid), looking to find the one with >> the right arch. This is ok, as there will be very few regcaches for a >> given key (typically one). Lookups become faster when the number of >> threads grows, compared to the linked list. With a small number of >> threads, it will probably be a bit slower to do a map lookup than to >> walk a few linked list nodes, but I don't think it will be noticeable in >> practice. >> >> The function registers_changed_ptid deletes all regcaches related to a >> given (target, ptid). We must now handle the different cases >> separately: >> >> - NULL target and minus_one_ptid: we delete all the entries >> - NULL target and non-minus_one_ptid: invalid (checked by assert) >> - non-NULL target and non-minus_one_ptid: we delete all the entries >> associated to that tuple, this is done efficiently >> - a non-NULL target and minus_one_ptid: we delete all the entries >> associated to that target, whatever the ptid. This is the slightly >> annoying case, as we can't easily look up all items having this target >> in their key. I implemented it by walking the list, which is not >> ideal. > > Given the last case, did you consider using a two-level map instead? > > using ptid_regcache_map > = std::unordered_multimap; > > using target_ptid_regcache_map > = std::unordered_map; > > static target_ptid_regcache_map regcaches; > > The result for registers_changed_ptid would be: > > - NULL target and minus_one_ptid: we delete all the REGCACHES entries, > > - NULL target and non-minus_one_ptid: invalid (checked by assert) > > - non-NULL target and non-minus_one_ptid: we lookup the target > in REGCACHES, and then delete all the entries in that map > associated to that ptid. > > - a non-NULL target and minus_one_ptid: we delete the map entry > in REGCACHES associated to that target. I thought about it early on, but dismissed it thinking it would be too heavy for nothing, having a map of maps. But I didn't think it would help that last case, that might be a good reason to go with that. > Looking up a regcache does two map lookups instead of one, but > that is 2x amortized O(1), so should be a constant hit. > > I gave this a try, to see how feasible it would be. See attached > patches on top of yours (the first two address comments I'll make below). > I've also put these in the users/palves/regcache-map branch on > sourceware.org. Thanks, I'll run with that. > I did not run performance tests, though. It may be that in > registers_changed_ptid, it would be more efficient to > not clear the first level map entry for a given target (i.e., > destroy the second level map object completely), but instead > clear the second level map entry. I.e., avoid destroying the > unordered_multimap for a given target, only to recreate it soon > enough. But I kept it simple in case that isn't noticeable. Will try. > A similar two-level data structure would be to put an instance of: > > using ptid_regcache_map > = std::unordered_multimap; > > in process_stratum_target directly. > > IOW, target-connection.c:process_targets has a list of open targets, > which could be used to walk over all targets in the > "NULL target and minus_one_ptid" scenario. Ok, that would be an extra step, I suggest we keep it in mind for a subsequent refactor. >> The function regcache_thread_ptid_changed is called when a thread >> changes ptid. It is implemented efficiently using the map, although >> that's not very important: it is not called often, mostly when creating >> an inferior, on some specific platforms. >> >> Note: In hash_target_ptid, I am combining hash values from std::hash by >> summing them. I don't think it's ideal, since std::hash is just the >> identity function for base types. But I don't know what would be better >> to reduce the change of collisions. If anybody has a better idea, I'd >> be interested. > > I'd maybe look in some kernel sources, e.g., Linux or BSD, what they > use as hash function for pids. Ok. >> +/* Hash function for target_ptid. */ >> + >> +struct hash_target_ptid >> +{ >> + size_t operator() (const target_ptid &val) const >> + { >> + std::hash h_long; >> + hash_ptid h_ptid; >> + >> + return h_long ((long) val.target) + h_ptid (val.ptid); > > Note that on 64-bit Windows, long is 32-bit, so pointers don't > fit. It just means you're only hashing half the bits. > I wonder whether using libiberty's htab_hash_pointer would be > a good idea here. Huh, I did that because there's no std::hash overload for uintptr_t. But I just saw there is: template< class T > struct hash; which seems to be for hashing pointers. It is implemented as: return reinterpret_cast(__p); So I'll just use that. Otherwise, htab_hash_pointer would be good too. >> @@ -417,10 +458,25 @@ static void >> regcache_thread_ptid_changed (process_stratum_target *target, >> ptid_t old_ptid, ptid_t new_ptid) >> { >> - for (auto ®cache : regcaches) >> + /* Find all the regcaches to updates. */ > > typo: to update. Fixed. >> + std::vector regcaches_to_update; >> + target_ptid old_key (target, old_ptid); >> + auto range = regcaches.equal_range (old_key); >> + for (auto it = range.first; it != range.second; it++) >> + regcaches_to_update.push_back (it->second); >> + >> + /* Remove all entries associated to OLD_KEY. We could have erased the items >> + in the previous `for`, but it is only safe to erase items while iterating >> + starting with C++14. */ >> + regcaches.erase (old_key); > > I don't think that's really true in practice. The idiom used pretty > much by everyone is: > > for (auto it = range.first; it != range.second;) > { > regcaches_to_update.push_back (it->second); > it = regcaches.erase (it); > } > > Note that even the resolution of the C++ issue at: > > https://wg21.cmeerw.net/lwg/issue2356 > > says: > > "not that any actual implementation does that, anyway" > > There's no need to be super pedantic wrt to an issue > that no compiler is going to miscompile. Ok, didn't know about that. > See here as well: > https://stackoverflow.com/questions/25047241/c11-is-it-safe-to-remove-individual-elements-from-stdunordered-map-while-it > > And then, given we know that there are no entries for > new_ptid in the map, I think we could instead do it all in > one iteration, like: > > target_ptid old_key (target, old_ptid); > target_ptid new_key (target, new_ptid); > > auto range = regcaches.equal_range (old_key); > for (auto it = range.first; it != range.second;) > { > regcache *rc = it->second; > rc->set_ptid (new_ptid); > > /* Remove old before inserting new, to avoid rehashing, which > would invalidate iterators. */ > it = regcaches.erase (it); > regcaches.insert (std::make_pair (new_key, rc)); > } Thanks, I used this. >> + >> + /* Update the regcaches' ptid, insert them back in the map with an updated >> + key. */ >> + target_ptid new_key (target, new_ptid); >> + for (regcache *rc : regcaches_to_update) >> { >> - if (regcache->ptid () == old_ptid && regcache->target () == target) >> - regcache->set_ptid (new_ptid); >> + rc->set_ptid (new_ptid); >> + regcaches.insert (std::make_pair (new_key, rc)); >> } >> } >> >> @@ -438,19 +494,53 @@ regcache_thread_ptid_changed (process_stratum_target *target, >> void >> registers_changed_ptid (process_stratum_target *target, ptid_t ptid) >> { >> - for (auto oit = regcaches.before_begin (), it = std::next (oit); >> - it != regcaches.end (); ) >> + if (target == nullptr) >> { >> - struct regcache *regcache = *it; >> - if ((target == nullptr || regcache->target () == target) >> - && regcache->ptid ().matches (ptid)) >> - { >> - delete regcache; >> - it = regcaches.erase_after (oit); >> - } >> - else >> - oit = it++; >> + /* Since there can be ptid clashes between targets, it's not valid to >> + pass a ptid without saying to which target it belongs. */ >> + gdb_assert (ptid == minus_one_ptid); >> + >> + /* Delete all the regcaches. */ >> + for (auto pair : regcaches) >> + delete pair.second; > > We could store a std::unique_ptr in the map instead > to avoid all these manual deletes. Indeed, I'll do that. >> + >> + regcaches.clear (); >> } >> + else if (ptid != minus_one_ptid) >> + { >> + /* Non-NULL target and non-minus_one_ptid, delete all regcaches belonging >> + to this (TARGET, PTID). */ >> + target_ptid key (target, ptid); >> + auto range = regcaches.equal_range (key); >> + for (auto it = range.first; it != range.second; it++) >> + delete it->second; > > Like this one. Done. >> + >> + regcaches.erase (key); > > Here as well could do: > > for (auto it = range.first; it != range.second;) > { > delete it->second; > it = regcaches.erase (it); > } With the unique_ptr, it's now just: regcaches.erase (target_ptid (target, ptid)); >> + } >> + else >> + { >> + /* Non-NULL target and minus_one_ptid, delete all regcaches associated >> + to this target. >> + >> + We unfortunately don't have an efficient way to do this. Fall back >> + to iterating all items to find all those belonging to TARGET. >> + >> + Note that in C++11, it's not safe to erase map entries while >> + iterating, so we keep track of them and delete them at the end. */ > > Here as well. It's down to: for (auto it = regcaches.begin (); it != regcaches.end ();) { if (it->second->target () == target) it = regcaches.erase (it); else it++; } Thanks for the comments. I'll try the multi-level map thing and come back with another version. Simon