Improve resolver speed again
What does this PR try to resolve?
Follow-up of #14663 which has been splitted.
How should we test and review this PR?
Commit 1 enables the clippy::derive_ord_xor_partial_ord lint to be sure that Ord and PartialOrd are implemented correctly everywhere in the repository.
Commit 2 moves the ActivationKey type in core.
Commit 3 adds interning for the resolver activation keys so they can be used with nohash-hasher, decreasing resolving time by another 30%. This is possible since the interning is implemented with a static HashSet, so each interned element can be reduced to its address.
Performance comparison with the test added in the previous PR by setting LAST_CRATE_VERSION_COUNT = 100:
| commit | duration |
|---|---|
| master | 16s |
| with rustc-hash | 6.7s |
| with nohash | 4.6s |
Firefox profiles, can be read with https://profiler.firefox.com: perf.tar.gz
r? Eh2406
I benchmarked against an earlier version of this code. Running across all of crates.io (including solana) went from 235.88min to 218.03min that is a solid 7.5% improvement. solana-transaction-status v1.3.10 improve the most going from 49.2s to 35.1s a 14s improvement! On the other hand solana-core v1.0.5 went from 99s to 114s that is 14s worse. Here is the disaggregated data for crates that took >=0.02 to build.
sort_join.csv
I tested solana-core v1.0.5 on my machine, it resolved in exactly the same time before and after this PR.
Good to know! I will have to rerun when I get a chance. The most likely explanation is a random hiccup on my computer, that's pretty likely when carefully measuring time of a process that runs for an hour.
I tested
solana-core v1.0.5on my machine, it resolved in exactly the same time before and after this PR.
I ran solana-core v1.0.5 on another machine where I can do a perf, and I observe a 30% improvement with this PR going from 6m25 to 4m40. I may have some throttling problems on the first machine because the timings are not stable.
For solana-core v1.0.5, cargo spent more than 30s dropping the ResolverContext:
flamegraph.svg.tar.gz
A more targeted run, that only looked at crates containing the word solana went from 180.61min from master to 162.47min on this PR. With this run solana-core v1.0.5 went from 139s to 112s! So that one may have been a fluke on my part.
out_join.csv
I wanted to spend some time teasing apart exactly what part of the old hash was slow. The old hash but modified to hash the pointer of the InternedString instead of its contents got me 25% of the savings. It would be hard to believe that the SemverCompatibility is slow, it's basically two numbers. Hashing for SourceId is surprisingly complicated. (note to self, using SourceId::full_hash is fast but gets the wrong answer.) But I kept getting distracted. Not hashing the SourceId (with the InternedString fix) got about 80% of the savings. If keys only differ by their SourceId a hash map becomes O(n), but n is the number of packages with the same name and major version that only differ by source. Makes me wonder if hashing only by the pointer of the InternedString would also be effective.
With the holiday I will probably not work on this till Monday. Sorry for the delay.
I think we should also define a custom Hash for Dependency, as it is used in the conflict cache and hashing it amounts to 5-10% of the total resolving time.
So I was able to get about 50% of benefit was much less churn. I change the order so that we only do the Eq test for SourceId after the others are determined to be equal, and I overrode Hash to look at the InternedString address and not look at SourceId.
#[derive(Clone, PartialEq, Eq, Debug)]
pub struct ActivationsKey(InternedString, SemverCompatibility, SourceId);
impl std::hash::Hash for ActivationsKey {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
std::ptr::NonNull::from(self.0.as_str()).hash(state);
self.1.hash(state);
// self.2.hash(state); // Packages that only differ by SourceId are rare enough to not be worth hashing
}
}
So the other half the performance is because despite being interned SourceId::Eq is not a ptr::eq. Which in turn is because SourceIds concept of identity is a complete mess. The code relies on having to IDs that are Eq but do not have the same values for their fields.
So that leads to two questions about maintainability. Pinging @epage for strong and useful opinions about cleaning up our code base:
- Any ideas about how to clean up the definition of "what is a
SourceId" so that the structs we've defined actually match the semantics were using? The only thing I can think of is splitting it up into aCanonicalSourcethat just has the kind and canonical URL, and aSourceInfothat has the other fields. Thenstruct SourceId {canonical: &'static CanonicalSource, info: &'static SourceInfo}. Then EQ andfast_hash(a new method for us to use inActivationsKey) can look at the ptr to canonical. But I'm open to other suggestions. - Is the remaining performance benefit worth merging this PR as is, specifically involving adding
core::activation_key::ActivationKeywith the bike shedding and documentation work? I'm leaning toward no for now. @x-hgg-x has come across a number of good leads for making resolution faster and we should reevaluate this after we followed up on those. Ed had some initial feedback in https://github.com/rust-lang/cargo/pull/14663, so I wanted his opinion here.
:umbrella: The latest upstream changes (presumably #14692) made this pull request unmergeable. Please resolve the merge conflicts.
If we can get the performance benefit without storing the activation key inside of a PackageId, we shouldn't. The activation key is specific to the resolver and should be scoped to it.
If there are cheap things we can do, like re-order fields, that seems like an easy win.
Cleaning up SourceId would be nice and if it means we can then hash / eq on a cheaper value, then great!
:umbrella: The latest upstream changes (possibly 081d7bac633bbc72883fb74fb4993bb1b1a2df4a) made this pull request unmergeable. Please resolve the merge conflicts.
With #14800 and #14915 being merged and this being stale, I'm going to go ahead and close. If someone wants to resume similar work, I recommend checking out https://github.com/rust-lang/cargo/pull/14665#issuecomment-2465717311 and #15649