uv icon indicating copy to clipboard operation
uv copied to clipboard

Use lockfile to prefill resolver index

Open ibraheemdev opened this issue 1 year ago • 4 comments
trafficstars

Summary

Use the lockfile to prefill the InMemoryIndex used by the resolver. This enables us to resolve completely from the lockfile without making any network requests/builds if the requirements are unchanged. It also means that if new requirements are added we can still avoid most I/O during resolution, partially addressing https://github.com/astral-sh/uv/issues/3925.

The main limitation of this PR is that resolution from the lockfile can fail if new versions are requested that are not present in the lockfile, in which case we have to perform a fresh resolution. Fixing this would likely require lazy version/metadata requests by VersionMap (this is different from the lazy parsing we do, the list of versions in a VersionMap is currently immutable).

Resolves https://github.com/astral-sh/uv/issues/3892.

Test Plan

Added a deterministic! macro that ensures that a resolve from the lockfile and a clean resolve result in the same lockfile output for all our current tests.

ibraheemdev avatar Jun 24 '24 22:06 ibraheemdev

One thing I'm wondering if any of the metadata stored in the lockfile is potentially mutable, meaning resolution would fail in the presence of a lockfile. e.g. we probably can't rely on the locked metadata of path dependencies, and maybe also direct URLs?

ibraheemdev avatar Jun 24 '24 22:06 ibraheemdev

One thing I'm wondering if any of the metadata stored in the lockfile is potentially mutable, meaning resolution would fail in the presence of a lockfile. e.g. we probably can't rely on the locked metadata of path dependencies, and maybe also direct URLs?

Yeah, this is the right instinct. We need to check if they're up-to-date... Take a look at RequirementSatisfaction which performs these checks when (e.g.) determining whether we can use a value from the cache.

In general, we assume that HTTP direct URL requirements are immutable and thus require --upgrade or --refresh or similar. For file-based direct URL requirements, the rules are a little more nuanced.

charliermarsh avatar Jun 25 '24 22:06 charliermarsh

On the airflow benchmark, which is a very large lockfile, this speeds up uv lock by 2x:

$ hyperfine "../uv/target/profiling/baseline lock" "../uv/target/profiling/uv lock"
Benchmark 1: ../uv/target/profiling/baseline lock
  Time (mean ± σ):     183.6 ms ±   4.6 ms    [User: 199.8 ms, System: 124.6 ms]
  Range (min … max):   176.1 ms … 190.4 ms    15 runs
 
Benchmark 2: ../uv/target/profiling/uv lock
  Time (mean ± σ):      93.6 ms ±   2.0 ms    [User: 78.9 ms, System: 16.2 ms]
  Range (min … max):    90.5 ms …  99.1 ms    31 runs
 
Summary
  ../uv/target/profiling/uv lock ran
    1.96 ± 0.06 times faster than ../uv/target/profiling/baseline lock

From a user perspective, 100ms seems relatively instant while I can feel some lag at 200ms, so this is a noticeable improvement. There is probably still room for some improvement here.

The big win here is that with a cold cache, uv lock can still run in 100ms due to the presence of a lockfile, where it would previously take over 30 seconds.

ibraheemdev avatar Jun 26 '24 17:06 ibraheemdev

The fact that you can resolve so quickly with a cold cache is crucial.

charliermarsh avatar Jun 28 '24 01:06 charliermarsh

What if we intentionally run the resolver in Offline mode for this?

charliermarsh avatar Jul 07 '24 19:07 charliermarsh

Okay, I think this is ready now.

  • The upgrade strategy is respected. Any packages included in the upgrade strategy (all of them if --upgrade) are excluded from the prefilled index. See the passing lock_preference test.
  • Missing extras/development groups are detected and require a fresh resolution. This unfortunately means that projects with extras that don't have optional dependencies, such as requests[security], always require a fresh resolve. The solution to this is either to add extra information to the lockfile or eventually add per-dependency fallbacks in the resolver to fetch up-to-date metadata. See the passing lock_new_extras test.
  • The duplicated resolved in Xms is now avoided. We still may print checkout/build progress in the initial resolve from the lockfile even if it fails, but the progress is seamless as the new progress bar overwrites the previous one in-place. You can see how this looks locally by adding a cold git dependency and running uv lock with a new extra added to an existing dependency.

ibraheemdev avatar Jul 11 '24 20:07 ibraheemdev

One remaining issue is that we don't check for yanked releases (see https://github.com/astral-sh/uv/issues/3892#issuecomment-2212528949). I'm not sure how we should tackle this, does PyPI have a fast-path to check if a specific version was yanked? It will likely still result in significant overhead.

ibraheemdev avatar Jul 11 '24 20:07 ibraheemdev

Missing extras/development groups are detected and require a fresh resolution. This unfortunately means that projects with extras that don't have optional dependencies, such as requests[security], always require a fresh resolve. The solution to this is either to add extra information to the lockfile or eventually add per-dependency fallbacks in the resolver to fetch up-to-date metadata. See the passing lock_new_extras test.

Can you explain this piece in a little more detail? What are "projects with extras that don't have optional dependencies"? Does security not exist as a valid extra for requests?

charliermarsh avatar Jul 12 '24 01:07 charliermarsh

This is looking good.

charliermarsh avatar Jul 12 '24 01:07 charliermarsh

Can you explain this piece in a little more detail? What are "projects with extras that don't have optional dependencies"? Does security not exist as a valid extra for requests?

Ah I think I was confused about this. security is an extra provided by requests, but it doesn't enable any optional dependencies so it doesn't show up in the lockfile. Turns out it is deprecated so I don't think this is a problem. I was thinking that some packages might provide extras that enable features dynamically without introducing optional dependencies (like cargo features) but I don't think that's expected (or even possible)?

ibraheemdev avatar Jul 12 '24 01:07 ibraheemdev

Ohhh ok, I see. Yes, this makes sense.

charliermarsh avatar Jul 12 '24 01:07 charliermarsh

I was thinking that some packages might provide extras that enable features dynamically without introducing optional dependencies (like cargo features) but I don't think that's expected (or even possible)?

Yeah I believe this is not possible. AIUI, extras in Python disappear after dependencies are resolved. They aren't themselves directly visible from the package that defines them, unlike Cargo features, which can be used in arbitrary ways.

BurntSushi avatar Jul 12 '24 12:07 BurntSushi