uv icon indicating copy to clipboard operation
uv copied to clipboard

Speed up pyc compilation

Open hauntsaninja opened this issue 4 months ago • 4 comments

Thanks for implementing https://github.com/astral-sh/uv/issues/1788 , it's been great!

Since currently pyc compilation is implemented as a pass over the entire env, it can be quite costly:

λ uv pip install pypyp --compile 
Resolved 1 package in 123ms
Downloaded 1 package in 68ms
Installed 1 package in 28ms
Bytecode compiled 58064 files in 53.81s
 + pypyp==1.2.0

In this venv, it's about 1000x longer than it takes to install without pyc compilation.

Note this very slow time is on macOS, it's much better on Linux machines I have access to (more like 10s). See https://github.com/astral-sh/uv/issues/2326#issuecomment-1987366097 for my laptop specs

To be clear, this is not a particularly pressing issue. The need to bytecode compile deltas is much lower than when building things from scratch. Nevertheless, ideally uv should be significantly faster than pip in all usage scenarios.

With that in mind, some possible suggestions:

  1. It looks like uv currently forces recompilation https://github.com/astral-sh/uv/blob/1181aa9be40b7f99334f7efd15d5102653d8b38b/crates/uv-installer/src/pip_compileall.py#L50 I'm not sure why this is... maybe something to do with checked hash validation that compileall doesn't handle correctly? The script predates #2086 , so maybe there's something else going on

  2. We could only bytecode compile the newly installed packages

  3. If uv no longer forces recompilation, you could move the invalidation / mtime logic into Rust, not sure how much that would help. But you could switch to whatever syscall os.scandir does (I think readdir?) which means you won't have to stat each file individually like you would shelling to compileall https://github.com/python/cpython/blob/3726cb0f146cb229a5e9db8d41c713b023dcd474/Lib/compileall.py#L229-L236

  4. Something something copy on write for pyc

hauntsaninja avatar Mar 24 '24 03:03 hauntsaninja

It looks like pip forces recompilation too, and this goes back to pip v1.5 https://github.com/pypa/pip/commit/7ec49dc2fb0a9dbf522e79d87e2bc13d29a2556e#diff-9b28941fa609b0277432cba3444d4595da4ef80ca7c04e9024c75bb0a15ae830R163

Couldn't discern a reason for that, but since it's only doing it for the newly installed package I guess it doesn't hurt much

hauntsaninja avatar Mar 24 '24 04:03 hauntsaninja

Makes sense! I assume the highest-impact thing here would be to only recompile newly-installed packages.

charliermarsh avatar Mar 24 '24 22:03 charliermarsh

Yeah, probably. I just tested my first suggestion since it's easy:

diff --git a/crates/uv-installer/src/pip_compileall.py b/crates/uv-installer/src/pip_compileall.py
index 47e0242f..20bb3c6e 100644
--- a/crates/uv-installer/src/pip_compileall.py
+++ b/crates/uv-installer/src/pip_compileall.py
@@ -47,7 +47,7 @@ with warnings.catch_warnings():
         # We'd like to show those errors, but given that pip thinks that's totally fine,
         # we can't really change that.
         success = compileall.compile_file(
-            path, invalidation_mode=invalidation_mode, force=True, quiet=2
+            path, invalidation_mode=invalidation_mode, force=False, quiet=2
         )
         # We're ready for the next file.
         print(path)

And it speeds things up 2-3x, in the above case from 55s to 20s. Would probably stack well with the third suggestion too, in case we like compiling the whole venv (if it's fast no reason not to)

hauntsaninja avatar Mar 24 '24 23:03 hauntsaninja

I'm cool with shipping that.

charliermarsh avatar Mar 24 '24 23:03 charliermarsh