ray icon indicating copy to clipboard operation
ray copied to clipboard

[Ray component: Core] Runtime Envs should support package installation with uv

Open colinjc opened this issue 1 year ago • 2 comments

Description

uv is a vastly faster at installing Python packages than Conda and Pip, so Ray should enable using it when defining runtime envs.

uv has two interfaces, which would be nice to support separately -

The uv pip interface is nearly drop-in compatible with pip with the exception of some outlier packages like jax[tpu], and would be nice to have as an option.

The uv sync interface is nicer, but would only really work by passing a pyproject.toml in as the argument much like runtime env's conda argument accepts a environment yaml path.

Use case

Runtime env creation now is quite slow. Having uv support would really improve the ux of iterating changes to a runtime env.

colinjc avatar Sep 25 '24 18:09 colinjc

We will pick it up if there is enough demand from OSS users.

jjyao avatar Oct 07 '24 21:10 jjyao

Sounds like this would be really really great to have!

uv installations from are 10-100x faster in my experience (especially from cache) so should really improve UX for heavy runtime environments like torch

danielgafni avatar Oct 18 '24 22:10 danielgafni

I agree that this would be great. We have moved from poetry to uv recently and the difference is staggering.

dbuades avatar Oct 24 '24 17:10 dbuades

cc @dentiny

jjyao avatar Oct 29 '24 21:10 jjyao

We've seen a huge increase in speed with uv and would love to see it as an option in Ray

chainlink avatar Oct 30 '24 17:10 chainlink

Hi all, does this interface look reasonable to you?

runtime_env = {"pip": {"uv_version": "==0.1.1", "packages":["tensorflow", "requests"], "pip_check": False}}

Note the uv_version field in pip. If it's set, we use uv instead of pip. uv_version and pip_version can not both exist.

rynewang avatar Oct 30 '24 23:10 rynewang

What about we completely separate pip plugin and uv plugin:

runtime_env = {"uv": {"uv_version": "==0.1.1", "packages":["tensorflow", "requests"], "pip_check": False}}

pip_version is invalid in uv runtime env plugin.

jjyao avatar Oct 30 '24 23:10 jjyao

What about we completely separate pip plugin and uv plugin:

@jjyao I think interface wise it makes sense, but implementation wise we would have quite a few duplicate code between pip vs uv. My PR (https://github.com/ray-project/ray/pull/48457) attempts to do least code change as possible.

uv could be a subset of pip, one justification is, the command for uv to install packages is pip uv install.

dentiny avatar Oct 30 '24 23:10 dentiny

We can have an implementation that share code between these two plugins.

uv doesn't use pip, it just provides a pip like interface. I feel it's cleaner to explicitly separate them and this also gives us flexibility to support uv specific options in the future.

cc @pcmoritz for API discussion.

jjyao avatar Oct 30 '24 23:10 jjyao

I think it would be best if Ray did both:

  1. Use uv instead of pip with the current {"runtime_env": {"pip": ...}} interface. This should be enabled by default. We can make a flag to disable uv in this interface just in case something goes wrong. This interface won't take any uv-specific parameters. This would immediately improve installation times for all Ray users.
  2. Add a new {"runtime_env": {"uv": ...}} interface which would mostly share the same code but also will be able to take some new uv-specific parameters.

danielgafni avatar Oct 31 '24 12:10 danielgafni

Hi I think we have multiple available solutions, just a brief summaries:

  1. uv has higher priority than pip, with no user interface changed 1.1 Option-1: automatically use uv when installed, detect whether uv exists in environment, prefer uv if already installed, otherwise fallback to pip; no user side change needed 1.2 Option-2: install and use uv in all cases

If pip_version version specified, which means users do have an intention to prefer pip, we will respect and no uv checked in at certain case.

  1. Use uv as a subcommand for pip and install uv when specified. Within runtime env, we expect new spec like (1) uv_version, which assigns a detailed uv to use; or (2) use_uv, which means users don't have a requirement on version, so latest version would be installed.

The interface would look pretty similar to what we have for pip command, which looks like

runtime_env = {"pip": {"uv_version": "==0.1.1", "packages":["tensorflow", "requests"], "pip_check": False}}

uv is only selected when explicitly specified.

  1. Make uv a new command as a counterpart of pip, which looks similar to how we use pip/conda nowadays
runtime_env = {"uv": {"uv_version": "==0.1.1", "packages":["tensorflow", "requests"]}}

In rollout phase, one thing we could do to reduce risk is allow fallback to pip when package installation via uv fails.

dentiny avatar Oct 31 '24 18:10 dentiny

I feel silently changing implementations behind pip is so risky. Rather it's good to make uv opt-in, by either a "uv" plugin, or a pip.uv_version key.

rynewang avatar Oct 31 '24 19:10 rynewang

And no, we don't want the behavior to depend on "if uv exists on this machine". If we want uv and it does not exist we need to fail instead of falling back to pip.

rynewang avatar Oct 31 '24 19:10 rynewang

@colinjc @dbuades @chainlink any feedbacks on the above API proposals?

jjyao avatar Oct 31 '24 20:10 jjyao

I agree that swapping pip to uv blindly is very risky. They're not as 1:1 compatible as you would think, a lot of code would break due to the stricter constraint checking.

In the Ray docs it's mentioned that pip installs into the existing , while conda creates an isolated one. Which approach will the uv env implement?

runtime_env = {"uv": {"uv_version": "==0.1.1", "packages":["tensorflow", "requests"]}}

I would make it more explicit that this is the uv pip interface, in case there's interest in adding uv sync support later.

Or support both in the same interface?

{"uv": 
  {"version": "==0.1.1", "pip_packages": ["tensorflow"], "pyproject": "/path/to/pyproject.toml"}
}

colinjc avatar Oct 31 '24 21:10 colinjc

I agree that swapping pip to uv blindly is very risky. They're not as 1:1 compatible as you would think, a lot of code would break due to the stricter constraint checking.

In the Ray docs it's mentioned that pip installs into the existing , while conda creates an isolated one. Which approach will the uv env implement?


runtime_env = {"uv": {"uv_version": "==0.1.1", "packages":["tensorflow", "requests"]}}

I would make it more explicit that this is the uv pip interface, in case there's interest in adding uv sync support later.

Or support both in the same interface?


{"uv": 

  {"version": "==0.1.1", "pip_packages": ["tensorflow"], "pyproject": "/path/to/pyproject.toml"}

}

Support for installing packages with uv sync with the uv.lock file would be great, since we wouldn't need to export a requirements.txt anymore and we would be sure that the exact project dependencies are installed.

dbuades avatar Nov 05 '24 05:11 dbuades

I would make it more explicit that this is the uv pip interface, in case there's interest in adding uv sync support later.

Sounds good.

dentiny avatar Nov 05 '24 08:11 dentiny

Support for installing packages with uv sync with the uv.lock file would be great, since we wouldn't need to export a requirements.txt anymore and we would be sure that the exact project dependencies are installed.

This really sounds like it should be a build step (in CI) instead. Curious to know why would you need to swap lock files at runtime?

danielgafni avatar Nov 05 '24 12:11 danielgafni

This would be a HUGE improvement for us. Runtime env installation is slow for our use cases, and UV is significantly faster than standard pip installs. Would greatly appreciate if this feature is implemented soon.

sfriedowitz avatar Nov 08 '24 17:11 sfriedowitz

Hi community, I'm working on runtime env setup with uv; As the first milestone, I target at existing features for pip; Let me know if you have any other feature requested specifically for uv.

dentiny avatar Nov 10 '24 06:11 dentiny

Support for installing packages with uv sync with the uv.lock file would be great, since we wouldn't need to export a requirements.txt anymore and we would be sure that the exact project dependencies are installed.

This really sounds like it should be a build step (in CI) instead. Curious to know why would you need to swap lock files at runtime?

Sorry for the delayed response, I was playing with Ray and uv today and came back to this thread.

We already export the lock to a requirements.txt file in the CI. However, uv export doesn't add external indexes and, although there are manual workarounds to add it afterwards, it doesn't guarantee perfect compatibility. This comment in the uv repo explains the reasoning well.

I think the support for uv pip currently implemented in Ray works perfectly fine. Adding support for uv sync would just be a nice-to-have feature for the future.

dbuades avatar Dec 23 '24 14:12 dbuades

Please enable support for this!

vladjohnson avatar Feb 17 '25 19:02 vladjohnson

Please enable support for this!

Hi @vladjohnson , I think this feature has been enabled in latest release.

dentiny avatar Feb 18 '25 00:02 dentiny

@dentiny, is the pyproject.toml a supported source for dependency installation? Thanks!

vladjohnson avatar Feb 18 '25 16:02 vladjohnson