modin
modin copied to clipboard
FEAT-#7139: use ray-core+grpcio instead of ray-default
What do these changes do?
These changes are aimed at reducing the number of dependencies and thereby simplifying the installation of Modin with the Ray engine in the user environment. For example, we might not install dependencies for the Ray dashboard. To do this, we could use either package ray[client], but it appears only from version Ray>=2.6.0 (there is no desire to increase the minimally supported version of Ray so much for now), or switch to using the basic package itself (ray), but this will not work for fairly new versions of Ray, so how they don't include grpcio dependency.
As a transitional option, we can use the basic package with an explicit indication of grpcio package . This option will work, in this pull request, I tried it with different versions of Ray. However, there is a chance that grpcio package of some version will be installed, with which Ray has problems. Ray can explicitly exclude any versions in such cases (note: the exceptions are not made for the basic package, but for ray[client] package, so they cannot be reused on our side), but using the proposed approach, we most likely will not know about this until the problem arises for users.
If we decide to switch to such a transitional option, then I believe we should also consider switching to ray[client] package for Modin 1.0 release.
For example, dependencies for ray[default]==2.9.3:
"default": [
# If adding dependencies necessary to launch the dashboard api server,
# please add it to dashboard/optional_deps.py as well.
"aiohttp >= 3.7",
"aiohttp_cors",
"colorful",
"py-spy >= 0.2.0",
"requests",
"gpustat >= 1.0.0", # for windows
"grpcio >= 1.32.0; python_version < '3.10'", # noqa:E501
"grpcio >= 1.42.0; python_version >= '3.10'", # noqa:E501
"opencensus",
"pydantic!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3",
"prometheus_client >= 0.7.1",
"smart_open",
"virtualenv >=20.0.24, !=20.21.1", # For pip runtime env.
],
- [x] first commit message and PR title follow format outlined here
NOTE: If you edit the PR title to match this format, you need to add another commit (even if it's empty) or amend your last commit for the CI job that checks the PR title to pick up the new PR title.
- [ ] passes
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py - [ ] passes
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py - [ ] signed commit with
git commit -s - [ ] Resolves #7139
- [ ] tests added and passing
- [ ] module layout described at
docs/development/architecture.rstis up-to-date
@YarShev @sfc-gh-dpetersohn @dchigarev thoughts?
switch to using the basic package itself (ray), but this will not work for fairly new versions of Ray, so how they don't include grpcio dependency.
Why did they exclude grpcio? If we install ray-core, it won't work out of the box, right? Doesn't this look like an issue for Ray?
As a transitional option, we can use the basic package with an explicit indication of grpcio package . This option will work, in this pull request, I tried it with different versions of Ray. However, there is a chance that grpcio package of some version will be installed, with which Ray has problems. Ray can explicitly exclude any versions in such cases (note: the exceptions are not made for the basic package, but for ray[client] package, so they cannot be reused on our side), but using the proposed approach, we most likely will not know about this until the problem arises for users.
If Ray limits grpcio to install specific versions, it seems there shouln't be a problem when installing pip install grpcio ray-core. The pip resolver should handle a case with incompatible versions of grpcio for Ray.
If we decide to switch to such a transitional option, then I believe we should also consider switching to ray[client] package for Modin 1.0 release.
Why?
Why did they exclude grpcio? If we install ray-core, it won't work out of the box, right? Doesn't this look like an issue for Ray?
I don’t know for sure, but this is a conscious decision to transfer this package to another target: ray[client].
If Ray limits grpcio to install specific versions, it seems there shouldn't be a problem when installing pip install grpcio ray-core. The pip resolver should handle a case with incompatible versions of grpcio for Ray.
The default target does not contain version pins for grpcio dependency (only ray[client] or ray[default]).
If we decide to switch to such a transitional option, then I believe we should also consider switching to ray[client] package for Modin 1.0 release.
Why?
I consider this a more reliable solution, since this is not our direct dependency, but a transitive one from Ray.
Hi @rkooo567! I would like to be more confident that we are going in the right direction towards reducing Ray dependencies installed by default. Do I understand correctly that in a situation where core functionality and a cluster launcher are needed, it will be enough to install ray[client] instead of ray[default]? (since the dashboard from 'default' target is not needed)
I think there's no package that only includes cluster launcher. ray[client] is not for the cluster launcher (it is for ray client mode https://docs.ray.io/en/master/cluster/running-applications/job-submission/ray-client.html, which is not recommended to use)
I think there's no package that only includes cluster launcher. ray[client] is not for the cluster launcher (it is for ray client mode https://docs.ray.io/en/master/cluster/running-applications/job-submission/ray-client.html, which is not recommended to use)
@rkooo567 thanks! Now it’s more clear how to deal with this.
It looks like we can safely remove ClientObjectRef from Modin sources. It was introduced in https://github.com/modin-project/modin/pull/2851. Ray fixed the problem a long time ago and defined the type like this: class ClientObjectRef(raylet.ObjectRef):