ray icon indicating copy to clipboard operation
ray copied to clipboard

[Serve] Ray 2.9.0 does not support service deployment using docker images via ray job

Open psydok opened this issue 2 years ago • 0 comments

What happened + What you expected to happen

The container field of runtime_env is fixed in #40419 and will be included in Ray 2.9, which should be released today or tomorrow. Or you can try it today on the Ray nightly image. Let us know if you run into any issues!

In version 2.8.1, I was able to debug the service startup with an image and everything started with this command:

RAY_ADDRESS='http://localhost:8265' ray job submit \
    --working-dir ./examples/mytest/ \
    --runtime-env-json \
    '{"container": {"image": "mytest:latest", "run_options": ["--tty", "--privileged", "--cap-drop ALL", "--log-level=debug", "--device nvidia.com/gpu=all", "--security-opt=label=disable",  "--restart unless-stopped"]}, "config": {"eager_install": false}, "env_vars":{"NVIDIA_VISIBLE_DEVICES": "all"}}'  \
    -- python service.py

I updated Ray to 2.9.0 and am now getting errors: ValueError: The 'container' field currently cannot be used together with other fields of runtime_env. Specified fields: dict_keys(['working_dir', 'container', 'env_vars', 'config'])

I definitely want to deploy services using the ray job submit command, as serve deploy - overwrites/deletes existing services. But the documentation doesn't say anything about the alternative method (https://docs.ray.io/en/latest/serve/advanced-guides/multi-app-container.html).

Ray Cluster is deployed without using k8s and cloud systems.

Versions / Dependencies

python==3.11.5 ray[serve]==2.9.0 grpcio-tools==1.59.3

Reproduction script

RAY_ADDRESS='http://localhost:8265' ray job submit \
    --working-dir ./examples/mytest/ \
    --runtime-env-json \
    '{"container": {"image": "mytest:latest", "run_options": ["--tty", "--privileged", "--cap-drop ALL", "--log-level=debug", "--device nvidia.com/gpu=all", "--security-opt=label=disable",  "--restart unless-stopped"]}, "config": {"eager_install": false}, "env_vars":{"NVIDIA_VISIBLE_DEVICES": "all"}}'  \
    -- python service.py

Issue Severity

High: It blocks me from completing my task.

psydok avatar Dec 27 '23 09:12 psydok