[VM launcher] Ran `Ray status` after I sshed in to the head node and it printed "No cluster status"
What happened + What you expected to happen
The head node is started.
Local node IP: 172.31.62.187
--------------------
Ray runtime started.
--------------------
Next steps
To add another node to this Ray cluster, run
ray start --address='172.31.62.187:6379'
To connect to this Ray cluster:
import ray
ray.init()
To submit a Ray job using the Ray Jobs CLI:
RAY_ADDRESS='http://127.0.0.1:8265' ray job submit --working-dir . -- python my_script.py
See https://docs.ray.io/en/latest/cluster/running-applications/job-submission/index.html
for more information on submitting Ray jobs to the Ray cluster.
To terminate the Ray runtime, run
ray stop
To view the status of the cluster, use
ray status
To monitor and debug Ray, view the dashboard at
127.0.0.1:8265
If connection to the dashboard fails, check your firewall settings and network configuration.
Shared connection to 34.223.114.236 closed.
New status: up-to-date
I ran Ray status after I sshed in to the head node and it printed "No cluster status".
Last login: Wed May 3 21:47:54 2023 from {my laptop ip}
ubuntu@ip-172-31-62-187:~$ ray status
No cluster status.
ubuntu@ip-172-31-62-187:~$ exit
The yaml file is attached below.
cluster_name: 0503-3
max_workers: 2
provider:
type: aws
region: us-west-2
cache_stopped_nodes: True
auth:
ssh_user: ubuntu
available_node_types:
ray.head.default:
node_config:
InstanceType: m5.2xlarge
ray.worker.default:
min_workers: 2
max_workers: 2
node_config:
InstanceType: m5.2xlarge
head_node_type: ray.head.default
head_start_ray_commands:
- ray stop
- ray start --head --port=6379 --object-manager-port=8076 --autoscaling-config=~/ray_bootstrap_config.yaml --temp-dir=~/ray_temp_logs/
worker_start_ray_commands:
- ray stop
- ray start --address=$RAY_HEAD_IP:6379 --object-manager-port=8076
Versions / Dependencies
see yaml file above
Reproduction script
see yaml file above
Issue Severity
High: It blocks me from completing my task.
cc: @gvspraveen @wuisawesome
In the absence of error messages i'm assuming this is a race condition where that ray status is happening before the autoscaler is fully up.
I assume this should get fixed in the autoscaler refactor? @scv119
This P2 issue has seen no activity in the past 2 years. It will be closed in 2 weeks as part of ongoing cleanup efforts.
Please comment and remove the pending-cleanup label if you believe this issue should remain open.
Thanks for contributing to Ray!