The ray rsync-up cli reports no issue, but actually file is absent on remote side (Ray AWS cluster)
What happened + What you expected to happen
The ray rsync-up cli reports no issue, but actually file is absent on remote side
Wanted: requirements.txt on remote side /home/ray or an error message
Ready to use cluster examples are here https://github.com/yell0w4x/ray-serve-boilerplate/
(ray.venv) q@bora-bora:~/work/playground/ray-web-app$ ray rsync-up -v cluster.yaml requirements.txt /home/ray
2023-05-04 21:59:08,800 INFO util.py:376 -- setting max workers for head node type to 0
Loaded cached provider configuration from /tmp/ray-config-4baa24b02d729e74d31d38959a8a266c2f868f62
If you experience issues with the cloud provider, try re-running the command with --no-config-cache.
Creating AWS resource `ec2` in `us-west-2`
Creating AWS resource `ec2` in `us-west-2`
Fetched IP: 34.219.##.##
Running `mkdir -p /tmp/ray_tmp_mount/default/home && chown -R ubuntu /tmp/ray_tmp_mount/default/home`
Shared connection to 34.219.##.## closed.
Running `rsync --rsh ssh -i /home/q/.ssh/ray-autoscaler_us-west-2.pem -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 -o ControlMaster=auto -o ControlPath=/tmp/ray_ssh_7694f4a663/c21f969b5f/%C -o ControlPersist=10s -o ConnectTimeout=120s -avz --exclude **/.git --exclude **/.git/** --filter dir-merge,- .gitignore requirements.txt [email protected].##.##:/tmp/ray_tmp_mount/default/home/ray`
sending incremental file list
requirements.txt
sent 129 bytes received 35 bytes 65,60 bytes/sec
total size is 16 speedup is 0,10
Running `docker inspect -f '{{.State.Running}}' ray_container || true`
Shared connection to 34.219.##.## closed.
Running `docker exec -it ray_container /bin/bash -c 'mkdir -p /home' && rsync -e 'docker exec -i' -avz /tmp/ray_tmp_mount/default/home/ray ray_container:/home/ray`
sending incremental file list
ray
sent 118 bytes received 35 bytes 102.00 bytes/sec
total size is 16 speedup is 0.10
Shared connection to 34.219.##.## closed.
`rsync`ed requirements.txt (local) to /home/ray (remote)
The working_dir doesn't matter ray.init(address='ray://localhost:10001', runtime_env=dict(working_dir=os.getcwd())).
The example is here https://github.com/yell0w4x/ray-serve-boilerplate/blob/master/src/rsync_up.py
(ray.venv) q@bora-bora:~/work/playground/ray-web-app$ python src/rsync_up.py
2023-05-04 22:42:02,242 INFO util.py:376 -- setting max workers for head node type to 0
2023-05-04 22:42:02,349 VWARN commands.py:324 -- Loaded cached provider configuration from /tmp/ray-config-4baa24b02d729e74d31d38959a8a266c2f868f62
2023-05-04 22:42:02,349 WARN commands.py:332 -- If you experience issues with the cloud provider, try re-running the command with --no-config-cache.
2023-05-04 22:42:02,350 VINFO utils.py:150 -- Creating AWS resource `ec2` in `us-west-2`
2023-05-04 22:42:02,590 VINFO utils.py:150 -- Creating AWS resource `ec2` in `us-west-2`
2023-05-04 22:42:03,490 INFO command_runner.py:204 -- Fetched IP: 34.219.##.##
2023-05-04 22:42:03,490 INFO log_timer.py:30 -- NodeUpdater: i-005872b9b0516587d: Got IP [LogTimer=0ms]
2023-05-04 22:42:03,490 VINFO command_runner.py:371 -- Running `mkdir -p /tmp/ray_tmp_mount/default/home && chown -R ubuntu /tmp/ray_tmp_mount/default/home`
2023-05-04 22:42:03,490 VVINFO command_runner.py:374 -- Full command is `ssh -tt -i /home/q/.ssh/ray-autoscaler_us-west-2.pem -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 -o ControlMaster=auto -o ControlPath=/tmp/ray_ssh_7694f4a663/c21f969b5f/%C -o ControlPersist=10s -o ConnectTimeout=120s [email protected].##.## bash --login -c -i 'source ~/.bashrc; export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (mkdir -p /tmp/ray_tmp_mount/default/home && chown -R ubuntu /tmp/ray_tmp_mount/default/home)'`
Shared connection to 34.219.##.## closed.
2023-05-04 22:42:03,972 VINFO command_runner.py:414 -- Running `rsync --rsh ssh -i /home/q/.ssh/ray-autoscaler_us-west-2.pem -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 -o ControlMaster=auto -o ControlPath=/tmp/ray_ssh_7694f4a663/c21f969b5f/%C -o ControlPersist=10s -o ConnectTimeout=120s -avz --exclude **/.git --exclude **/.git/** --filter dir-merge,- .gitignore requirements.txt [email protected].##.##:/tmp/ray_tmp_mount/default/home/ray`
sending incremental file list
sent 62 bytes received 12 bytes 29,60 bytes/sec
total size is 16 speedup is 0,22
2023-05-04 22:42:05,458 VINFO command_runner.py:371 -- Running `docker inspect -f '{{.State.Running}}' ray_container || true`
2023-05-04 22:42:05,463 VVINFO command_runner.py:374 -- Full command is `ssh -tt -i /home/q/.ssh/ray-autoscaler_us-west-2.pem -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 -o ControlMaster=auto -o ControlPath=/tmp/ray_ssh_7694f4a663/c21f969b5f/%C -o ControlPersist=10s -o ConnectTimeout=120s [email protected].##.## bash --login -c -i 'source ~/.bashrc; export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (docker inspect -f '"'"'{{.State.Running}}'"'"' ray_container || true)'`
Shared connection to 34.219.##.## closed.
2023-05-04 22:42:06,009 VINFO command_runner.py:371 -- Running `docker exec -it ray_container /bin/bash -c 'mkdir -p /home' && rsync -e 'docker exec -i' -avz /tmp/ray_tmp_mount/default/home/ray ray_container:/home/ray`
2023-05-04 22:42:06,019 VVINFO command_runner.py:374 -- Full command is `ssh -tt -i /home/q/.ssh/ray-autoscaler_us-west-2.pem -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 -o ControlMaster=auto -o ControlPath=/tmp/ray_ssh_7694f4a663/c21f969b5f/%C -o ControlPersist=10s -o ConnectTimeout=120s [email protected].##.## bash --login -c -i 'source ~/.bashrc; export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (docker exec -it ray_container /bin/bash -c '"'"'mkdir -p /home'"'"' && rsync -e '"'"'docker exec -i'"'"' -avz /tmp/ray_tmp_mount/default/home/ray ray_container:/home/ray)'`
sending incremental file list
sent 58 bytes received 12 bytes 140.00 bytes/sec
total size is 16 speedup is 0.23
Shared connection to 34.219.##.## closed.
2023-05-04 22:42:06,767 VINFO updater.py:537 -- `rsync`ed requirements.txt (local) to /home/ray (remote)
Versions / Dependencies
(ray.venv) q@bora-bora:~/work/playground/ray-web-app$ ray --version; conda --version; python --version; uname -a
ray, version 2.4.0
conda 23.3.1
Python 3.7.16
Linux bora-bora 5.10.174-1-MANJARO #1 SMP PREEMPT Mon Mar 13 11:15:28 UTC 2023 x86_64 GNU/Linux
Reproduction script
ray rsync-up -v cluster.yaml requirements.txt /home/ray
https://github.com/yell0w4x/ray-serve-boilerplate/blob/master/src/rsync_up.py
import ray
from ray.autoscaler._private.commands import rsync
import os
def main():
ray.init(address='ray://localhost:10001', runtime_env=dict(working_dir=os.getcwd()))
local_path = "requirements.txt"
remote_path = "/home/ray"
rsync('cluster.yaml', source=local_path, target=remote_path,
override_cluster_name='default', down=False)
if __name__ == '__main__':
main()
Note:
ssh -i ~/.ssh/ray-autoscaler_us-west-2.pem -L 10001:localhost:10001 -nNT -v [email protected].##.##
Issue Severity
Low: It annoys or frustrates me.
This P2 issue has seen no activity in the past 2 years. It will be closed in 2 weeks as part of ongoing cleanup efforts.
Please comment and remove the pending-cleanup label if you believe this issue should remain open.
Thanks for contributing to Ray!