Zhanghao Wu

Results 315 comments of Zhanghao Wu

Yes, it is in the master right now and will be in our next release. Please feel free to switch to the latest master branch with the following commands: ```...

We are planning to make the `sky start` automatically mount the cloud storage previously mounted. Please stay tuned for the update. ; )

Yes, I think the keeping the order the same as mentioned above is a good idea. If the user request a subset of the nodes, we can still keep the...

Another way to get internal ips with our current `get_node_ips` is to create a temporary cluster yaml from the existing one, by adding `use_internal_ips: true` under the `provider` section. ([reference](https://github.com/ray-project/ray/blob/94f4548c373ef370f693d19945793d764c39f034/python/ray/autoscaler/_private/commands.py#L1330-L1333))...

Great findings @ewzeng! Thank you for investigating the problem. Do you want to add a network check in the following function to reduce the friction caused by the hanging? https://github.com/skypilot-org/skypilot/blob/f49ed5aed4d180c18d7feddcb2733c61d9170441/sky/backends/backend_utils.py#L1342-L1343

I think this could lead to a deep recursive detection finding all the clusters launched on the clusters in local cluster table. Probably, a better solution is to share the...

I think it would be useful if SkyPilot can be smarter for this. For example, if the disk_size is not specified and the image id is specified, we can set...

Yes, that would be good to have. I am thinking if we can have a `conda` style machine sharing as well. 1. `sky share -c cluster [new_username] [-i other_pubkey_file] >...

Hey @Akshat977, since we did not hear back from you for a long time, we will reassign the issue soon. Please tell us your progress, if you are working on...

> Just saw another instance of this problem. The instance was preempted about right after the first line of this log: Great observation! Is the instance preempted in the GCP...