Wei-Lin Chiang
Wei-Lin Chiang
@pschafhalter we have an ongoing PR https://github.com/skypilot-org/skypilot/pull/1532 for this issue. To unblock your use case, you may modify a line in `sky/templates/aws-ray.yml.j2` as below for now. https://github.com/skypilot-org/skypilot/pull/1532/files#diff-bea9d257f5081c10250926eda9b6a4b996c764f4359647b5020afe41066a6a8e We'll soon review...
From the discussion, seems like one option is to only show the `RUNNING` jobs by default. We may also want to add some hint message like ``` Note: Only RUNNING...
Ah yes this is a problem.. Didn't realize `source` can be a list. we should also correct the type of `self.source` to either list or str? https://github.com/skypilot-org/skypilot/blob/742c1dc82e62322d4e6d61c68b021b816f6110ad/sky/data/storage.py#L128
Yes @Michaelvll's PR include the fix. Romil just approved it, but happy to do additional reviews if needed.
I also agree we shouldn't upload users' cloud credentials at all. For 1, I'm not sure about the use case where users would prefer to `sky launch` on remote VMs...
For users who care about security, uploading their credentials without asking them would give a pretty negative impression.
Thinking would it be better if we sync all the states to S3 bucket? So even if the controller is stopped we can still get the up-to-date job status table....
If `-m` is the issue, looks like we can try to use `parallel_thread_count` or `parallel_process_count` documented [here](https://cloud.google.com/storage/docs/gsutil/addlhelp/GlobalCommandLineOptions) to limit the number of concurrent connections.
For TPU user, `TPU Admin` role needs to be added.
According to our user, the bug still exists. Reason: During preemption, we expect GCP to turn the VM state from `READY` to `PREEMPTED` as shown in the [document](https://cloud.google.com/tpu/docs/preemptible#detecting_if_a_tpu_has_been_preempted) . However,...