Kai-Hsun Chen
Kai-Hsun Chen
[Possible Solution] 1. Update [ray-operator/Dockerfile](https://github.com/ray-project/kuberay/blob/master/ray-operator/Dockerfile#L18) to `RUN CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -a -o manager main.go` 2. Build a multi-architecture image for Kuberay with [docker/buildx](https://github.com/docker/buildx). 3. Build a multi-architecture...
See #557 for more details.
TODO: Host stable charts in a separate repo.
cc @DmitriGekhtman
@Jeffwan is this PR ready to merge? The merge is blocked by your change requests. Thank you!
The following links may be useful. * https://support.hashicorp.com/hc/en-us/articles/4404634420755-Why-am-I-seeing-context-deadline-exceeded-errors * https://stackoverflow.com/questions/75148975/leaderelections-failing-lease-unable-to-be-renewed-automatically * https://discuss.kubernetes.io/t/kubeadm-init-fails-kube-scheduler-fails-with-error-retrieving-resource-lock-kube-system-kube-scheduler-context-deadline-exceeded-client-timeout-exceeded-while-awaiting-headers/24389/2 Would you mind conducting two experiments: * Experiment 1: Increase the memory limit/request for the KubeRay operator Pod....
This may be related to #715. KubeRay sends a request to the K8s API server to delete a Pod. In the next reconciliation, the informer cache still hasn't received the...
Reopen this issue. I will check whether we should make the GKE CSI Fuse work with the default KubeRay config, or if updating the documentation is sufficient.
> Disabling auto init container injection will be a quite big breaking change on our platform. cc @daikeshi Would you mind sharing more details about this? cc @andrewsykim is there...
The `suspend` feature in RayJob will issue a request to the Ray head Pod to halt the job before the RayCluster is deleted. For RayCluster, I prefer to avoid doing...