Kai-Hsun Chen

Results 327 comments of Kai-Hsun Chen
trafficstars

> since some users will expect the head pod to exit on failure. @DmitriGekhtman: currently, if the head Pod's restartPolicy is `Never` and the Pod becomes `Failed`, KubeRay will delete...

Try to summarize the current discussion: * Currently, if a Pod's `restartPolicy` is `Always`, KubeRay will not delete the Pod. Instead, KubeRay waits for the Pod restarts by itself. *...

> If we ignore the container restart strategy, I feel that there will be a scenario where the container is already restarting or even successful If a Pod is Succeeded...

* Case 1: `pb.serialized_py_logging_config` is 428 bytes. ```python log_dict_config = { "version": 1, "disable_existing_loggers": False, "formatters": { "text": { "()": "ray._private.structured_logging.formatters.TextFormatter", }, }, "filters": { "core_context": { "()": "ray._private.structured_logging.filters.CoreContextFilter", },...

cc @c21 would you mind reviewing this PR? Thanks!

> until the job is manually deleted. * Do you mean RayJob CRD? * Which version of KubeRay do you use?

Which Ray images are you using? You should use images that include `aarch64` in the image tag.

I tried the following on my Mac M1, and my RayCluster is healthy; no pods have been killed. ```sh kind create cluster helm install kuberay-operator kuberay/kuberay-operator --version 1.1.1 helm install...