Andrew Sy Kim
Andrew Sy Kim
The train code is referenced in https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/pytorch-resnet-image-classifier/fine-tune-pytorch-resnet-image-classifier.py There's also a link under the "Deploy the RayJob" section
Looks like we would need to use self-hosted runners for this https://github.com/actions/virtual-environments/issues/45
Seeing this on github actions: ``` ### ERRORED 04:45:14Z - Please verify your email address to run GitHub Actions workflows. https://github.com/settings/emails ``` Maybe it's been fixed now, can you rebase...
> If the CAPI-DNS controller in the management cluster is unable to resolve the IPs of a particular gateway, that error should be visible in a Status on the GatewayDNSRecord....
Would it be crazy to add a condition to Cluster?
> I suppose Cluster condition could work, but that somehow feels indirect.... Because the user-intent to provide DNS resolution is elsewhere (CR or ConfigMap for this CAPI-DNS controller). If everything...
Is this problem specifc to using Argo CD?
Requring recreate of RayService is pretty bad, is there a patch fix we can do for v1.1.2 to avoid needing to do that?
> cc @andrewsykim does this make sense for you? sounds good thanks
Relevant KEP in Kubernetes: https://github.com/kubernetes/enhancements/pull/2611 There were some previous discussions around support port ranges as well, but I can't seem to find that discussion