pytorch-operator
pytorch-operator copied to clipboard
Implement "earlier" resource validation
Hi
Currently, when a new PytorchJob is created the Kubernetes API will only check if it matches the schema defined in the CRD, but the "real" validation is done in pkg/apis/pytorch/validation/validation.go.
This is too late in the process as the resrouce will already be created by the API
A use-case for catching errors early would be deployment using a CI/CD tool where you wan your pipeline to fail if it can't deploy.
For example, cert-manager is doing resource validation using a webhook: https://docs.cert-manager.io/en/latest/getting-started/webhook.html.
Yes. There is an issue tracking this in tf-operator. https://github.com/kubeflow/tf-operator/issues/1016
This can be taken up in 0.7
/area 0.7.0
/kind feature /priority p2