kubeflow-manifests
kubeflow-manifests copied to clipboard
kubectl 1.25 describe has no `--timeout` flag, breaking kubeflow install
Describe the bug
Installation of kubeflow v1.7.0 (using manifest install pattern, not terraform) breaks when installing the aws-secrets-sync
deployment due to a mismatch in the kubectl syntax. Kubectl has no --timeout
on a describe
operation, which is used in the utils/utils.py
line 273, which fails the installation task(s).
Steps To Reproduce
- Install all prereq's for
cognito-rds-s3
- run
make deploy-kubeflow INSTALLATION_OPTION=kustomize DEPLOYMENT_OPTION=cognito-rds-s3 PIPELINE_S3_CREDENTIAL_OPTION=irsa
- See error (listed below)
Expected behavior Kubeflow installed, and all commands operated without error.
Environment
-
Kubernetes version v1.25.16-eks-8cb36c9
-
Using EKS (yes/no), if so version? Yes - v1.25.16-eks-8cb36c9
-
Kubeflow version v1.7.0
-
kubectl version Client Version: v1.25.0 Kustomize Version: v4.5.7 Server Version: v1.25.16-eks-8cb36c9
-
AWS build number AWS_RELEASE_VERSION="v1.7.0-aws-b1.0.3"
-
AWS service targeted (S3, RDS, etc.) S3, Cognito, RDS, EKS
Screenshots No screens - here's the logs:
...
==========Installing aws-secrets-manager==========
# Warning: 'bases' is deprecated. Please use 'resources' instead. Run 'kustomize edit fix' to update your Kustomization automatically.
# Warning: 'patchesStrategicMerge' is deprecated. Please use 'patches' instead. Run 'kustomize edit fix' to update your Kustomization automatically.
deployment.apps/aws-secrets-sync unchanged
Warning: secrets-store.csi.x-k8s.io/v1alpha1 is deprecated. Use secrets-store.csi.x-k8s.io/v1 instead.
secretproviderclass.secrets-store.csi.x-k8s.io/rds-secret unchanged
Waiting for aws-secrets-manager pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (aws-secrets-sync)' --timeout=240s -n kubeflow
error: no matching resources found
error: unknown flag: --timeout
See 'kubectl describe --help' for usage.
Waiting for aws-secrets-manager pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (aws-secrets-sync)' --timeout=240s -n kubeflow
error: no matching resources found
error: unknown flag: --timeout
See 'kubectl describe --help' for usage.
Waiting for aws-secrets-manager pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (aws-secrets-sync)' --timeout=240s -n kubeflow
error: no matching resources found
error: unknown flag: --timeout
See 'kubectl describe --help' for usage.
Traceback (most recent call last):
File "utils/kubeflow_installation.py", line 324, in <module>
install_kubeflow(
File "utils/kubeflow_installation.py", line 101, in install_kubeflow
install_component(
File "utils/kubeflow_installation.py", line 180, in install_component
validate_component_installation(installation_config, component_name)
File "/usr/local/lib/python3.8/dist-packages/retrying.py", line 56, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/usr/local/lib/python3.8/dist-packages/retrying.py", line 266, in call
raise attempt.get()
File "/usr/local/lib/python3.8/dist-packages/retrying.py", line 301, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/usr/local/lib/python3.8/dist-packages/six.py", line 719, in reraise
raise value
File "/usr/local/lib/python3.8/dist-packages/retrying.py", line 251, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "utils/kubeflow_installation.py", line 192, in validate_component_installation
kubectl_wait_pods(value, namespace, key)
File "/kube/tests/e2e/utils/utils.py", line 275, in kubectl_wait_pods
raise Exception("Timeout/error waiting for pod condition")
Exception: Timeout/error waiting for pod condition
...
Additional context AWS, EKS, Kubeflow 1.7.0
I can confirm this is the case!
I have also run into this issue. I've just put in PR #821 to resolve this.