terraform-provider-kubernetes
terraform-provider-kubernetes copied to clipboard
Fix kubernetes_daemon_set not waiting for rollout. Fixes #2092
Description
Fixes kubernetes_daemon_set_v1 resource to wait for rollout. Current check for the rollout when wait_for_rollout = true is ineffective and is not waiting for rollout at all as it checks a wrong status field of a DaemonSet.
:exclamation: What is worse, there is currently no way how to correctly manage a DaemonSet using kubernetes provider as also kubernetes_manifest is affected by a nasty bug so every DaemonSet change ends up with a failure..
Impacted by the issue, I prepared a reproducer TF code here. Troubleshooting the issue by applying a new resource and changing existing one I noticed, the current code is checking a wrong field, so wait_for_rollout = true has no effect as incorrect status fiels is check.
Additionally, majority of the daemonset resource tests must have been fixed as after fixing the rollout issues, these have been failing as the the original manifests failed to be rolled out.
:warning: this fix might break many existing DaemonSets as by default wait_for_rollout is set to true in the resource. Until now, waiting for rollout was ineffective, though after it's fixed, apply is going to wait until all the daemonset pods are in Ready state. I expect this might cause some issues for users using this resource as their existing TF code might start to time out on apply. This is also visible on the number of test case changes, that must have been updated.
Troubleshooting - apply of a new resource
- reproducer TF code
- run
terraform apply
Plan: 1 to add, 0 to change, 0 to destroy.
kubernetes_daemon_set_v1.i-am-not-waiting-for-rollout: Creating...
kubernetes_daemon_set_v1.i-am-not-waiting-for-rollout: Creation complete after 2s [id=default/i-am-not-waiting-for-rollout]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
The apply is almost instant. While the apply is running,, this is how the daemonset status looks like: `
$ while true; do kubectl get ds/i-am-not-waiting-for-rollout --output="jsonpath={.status}" ;echo; done` printing the DS Error from server (NotFound): daemonsets.apps "i-am-not-waiting-for-rollout" not found
...
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberMisscheduled":0,"numberReady":0,"numberUnavailable":2,"observedGeneration":1,"updatedNumberScheduled":2}
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberMisscheduled":0,"numberReady":0,"numberUnavailable":2,"observedGeneration":1,"updatedNumberScheduled":2}
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberAvailable":1,"numberMisscheduled":0,"numberReady":1,"numberUnavailable":1,"observedGeneration":1,"updatedNumberScheduled":2}
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberAvailable":2,"numberMisscheduled":0,"numberReady":2,"observedGeneration":1,"updatedNumberScheduled":2}
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberAvailable":2,"numberMisscheduled":0,"numberReady":2,"observedGeneration":1,"updatedNumberScheduled":2}
As you can see, checking CurrentNumberScheduled is not the right property to check if when we need to wait for the rollout as this number does not represent the DaemonSet rollout status. We must check for numberReady here.
Troubleshooting - change of an existing resource
Apply of a change:
Plan: 0 to add, 1 to change, 0 to destroy.
kubernetes_daemon_set_v1.i-am-not-waiting-for-rollout: Modifying... [id=default/i-am-not-waiting-for-rollout]
kubernetes_daemon_set_v1.i-am-not-waiting-for-rollout: Modifications complete after 2s [id=default/i-am-not-waiting-for-rollout]
Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
DS status output:
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberAvailable":2,"numberMisscheduled":0,"numberReady":2,"observedGeneration":5,"updatedNumberScheduled":2}
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberAvailable":1,"numberMisscheduled":0,"numberReady":1,"numberUnavailable":1,"observedGeneration":6,"updatedNumberScheduled":1}
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberAvailable":1,"numberMisscheduled":0,"numberReady":1,"numberUnavailable":1,"observedGeneration":6,"updatedNumberScheduled":1}
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberAvailable":1,"numberMisscheduled":0,"numberReady":1,"numberUnavailable":1,"observedGeneration":6,"updatedNumberScheduled":1}
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberAvailable":1,"numberMisscheduled":0,"numberReady":1,"numberUnavailable":1,"observedGeneration":6,"updatedNumberScheduled":2}
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberAvailable":1,"numberMisscheduled":0,"numberReady":1,"numberUnavailable":1,"observedGeneration":6,"updatedNumberScheduled":2}
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberAvailable":2,"numberMisscheduled":0,"numberReady":2,"observedGeneration":6,"updatedNumberScheduled":2}
{"currentNumberScheduled":2,"desiredNumberScheduled":2,"numberAvailable":2,"numberMisscheduled":0,"numberReady":2,"observedGeneration":6,"updatedNumberScheduled":2}
Acceptance tests
- [ ] Have you added an acceptance test for the functionality being added?
- [x] Have you run the acceptance tests on this branch?
Output from acceptance testing:
$ make testacc TESTARGS='-run=TestAccKubernetesDaemonSetV1* -v'
==> Checking that code complies with gofmt requirements...
go vet ./...
TF_ACC=1 go test "/home/stano/workspace/projects/tf/terraform-provider-kubernetes/kubernetes" -v -vet=off -run=TestAccKubernetesDaemonSetV1* -v -parallel 8 -timeout 3h
=== RUN TestAccKubernetesDaemonSetV1_minimal
=== PAUSE TestAccKubernetesDaemonSetV1_minimal
=== RUN TestAccKubernetesDaemonSetV1_basic
=== PAUSE TestAccKubernetesDaemonSetV1_basic
=== RUN TestAccKubernetesDaemonSetV1_with_template_metadata
=== PAUSE TestAccKubernetesDaemonSetV1_with_template_metadata
=== RUN TestAccKubernetesDaemonSetV1_initContainer
=== PAUSE TestAccKubernetesDaemonSetV1_initContainer
=== RUN TestAccKubernetesDaemonSetV1_noTopLevelLabels
=== PAUSE TestAccKubernetesDaemonSetV1_noTopLevelLabels
=== RUN TestAccKubernetesDaemonSetV1_with_tolerations
=== PAUSE TestAccKubernetesDaemonSetV1_with_tolerations
=== RUN TestAccKubernetesDaemonSetV1_with_tolerations_unset_toleration_seconds
=== PAUSE TestAccKubernetesDaemonSetV1_with_tolerations_unset_toleration_seconds
=== RUN TestAccKubernetesDaemonSetV1_with_container_security_context_seccomp_profile
=== PAUSE TestAccKubernetesDaemonSetV1_with_container_security_context_seccomp_profile
=== RUN TestAccKubernetesDaemonSetV1_with_container_security_context_seccomp_localhost_profile
=== PAUSE TestAccKubernetesDaemonSetV1_with_container_security_context_seccomp_localhost_profile
=== RUN TestAccKubernetesDaemonSetV1_with_resource_requirements
=== PAUSE TestAccKubernetesDaemonSetV1_with_resource_requirements
=== RUN TestAccKubernetesDaemonSetV1_minimalWithTemplateNamespace
=== PAUSE TestAccKubernetesDaemonSetV1_minimalWithTemplateNamespace
=== CONT TestAccKubernetesDaemonSetV1_minimal
=== CONT TestAccKubernetesDaemonSetV1_with_tolerations_unset_toleration_seconds
=== CONT TestAccKubernetesDaemonSetV1_noTopLevelLabels
=== CONT TestAccKubernetesDaemonSetV1_with_template_metadata
=== CONT TestAccKubernetesDaemonSetV1_with_resource_requirements
=== CONT TestAccKubernetesDaemonSetV1_minimalWithTemplateNamespace
=== CONT TestAccKubernetesDaemonSetV1_initContainer
=== CONT TestAccKubernetesDaemonSetV1_with_tolerations
--- PASS: TestAccKubernetesDaemonSetV1_with_tolerations_unset_toleration_seconds (18.46s)
=== CONT TestAccKubernetesDaemonSetV1_basic
--- PASS: TestAccKubernetesDaemonSetV1_minimal (18.49s)
=== CONT TestAccKubernetesDaemonSetV1_with_container_security_context_seccomp_localhost_profile
--- PASS: TestAccKubernetesDaemonSetV1_noTopLevelLabels (18.54s)
=== CONT TestAccKubernetesDaemonSetV1_with_container_security_context_seccomp_profile
--- PASS: TestAccKubernetesDaemonSetV1_with_tolerations (19.22s)
--- PASS: TestAccKubernetesDaemonSetV1_initContainer (22.81s)
--- PASS: TestAccKubernetesDaemonSetV1_with_template_metadata (31.53s)
--- PASS: TestAccKubernetesDaemonSetV1_minimalWithTemplateNamespace (32.90s)
--- PASS: TestAccKubernetesDaemonSetV1_with_container_security_context_seccomp_localhost_profile (14.42s)
--- PASS: TestAccKubernetesDaemonSetV1_with_container_security_context_seccomp_profile (23.03s)
--- PASS: TestAccKubernetesDaemonSetV1_with_resource_requirements (46.52s)
--- PASS: TestAccKubernetesDaemonSetV1_basic (28.36s)
PASS
ok github.com/hashicorp/terraform-provider-kubernetes/kubernetes 46.870s
Release Note
Release note for CHANGELOG:
`resource/kubernetes_daemon_set_v1`: fix an issue with the provider not waiting for rollout even `wait_for_rollout` is set to `true`
References
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment