terraform-provider-helm
terraform-provider-helm copied to clipboard
Failed deployment creation does not fail Terraform apply
Terraform, Provider, Kubernetes and Helm Versions
Terraform version: v0.12.29
Provider version: 2.1.0
Kubernetes version: v1.18.9-eks-d1db3c
Helm version: v3.3.0
Affected Resource(s)
- helm_release
Terraform Configuration Files
provider "helm" {
kubernetes {
config_path = "~/.kube/config"
}
}
resource "helm_release" "should_fail" {
name = "should-fail"
chart = "./failed-deploy"
}
./failed-deploy/templates/service.yaml
apiVersion: v1
kind: Service
metadata:
name: test
namespace: doesnt-exist
spec:
type: NodePort
ports:
- port: 80
targetPort: http
protocol: TCP
name: http
Debug Output
NOTE: In addition to Terraform debugging, please set HELM_DEBUG=1 to enable debugging info from helm.
https://gist.github.com/citizenken/9c4ee6d13a0e1f1edd7c73ee5cb53afe
Steps to Reproduce
- Copy module locally
- Create a new chart with the above manifest
- Run
terraform plan, confirm it should create a new release - Run
terraform apply, confirm that it completes successfully - Run
helm status should-fail, confirm that Helm reports a failure - Confirm the service was not created
- Run
terraform planagain, confirm that the only change is~ status = "failed" -> "deployed"
Expected Behavior
I would expect that if Helm reports a deployment failed, that Terraform fails as well and cleans up the partial, failed deployment.
Actual Behavior
Terraform reports "Complete," leaving the release in a broken state.
I think the root cause of this problem is that compared to other providers (ex. AWS), Helm allows for partial resource creation. If you have multiple resources, and only 1 fails, the others will be deployed but the status will still be "failed." From a Terraform perspective, that doesn't really work with the model other providers have that the resource is either created or it is not, there are no partial states. In this case, I think it would be better/match expectations to destroy the partial deployment and attempt to re-create on a subsequent TF run.
Important Factoids
References
Similar issues:
- #672
- #620
PR I've opened to fix the bug:
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
@citizenken have you experimented with the wait_for_jobs or cleanup_on_fail resource arguments? Those might be useful in this case.
@citizenken have you experimented with the wait_for_jobs or cleanup_on_fail resource arguments? Those might be useful in this case.
wait_for_jobs doesn't change anything.
Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!
This is still relevant.
Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!