skaffold
skaffold copied to clipboard
Skaffold Delete fails if helm release can't be found
v1.36.1 Macos
Skaffold delete will fail if the config includes a helm chart that isn't found on the cluster. This can easily happen if a deployment is only partially successful, and then the user wants to run skaffold delete to remove everything. In my case, I had to get to a state where skaffold run ran fully before I could run skaffold delete, which sort of defeated the purpose of using delete to clean up after a failed deployment.
Repro:
- Move to
examples/getting-start minikube startskaffold run- Modify the skaffold.yaml and add a basic helm chart like this:
apiVersion: skaffold/v2beta27
kind: Config
build:
artifacts:
- image: skaffold-example
deploy:
kubectl:
manifests:
- k8s-*
helm:
releases:
- name: redis
repo: https://charts.bitnami.com/bitnami
remoteChart: redis
setValues:
auth.enabled: false
skaffold delete
Error:
Cleaning up...
Error: uninstall: Release not loaded: redis: release: not found
exit status 1
To successfully delete, first you need to execute skaffold run, then skaffold delete.
Looking at the example now. The expectation is that the k8s manifests should also be deleted right?
Correct.
The main use case where I have used skaffold delete is failed deployments and looking to start afresh. Even if there is a slightly different version of the code, different SHA, etc., it generally has worked well unless names of deployments are changing, in which case the main thing that could go wrong is you have an orphaned deployment.
It looks like if any of the Cleanup runs on any of the deployers fails, we quit return the error and exit the loop:
https://github.com/GoogleContainerTools/skaffold/blob/0d7d66946fcac997f5d0fcb0fd96afb42d89d821/pkg/skaffold/deploy/deploy_mux.go#L161-L173
So in this case if you have a helm deployer that can't find the installed release, it fails and the kubectl deployer never gets to cleanup.
It seems to me that in most cases, if a cleanup fails you would want to print the error but still try and clean everything else up possible vs exiting immediately? That is, cleanup is a best-effort process in order to capture evolving skaffold configs. I could see the counter argument that you want to fail the cleanup process to make it clear to the user something is wrong, but I still think you should try and clean everything.
This would solve my use case of I have a somewhat corrupted deployment (user manually deleted the helm release) and I want skaffold to delete everything possible so I can start over.
This is also an issue for a collection of microservices my team and I are working on.
Each of the services has a skaffold file because we want to be able to start and deploy them individually. But since several services relies on RabbitMQ it means that RabbitMQ is listed in multiple skaffold files. The result is that some services are not destroyed properly, because after one service destroys the RabbitMQ instance the rest, which attempts to do so, fails.
Please let me know if there's a better way of dealing with the above.
@dhodun I think we should allow all the deployers to run Cleanup even if some of them fail.
@tonsV2 is it possible to extract out the RabbitMQ deploy manifests into one skaffold.yaml file? Then add a requires clause to the other skaffold.yaml files rather than redefining it everywhere? Then skaffold should attempt the delete operation only once.
https://skaffold.dev/docs/design/config/#configuration-dependencies
@gsquared94 Yes, I guess that would be an option. Although then both (or more) projects would depend on another project, so my initial thought would be that I'd rather just define it multiple places.
+1 for a --force-cleanup switch or some measure of sorts
I am also running into this issue