skaffold icon indicating copy to clipboard operation
skaffold copied to clipboard

Skaffold Delete fails if helm release can't be found

Open dhodun opened this issue 3 years ago • 8 comments
trafficstars

v1.36.1 Macos

Skaffold delete will fail if the config includes a helm chart that isn't found on the cluster. This can easily happen if a deployment is only partially successful, and then the user wants to run skaffold delete to remove everything. In my case, I had to get to a state where skaffold run ran fully before I could run skaffold delete, which sort of defeated the purpose of using delete to clean up after a failed deployment.

Repro:

  • Move to examples/getting-start
  • minikube start
  • skaffold run
  • Modify the skaffold.yaml and add a basic helm chart like this:
apiVersion: skaffold/v2beta27
kind: Config
build:
  artifacts:
  - image: skaffold-example
deploy:
  kubectl:
    manifests:
      - k8s-*
  helm:
    releases:
    - name: redis
      repo: https://charts.bitnami.com/bitnami
      remoteChart: redis
      setValues:
        auth.enabled: false
  • skaffold delete

Error:

Cleaning up...
Error: uninstall: Release not loaded: redis: release: not found
exit status 1

To successfully delete, first you need to execute skaffold run, then skaffold delete.

dhodun avatar Mar 16 '22 19:03 dhodun

Looking at the example now. The expectation is that the k8s manifests should also be deleted right?

tejal29 avatar Mar 16 '22 19:03 tejal29

Correct.

The main use case where I have used skaffold delete is failed deployments and looking to start afresh. Even if there is a slightly different version of the code, different SHA, etc., it generally has worked well unless names of deployments are changing, in which case the main thing that could go wrong is you have an orphaned deployment.

dhodun avatar Mar 16 '22 19:03 dhodun

It looks like if any of the Cleanup runs on any of the deployers fails, we quit return the error and exit the loop: https://github.com/GoogleContainerTools/skaffold/blob/0d7d66946fcac997f5d0fcb0fd96afb42d89d821/pkg/skaffold/deploy/deploy_mux.go#L161-L173

So in this case if you have a helm deployer that can't find the installed release, it fails and the kubectl deployer never gets to cleanup.

It seems to me that in most cases, if a cleanup fails you would want to print the error but still try and clean everything else up possible vs exiting immediately? That is, cleanup is a best-effort process in order to capture evolving skaffold configs. I could see the counter argument that you want to fail the cleanup process to make it clear to the user something is wrong, but I still think you should try and clean everything.

This would solve my use case of I have a somewhat corrupted deployment (user manually deleted the helm release) and I want skaffold to delete everything possible so I can start over.

dhodun avatar May 10 '22 18:05 dhodun

This is also an issue for a collection of microservices my team and I are working on.

Each of the services has a skaffold file because we want to be able to start and deploy them individually. But since several services relies on RabbitMQ it means that RabbitMQ is listed in multiple skaffold files. The result is that some services are not destroyed properly, because after one service destroys the RabbitMQ instance the rest, which attempts to do so, fails.

Please let me know if there's a better way of dealing with the above.

tonsV2 avatar May 12 '22 07:05 tonsV2

@dhodun I think we should allow all the deployers to run Cleanup even if some of them fail.

@tonsV2 is it possible to extract out the RabbitMQ deploy manifests into one skaffold.yaml file? Then add a requires clause to the other skaffold.yaml files rather than redefining it everywhere? Then skaffold should attempt the delete operation only once. https://skaffold.dev/docs/design/config/#configuration-dependencies

gsquared94 avatar May 13 '22 03:05 gsquared94

@gsquared94 Yes, I guess that would be an option. Although then both (or more) projects would depend on another project, so my initial thought would be that I'd rather just define it multiple places.

tonsV2 avatar May 20 '22 11:05 tonsV2

+1 for a --force-cleanup switch or some measure of sorts

gilbahat avatar Aug 01 '22 04:08 gilbahat

I am also running into this issue

thall-kythera avatar Aug 03 '22 14:08 thall-kythera