helmfile icon indicating copy to clipboard operation
helmfile copied to clipboard

Proposal: `helmfile sync --prune` removes any charts not declared

Open lswith opened this issue 7 years ago • 60 comments

This idea comes from the kubectl apply -f --prune where kubectl deletes any resources that aren't referenced. Helmfile should have an option for the sync command that checks what helm charts are deployed, and then compares that to the helmfile. If there are charts that aren't referenced by the helmfile, they are deleted (possibly purged) from helm.

This would be extremely useful for keeping a specific state in the cluster.

lswith avatar Jul 19 '18 08:07 lswith

I think that's a great idea, but with the implementation of helmfile.d where we have seperate charts and the possibility that users will have separate helmfiles for the same cluster that could also be dangerous.

I think the largest issue is how helmfile.d is implemented

sstarcher avatar Jul 19 '18 13:07 sstarcher

Yeah, sounds like a great idea! How would you expect it to work when we run:

  1. helmfile -f myhelmfile.yaml sync
  2. helmfile -f myhelmfile.yaml -l foo=bar sync
  3. helmfile -f myhelmfile.d/ sync
  4. helmfile -f myhelmfile.d -l foo=bar sync

My initial thoughts:

For 1., remove all the releases stored in the cluster but not exists in the helmfile For 2., remove releases w/ the label foo=bar, stored in the cluster, but not exists in the helmfile For 3., delete all the releases stored in the cluster, but missing corresponding helmfile(s) in the directory. Then do 1. for each helmfile in the directory. For 4., do 2. for each helmfile in the directory

Detailed specs would include:

  • unnecessary releases are deleted by running helm delete
  • renaming of helmfile(s) in a helmfile.d will result in recreation of all the releases stored in the helmfile(s)
  • the name of helmfile.d and the name of helmfile are stored in the cluster as a custom resource, so that we can see which releases are deleted from helmfile.d and helmfile(s)

mumoshu avatar Jul 19 '18 15:07 mumoshu

Or should we rethink how the helmfile.d feature works, by any chance?

mumoshu avatar Jul 19 '18 15:07 mumoshu

I'm not following this statement

renaming of helmfile(s) in a helmfile.d will result in recreation of all the releases stored in the helmfile(s)

Why would renaming a helmfile in helmfile.d effect the release names?

And if this is added it should certainly be flag and not the default behavior.

sstarcher avatar Jul 19 '18 16:07 sstarcher

I’d say, take the same flow that kubectl apply -f —prune takes.

Selectors wouldn’t work for this feature.

Any charts not referenced in helmfile.d would be deleted.

lswith avatar Jul 19 '18 21:07 lswith

In that case it would likely make sense for this to be a seperate command like helmfile prune vs a flag on helmfile sync/charts

sstarcher avatar Jul 19 '18 21:07 sstarcher

The only trouble with it being “prune” is that it implies that it won’t create new releases. This command should just “sync” the exact helm charts you have to a “specific” tiller.

lswith avatar Jul 19 '18 22:07 lswith

While also cleaning up unused releases.

lswith avatar Jul 19 '18 22:07 lswith

The issue with leveraging the current sync is you now have a conflict with the selectors.

sstarcher avatar Jul 19 '18 23:07 sstarcher

I'm not following this statement

renaming of helmfile(s) in a helmfile.d will result in recreation of all the releases stored in the helmfile(s)

Why would renaming a helmfile in helmfile.d effect the release names?

I cannot be sure yet but believe that it is because you need something like a helm release stored in the k8s cluster, for each helmfile in a helmfile.d. Otherwise we can't take diff between the current state and the desired state(in a helmfile yaml).

helm uses the name of a release to identify the current state, compares it with the desired state in order to remove unnecessary k8s resources on helm upgrade(afaik). I thought that we'd need to do it for helmfile too. Then, my question was - how to "identify the current state" that corresponds to the desired state? My quick guess was to use the name of each helmfile contained in the helmfile.d, hence renaming results in recreation of all the releases in the helmfile.

mumoshu avatar Jul 20 '18 01:07 mumoshu

@sstarcher I'm not completely opted to implement this into sync. But anyway, I tried to address the conflict with the proposed feature vs selector in the number 2 and 4 of my proposed design. To be short, -l foo=bar should restrict which releases are installed/upgraded/deleted.

Suppose the "current" stated stored somehow inside your cluster look like:

- releases
  - name: foo
     labels:
        kind: foo
  - name: bar
       kind: bar

Whereas your "desired" state in helmfile.yml looks like:

releases:
- name: foo
  labels:
    kind: foo

helmfile -f helmfile.yml sync or helmfile -f helmfile.yml -l kind=bar should delete the released named bar only. On the other hand, helmfile -f helmfile.yml -l kind=foo should do nothing. This is an example of the number 2 of my proposed design above.

mumoshu avatar Jul 20 '18 01:07 mumoshu

I’m actually happy for you guys to hash out this feature. The only use case is that I’d like CD for my helm configurations and using a helmfile is easily the best tool for doing that. Adding the ability to sync a desired state to the cluster will give me that

lswith avatar Jul 20 '18 07:07 lswith

Deletes, updates and installs all in one

lswith avatar Jul 20 '18 07:07 lswith

Yeah! I believe that I understand it, because I have the same use-case as yours. So I'm trying, but I still need your help to verify that my proposed solution work 😄

mumoshu avatar Jul 20 '18 08:07 mumoshu

I see, what I imagined the design would be would, not to only remove what was in the helmfile previously, but to remove any resource that was not in the helmfile. In the second situation you would not need to store anything in k8s as you would just remove everything that was not currently in the file.

If you are storing things in k8s you also run into race conditions and require locking. I believe that might be getting more complicated then we really want. Unless we want to develop a server side helmfile component.

sstarcher avatar Jul 20 '18 13:07 sstarcher

Yep, I 100% see value in this we just need to find a clean way to implement this.

sstarcher avatar Jul 20 '18 14:07 sstarcher

@sstarcher Thanks for the comment! I'm still trying to figure out either.

you would just remove everything that was not currently in the file.

Everything in the cluster? Or perhaps "everything in the namespace"?

For the former, I'm wondering if it implies that we can have at most one helmfile per cluster. Otherwise helmfile sync --delete against two separate helmfiles would try to delete releases from each other?

mumoshu avatar Jul 24 '18 00:07 mumoshu

I think it would be sweet in principal to have the helmfile sync --delete feature, but for all the aforementioned arguments struggle to see how it would work sanely with selectors or without shared state. It feels very dangerous. One possibility is to make --delete mutually exclusive with --selectors; that is, you cannot use it with selectors.

We have dozens of services in our helmfiles. It's become more of a "linux distribution" of cloud native apps for kubernetes. We use helmfile --selector chart=grafana sync just the way you would do something like apt-get install grafana on a classic linux distro.

Other ideas: (not well thought out)

  • introduce a new verb called upsert which works like the current sync command but does (upgrades and installs). Then repurpose sync to work the way others have suggested.
  • treat --delete like a --selector, but for charts that should be deleted.
    helmfile --delete install=false sync 
    
  • combination of the above two suggestions...
    helmfile --upsert install=true --delete install=false sync
    

osterman avatar Jul 25 '18 23:07 osterman

One of the things I really like about Helmfile is that all state is in helmfile.yaml, so it's easy to have multiple users maintaining a cluster, and it's also easy to switch between using Helmfile and Helm.

For example

  • I find Terraform very nice to use as an individual, but to support a team you need to synchronise the tfstate file between all members. This adds another level of infrastructure and is one of the reasons we're not using it.
  • Being able to use Helm directly is useful when rolling out Helmfile to an existing cluster, especially during an evaluation period where people are unsure of adding yet another tool to the deployment chain.

So, my vote would be to do everything using helmfile.yaml/helmfile.d, and to avoid storing additional state, whether locally or in the cluster.

manics avatar Aug 29 '18 09:08 manics

ya, the additional state concerns me and I don't believe it's necissary. Yes, currently helmfile does not merge multiple files in helmfile.d together, but it could easily be implemented to keep track of the releases that are managed by helmfile.d and to accumulate them as it goes through. Once it processes all of the files it could delete any releases that are not in any of the files in helmfile.d.

This is a fairly trivial feature to implement and would cover the case to assure everything in helm is also in helmfile.

sstarcher avatar Aug 29 '18 12:08 sstarcher

@sstarcher Interesting! You mean not to merge helmfile.yaml files into a single yaml file, but rather to load releases in context of each helmfile.yaml, and then merge the list of resulting releases, right?

mumoshu avatar Aug 29 '18 13:08 mumoshu

ya, essentially we loop over all of the helmfiles. The function that does that could return the list of releases in the helmfile. And the overarching loop could accumulate that list. Once all helmfiles were processed you would have a full list of everything that is being managed.

At that point you could purge.

sstarcher avatar Aug 29 '18 13:08 sstarcher

And we calculate a releases list per tiller namespace?

mumoshu avatar Aug 29 '18 13:08 mumoshu

ahh namespaced tillers do make that more difficult. That would certainly have to go into the calculation for the rollup.

sstarcher avatar Aug 29 '18 15:08 sstarcher

Got it, thanks for the suggestion and the confirmation! That should indeed work.

Mind if I altered the flag to --prune after kubectl apply --prune --all? To me who is a little familiar with kubectl apply --prune, --delete and --purge sounded more like deleting/purging all the releases, rather than ones that are disappeared from helmfiles, which conflicts with the meaning of sync.

mumoshu avatar Aug 30 '18 01:08 mumoshu

Sounds great

sstarcher avatar Aug 30 '18 18:08 sstarcher

So for consistency with kubectl apply --prune [-l foo=bar] [--all], helmfile apply --prune --all should remove all releases that resides in all the targeted tiller namespaces. On the other hand, helmfile apply --prune -l foo=bar should remove all releases whose labels contain foo=bar, that resides in the all targeted tiller namespaces.

What we can implement as of today is the former, helmfile apply --prune --all only. That's because we have no way to label releases.

mumoshu avatar Sep 13 '18 08:09 mumoshu

helmfile apply --prune -l K=V would benefit from https://github.com/helm/helm/issues/4639. It isn't a requisite feature-wise, but good to have to make the command complete faster.

mumoshu avatar Sep 13 '18 13:09 mumoshu

So are we to assume that tiller only has access to the namespace it's in? Tiller can easily control all namespaces, just the one it's in, or x specific namespaces depending on RBAC.

Also I don't understand helmfile apply --prune -l foo=bar. So you are saying it won't do what the normal --prune does, but it will look at the helmfile and find the labels of foo=bar and remove those? That seems more like a helmfile delete -l foo=bar. prune is for syncing helmfile and kubernetes using a label in this context is very different and seems highly confusing.

sstarcher avatar Sep 13 '18 13:09 sstarcher

So are we to assume that tiller only has access to the namespace it's in?

Nope. I assume tiller has access to all the releases that are allowed by the serviceaccount associated to the tiller instance.

Assume we have 2 releases mybackend and myfrontend managed by the tiller at tiller-system:

$ kubectl --namespace tiller-system get configmap
NAME                                         DATA      AGE
mybackend.v1                              1         4d
myfrontend.v1                              1         4d

And also assume mybackend.v1 is labeled tier: backend and myfrontend.v1 is labeled tier: frontend.

Let's say mybackend.v1 has corresponding k8s resources in the ns backend, as it was installed by helm install --tiller-namespace tiller-system --namespace backend charts/backend --name mybackend. And myfrontend's resources in frontend as similar as the backend.

If we removed both releases from helmfile.yaml, helmfile sync --prune (or helmfile sync --prune --all if we keep consistency with kubectl) would remove both. On the other hand, helmfile sync --prune -l tier=backend would remove only mybackend. This allows you to prune unneeded releases even when you use selector all over the places.

Suppose you have every release labeled appropriately, helmfile sync --prune --all is equivalent to helmfile sync --prune -l tier=backend + helmfile sync --prune -l tier=frontend.

mumoshu avatar Sep 13 '18 13:09 mumoshu