helm-operator icon indicating copy to clipboard operation
helm-operator copied to clipboard

Always dry run compare before upgrade

Open seaneagan opened this issue 4 years ago • 2 comments

Upgrades should only occur if there are non-trivial changes since the previously applied (installed/upgraded) state, or if this is the same state that was previously upgraded to and rolled back from, and the HelmRelease config allows for further upgrade attempts. Thus, a dry run compare should always occur first to compare the current desired state to the one previously applied. Previously, there were several cases where superfluous upgrades could occur. There were also cases where release diffs were not logged, since the dry run compare never occurred.

As the included TODO indicates, this also enables a future enhancement to reset the rollback count when non-trivial changes are detected, so that the new state is alotted the configured number of retries, regardless of any failed attempts on a previous state.

seaneagan avatar Jun 03 '20 20:06 seaneagan

#447 addresses the rollback count reset TODO

seaneagan avatar Jun 04 '20 20:06 seaneagan

@seaneagan you have been doing God's work lately, for which I am very grateful. :rose:

Below you will find some non-conclusive comments which I wanted to get to you today, as our working hours do not seem to match very well :-).


I think this PR is a step in the right direction, but I am also a bit anxious about this change, as the dry-run has a history of missing changes due to the comparison limitations. One example I can illustrate from the top of my head is that the metadata of chart dependencies is rich for a release object (as presented by Helm) that is the result of a dry-run, but this same data is not present once this release would be retrieved from storage. Another one is that it will likely worsen the behaviour of the operator for people making use of local chart dependencies, as described in #16.

Finding a balance in this is hard; if you would take the easy route and just compare the generated manifests string from the dry-run with the latest in storage, there is a probability people are for example putting some timestamp in an annotation that is generated during the dry-run (or release), and you will get spurious upgrades although technically nothing has changed (except for this timestamp).

People also loose the ability to force the operator to re-release something (due to the removal of HasSynced). There have been (internal) discussions to make this annotation based for awhile, an example of how we imagine this to work can be found here, which is much closer to how some core Kubernetes components work (the Helm Operator isn't anywhere near how we imagine operator design in the present, but rather the result of "natural growth in software").

hiddeco avatar Jun 04 '20 21:06 hiddeco

Closing, stale.

Thanks for contributing here! Flux v1 and Helm Operator are in maintenance mode 🎉

kingdonb avatar Sep 02 '22 18:09 kingdonb