bugfix: Override subcharts with null values
This PR closes #12469 and closes #12488
Helm should allow users to not only override default values, but also completely remove any default values by setting a config to null.
This works fine for regular charts, but default values within sub-charts cannot be null-ed. The linked issue has a good example of this created by user "naemono."
The reason this issue is happening is because the coalesce function goes over sub-chart values that are defined in a values file or with a --set flag twice due to the logic here. merge is always set to false in this context, and the first time coalesce gets called, the null value gets removed during the coalesceTablesFullKey function here. This is fine for regular chart values, but for sub-chart values, the coalesceDeps function runs the logic again for every value in the subChart, so we end right back to the spot in the previous link. But the config option that was explicitly set to null was already removed, so we end up at this point two lines down, which overwrites the user defined null config the sub-chart default value.
This problem only occurs because when the coalesce function is run while merge is false.
To prevent this bug from removing sub-chart config values that are explicitly set to null and therefore getting overwritten when coalesceDeps runs through the sub-chart values a second time, I added logic to check if the key was the name of a sub-chart, and if so setting merge to false so the null config option is not deleted.
Special notes for your reviewer: Follow-up if this PR is accepted may be to add a unit-test to confirm sub-chart nulls aren't overwritten.
When this could be merged?
@joejulian @mattfarina
Please merge/release ASAP. All builds from 3.13 onwards are "effectively" broken for many of my clients because of this.
I hope to see this in the next Helm release, there are many workarounds used to overcome this bug.
I think this is also related https://github.com/helm/helm/issues/12991
Beep boop we need this, boop beep
Ahh, alas, another week goes by, and yet more DevOps engineers are fraught with this unnecessary pain in having to continue to use such an old version of Helm without this bug. Beep boop, someone merge and release this, boop beep.
Another week nudge on this. Sorry to be the annoying jerk, but this is important.
Sorry to go against etiquette here with comment spam but we're also affected by this bug, and with this PR waiting for review for almost 3 months now I think some of us might be losing hope :-)
Just collecting all the duplicate bugs about this over the years, for fun I guess...
#12991 #12741 #12730 #12637 #12594 #12522 #12511 #12490 #12488 #12469 #12441 #12417 #11567 #9806 #9804 #9696 #9136 #9027 #6277 #5184
I'm so happy that the helm devs keep this bug alive so that I can charge my clients hours in working around it. Thanks! Tonight I eat steak thanks to you!
On a more serious note, who can we ping about this? I see @mattfarina worked on #12480 , sorry for the rude ping but maybe you were at least aware of the issue at some point in time.
I'd like to join the ping party, even though I know it won't help and normally I'd consider it rude as well, but this is getting ridiculous. Even if this is merged it will be some time until tools like argo-cd picks it up so that things can finally go back to the way they were.
Anyone with merge rights on this repo please take a look, lots of people would appreciate it ❤️
Hmm, can we just start aggressively @-ing owners and/or higher tier contributors on this repo? I feel like maybe this thread has been muted, and maybe we have no choice. I don't mention brute force lightly, but this is a serious issue which has gone unchecked for too long.
This PR fixes a really inconvenient bug and is highly expected to be merge. @joejulian do you know who we should ping to have a look at it or if this PR isn't ready can we have some feedback please ? Thank you ❤️
I really want @mattfarina to look at this first. He recently dove deep into this and has the most context.
Is there any update on this? Like others we're blocked on Argo version updates until this defect is resolved
There are Helm developer meetings every Thursday I believe: https://docs.google.com/document/d/1d-6xJEx0C78csIYSPKJzRPeWaHG_8W1Hjl72OJggwdc/edit?pli=1#heading=h.cm5b1kdljh4m Maybe an interested contributor could join to discuss the next steps for this PR (be nice!)
https://kubernetes.slack.com/archives/C51E88VDG might also be worth a try.
We would really like this change as well.
Also waiting for this. I want to override some default cpu limits of a subchart in ArgoCD, but can't..
Is this still broken? Whats the reason this is not fixed for such a long time?
Please?
This issue going completely ignored so long makes me want to write a cronjob which just auto-reposts in here every day for infinity until this gets merged. There's no reason for this, this is just causing a bunch of us needless pain and suffering. Arguably, work, but, painful work.
Please someone review/merge/fix this. I'm going to create a new issue right now to reference this issue in this repo, in case this issue is on a ignore/unsubscribe list.
The OpenTelemetry Helm Charts would also love this feature. I could've sworn this was fixed in 3.13, but it appears not.
Alright I'll bite and join the weekly meeting, if I am not busy. I am no longer affected by this bug (all charts I manage are custom), nor I am confident with the PR code itself, so I'd highly appreciate the presence of @ryanhockstad . Otherwise, I'll just point to this PR and explain the situation.
I have a reproducible testcase btw https://github.com/pisto/helm-subchart-null-bug
I encountered the same problem when attempting to unset seccompProfile in the subchart kong of kubernetes-dashboard.
Sorry for anothing weekly ping. But this fix is really needed. Pelase. @mattfarina
just a reminder that we can celebrate 5 months since this bug has been fixed in the PR
@mattfarina can we get a feedback on this from a maintainer, please?
This is such an annoying issue for us. We would be extremely glad if the maintainer could fix it.
Let's see if anything moves... 💣
@joejulian @jdolitsky @marckhouzam @mattfarina @sabre1041 @scottrigby @technosophos
Please take a look.
Sorry for the bad manners. 🙏🏻
Bumping as I've also encountered this issue many times over the years.
@mattfarina, @technosophos as discussed in the SIG meeting call today, I spent some time working on tests for this PR. I first started investigating why the existing test passes. That test has several null merging scenarios with global and subcharts so I wanted to understand why they are properly deleted/merged and how the test differs from an actual install or template scenario.
Debugging via a template command and test scenario, I found that in an actual install scenario the call to chartutil.ProcessDependenciesWithMerge here ends up modifying the chart's Values (I believe on purpose, as part of Imports) to include any default values from subcharts.
The subchart's default values being present in the parent's values ends up causing problems during the second call to CoalesceValues in the Render function. When the subchart's default values are present in the parent chart's values.yaml, they end up being used, as described in the PR description, during the coalesceDeps step of coalesce, after the key had already been deleted.
The current unit tests does not have this problem because it does not call ProcessDependenciesWithMerge and the setup does not include any subchart values in the parent's values. As a result, when coalesceDeps is called there is no key to add back.
I've updated the test to show this failure scenario here by manually adding a subchart's default value to a parent chart's values. I have also (not rigorously) confirmed here that the proposed changes in this PR fixes this scenario.
I am going to work on more test cases/scenarios and try to determine if the fix should be made in coalesce, as proposed by this PR, or somewhere in ProcessDependenciesWithMerge. My goal will be to create enough test scenarios that you feel confident that this PR does not create a Whack-a-Mole scenario and the change and its new tests can be merged (or some equivalent change in ProcessDependenciesWithMerge if that ends up being more appropriate).