gardener icon indicating copy to clipboard operation
gardener copied to clipboard

☂️-Issue for elimination of Helm charts in favor of Golang implementation

Open rfranzke opened this issue 5 years ago • 12 comments

How to categorize this issue?

/area dev-productivity robustness quality testing /kind enhancement /priority 3

What would you like to be added: This is an umbrella issue for improving our existing code, mainly in the operation and botanist packages. We lack unit and/or integration tests at too many places and should improve this. One important pain point is that the data flow is too distributed, many values are duplicated in Helm charts (e.g., labels, ports, names, ...) which makes it very hard to introduce a change as this might break unexpectedly.

One proposal is to leverage the recently introduced component interfaces (#1980) and eliminate the Helm charts in favor of Golang structs (see #2731). Another proposals is (at the point of time writing this issue) currently evaluated by @timebertt and discussed in two days.

Once agreed upon the preferred approach how to proceed, we can replicate it for the majority of the task functions in the shoot reconciliation and deletion flows.

Agreements:

  • With the node-local-dns refactoring, we experienced that it can be cumbersome to compare for equality the serialized obj in from the ManagedResource secrets. The order of annotations in the serialized obj was changing based on the ConfigMap data. While other components like vpnshoot, coredns rather have static ConfigMaps/Secrets and the annotations keys and the order of annotations is not changing that often. That's why for node-local-dns we decided to compare the obj (the DaemonSet) using golang struct. If a component hits similar issue to the node-local-dns, it can use compare the obj using golang struct (first deserialize the obj, then compare it to the expected one).

Tasks

The following Helm charts still need to be refactored:

  • shoot-core/components:

    • [ ] apiserver-proxy (create botanist/component/apiserverproxy package)
    • [ ] network-policies (ideally put into existing botanist/component/networkpolicies package)
    • [x] https://github.com/gardener/gardener/pull/6134
    • ℹ️ podsecuritypolicies should not be refactored since PSPs are deprecated, we should rather work on https://github.com/gardener/gardener/issues/5250
  • shoot-addons: (here, we should first decide whether we want to keep them since they are already only allowed for shoots < 1.22 or evaluation shoots >= 1.22 (see https://github.com/gardener/gardener/pull/4213))

    • [ ] kubernetes-dashboard (create botanist/component/kubernetesdashboard package)
    • [ ] nginx-ingress (ideally merge with existing botanist/component/nginxingress package)
  • seed-bootstrap{-crds}:

    • [x] https://github.com/gardener/gardener/pull/6210
    • [ ] grafana (@wyb1 @istvanballok @rickardsjp)
    • [x] ~~Other Monitoring-related charts (Keep them but move them into existing botanist/components/monitoring package and deploy via ManagedResource)~~ (dropped in favor of GEP-19)
    • [ ] Logging-related charts (Keep them but move them into existing botanist/components/logging package and deploy via ManagedResource)
    • [x] https://github.com/gardener/gardener/pull/6133
    • [x] https://github.com/gardener/gardener/pull/6135
    • [x] https://github.com/gardener/gardener/pull/6186
  • istio:

    • [x] https://github.com/gardener/gardener/pull/5946
    • [x] https://github.com/gardener/gardener/pull/6165
  • seed-monitoring (here, we should ideally move it out to an extension, but since this is unclear we should at least clean up the structure for now (see also https://github.com/gardener/gardener/issues/3553)):

    • [x] https://github.com/gardener/gardener/pull/6210
    • [ ] grafana (@wyb1 @istvanballok @rickardsjp)
    • [x] ~~Keep other Helm charts but move them into existing botanist/components/monitoring package and deploy via ManagedResource~~ (partially dropped for now in favor GEP-19)

Especially the deployment via ManagedResources for the seed system components will yield to better observability and seed conditions now that we have https://github.com/gardener/gardener/pull/5274.

rfranzke avatar Aug 19 '20 12:08 rfranzke

One proposal is to leverage the recently introduced component interfaces (#1980) and eliminate the Helm charts in favor of Golang structs (see #2731). Another proposals is (at the point of time writing this issue) currently evaluated by @timebertt and discussed in two days.

I did a rough sketch of how it could look like, if we want to keep the helm charts, but still implement a proper data flow of chart values and tests for those charts. I started with #2731 and applied my ideas on top of it, while bringing back the charts and mostly keeping the tests proposed in @rfranzke's PR. You can sneak a peek here in #2766. We will discuss both approaches in the afternoon and decide on which approach we like to proceed with.

timebertt avatar Aug 21 '20 06:08 timebertt

Together with @timebertt @timuthy @danielfoehrKn @BeckerMax @ialidzhikov @vpnachev we have had a very productive discussion on the subject matter and compared the two approaches presented in #2731 (proposal 1) and #2766 (proposal 2).

A short summary is:

  • Proposal 1 has more overhead/boilerplate because of the Golang syntax when defining the structs while with proposal 2 you can define them directly via YAML.
  • OTOH, proposal 2 has the disadvantage that we might have to duplicate business logic in both the Helm chart (if-then-else expressions) and in the Golang code to compute the required chart values for the respective Kubernetes version.
    • This is exactly one problem of today's code base that we want to eliminate (reduce duplication).
    • The only way to get rid of this problem is to move this logic out of the Helm chart to the Golang code, and then enrich the chart with the proper values. The consequence of this is that the chart templates are no longer "readable" in the way they are now, and that the Values Golang structure would again need duplication. Both aspects question the remaining benefit of such a Helm chart.
  • In proposal 2 there is still the double-maintenance of the values-test.yaml file (which could be eliminated, but then you can't helm template . the chart locally anymore without additional effort).
  • A concern with proposal 1 was that we are bound to the vendored Kubernetes libraries when defining types while we were able to also construct manifests with older API versions/kinds even if our vendored Kubernetes libraries don't support this type anymore (e.g., when a resource is deprecated or moved to another API group).
    • We probably don't have too many of those issues, but if it happens we can fallback to unstructured.Unstructured.

In the end, we agreed on proceeding with proposal 1 (#2731) with the following arguments:

  • Less duplication of business logic (everything is in Golang, no charts anymore)
  • No scattered artefacts but a single place that contains everything related to a component
  • Improving the code quality with the current approach OR proposal 2 would also be possible, but as part of this story the main argument for keeping the Helm charts (readability) would be devalued and potentially vanish totally, eventually increasing complexity and potential for errors and mistakes

Next steps are applying this approach to a more complex scenario (e.g., deployment of kube-apiserver), and to retrospect on the development experience. We might have discussions on this in the future again and continuously re-evaluate what's the most efficient style that avoids today's problems.

Please note that we don't plan to throw out the chart applier as a whole, and that we might still use it when applicable (e.g., for applying large CRDs). We will learn and improve here on the way.

rfranzke avatar Aug 21 '20 13:08 rfranzke

/unassign @timebertt - thanks for investigating different approaches! /assign @danielfoehrKn for a refactoring for the DeployKubeControllerManager task (similar to #2731).

rfranzke avatar Aug 24 '20 07:08 rfranzke

/unassign for now as #2783 is merged

rfranzke avatar Sep 08 '20 11:09 rfranzke

@danielfoehrKn, @schrodit, @timuthy, @rfranzke, @mvladev, @dguendisch, @vpnachev, @ialidzhikov, @beckermax, @danielfoehrkn This issue was referenced by @timebertt in duplicate issue gardener/gardener#1633.

gardener-robot avatar Jan 12 '21 18:01 gardener-robot

/assign @BeckerMax for gardener-resource-manager deployment refactoring /assign @rfranzke for OperatingSystemConfig refactoring

rfranzke avatar Jan 20 '21 08:01 rfranzke

/unassign as #3445 was merged

BeckerMax avatar Apr 19 '21 13:04 BeckerMax

(In progress) Refactor nginx-ingress Helm chart into component

shafeeqes avatar Nov 26 '21 09:11 shafeeqes

The next task of this umbrella issue is refactoring the kube-proxy Helm chart in order to optimize the rollouts.

rfranzke avatar Feb 04 '22 11:02 rfranzke

The Gardener project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed You can:
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close

/lifecycle rotten

gardener-ci-robot avatar Mar 30 '22 14:03 gardener-ci-robot

Tasks

This got copied into the issue description itself, see https://github.com/gardener/gardener/issues/2754#issue-681818569.

rfranzke avatar May 21 '22 12:05 rfranzke

/assign @wyb1 @istvanballok @rickardsjp for the grafana component refactoring

rfranzke avatar Jul 26 '22 09:07 rfranzke

/assign

oliver-goetz avatar Feb 06 '23 09:02 oliver-goetz

/unassign @oliver-goetz @rfranzke

rfranzke avatar Apr 27 '23 12:04 rfranzke

/assign @acumino @shafeeqes for blackbox-exporter and node-exporter.

acumino avatar May 15 '23 10:05 acumino

/unassign

acumino avatar May 29 '23 05:05 acumino

/unassign

shafeeqes avatar May 29 '23 12:05 shafeeqes

The plutono charts are the last that we want to refactor as part of this issue. I think the prerequisites are completed and this task can be picked up.

rfranzke avatar Jun 02 '23 08:06 rfranzke

/assign

acumino avatar Jul 03 '23 07:07 acumino

All tasks have been completed. /close

Thanks to everybody who contributed to this one, this was a long journey :)

rfranzke avatar Jul 14 '23 08:07 rfranzke

@rfranzke: Closing this issue.

In response to this:

All tasks have been completed. /close

Thanks to everybody who contributed to this one, this was a long journey :)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

gardener-prow[bot] avatar Jul 14 '23 08:07 gardener-prow[bot]