operator-sdk
operator-sdk copied to clipboard
Delivering a golang-based operator in an Helm chart best-practice
Type of question
Best practices
Question
What did you do?
We're building a golang based operator, and we want to deliver it to clients using a helm chart that will deploy it. As far as I understand, I need to take the omitted kustomize build artifact and encapsulate the required k8s objects in an helm chart that matches my needs (templating values that user will need to insert , namespace, etc)
In the generated kustomize artifacts, there are bunch of k8s objects that I could not see in other operator helm charts out there, mainly:
- the proxy role - after some digging I commented it out.
- leader-election-role (+binding) - Couldn't found it in other operator helm charts, and as far as I understand if I'm not planning to support operator upgrades in a way that will require the new and the old one to run together - it is redundant
- controller_manager_config ConfigMap - couldn't found any reference to it in the generated kustomize build artifacts.
What did you expect to see?
I searched the documentation for a guide on how to package an operator inside an helm chart, and to understand what are the minimum required k8s objects required to wrap the operator itself, I can do trial&error but I prefer knowing that my chart is correct.
What did you see instead? Under which circumstances?
Could not find any documentation regarding the above, and operator helm charts in github are not consistent, contain objects I don't even see in the generated kustomize artifacts.
Environment
Operator type:
/language go
Kubernetes cluster type:
vanilla
$ operator-sdk version
operator-sdk version: "v1.15.0", commit: "f6326e832a8a5e5453d0ad25e86714a0de2c0fc8", kubernetes version: "1.21", go version: "go1.16.10", GOOS: "linux", GOARCH: "amd64"
$ go version (if language is Go)
go version go1.16 linux/amd64
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.0", GitCommit:"c2b5237ccd9c0f1d600d3072634ca66cefdf272f", GitTreeState:"clean", BuildDate:"2021-08-04T18:03:20Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.5-eks-bc4871b", GitCommit:"5236faf39f1b7a7dabea8df12726f25608131aa9", GitTreeState:"clean", BuildDate:"2021-10-29T23:32:16Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
Additional context
Some context for the use case (my 2 cents) - The use case of delivering an Operator as a Helm chart is common in vanilla k8s and any k8s-based platform that doesn't have direct integration with OperatorHub to deploy with OLM.
An Operator Helm chart contains cluster level resources (CRDs, Cluster Roles/Bindings) and namespace level resources (controller manager Deployment, Config Maps etc'). So for an "own namespace" type Operator there are at least 2 ways to deploy it via Helm chart - a single chart running 2 times (with --skip-crds) or 2 separate charts. REF: https://helm.sh/docs/chart_best_practices/custom_resource_definitions/
I haven't tried adding Kustomize into the mix like Noam, but I imagine it adds complexity to the Helm use case.
Suggested solution would be either
- a guideline doc what is the best approach for a user packaging his Operator as a Helm chart(s). or
- an enhancement to add
helmtargets into the generated Makefile in a future SDK version (alongside thebundletargets).
@itroyano thank you for adding this information !
just to be clear, It's not that I chose to use the kustomize artifacts, they are just the only k8s objects artifacts generated by the sdk, so I had to use them because without them I couldn't possibly know how to wrap the operator binary in a way that will make it work in a predictable way.
a helm chart output will be amazing as it will solve even more problems (for example, using the kustomize artifacts means I need to template the k8s objects myself which is cumbersome)
regarding what you said about the operator helm chart - that is basically what I'm trying to achieve now:
- I modified the operator to be namespaced as explained in the docs
- created a chart that installs the cluster-wide assets (psp, clusterroles/bindings) and the namespaced operator itself
you mentioned running the helm chart twice - can you clarify ? I read the docs and it seems that I can put the crds under /crd and helm will make sure their installation will precede the other assets, its seems to be the case with victoria metrics operator - no mentions of running the chart twice
what am I missing?
@noamApps In Helm it is permission related i.e. the "user" running the chart installation/upgrade.
back in Helm v2 permissions were the Tiller Service Account of a specific namespace. now in Helm v3 it's the current context of your Kubeconfig. but it may or may not be a cluster admin, and if it isn't a cluster admin - the chart can get blocked by RBAC when trying to create CRDs and Cluster Roles / Bindings (if any). Helm team's solution in the article above, for the non-admin case, is to either
- have the non-admin run the chart with
--skip-crdso anything in the/crdfolder is ignored, and only the namespace-level resources get applied: controller-manager Deployment & any CMs/Secrets/Services, etc'. or - have the non-admin run a separate chart only containing the namespace-level resources, while the admin runs a separate chart with the cluster-level CRDs etc'.
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle rotten /remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.
/close
@openshift-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting
/reopen. Mark the issue as fresh by commenting/remove-lifecycle rotten. Exclude this issue from closing again by commenting/lifecycle frozen./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@noamApps I don't know whether you have resolved this problem, but I built https://github.com/yeahdongcn/kustohelmize for the same reason.