spec.Manifest != release.Manifest when helm "lookup" func used in template causing repeat reconcilliation
Hey, I have been building a hybrid operator (https://gitlab.com/GeorgeRaven/authentik-manager/-/tree/staging/operator) with the helm controller which attempts to reconcile one of my helm charts (https://gitlab.com/GeorgeRaven/authentik-manager/-/tree/staging/operator/helm-charts/ak).
This works overall however on closer inspection I note that the reconciler is repeatedly trying to reconcile a resource that it later recognizes as being correct or requiring no change. This leads to extremely high resource usage.
I believe this is related to helm lookup functions, and attempting to impute values from already existing resources. The reason I say this is because the affected resource in particular appear to be one of my secrets:
{{- if .Values.secret.generate }}
{{- $existingSecret := (lookup "v1" "Secret" (printf "%s" .Release.Namespace) (printf "%s" .Values.secret.name) ) -}}
apiVersion: v1
kind: Secret
metadata:
name: {{ .Values.secret.name }}
{{/* If the secret already exists then keep the data else gen new */}}
{{- if $existingSecret }}
annotations: {{ toJson $existingSecret.metadata.annotations }}
# from existing secret
data: {{ toJson $existingSecret.data }}
{{- else }}
data:
authJwtToken: {{ default 30 .Values.secret.randLength | int | randAlphaNum | b64enc }}
authStorageEncryptionKey: {{ default 30 .Values.secret.randLength | int | randAlphaNum | b64enc }}
************** several more secrets ************
{{- end }}
{{- end }}
Which I believe may be affecting this check which is being met causing repeat reconciles:
https://github.com/operator-framework/helm-operator-plugins/blob/ebfbea851192b115324c92eb1e38b3d26bc9191d/pkg/reconciler/reconciler.go#L727-L730
In contrast the actual actionclient reconciler is likely properly generating an empty patch:
https://github.com/operator-framework/helm-operator-plugins/blob/ebfbea851192b115324c92eb1e38b3d26bc9191d/pkg/client/actionclient.go#L276-L313
This also affects my sub-charts (Bitnami redis, Bitnami postgres) which also use the lookup function and fail although are disabled here to focus only on the clearer example.
Here is a full log of a run where I apply my watched for helm resource:
> make generate manifests install run
test -s /home/archer/git/authentik-manager/operator/bin/controller-gen || GOBIN=/home/archer/git/authentik-manager/operator/bin go install sigs.k8s.io/controller-tools/cmd/[email protected]
/home/archer/git/authentik-manager/operator/bin/controller-gen object:headerFile="" paths="./..."
/home/archer/git/authentik-manager/operator/bin/controller-gen rbac:roleName=manager-role crd webhook paths="./..." output:crd:artifacts:config=config/crd/bases
/home/archer/git/authentik-manager/operator/bin/kustomize build config/crd | kubectl apply -f -
customresourcedefinition.apiextensions.k8s.io/akblueprints.akm.goauthentik.io configured
customresourcedefinition.apiextensions.k8s.io/aks.akm.goauthentik.io unchanged
go fmt ./...
go vet ./...
# helm-operator run
go run ./main.go
1.6766436991798725e+09 INFO controller-runtime.metrics Metrics server is starting to listen {"addr": ":8080"}
1.6766436991832378e+09 INFO controllers.Helm Watching resource {"group": "akm.goauthentik.io", "version": "v1alpha1", "kind": "Ak"}
1.676643699183254e+09 INFO setup configured watch {"gvk": "akm.goauthentik.io/v1alpha1, Kind=Ak", "chartPath": "helm-charts/ak", "maxConcurrentReconciles": 24, "reconcilePeriod": "1m0s"}
1.6766436991832697e+09 INFO setup starting manager
1.6766436991834452e+09 INFO Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8080"}
1.6766436991834517e+09 INFO Starting server {"kind": "health probe", "addr": "[::]:8081"}
1.6766436991835036e+09 INFO Starting EventSource {"controller": "akblueprint", "controllerGroup": "akm.goauthentik.io", "controllerKind": "AkBlueprint", "source": "kind source: *v1alpha1.AkBlueprint"}
1.6766436991835206e+09 INFO Starting Controller {"controller": "akblueprint", "controllerGroup": "akm.goauthentik.io", "controllerKind": "AkBlueprint"}
1.6766436991835217e+09 INFO Starting EventSource {"controller": "ak-controller", "source": "kind source: *unstructured.Unstructured"}
1.6766436991835382e+09 INFO Starting EventSource {"controller": "ak-controller", "source": "kind source: *v1.Secret"}
1.6766436991835427e+09 INFO Starting Controller {"controller": "ak-controller"}
1.6766436992841232e+09 INFO Starting workers {"controller": "akblueprint", "controllerGroup": "akm.goauthentik.io", "controllerKind": "AkBlueprint", "worker count": 1}
1.6766436992841737e+09 INFO Starting workers {"controller": "ak-controller", "worker count": 24}
1.6766437054854615e+09 DEBUG controllers.Helm Reconciliation triggered {"ak": "default/ak-sample"}
1.6766437054908583e+09 DEBUG controllers.Helm Starting install
1.6766437056020834e+09 DEBUG controllers.Helm creating 11 resource(s)
1.676643705631722e+09 INFO controllers.Helm Release installed {"ak": "default/ak-sample", "name": "ak-sample", "version": 1}
1.6766437056327908e+09 INFO Starting EventSource {"controller": "ak-controller", "source": "kind source: *unstructured.Unstructured"}
1.6766437056328113e+09 DEBUG controllers.Helm Watching dependent resource {"ak": "default/ak-sample", "dependentAPIVersion": "v1", "dependentKind": "Secret"}
1.6766437056330194e+09 INFO Starting EventSource {"controller": "ak-controller", "source": "kind source: *unstructured.Unstructured"}
1.6766437056330316e+09 DEBUG controllers.Helm Watching dependent resource {"ak": "default/ak-sample", "dependentAPIVersion": "v1", "dependentKind": "ConfigMap"}
1.6766437056331952e+09 INFO Starting EventSource {"controller": "ak-controller", "source": "kind source: *unstructured.Unstructured"}
1.6766437056332052e+09 DEBUG controllers.Helm Watching dependent resource {"ak": "default/ak-sample", "dependentAPIVersion": "v1", "dependentKind": "Service"}
1.6766437056338887e+09 INFO Starting EventSource {"controller": "ak-controller", "source": "kind source: *unstructured.Unstructured"}
1.6766437056339014e+09 DEBUG controllers.Helm Watching dependent resource {"ak": "default/ak-sample", "dependentAPIVersion": "apps/v1", "dependentKind": "Deployment"}
1.6766437056340444e+09 INFO Starting EventSource {"controller": "ak-controller", "source": "kind source: *unstructured.Unstructured"}
1.6766437056340532e+09 DEBUG controllers.Helm Watching dependent resource {"ak": "default/ak-sample", "dependentAPIVersion": "networking.k8s.io/v1", "dependentKind": "Ingress"}
1.6766437056342168e+09 INFO Starting EventSource {"controller": "ak-controller", "source": "kind source: *unstructured.Unstructured"}
1.676643705634227e+09 DEBUG controllers.Helm Watching dependent resource {"ak": "default/ak-sample", "dependentAPIVersion": "networking.k8s.io/v1", "dependentKind": "NetworkPolicy"}
1.676643705649836e+09 DEBUG controllers.Helm Reconciliation triggered {"ak": "default/ak-sample"}
1.6766437056602938e+09 DEBUG controllers.Helm preparing upgrade for ak-sample
1.676643705733423e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "bootstrap-token-js20q4", "namespace": "kube-system", "apiVersion": "v1", "kind": "Secret"}
1.6766437057334795e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "sh.helm.release.v1.ak-sample.v1", "namespace": "default", "apiVersion": "v1", "kind": "Secret"}
1.6766437057334864e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "auth", "namespace": "default", "apiVersion": "v1", "kind": "Secret"}
1.676643705733534e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "authentik-server", "namespace": "default", "apiVersion": "v1", "kind": "Service"}
1.6766437057335584e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "kubernetes", "namespace": "default", "apiVersion": "v1", "kind": "Service"}
1.6766437057335649e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "kube-dns", "namespace": "kube-system", "apiVersion": "v1", "kind": "Service"}
1.676643705733571e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "authentik-worker", "namespace": "default", "apiVersion": "v1", "kind": "Service"}
1.676643705733582e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "kubelet-config", "namespace": "kube-system", "apiVersion": "v1", "kind": "ConfigMap"}
1.6766437057336082e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "kube-root-ca.crt", "namespace": "default", "apiVersion": "v1", "kind": "ConfigMap"}
1.676643705733655e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "kube-root-ca.crt", "namespace": "kube-system", "apiVersion": "v1", "kind": "ConfigMap"}
1.6766437057336686e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "kube-proxy", "namespace": "kube-system", "apiVersion": "v1", "kind": "ConfigMap"}
1.6766437057336805e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "extension-apiserver-authentication", "namespace": "kube-system", "apiVersion": "v1", "kind": "ConfigMap"}
1.676643705733691e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "coredns", "namespace": "kube-system", "apiVersion": "v1", "kind": "ConfigMap"}
1.6766437057337024e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "kube-root-ca.crt", "namespace": "kube-public", "apiVersion": "v1", "kind": "ConfigMap"}
1.6766437057337132e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "kube-root-ca.crt", "namespace": "kube-node-lease", "apiVersion": "v1", "kind": "ConfigMap"}
1.6766437057337236e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "authentik-config", "namespace": "default", "apiVersion": "v1", "kind": "ConfigMap"}
1.6766437057337363e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "kubeadm-config", "namespace": "kube-system", "apiVersion": "v1", "kind": "ConfigMap"}
1.676643705733748e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "cluster-info", "namespace": "kube-public", "apiVersion": "v1", "kind": "ConfigMap"}
1.676643705734698e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "auth-ldap", "namespace": "default", "apiVersion": "networking.k8s.io/v1", "kind": "Ingress"}
1.6766437057347128e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "auth", "namespace": "default", "apiVersion": "networking.k8s.io/v1", "kind": "Ingress"}
1.6766437057346978e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "general-netpol", "namespace": "default", "apiVersion": "networking.k8s.io/v1", "kind": "NetworkPolicy"}
1.6766437057347288e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "ldap-netpol", "namespace": "default", "apiVersion": "networking.k8s.io/v1", "kind": "NetworkPolicy"}
1.6766437057347398e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "authentik-worker", "namespace": "default", "apiVersion": "networking.k8s.io/v1", "kind": "NetworkPolicy"}
1.6766437057347512e+09 DEBUG controllers.Helm performing update for ak-sample
1.6766437057347138e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "authentik-server", "namespace": "default", "apiVersion": "apps/v1", "kind": "Deployment"}
1.6766437057347898e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "authentik-worker", "namespace": "default", "apiVersion": "apps/v1", "kind": "Deployment"}
1.6766437057348058e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "coredns", "namespace": "kube-system", "apiVersion": "apps/v1", "kind": "Deployment"}
1.6766437057447062e+09 DEBUG controllers.Helm dry run for ak-sample
1.6766437057501326e+09 DEBUG controllers.Helm preparing upgrade for ak-sample
1.676643705789205e+09 DEBUG controllers.Helm performing update for ak-sample
1.6766437057983747e+09 DEBUG controllers.Helm creating upgraded release for ak-sample
1.6766437058037639e+09 DEBUG controllers.Helm checking 11 resources for changes
1.676643705805758e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "sh.helm.release.v1.ak-sample.v2", "namespace": "default", "apiVersion": "v1", "kind": "Secret"}
1.6766437058108108e+09 DEBUG controllers.Helm Looks like there are no changes for NetworkPolicy "authentik-worker"
1.676643705814753e+09 DEBUG controllers.Helm Looks like there are no changes for NetworkPolicy "ldap-netpol"
1.6766437058186688e+09 DEBUG controllers.Helm Looks like there are no changes for NetworkPolicy "general-netpol"
1.676643705822401e+09 DEBUG controllers.Helm Looks like there are no changes for Secret "auth"
1.6766437058262606e+09 DEBUG controllers.Helm Looks like there are no changes for ConfigMap "authentik-config"
1.676643705831045e+09 DEBUG controllers.Helm Looks like there are no changes for Service "authentik-server"
1.6766437058344057e+09 DEBUG controllers.Helm Looks like there are no changes for Service "authentik-worker"
1.6766437058460007e+09 DEBUG controllers.Helm Looks like there are no changes for Deployment "authentik-server"
1.6766437058591201e+09 DEBUG controllers.Helm Looks like there are no changes for Deployment "authentik-worker"
1.6766437058633142e+09 DEBUG controllers.Helm Looks like there are no changes for Ingress "auth"
1.6766437058660803e+09 DEBUG controllers.Helm Looks like there are no changes for Ingress "auth-ldap"
1.6766437058724103e+09 DEBUG controllers.Helm updating status for upgraded release for ak-sample
1.6766437058739226e+09 DEBUG predicate Reconciling due to dependent resource update {"name": "sh.helm.release.v1.ak-sample.v1", "namespace": "default", "apiVersion": "v1", "kind": "Secret"}
1.6766437058783867e+09 INFO controllers.Helm Release upgraded {"ak": "default/ak-sample", "name": "ak-sample", "version": 2}
********************************************************** skipping many repeats ***************************************************************
1.6766437080106707e+09 DEBUG predicate Reconciling due to dependent resource update {"name": "sh.helm.release.v1.ak-sample.v9", "namespace": "default", "apiVersion": "v1", "kind": "Secret"}
1.676643708014716e+09 DEBUG controllers.Helm Reconciliation triggered {"ak": "default/ak-sample"}
1.676643708067962e+09 DEBUG controllers.Helm preparing upgrade for ak-sample
1.6766437082209888e+09 DEBUG controllers.Helm performing update for ak-sample
1.6766437082409153e+09 DEBUG controllers.Helm dry run for ak-sample
1.6766437082749157e+09 DEBUG controllers.Helm preparing upgrade for ak-sample
1.676643708349455e+09 DEBUG controllers.Helm performing update for ak-sample
1.6766437083658214e+09 DEBUG controllers.Helm creating upgraded release for ak-sample
1.6766437083739903e+09 DEBUG controllers.Helm checking 11 resources for changes
1.676643708375554e+09 DEBUG predicate Skipping reconciliation for dependent resource creation {"name": "sh.helm.release.v1.ak-sample.v10", "namespace": "default", "apiVersion": "v1", "kind": "Secret"}
1.6766437083802536e+09 DEBUG controllers.Helm Looks like there are no changes for NetworkPolicy "authentik-worker"
1.6766437083839474e+09 DEBUG controllers.Helm Looks like there are no changes for NetworkPolicy "ldap-netpol"
1.6766437083875918e+09 DEBUG controllers.Helm Looks like there are no changes for NetworkPolicy "general-netpol"
1.6766437083913655e+09 DEBUG controllers.Helm Looks like there are no changes for Secret "auth"
1.6766437083955123e+09 DEBUG controllers.Helm Looks like there are no changes for ConfigMap "authentik-config"
1.6766437083996217e+09 DEBUG controllers.Helm Looks like there are no changes for Service "authentik-server"
1.6766437084061203e+09 DEBUG controllers.Helm Looks like there are no changes for Service "authentik-worker"
1.6766437084129076e+09 DEBUG controllers.Helm Looks like there are no changes for Deployment "authentik-server"
1.6766437084197428e+09 DEBUG controllers.Helm Looks like there are no changes for Deployment "authentik-worker"
1.6766437084236712e+09 DEBUG controllers.Helm Looks like there are no changes for Ingress "auth"
1.6766437084272254e+09 DEBUG controllers.Helm Looks like there are no changes for Ingress "auth-ldap"
1.6766437084342504e+09 DEBUG controllers.Helm updating status for upgraded release for ak-sample
1.6766437084358995e+09 DEBUG predicate Reconciling due to dependent resource update {"name": "sh.helm.release.v1.ak-sample.v9", "namespace": "default", "apiVersion": "v1", "kind": "Secret"}
^C1.6766437084366193e+09 INFO Stopping and waiting for non leader election runnables
1.6766437084366376e+09 INFO Stopping and waiting for leader election runnables
1.676643708436671e+09 INFO Shutdown signal received, waiting for all workers to finish {"controller": "akblueprint", "controllerGroup": "akm.goauthentik.io", "controllerKind": "AkBlueprint"}
1.6766437084366868e+09 INFO Shutdown signal received, waiting for all workers to finish {"controller": "ak-controller"}
1.6766437084366996e+09 INFO All workers finished {"controller": "akblueprint", "controllerGroup": "akm.goauthentik.io", "controllerKind": "AkBlueprint"}
1.676643708444819e+09 INFO controllers.Helm Release upgraded {"ak": "default/ak-sample", "name": "ak-sample", "version": 10}
1.6766437084463599e+09 DEBUG predicate Reconciling due to dependent resource update {"name": "sh.helm.release.v1.ak-sample.v10", "namespace": "default", "apiVersion": "v1", "kind": "Secret"}
1.6766437084504726e+09 DEBUG controllers.Helm Reconciliation triggered {"ak": "default/ak-sample"}
1.6766437084505365e+09 ERROR Reconciler error {"controller": "ak-controller", "object": {"name":"ak-sample","namespace":"default"}, "namespace": "default", "name": "ak-sample", "reconcileID": "95d10411-5865-4450-b6f4-e718640001ac", "error": "client rate limiter Wait returned an error: context canceled"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/home/archer/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:326
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/home/archer/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/home/archer/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234
1.6766437084505928e+09 INFO All workers finished {"controller": "ak-controller"}
1.6766437084506066e+09 INFO Stopping and waiting for caches
1.6766437084507222e+09 INFO Stopping and waiting for webhooks
1.6766437084507341e+09 INFO Wait completed, proceeding to shutdown the manager
make: *** [Makefile:113: run] Error 1
I can confirm it is related to the secret as when it is disabled I have no such reconcilliation problems:
1.6766454866735418e+09 INFO controllers.Helm Release installed {"ak": "default/ak-sample", "name": "ak-sample", "version": 1}
1.6766454866736445e+09 DEBUG predicate Reconciling due to dependent resource update {"name": "authentik-worker", "namespace": "default", "apiVersion": "apps/v1", "kind": "Deployment"}
1.6766454866741118e+09 DEBUG predicate Reconciling due to dependent resource update {"name": "authentik-server", "namespace": "default", "apiVersion": "apps/v1", "kind": "Deployment"}
1.676645486674792e+09 DEBUG predicate Reconciling due to dependent resource update {"name": "sh.helm.release.v1.ak-sample.v1", "namespace": "default", "apiVersion": "v1", "kind": "Secret"}
1.67664548668227e+09 DEBUG predicate Reconciling due to dependent resource update {"name": "authentik-worker", "namespace": "default", "apiVersion": "apps/v1", "kind": "Deployment"}
1.67664548668497e+09 DEBUG predicate Reconciling due to dependent resource update {"name": "authentik-server", "namespace": "default", "apiVersion": "apps/v1", "kind": "Deployment"}
1.6766454866865325e+09 DEBUG controllers.Helm Reconciliation triggered {"ak": "default/ak-sample"}
1.6766454866900659e+09 DEBUG predicate Reconciling due to dependent resource update {"name": "authentik-worker", "namespace": "default", "apiVersion": "apps/v1", "kind": "Deployment"}
1.6766454866957946e+09 DEBUG predicate Reconciling due to dependent resource update {"name": "authentik-server", "namespace": "default", "apiVersion": "apps/v1", "kind": "Deployment"}
1.6766454866978717e+09 DEBUG controllers.Helm preparing upgrade for ak-sample
1.6766454867697005e+09 DEBUG controllers.Helm performing update for ak-sample
1.6766454867792912e+09 DEBUG controllers.Helm dry run for ak-sample
1.6766454867954018e+09 INFO controllers.Helm Release reconciled {"ak": "default/ak-sample", "name": "ak-sample", "version": 1}
1.6766454867980201e+09 DEBUG controllers.Helm Reconciliation triggered {"ak": "default/ak-sample"}
1.6766454868060386e+09 DEBUG controllers.Helm preparing upgrade for ak-sample
1.676645486920131e+09 DEBUG controllers.Helm performing update for ak-sample
1.6766454869291651e+09 DEBUG controllers.Helm dry run for ak-sample
1.6766454869506211e+09 INFO controllers.Helm Release reconciled {"ak": "default/ak-sample", "name": "ak-sample", "version": 1}
1.6766454987888744e+09 DEBUG controllers.Helm Reconciliation triggered {"ak": "default/ak-sample"}
1.676645498802959e+09 DEBUG controllers.Helm preparing upgrade for ak-sample
1.6766454989271529e+09 DEBUG controllers.Helm performing update for ak-sample
1.6766454989450462e+09 DEBUG controllers.Helm dry run for ak-sample
1.6766454989689188e+09 INFO controllers.Helm Release reconciled {"ak": "default/ak-sample", "name": "ak-sample", "version": 1}
If I am doing anything particularly wrong please let me know. I will try to see if there is a specific in-code fix possible and potentially PR it if I find one.
First, the approach you have for secret generation (generate new if not exist, otherwise use existing) is one I have not seen before. I didn't know that was possible with helm, and it's a very elegant solution to the "you can't use random values in the helm charts managed by the helm operator" problem. If we can figure out this other problem, I would really like to see this in our docs!
1.6766437058783867e+09 INFO controllers.Helm Release upgraded
^ That's a release upgrade which wouldn't follow the patch code path. The patch code path is followed when the manifests match.
On the release upgrade path, the specRelease.Manifest != currentRelease.Manifest check is most likely turning up a difference.
I wonder if there's some inconsequential difference (different spacing or base64 vs not base64)? Can you turn up log verbosity to get these diff log lines in the output: https://github.com/operator-framework/helm-operator-plugins/blob/6e30bde2868879811b3bedb200e911ae440413c8/pkg/reconciler/reconciler.go#L795-L797
Hey @joelanford, thanks I cant take any credit for it, it was an idea I exported from some stackoverflow post at some point a while ago.
I have actually moved away from the helm-operator-plugin a few days ago in favor of calling helm directly: API spec:
type AkSpec struct {
// +kubebuilder:pruning:PreserveUnknownFields
// +kubebuilder:validation:Schemaless
// Values is the helm chart values map to override chart defaults. This is often further adapted by the controller
// to add additional resources like declarative blueprints into the deployments. Values is a loose, and unstructured
// datatype. It will not complain if the values do not override anything, or do anything at all.
Values json.RawMessage `json:"values,omitempty"`
}
Reconciler Logic (apologies for the length I included it here for completion):
func (r *AkReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
l := klog.FromContext(ctx)
// Parsing options to make them available TODO: pass them in rather than read continuously
o := utils.Opts{}
arg.MustParse(&o)
actionConfig, err := r.GetActionConfig(req.NamespacedName.Namespace, l)
if err != nil {
return ctrl.Result{}, err
}
// GET CRD
crd := &akmv1a1.Ak{}
err = r.Get(ctx, req.NamespacedName, crd)
if err != nil {
if errors.IsNotFound(err) {
// Request object not found, could have been deleted after reconcile request.
// Owned objects are automatically garbage collected. For additional cleanup logic use finalizers.
// Return and don't requeue
l.Info("Ak resource reconciliation triggered but disappeared. Checking for residual chart for uninstall then ignoring since object must have been deleted.")
_, err := r.UninstallChart(req.NamespacedName, actionConfig)
if err != nil {
return ctrl.Result{}, err
}
return ctrl.Result{}, nil
}
// Error reading the object - requeue the request.
l.Error(err, "Failed to get Ak resource. Likely fetch error. Retrying.")
return ctrl.Result{}, err
}
l.Info(fmt.Sprintf("Found Ak resource `%v` in `%v`.", crd.Name, crd.Namespace))
// Helm Chart Identification
u, err := url.Parse(fmt.Sprintf("file://workspace/helm-charts/ak-%v.tgz", o.SrcVersion))
if err != nil {
return ctrl.Result{}, err
}
// Helm Install or Upgrade Chart
var vals map[string]interface{}
err = json.Unmarshal(crd.Spec.Values, &vals)
if err != nil {
return ctrl.Result{}, err
}
_, err = r.UpgradeOrInstallChart(req.NamespacedName, u, actionConfig, vals)
if err != nil {
return ctrl.Result{}, err
}
return ctrl.Result{}, nil
}
// UpgradeOrInstallChart upgrades a chart in cluster or installs it new if it does not already exist
// ulr format is [scheme:][//[userinfo@]host][/]path[?query][#fragment] e.g file://workspace/helm-charts/ak-0.1.0.tgz"
func (r *AkReconciler) UpgradeOrInstallChart(nn types.NamespacedName, u *url.URL, a *action.Configuration, o map[string]interface{}) (*release.Release, error) {
// Helm List Action
listAction := action.NewList(a)
releases, err := listAction.Run()
if err != nil {
return nil, err
}
toUpgrade := false
for _, release := range releases {
// fmt.Println("Release: " + release.Name + " Status: " + release.Info.Status.String())
if release.Name == nn.Name {
toUpgrade = true
}
}
c, err := r.LoadHelmChart(u)
if err != nil {
return nil, err
}
fmt.Println(o)
var rel *release.Release
if toUpgrade {
// Helm Upgrade
updateAction := action.NewUpgrade(a)
rel, err = updateAction.Run(nn.Name, c, o)
if err != nil {
return nil, err
}
} else {
// Helm Install
installAction := action.NewInstall(a)
installAction.Namespace = nn.Namespace
installAction.ReleaseName = nn.Name
rel, err = installAction.Run(c, o)
if err != nil {
return nil, err
}
}
return rel, nil
}
func (r *AkReconciler) UninstallChart(nn types.NamespacedName, a *action.Configuration) (*release.UninstallReleaseResponse, error) {
uninstallAction := action.NewUninstall(a)
releaseResponse, err := uninstallAction.Run(nn.Name)
if err != nil {
return nil, err
}
return releaseResponse, nil
}
// GetActionConfig Get the Helm action config from in cluster service account
func (r *AkReconciler) GetActionConfig(namespace string, l logr.Logger) (*action.Configuration, error) {
actionConfig := new(action.Configuration)
var kubeConfig *genericclioptions.ConfigFlags
config, err := rest.InClusterConfig()
if err != nil {
return nil, err
}
// Set properties manually from official rest config
kubeConfig = genericclioptions.NewConfigFlags(false)
kubeConfig.APIServer = &config.Host
kubeConfig.BearerToken = &config.BearerToken
kubeConfig.CAFile = &config.CAFile
kubeConfig.Namespace = &namespace
if err := actionConfig.Init(kubeConfig, namespace, os.Getenv("HELM_DRIVER"), log.Printf); err != nil {
}
return actionConfig, nil
}
// Get Connection Client to Kubernetes
func (r *AkReconciler) GetKubeClient() (*kubernetes.Clientset, error) {
config, err := rest.InClusterConfig()
if err != nil {
return nil, err
}
// creates the clientset
clientset, err := kubernetes.NewForConfig(config)
if err != nil {
return nil, err
}
return clientset, nil
}
// GetHelmChart loads a helm chart from a given file as URL
func (r *AkReconciler) LoadHelmChart(u *url.URL) (*chart.Chart, error) {
// GET HELM CHART
if u.Scheme != "file" {
err := errors.NewInvalid(
schema.GroupKind{
Group: "akm.goauthentik.io",
Kind: "Ak",
},
fmt.Sprintf("Url scheme `%v` != `file`, unsupported scheme.", u.Scheme),
field.ErrorList{})
return nil, err
}
// load chart from filepath (which is part of host in url)
path, err := filepath.Abs(u.Host + u.Path)
if err != nil {
return nil, err
}
_, err = utils.Exists(path)
if err != nil {
return nil, err
}
return chartLoader.Load(path)
}
Which has much the same effect. I will checkout to an old commit and get you a more in depth debug log regardless if you are still interested in this though.
This just bit us pretty badly, spent a few hours on this with an infinite reconcile loop stemming from this code in the helm chart
template:
metadata:
labels:
app: {{ .Values.metadata.name }}
backstage.io/kubernetes-id: {{ .Values.metadata.annotations.backstageName }}
kuma.io/region: {{ include "webapp.cluster.region" . }} # <--- this line which calls this helper
{{- define "webapp.cluster.region" -}}
{{- $region:= "noregion" }}
{{- if $config := (lookup "v1" "ConfigMap" "runway-operator" "kubernetes-info") -}}
{{- $region = $config.data.region_name -}}
{{- end }}
{{- printf "%v" $region -}}
{{- end -}}
when the release spec is gotten it doesn't account for lookup values and therefore sees the two manifests as different
Is there considered a bug or expected behavior?
Lookup values should be accounted for. The release secret of the current release should have the value populated from the lookup function. The next time reconcile happens, I would expect that the dry-run upgrade that happens would also populate the value from the lookup. If the value of the lookup changes, then that would trigger a diff in the manifest and a real upgrade would proceed.
Can you turn logging up to level 4 to get the release diff in the logs and see what the difference is (if there is a difference at least): https://github.com/operator-framework/helm-operator-plugins/blob/8c7e1e25baf817338fc05aa61f006891c2df8f8c/pkg/reconciler/reconciler.go#L755-L757
Update: its because of this line
https://github.com/operator-framework/helm-operator-plugins/blob/8c7e1e25baf817338fc05aa61f006891c2df8f8c/pkg/reconciler/reconciler.go#L720
When DryRun is set to true lookup values are not respected and the upgrade check for sameness will always fail because the template has default values while the actual release has the lookedup values
@joelanford Apologies for the delay I didn't see your comment.
Here is a short 2 second log dump run with go run cmd/main.go --zap-log-level 4
https://pastebin.com/WKCRk8Ns The diff that it logs every time is identical.
The actual upgrade check, done here https://github.com/operator-framework/helm-operator-plugins/blob/ebfbea851192b115324c92eb1e38b3d26bc9191d/pkg/reconciler/reconciler.go#L727-L730
is done with a dry run flag which does not actually lookup values in a cluster so the string comparison fails because it fell back to the default value. As documented here
https://helm.sh/docs/chart_template_guide/functions_and_pipelines/#using-the-lookup-function
Oh, it is unfortunate that the lookup function is non-functional during dry runs.
The dry run and manifest comparison are fundamental aspects of the reconciliation, and we have not been able to find an alternative.
Some context is documented here: https://github.com/operator-framework/operator-sdk/pull/5059#issuecomment-884336514
@joelanford: Would it be possible to use the new --dry-run=server introduced in v3.13.0?
We can certainly try that out! My initial instinct is that we can just change the existing client-side dry-run used to calculate the manifest diff to be a server-side dry-run.
Do we know of any examples where switching from client dry-run to server dry-run would break existing usage?
The goal of the dry-run manifest generation is to simulate as closely as possible a real upgrade. So my personal feeling is that since a real upgrade obviously contacts the server and supports templates with the lookup function, it follows that using server-side dry run is a strict improvement and aligns with the purpose of the dry-run in the first place.
Can you try using the code from #344 to see if it would resolve this issue?
@miquella If you have a normal helm operator this feature already in operator-sdk, and can be enabled by updating the watches.yaml please take a look at the below PR.
- Relates: operator-framework/operator-sdk#6691
@joelanford: I apparently crossed some wires in my research 😝 It was pointed out to me that our current solution isn't currently using the hybrid operator, so I don't think I have a way to test with your change.
But it sounds like we can take advantage of the solution that @acornett21 suggested!
Thanks, guys!