piraeus-operator icon indicating copy to clipboard operation
piraeus-operator copied to clipboard

customizeComponents feature

Open kvaps opened this issue 1 year ago • 7 comments

@WanzenBug in flowing to discussion in https://github.com/piraeusdatastore/piraeus-operator/issues/180#issuecomment-1050698412 I think I found an alternative way to add customizations to the generated manifests.

KubeVirt project has customizeComponents option, which is working the following way: https://kubevirtlegacy.gitbook.io/user-guide/docs/operations/customize_components

From my point of view it can be better because of few reasons:

  • Unlike previews proposed solution, this might work with existing operator logic
  • Less impact on CRD structure, which now must implement a lot of new fields (example: https://github.com/piraeusdatastore/piraeus-operator/pull/280)
  • Full customization is possible. Every resource can be patched. Not only strategic merge patch is supported, but also json and merge patches, which might be useful to remove fields and customization in the more smart way.

kvaps avatar Aug 15 '22 15:08 kvaps

Also the current implementation might be updated to generate patches locally using Helm: eg the following values should generate:

diff --git a/charts/piraeus/templates/operator-controller.yaml b/charts/piraeus/templates/operator-controller.yaml
index 25ac9fe..3ce0a8f 100644
--- a/charts/piraeus/templates/operator-controller.yaml
+++ b/charts/piraeus/templates/operator-controller.yaml
@@ -55,10 +55,18 @@ spec:
   {{- if .Values.operator.controller.httpsBindAddress }}
   httpsBindAddress: {{ .Values.operator.controller.httpsBindAddress | quote }}
   {{- end }}
-  {{- if .Values.operator.controller.sidecars }}
-  sidecars: {{ .Values.operator.controller.sidecars | toJson }}
-  {{- end }}
-  {{- if .Values.operator.controller.extraVolumes }}
-  extraVolumes: {{ .Values.operator.controller.extraVolumes | toJson }}
-  {{- end }}
+  customizeComponents:
+    patches:
+    {{- with .Values.operator.controller.sidecars }}
+    - resourceType: Deployment
+      resourceName: linstor-controller
+      patch: '{{ printf "{\"spec\":{\"template\":{\"spec\":{\"containers\":%s}}}}" (toJson .) }}'
+      type: strategic
+    {{- end }}
+    {{- with .Values.operator.controller.extraVolumes }}
+    - resourceType: Deployment
+      resourceName: linstor-controller
+      patch: '{{ printf "{\"spec\":{\"template\":{\"spec\":{\"volumes\":%s}}}}" (toJson .) }}'
+      type: strategic
+    {{- end }}
 {{- end }}

example output:

 # Source: piraeus/templates/operator-controller.yaml
 apiVersion: piraeus.linbit.com/v1
 kind: LinstorController
 metadata:
   name: linstor-piraeus-cs
   namespace: linstor
   labels:
     app.kubernetes.io/name: linstor-piraeus-cs
 spec:
   priorityClassName: ""
   # TODO: switch to k8s db by default
   dbConnectionURL:  etcd://linstor-etcd:2379
   luksSecret: linstor-piraeus-passphrase
   sslSecret: "linstor-piraeus-control-secret"
   dbCertSecret:
   dbUseClientCert: false
   drbdRepoCred: ""
   controllerImage: quay.io/piraeusdatastore/piraeus-server:v1.18.2
   imagePullPolicy: "IfNotPresent"
   linstorHttpsControllerSecret: "linstor-piraeus-controller-secret"
   linstorHttpsClientSecret: "linstor-piraeus-client-secret"
   tolerations: [{"effect":"NoSchedule","key":"node-role.kubernetes.io/master","operator":"Exists"},{"effect":"NoSchedule","key":"node-role.kubernetes.io/control-plane","operator":"Exists"}]
   resources: {}
   replicas: 1
   httpBindAddress: "127.0.0.1"
-  sidecars: [{"args":["--upstream=http://127.0.0.1:3370","--config-file=/etc/kube-rbac-proxy/linstor-controller.yaml","--secure-listen-address=$(KUBE_RBAC_PROXY_LISTEN_ADDRESS):3370","--client-ca-file=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt","--v=2","--logtostderr=true"],"env":[{"name":"KUBE_RBAC_PROXY_LISTEN_ADDRESS","valueFrom":{"fieldRef":{"fieldPath":"status.podIP"}}}],"image":"quay.io/brancz/kube-rbac-proxy:v0.11.0","name":"kube-rbac-proxy","volumeMounts":[{"mountPath":"/etc/kube-rbac-proxy","name":"rbac-proxy-config"}]}]
-  extraVolumes: [{"configMap":{"name":"piraeus-rbac-proxy"},"name":"rbac-proxy-config"}]
+  customizeComponents:
+    patches:
+    - resourceType: Deployment
+      resourceName: linstor-controller
+      patch: '{"spec":{"template":{"spec":{"containers":[{"args":["--upstream=http://127.0.0.1:3370","--config-file=/etc/kube-rbac-proxy/linstor-controller.yaml","--secure-listen-address=$(KUBE_RBAC_PROXY_LISTEN_ADDRESS):3370","--client-ca-file=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt","--v=2","--logtostderr=true"],"env":[{"name":"KUBE_RBAC_PROXY_LISTEN_ADDRESS","valueFrom":{"fieldRef":{"fieldPath":"status.podIP"}}}],"image":"quay.io/brancz/kube-rbac-proxy:v0.11.0","name":"kube-rbac-proxy","volumeMounts":[{"mountPath":"/etc/kube-rbac-proxy","name":"rbac-proxy-config"}]}]}}}}'
+      type: strategic
+    - resourceType: Deployment
+      resourceName: linstor-controller
+      patch: '{"spec":{"template":{"spec":{"volumes":[{"configMap":{"name":"piraeus-rbac-proxy"},"name":"rbac-proxy-config"}]}}}}'
+      type: strategic

kvaps avatar Aug 15 '22 15:08 kvaps

I do like that approach, it seems much more manageable. I'm a bit hesitant on porting existing features to this new patch strategy.

Since I'm working on v2 anyways, I'll give it a try there first. To keep development effort low, I'd like to move the v1 operator into a "maintenance mode", i.e. no new features, only bug fixes. Unless there is a pressing issue that needs a new feature while v2 is still being developed of course.

WanzenBug avatar Aug 29 '22 06:08 WanzenBug

@kvaps perhaps you have some insight here, too:

Let's assume the operator v2 is working with some slightly different CRDs:

  • LinstorCluster, which is used to manage all "controller" related resources
  • LinstorNode, which manages a satellite and node agents, so there is one LinstorNode resource per k8s node.

A user might want to apply some customization to only a subset of nodes. The question is: how should that work. In my prototype, if you create a LinstorCluster, it will automatically create a LinstorNode for all k8s nodes. But what it creates is just a plain resource, so how would a user specify patches.

Some ideas I had:

  • The LinstorCluster also has a list of patches with some kind of selector expression, matching the nodes
  • A user can add special annotations to k8s nodes which will be recognized and automatically add patches to the LinstorNode resource
  • Nothing, the user has to manually edit the LinstorNode resources

WanzenBug avatar Aug 29 '22 12:08 WanzenBug

Hey, I think the best idea is to try align to well-known design patterns.

  • For example CSI and its CSINode object is managed by csi-node-registrar from the node side:

  • Another example is Cilium: it has CiliumNode object which is managed by cilium-agent from the node side.

  • Even more: kubelet creates and managing own Node object by itself.

Thus I think the LisntorNode objects should also be created and managed by DaemonSet from the node side.

As about configuration for specific nodes I would prefer to use separate CRD, eg, LinstorNodeConfiguration with default nodeSelectorTerms structure, eg:

  nodeSelectorTerms:
    - matchExpressions:
      - key: "kubernetes.io/arch"
        operator: In
        values: ["amd64"]
    - matchExpressions:
      - key: "kubernetes.io/os"
        operator: In
        values: ["linux"]
  nodeSelectorTerms:
  - matchExpressions:
    - key: name
      operator: In
      values:
      - app-worker-node
  nodeSelectorTerms:
    - matchExpressions:
      - key: "failure-domain.kubernetes.io/zone"
        operator: In
        values: ["us-central1-a"]
  nodeSelectorTerms:
  - matchExpressions:
    - key: node-role.kubernetes.io/master
      operator: DoesNotExist

Similar logic is used by istio-operator to match namespaces https://istio.io/latest/blog/2021/discovery-selectors/#discovery-selectors-in-action

kvaps avatar Aug 30 '22 09:08 kvaps

I was also thinking about a LinstorNodeConfiguration object. My problem is that one of the things I want to have control over is which DRBD injector image is used. That is not possible if I use a daemonset I believe.

WanzenBug avatar Aug 30 '22 09:08 WanzenBug

I'd really like to not have to reimplement a DaemonSet, but there are a few edge cases that I believe can't be fixed with a normal daemonset. On the top of my head:

  • The different loader images per node distribution
  • When using the pod network: we want to have stable pod names, otherwise LINSTOR gets confused with the unames of pods after restarts.

But that does not mean we actually need a LinstorNode object. Using your example, I guess the better solution is to have the operator aggregate the configuration fragments for every node.

WanzenBug avatar Aug 30 '22 12:08 WanzenBug

Well, the driver loader pod can be created by the daemonset controller as well.

Since you want to use separate driver loader image for every OS, this is a little bit tricky. Eg. you can use EphemeralContainers feature or just spawn a new pod with ownerReference to existing one.

But to be honest I don't see the problem of using the common driver loader image. Debian with additional libelf-dev dependency can build and inject module on any rpm-based OS and their kernels.

Having multiple kernel-loader images for the every OS will make it more difficult to manage and maintain.

ref https://github.com/deckhouse/deckhouse/pull/2269 and https://github.com/piraeusdatastore/piraeus/pull/120

kvaps avatar Aug 30 '22 12:08 kvaps

This is implemented with Operator v2

WanzenBug avatar Jul 19 '23 11:07 WanzenBug