piraeus-operator icon indicating copy to clipboard operation
piraeus-operator copied to clipboard

Following the tutorial results in satellites being created on control-plane nodes

Open MetalPinguinInc opened this issue 11 months ago • 2 comments

I am following the tutorial and installed the prerequisites (linux kernel headers) on all my worker nodes, but not on the control plane nodes. Of course this results in satellites being offline on the control-plane nodes, but it is unclear to me if it is expected behaviour that Piraeus even tries to start satellites/csi nodes on control-plane nodes.

What I expected to happen To have 4 online satellites; 1 on each worker node and none on the control-plane node.

What happened instead I have 4 online satellites and 3 offline satellites (1 on each control-plane node)

What I have tried I have tried to change the satellite configuration to not "autoplace" on control-plane nodes. I found this in the documentation, but not in the tutorial.

This is what the configuration looks like:

kind: LinstorCluster
metadata:
  name: linstorcluster
spec: {}
---
apiVersion: piraeus.io/v1
kind: LinstorSatelliteConfiguration
metadata:
  name: storage-pool
spec:
  nodeAffinity:
    nodeSelectorTerms:
      - matchExpressions:
          - key: node-role.kubernetes.io/control-plane
            operator: Exists
  properties:
    - name: AutoplaceTarget
      value: "no"
  storagePools:
    - name: test-pool
      fileThinPool:
        directory: /var/lib/piraeus-datastore/test-pool

I have also verified that my nodes are correctly marked as control-plane nodes.

Unfortunately the satellite/csi nodes are still scheduled on the control-plane nodes.

What can be done to fix this Make it clear in the tutorial whether it is expected for Piraeus to start satellite/csi pods on control-plane nodes and what that means in terms of disk space availability (i.e. does that mean all control-plane nodes need to have disk-space available for Piraeus?) If not, please include example configuration to exclude control-plane nodes from Piraeus satellite/csi deployment

Ps. I did not see a submission template, so I hope this includes all the needed information.

MetalPinguinInc avatar Dec 17 '24 17:12 MetalPinguinInc

I have since solved this issue by digging deeper into the documentation. I added

nodeAffinity:
    nodeSelectorTerms:
      - matchExpressions:          
          - key: node-role.kubernetes.io/control-plane
            operator: DoesNotExist

to the spec part of both the LinstorCluster and the LinstorSatelliteConfiguration. So now my configuration looks like this:

apiVersion: piraeus.io/v1
kind: LinstorCluster
metadata:
  name: linstorcluster
spec:
  nodeAffinity:
    nodeSelectorTerms:
      - matchExpressions:          
          - key: node-role.kubernetes.io/control-plane
            operator: DoesNotExist
---
apiVersion: piraeus.io/v1
kind: LinstorSatelliteConfiguration
metadata:
  name: storage-pool
spec:
  nodeAffinity:
    nodeSelectorTerms:
      - matchExpressions:          
          - key: node-role.kubernetes.io/control-plane
            operator: DoesNotExist
  storagePools:
    - name: test-pool
      fileThinPool:
        directory: /var/lib/piraeus-datastore/test-pool

This results in Piraeus not deploying resources to my control-plane nodes as expected.

As far as I am aware it is unusual to run workloads on control-plane nodes, so I think it would be a good idea to include this in the default settings or at least mention this in the tutorial.

MetalPinguinInc avatar Dec 17 '24 17:12 MetalPinguinInc

Yeah, you are right, it probably should not create them on control plane nodes. Right now there is no proper tracking of taints when generating the satellites. This is something we should look into that.

WanzenBug avatar Dec 18 '24 07:12 WanzenBug