clickhouse-operator icon indicating copy to clipboard operation
clickhouse-operator copied to clipboard

Issue with anti affinity rules

Open vigodeltoro opened this issue 3 years ago • 7 comments

Hi there,

I have a problem with the anti affinity rules.. maybe there is somebody out there who can help me out. I have a three nodes Kubernetes setup with a 3 shard cluster with one replica each..

So there 6 pods at the cluster. I try to use anti affinity rules to distribute the pods to the 3 nodes. My goal is to have 2 pods per node but not the same shard or the same replica. Something like the the example below..

chi-protobuf-example-dev-0-0-0 node1
chi-protobuf-example-dev-0-1-0 node2
chi-protobuf-example-dev-1-0-0 node3
chi-protobuf-example-dev-1-1-0 node1 chi-protobuf-example-dev-2-0-0 node2 chi-protobuf-example-dev-2-1-0 node3

The anti affinity rules I'm using are like the example of the /docs/chi-examples dir ( https://github.com/Altinity/clickhouse-operator/blob/master/docs/chi-examples/99-clickhouseinstallation-max.yaml)

podTemplates: - name: pod-template-with-init-container podDistribution: - type: ShardAntiAffinity - type: MaxNumberPerNode number: 2 topologyKey: "kubernetes.io/hostname" - type: ReplicaAntiAffinity - type: MaxNumberPerNode number: 2 topologyKey: "kubernetes.io/hostname"

But what's happening every time I deploy is:

chi-protobuf-example-dev-0-0-0 node1
chi-protobuf-example-dev-0-1-0 node2
chi-protobuf-example-dev-1-0-0 node1
chi-protobuf-example-dev-1-1-0 node2 chi-protobuf-example-dev-2-0-0 node3 chi-protobuf-example-dev-2-1-0 Pending because no free node is available

That's really problematic, because I can't use my resources properly..

Does anybody has an idea ?

Thanks a lot and best regards

vigodeltoro avatar Oct 31 '22 15:10 vigodeltoro

You only need ReplicaAntiAffinity and that's it.

    - name: pod-template-with-init-container
      podDistribution:
      - scope: ClickHouseInstallation
        type: ReplicaAntiAffinity
        topologyKey: "kubernetes.io/hostname"

alex-zaitsev avatar Nov 01 '22 15:11 alex-zaitsev

Hi Alex,

okay.. thanks a lot for that hint. I tried it out and achieved the following distribution

chi-protobuf-example-dev-0-0-0 node1 chi-protobuf-example-dev-0-1-0 node1 chi-protobuf-example-dev-1-0-0 node2 chi-protobuf-example-dev-1-1-0 node2 chi-protobuf-example-dev-2-0-0 node3 chi-protobuf-example-dev-2-1-0 node3

With that I got a distribution over all three nodes but shard and replica are on the same node which means that if I loose one node I loose 30% of my database, so redundancy is gone..

If I try it with :

  • name: pod-template-with-init-container podDistribution: - scope: ClickHouseInstallation type: ShardAntiAffinity topologyKey: "kubernetes.io/hostname"

I got only a 2 node distribution:

chi-protobuf-example-dev-0-0-0 node1 chi-protobuf-example-dev-0-1-0 node2 chi-protobuf-example-dev-1-0-0 node1 chi-protobuf-example-dev-1-1-0 node2 chi-protobuf-example-dev-2-0-0 node1 chi-protobuf-example-dev-2-1-0 node2

Do you have any other suggestions ?

Thanks a lot

vigodeltoro avatar Nov 03 '22 09:11 vigodeltoro

@alex-zaitsev There seems to be lack of proper docs on podDistribution with list of all possible values of each keys and their significance.

prashant-shahi avatar Dec 05 '22 05:12 prashant-shahi

@prashant-shahi Indeed.. and in my eyes there is a bug in circular replication..

I was able to workaround that problem with

"hardcoded" pod templates


podTemplates:
  - name: sh0-rep0-template     
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - "zone-1"                          

  name: sh0-rep1-template     
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - "zone-2"         

   - name: sh1-rep0-template     
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - "zone-2"

   - name: sh1-rep1-template     
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - "zone-3"

   - name: sh2-rep0-template     
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - "zone-3"

   - name: sh2-rep1-template     
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - "zone-1"

But with that I'm facing issues with the podDisruptionBudgets ( https://github.com/Altinity/clickhouse-operator/issues/1081)

So a fix would be really helpful..

vigodeltoro avatar Jan 26 '23 08:01 vigodeltoro