kubeblocks icon indicating copy to clipboard operation
kubeblocks copied to clipboard

[BUG] Endless affinity list when using new schudlingPolicy field

Open cjc7373 opened this issue 5 months ago • 0 comments

Describe the bug Endless affinity list when using new schudlingPolicy field and a DATA_PLANE_AFFINITY in config

To Reproduce Steps to reproduce the behavior:

  1. Have a DATA_PLANE_AFFINITY: '{"nodeAffinity":{"preferredDuringSchedulingIgnoredDuringExecution":[{"preference":{"matchExpressions":[{"key":"kb-data","operator":"In","values":["true"]}]},"weight":100}]}}' field in the config
  2. Create a cluster CR, with schedulingPolicy field not null. e.g.
    spec:
      schedulingPolicy:
        schedulerName: custom-scheduler
    
  3. cluster can't finish reconciling

The cluster CR will become like:

apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
spec:
  ...
  schedulingPolicy:
    affinity:
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
            - key: kb-data
              operator: In
              values:
              - "true"
          weight: 100
        - preference:
            matchExpressions:
            - key: kb-data
              operator: In
              values:
              - "true"
          weight: 100
        - preference:
            matchExpressions:
            - key: kb-data
              operator: In
              values:
              - "true"
          weight: 100
        [repeat hundreds of times]
        ...
    schedulerName: custom-scheduler
    tolerations:
    - effect: NoSchedule
      key: kb-data
      operator: Equal
      value: "true"
    - effect: NoSchedule
      key: kb-data
      operator: Equal
      value: "true"
    - effect: NoSchedule
      key: kb-data
      operator: Equal
      value: "true"
    [repeat hundreds of times]
    ...

Additional context The root cause is that the generated scheduling policy is patched to cluster CR itself. Though it only should be in the component CR.

On a deeper view, this bug happened because cluster controller would modify cluster CR's spec. In which use case do we need this behavior? After all, the general idea is that a controller will not modify the CR's spec it reconciles.

cjc7373 avatar Sep 06 '24 16:09 cjc7373