kubeblocks icon indicating copy to clipboard operation
kubeblocks copied to clipboard

[BUG]pods are deleted when vscale cpu/memory limit exceed namespace quota

Open ahjing99 opened this issue 1 year ago • 2 comments

➜ ~ kbcli version Kubernetes: v1.27.3-gke.100 KubeBlocks: 0.6.3-beta.3 kbcli: 0.6.3-beta.3

When vscale cpu/memory limit exceed namespace quota, pods will be deleted, we should block the ops at the beginning when it exceed quota

  1. Create ns with quota
kubectl apply -f -<<EOF
apiVersion: v1
items:
- apiVersion: v1
  kind: ResourceQuota
  metadata:
    name: quota-ns-ukltji
    namespace: ns-ukltji
  spec:
    hard:
      limits.cpu: "2"
      limits.ephemeral-storage: 10Gi
      limits.memory: 2Gi
      requests.storage: 10Gi
  status:
    hard:
      limits.cpu: "2"
      limits.ephemeral-storage: 10Gi
      limits.memory: 2Gi
      requests.storage: 10Gi
    used:
      limits.cpu: "0"
      limits.ephemeral-storage: "0"
      limits.memory: "0"
      requests.storage: "0"
kind: List
metadata:
  resourceVersion: ""
---
apiVersion: v1
kind: LimitRange
metadata:
  name: range-ns-ukltji
  namespace: ns-ukltji
spec:
  limits:
  - default:
      cpu: 100m
      memory: 100Mi
    type: Container
EOF
  1. Create role
kubectl apply -f -<<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  labels:
    app.kubernetes.io/instance: dbname
    app.kubernetes.io/managed-by: kbcli
  name: dbname
  namespace: ns-ukltji
rules:
  - apiGroups:
      - ''
    resources:
      - events
    verbs:
      - create
  - apiGroups:
      - ''
    resources:
      - configmaps
    verbs:
      - create
      - get
      - list
      - patch
      - update
      - watch
      - delete
  - apiGroups:
      - ''
    resources:
      - endpoints
    verbs:
      - create
      - get
      - list
      - patch
      - update
      - watch
      - delete
  - apiGroups:
      - ''
    resources:
      - pods
    verbs:
      - get
      - list
      - patch
      - update
      - watch
EOF
  1. Create SA RoleBinding
kubectl apply -f -<<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app.kubernetes.io/instance: dbname
    app.kubernetes.io/managed-by: kbcli
  name: dbname
  namespace: ns-ukltji
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    app.kubernetes.io/instance: dbname
    app.kubernetes.io/managed-by: kbcli
  name: dbname
  namespace: ns-ukltji
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: dbname
subjects:
  - kind: ServiceAccount
    name: dbname
    namespace: ns-ukltji
EOF
  1. Create cluster
kubectl create -f -<<EOF
apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
metadata:
  labels:
    clusterdefinition.kubeblocks.io/name: mongodb
    clusterversion.kubeblocks.io/name: mongodb-5.0
  generateName: mongo-
  namespace: ns-ukltji
spec:
  affinity:
    nodeLabels: {}
    podAntiAffinity: Preferred
    tenancy: SharedNode
    topologyKeys: []
  clusterDefinitionRef: mongodb
  clusterVersionRef: mongodb-5.0
  componentSpecs:
  - componentDefRef: mongodb
    monitor: true
    name: mongodb
    replicas: 1
    resources:
      limits:
        cpu: 1000m
        memory: 1024Mi
      requests:
        cpu: 100m
        memory: 102Mi
    serviceAccountName: dbname
    volumeClaimTemplates:
    - name: data
      spec:
        storageClassName: standard-rwo
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 5Gi
  terminationPolicy: WipeOut
  tolerations: []
EOF


➜  ~ kbcli cluster describe -n ns-ukltji mongo-ggbgx
Name: mongo-ggbgx         Created Time: Oct 10,2023 11:25 UTC+0800
NAMESPACE   CLUSTER-DEFINITION   VERSION       STATUS    TERMINATION-POLICY
ns-ukltji   mongodb              mongodb-5.0   Running   WipeOut

Endpoints:
COMPONENT   MODE        INTERNAL                                                EXTERNAL
mongodb     ReadWrite   mongo-ggbgx-mongodb.ns-ukltji.svc.cluster.local:27017   <none>

Topology:
COMPONENT   INSTANCE                ROLE      STATUS    AZ              NODE                                                CREATED-TIME
mongodb     mongo-ggbgx-mongodb-0   primary   Running   us-central1-c   gke-yjtest-default-pool-c51609d3-ss98/10.128.0.46   Oct 10,2023 11:25 UTC+0800

Resources Allocation:
COMPONENT   DEDICATED   CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)   STORAGE-SIZE   STORAGE-CLASS
mongodb     false       100m / 1             102Mi / 1Gi             data:5Gi       standard-rwo

Images:
COMPONENT   TYPE      IMAGE
mongodb     mongodb   registry.cn-hangzhou.aliyuncs.com/apecloud/mongo:5.0.14

Data Protection:
AUTO-BACKUP   BACKUP-SCHEDULE   TYPE     BACKUP-TTL   LAST-SCHEDULE   RECOVERABLE-TIME
Disabled      <none>            <none>   7d           <none>          <none>

Show cluster events: kbcli cluster list-events -n ns-ukltji mongo-ggbgx

  1. Vscale
kubectl create -f -<<EOF
apiVersion: apps.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
  generateName: ops-verticalscaling-2c4g-
  namespace: ns-ukltji
spec:
  clusterRef: mongo-ggbgx
  type: VerticalScaling
  verticalScaling:
  - componentName: mongodb
    requests:
      cpu: "4"
      memory: "4Gi"
    limits:
      cpu: "4"
      memory: "4Gi"
EOF      
  1. Pods are deleted and cannot recover
➜  ~ k describe cluster mongo-ggbgx  -n ns-ukltji
Name:         mongo-ggbgx
Namespace:    ns-ukltji
Labels:       clusterdefinition.kubeblocks.io/name=mongodb
              clusterversion.kubeblocks.io/name=mongodb-5.0
Annotations:  kubeblocks.io/ops-request: [{"name":"ops-verticalscaling-2c4g-v77p7","type":"VerticalScaling"}]
              kubeblocks.io/reconcile: 2023-10-10T03:43:56.541431103Z
API Version:  apps.kubeblocks.io/v1alpha1
Kind:         Cluster
Metadata:
  Creation Timestamp:  2023-10-10T03:25:43Z
  Finalizers:
    cluster.kubeblocks.io/finalizer
  Generate Name:  mongo-
  Generation:     3
  Managed Fields:
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:generateName:
        f:labels:
          .:
          f:clusterdefinition.kubeblocks.io/name:
          f:clusterversion.kubeblocks.io/name:
      f:spec:
        .:
        f:affinity:
          .:
          f:podAntiAffinity:
          f:tenancy:
        f:clusterDefinitionRef:
        f:clusterVersionRef:
        f:componentSpecs:
          .:
          k:{"name":"mongodb"}:
            .:
            f:componentDefRef:
            f:monitor:
            f:name:
            f:noCreatePDB:
            f:replicas:
            f:resources:
              .:
              f:limits:
              f:requests:
            f:serviceAccountName:
            f:volumeClaimTemplates:
        f:terminationPolicy:
    Manager:      kubectl-create
    Operation:    Update
    Time:         2023-10-10T03:25:43Z
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:clusterDefGeneration:
        f:components:
          .:
          f:mongodb:
            .:
            f:consensusSetStatus:
              .:
              f:leader:
                .:
                f:accessMode:
                f:name:
                f:pod:
            f:phase:
            f:podsReady:
        f:conditions:
        f:observedGeneration:
        f:phase:
    Manager:      manager
    Operation:    Update
    Subresource:  status
    Time:         2023-10-10T03:41:11Z
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubeblocks.io/ops-request:
          f:kubeblocks.io/reconcile:
        f:finalizers:
          .:
          v:"cluster.kubeblocks.io/finalizer":
      f:spec:
        f:componentSpecs:
          k:{"name":"mongodb"}:
            f:classDefRef:
              .:
              f:class:
            f:resources:
              f:limits:
                f:cpu:
                f:memory:
              f:requests:
                f:cpu:
                f:memory:
        f:monitor:
        f:resources:
          .:
          f:cpu:
          f:memory:
        f:storage:
          .:
          f:size:
    Manager:         manager
    Operation:       Update
    Time:            2023-10-10T03:43:56Z
  Resource Version:  1151548
  UID:               c9b3441e-793f-4848-8568-4619bc6620a9
Spec:
  Affinity:
    Pod Anti Affinity:     Preferred
    Tenancy:               SharedNode
  Cluster Definition Ref:  mongodb
  Cluster Version Ref:     mongodb-5.0
  Component Specs:
    Class Def Ref:
      Class:
    Component Def Ref:  mongodb
    Monitor:            true
    Name:               mongodb
    No Create PDB:      false
    Replicas:           1
    Resources:
      Limits:
        Cpu:     4
        Memory:  4Gi
      Requests:
        Cpu:               4
        Memory:            4Gi
    Service Account Name:  dbname
    Volume Claim Templates:
      Name:  data
      Spec:
        Access Modes:
          ReadWriteOnce
        Resources:
          Requests:
            Storage:         5Gi
        Storage Class Name:  standard-rwo
  Monitor:
  Resources:
    Cpu:     0
    Memory:  0
  Storage:
    Size:              0
  Termination Policy:  WipeOut
Status:
  Cluster Def Generation:  2
  Components:
    Mongodb:
      Consensus Set Status:
        Leader:
          Access Mode:  None
          Name:
          Pod:          Unknown
      Phase:            Updating
      Pods Ready:       false
  Conditions:
    Last Transition Time:  2023-10-10T03:41:06Z
    Message:               VerticalScaling opsRequest: ops-verticalscaling-2c4g-v77p7 is processing
    Reason:                VerticalScaling
    Status:                False
    Type:                  LatestOpsRequestProcessed
    Last Transition Time:  2023-10-10T03:25:43Z
    Message:               The operator has started the provisioning of Cluster: mongo-ggbgx
    Observed Generation:   3
    Reason:                PreCheckSucceed
    Status:                True
    Type:                  ProvisioningStarted
    Last Transition Time:  2023-10-10T03:25:44Z
    Message:               Successfully applied for resources
    Observed Generation:   3
    Reason:                ApplyResourcesSucceed
    Status:                True
    Type:                  ApplyResources
    Last Transition Time:  2023-10-10T03:41:11Z
    Message:               pods are not ready in Components: [mongodb], refer to related component message in Cluster.status.components
    Reason:                ReplicasNotReady
    Status:                False
    Type:                  ReplicasReady
    Last Transition Time:  2023-10-10T03:41:11Z
    Message:               pods are unavailable in Components: [mongodb], refer to related component message in Cluster.status.components
    Reason:                ComponentsNotReady
    Status:                False
    Type:                  Ready
  Observed Generation:     3
  Phase:                   Updating
Events:
  Type     Reason                    Age                    From                    Message
  ----     ------                    ----                   ----                    -------
  Normal   ComponentPhaseTransition  18m                    cluster-controller      Create a new component
  Normal   AllReplicasReady          18m                    cluster-controller      all pods of components are ready, waiting for the probe detection successful
  Normal   ClusterReady              18m                    cluster-controller      Cluster: mongo-ggbgx is ready, current phase is Running
  Normal   ComponentPhaseTransition  18m                    cluster-controller      Running: true, PodsReady: true, PodsTimedout: false
  Normal   Running                   18m                    cluster-controller      Cluster: mongo-ggbgx is ready, current phase is Running
  Normal   ApplyResourcesSucceed     3m23s (x2 over 18m)    cluster-controller      Successfully applied for resources
  Normal   PreCheckSucceed           3m23s (x2 over 18m)    cluster-controller      The operator has started the provisioning of Cluster: mongo-ggbgx
  Normal   VerticalScaling           3m23s                  ops-request-controller  Start to process the VerticalScaling opsRequest "ops-verticalscaling-2c4g-v77p7" in Cluster: mongo-ggbgx
  Normal   ComponentPhaseTransition  3m23s                  cluster-controller      Component workload updated
  Normal   WaitingForProbeSuccess    3m23s (x3 over 3m23s)  cluster-controller      Waiting for probe success
  Warning  ReplicasNotReady          3m18s                  cluster-controller      pods are not ready in Components: [mongodb], refer to related component message in Cluster.status.components
  Warning  ComponentsNotReady        3m18s                  cluster-controller      pods are unavailable in Components: [mongodb], refer to related component message in Cluster.status.components
  Warning  FailedCreate              33s (x3 over 2m36s)    event-controller        create Pod mongo-ggbgx-mongodb-0 in StatefulSet mongo-ggbgx-mongodb failed error: pods "mongo-ggbgx-mongodb-0" is forbidden: exceeded quota: quota-ns-ukltji, requested: limits.cpu=4,limits.memory=4Gi, used: limits.cpu=0,limits.memory=0, limited: limits.cpu=2,limits.memory=2Gi
➜  ~ k get pod -n ns-ukltji
No resources found in ns-ukltji namespace.

ahjing99 avatar Oct 10 '23 03:10 ahjing99