kubeblocks icon indicating copy to clipboard operation
kubeblocks copied to clipboard

[BUG]ob primary/secondary switchover failed

Open ahjing99 opened this issue 10 months ago • 1 comments

➜ ~ kbcli version Kubernetes: v1.27.8-gke.1067004 KubeBlocks: 0.9.0-beta.5 kbcli: 0.9.0-beta.1

# Add Helm repo 
helm repo add kubeblocks-addons https://apecloud.github.io/helm-charts
# If github is not accessible or very slow for you, please use following repo instead
helm repo add kubeblocks-addons https://jihulab.com/api/v4/projects/150246/packages/helm/stable
# Update helm repo
helm repo update

# Enable oceanbase 
helm upgrade -i oceanbase-ce kubeblocks-addons/oceanbase-ce --version 0.9.0 -n kb-system  
  1. create cluster k apply -f cluster.yaml
apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
metadata:
  name: oceanbase-cluster1
  namespace: default
  annotations:
    #Specify how many clusters to create with-in one ob cluster, set to 2 when creating a primary and secondary cluster
    "kubeblocks.io/extra-env": "{\"TENANT_NAME\":\"obtenant\",\"ZONE_COUNT\":\"1\",\"OB_CLUSTERS_COUNT\":\"2\",\"TENANT_CPU\":\"2\",\"TENANT_MEMORY\":\"2G\",\"TENANT_DISK\":\"5G\"}"
spec:
  # Specifies the cluster termination policy.
  # - DoNotTerminate will block delete operation.
  # - Halt will delete workload resources such as statefulset, deployment workloads but keep PVCs.
  # - Delete is based on Halt and deletes PVCs.
  # - WipeOut is based on Delete and wipe out all volume snapshots and snapshot data from backup storage location.
  terminationPolicy: Delete
  # The cluster-level configuration is used as the default configuration of all components;
  # if the affinity and tolerations exists in a component, the component-level configuration
  # will take effect and cover the default cluster-level configuration
  affinity:
    # Specifies the anti-affinity level of pods within a component.
    # - Preferred
    # - Required
    podAntiAffinity: Preferred
    # Represents the key of node labels.
    topologyKeys:
      - kubernetes.io/hostname
    # Defines how pods are distributed across nodes.
    # - SharedNode
    # - DedicatedNode
    tenancy: SharedNode
  # Attached to tolerate any taint that matches the triple `key,value,effect` using the matching operator `operator`.
  tolerations:
    - key: kb-data
      operator: Equal
      value: "true"
      effect: NoSchedule
  # List of componentSpec used to define the components that make up a cluster.
  # ComponentSpecs and ShardingSpecs cannot both be empty at the same time.
  # ClusterComponentSpec defines the specifications for a cluster component.
  componentSpecs:
      # Specifies the name of the cluster's component.
      # This name is also part of the Service DNS name and must comply with the IANA Service Naming rule.
    - name: ob-ce-0
      # References the componentDef defined in the ClusterDefinition spec. Must comply with the IANA Service Naming rule.
      # - ob-ce-repl, will use container network
      # - ob-ce-repl-host, will use host network
      componentDef: ob-ce-repl
      # Specifies the number of component replicas.
      replicas: 1
      # Specifies the resources requests and limits of the workload.
      resources:
        limits:
          cpu: "3"
          memory: "8Gi"
        requests:
          cpu: "3"
          memory: "8Gi"
      # Provides information for statefulset.spec.volumeClaimTemplates.
      volumeClaimTemplates:
        # Refers to `clusterDefinition.spec.componentDefs.containers.volumeMounts.name`.
        - name: data-file
          spec:
            # Contains the desired access modes the volume should have.
            accessModes:
              - ReadWriteOnce
            # Represents the minimum resources the volume should have.
            resources:
              requests:
                storage: 50Gi
        - name: data-log
          spec:
            # Contains the desired access modes the volume should have.
            accessModes:
              - ReadWriteOnce
            # Represents the minimum resources the volume should have.
            resources:
              requests:
                storage: 50Gi
        - name: log
          spec:
            # Contains the desired access modes the volume should have.
            accessModes:
              - ReadWriteOnce
            # Represents the minimum resources the volume should have.
            resources:
              requests:
                storage: 10Gi
        - name: workdir
          spec:
            # Contains the desired access modes the volume should have.
            accessModes:
              - ReadWriteOnce
            # Represents the minimum resources the volume should have.
            resources:
              requests:
                storage: 20Gi
    - name: ob-ce-1
      # References the componentDef defined in the ClusterDefinition spec. Must comply with the IANA Service Naming rule.
      # - ob-ce-repl, will use container network
      # - ob-ce-repl-host, will use host network
      componentDef: ob-ce-repl
      # Specifies the number of component replicas.
      replicas: 1
      # Specifies the resources requests and limits of the workload.
      resources:
        limits:
          cpu: "3"
          memory: "8Gi"
        requests:
          cpu: "3"
          memory: "8Gi"
      # Provides information for statefulset.spec.volumeClaimTemplates.
      volumeClaimTemplates:
        # Refers to `clusterDefinition.spec.componentDefs.containers.volumeMounts.name`.
        - name: data-file
          spec:
            # Contains the desired access modes the volume should have.
            accessModes:
              - ReadWriteOnce
            # Represents the minimum resources the volume should have.
            resources:
              requests:
                storage: 50Gi
        - name: data-log
          spec:
            # Contains the desired access modes the volume should have.
            accessModes:
              - ReadWriteOnce
            # Represents the minimum resources the volume should have.
            resources:
              requests:
                storage: 50Gi
        - name: log
          spec:
            # Contains the desired access modes the volume should have.
            accessModes:
              - ReadWriteOnce
            # Represents the minimum resources the volume should have.
            resources:
              requests:
                storage: 10Gi
        - name: workdir
          spec:
            # Contains the desired access modes the volume should have.
            accessModes:
              - ReadWriteOnce
            # Represents the minimum resources the volume should have.
            resources:
              requests:
                storage: 20Gi
  1. Before Switchover
➜  ~ kbcli cluster describe oceanbase-cluster1
Name: oceanbase-cluster1	 Created Time: Apr 11,2024 18:27 UTC+0800
NAMESPACE   CLUSTER-DEFINITION   VERSION   STATUS    TERMINATION-POLICY
default                                    Running   Delete

Endpoints:
COMPONENT   MODE        INTERNAL                                                              EXTERNAL
ob-ce-0     ReadWrite   oceanbase-cluster1-ob-ce-0-oceanbase.default.svc.cluster.local:2881   <none>
                        oceanbase-cluster1-ob-ce-0-oceanbase.default.svc.cluster.local:2882
ob-ce-1     ReadWrite   oceanbase-cluster1-ob-ce-1-oceanbase.default.svc.cluster.local:2881   <none>
                        oceanbase-cluster1-ob-ce-1-oceanbase.default.svc.cluster.local:2882

Topology:
COMPONENT   INSTANCE                       ROLE      STATUS    AZ              NODE                                                  CREATED-TIME
ob-ce-0     oceanbase-cluster1-ob-ce-0-0   primary   Running   us-central1-c   gke-yijing-default-pool-ea930834-23wf/10.128.0.6      Apr 11,2024 18:27 UTC+0800
ob-ce-1     oceanbase-cluster1-ob-ce-1-0   standby   Running   us-central1-c   gke-yijing-default-pool-ea930834-hq9p/10.128.15.238   Apr 11,2024 18:27 UTC+0800

Resources Allocation:
COMPONENT   DEDICATED   CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)   STORAGE-SIZE     STORAGE-CLASS
ob-ce-0     false       3 / 3                8Gi / 8Gi               data-file:50Gi   kb-default-sc
                                                                     data-log:50Gi    kb-default-sc
                                                                     log:10Gi         kb-default-sc
                                                                     workdir:20Gi     kb-default-sc
ob-ce-1     false       3 / 3                8Gi / 8Gi               data-file:50Gi   kb-default-sc
                                                                     data-log:50Gi    kb-default-sc
                                                                     log:10Gi         kb-default-sc
                                                                     workdir:20Gi     kb-default-sc

Images:
COMPONENT   TYPE   IMAGE
ob-ce-0            docker.io/apecloud/oceanbase:4.2.0.0-100010032023083021
ob-ce-1            docker.io/apecloud/oceanbase:4.2.0.0-100010032023083021

Show cluster events: kbcli cluster list-events -n default oceanbase-cluster1
  1. Switchover k apply -f switchover.yaml
apiVersion: apps.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
  name: oceanbase-switchover
spec:
  # References the cluster object.
  clusterRef: oceanbase-cluster1
  # Defines the operation type.
  type: Switchover
  # Switches over the specified components.
  switchover:
    # Specifies the name of the cluster component.
  - componentName: ob-ce-1
    # If assigned "*", it signifies that no specific primary or leader is designated for the switchover
    instanceName: '*'
  1. The ops failed
➜  ~ k describe ops oceanbase-switchover
Name:         oceanbase-switchover
Namespace:    default
Labels:       app.kubernetes.io/instance=oceanbase-cluster1
              ops.kubeblocks.io/ops-type=Switchover
Annotations:  <none>
API Version:  apps.kubeblocks.io/v1alpha1
Kind:         OpsRequest
Metadata:
  Creation Timestamp:  2024-04-11T10:32:35Z
  Finalizers:
    opsrequest.kubeblocks.io/finalizer
  Generation:  1
  Managed Fields:
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
      f:spec:
        .:
        f:clusterRef:
        f:switchover:
          .:
          k:{"componentName":"ob-ce-1"}:
            .:
            f:componentName:
            f:instanceName:
        f:ttlSecondsBeforeAbort:
        f:type:
    Manager:      kubectl-client-side-apply
    Operation:    Update
    Time:         2024-04-11T10:32:35Z
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .:
          v:"opsrequest.kubeblocks.io/finalizer":
        f:labels:
          .:
          f:app.kubernetes.io/instance:
          f:ops.kubeblocks.io/ops-type:
        f:ownerReferences:
          .:
          k:{"uid":"981e35db-5cd0-4828-bae3-3b9efb075aac"}:
    Manager:      manager
    Operation:    Update
    Time:         2024-04-11T10:32:35Z
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:completionTimestamp:
        f:conditions:
          .:
          k:{"type":"Validated"}:
            .:
            f:lastTransitionTime:
            f:message:
            f:reason:
            f:status:
            f:type:
          k:{"type":"WaitForProgressing"}:
            .:
            f:lastTransitionTime:
            f:message:
            f:reason:
            f:status:
            f:type:
        f:phase:
        f:progress:
    Manager:      manager
    Operation:    Update
    Subresource:  status
    Time:         2024-04-11T10:32:35Z
  Owner References:
    API Version:     apps.kubeblocks.io/v1alpha1
    Kind:            Cluster
    Name:            oceanbase-cluster1
    UID:             981e35db-5cd0-4828-bae3-3b9efb075aac
  Resource Version:  2807965
  UID:               357aaab0-5f6f-4d9b-9165-00fa7786f503
Spec:
  Cluster Ref:  oceanbase-cluster1
  Switchover:
    Component Name:          ob-ce-1
    Instance Name:           *
  Ttl Seconds Before Abort:  0
  Type:                      Switchover
Status:
  Completion Timestamp:  2024-04-11T10:32:35Z
  Conditions:
    Last Transition Time:  2024-04-11T10:32:35Z
    Message:               wait for the controller to process the OpsRequest: oceanbase-switchover in Cluster: oceanbase-cluster1
    Reason:                WaitForProgressing
    Status:                True
    Type:                  WaitForProgressing
    Last Transition Time:  2024-04-11T10:32:35Z
    Message:               this cluster component ob-ce-1 does not support switchover
    Reason:                ValidateFailed
    Status:                False
    Type:                  Validated
  Phase:                   Failed
  Progress:                -/-
Events:
  Type     Reason              Age    From                    Message
  ----     ------              ----   ----                    -------
  Normal   WaitForProgressing  4m52s  ops-request-controller  wait for the controller to process the OpsRequest: oceanbase-switchover in Cluster: oceanbase-cluster1
  Warning  ValidateFailed      4m52s  ops-request-controller  this cluster component ob-ce-1 does not support switchover
➜  ~

ahjing99 avatar Apr 11 '24 10:04 ahjing99

still failed on Kubernetes: v1.28.7-gke.1026000 KubeBlocks: 0.9.0-beta.17 kbcli: 0.9.0-beta.4

➜ ~ helm list -Aa | grep oceanbase oceanbase-ce kb-system 1 2024-04-30 10:50:15.621908 +0800 CST deployed oceanbase-ce-0.9.0 4.2.0.0-100010032023083021

ahjing99 avatar Apr 30 '24 03:04 ahjing99

This issue has been marked as stale because it has been open for 30 days with no activity

github-actions[bot] avatar Jun 03 '24 00:06 github-actions[bot]

Cannot recreate with 0.9.0-beta.36, closing

ahjing99 avatar Jun 24 '24 10:06 ahjing99