kubeblocks icon indicating copy to clipboard operation
kubeblocks copied to clipboard

[BUG]Mysql pod fault recovery time is long

Open ahjing99 opened this issue 1 year ago • 3 comments

➜ ~ kbcli version Kubernetes: v1.26.5-gke.1200 KubeBlocks: 0.6.0-alpha.34 kbcli: 0.6.0-alpha.34

when inject io fault to both leader and follower pod, the cluster cannot be connected for a while, but when 2 pods are running and the cluster recover, it takes another 250s for the third pod to full recover.

  1. Before inject fault
➜  ~ kbcli cluster describe mycluster
Name: mycluster	 Created Time: Jul 11,2023 14:37 UTC+0800
NAMESPACE   CLUSTER-DEFINITION   VERSION           STATUS    TERMINATION-POLICY
default     apecloud-mysql       ac-mysql-8.0.30   Running   WipeOut

Endpoints:
COMPONENT   MODE        INTERNAL                                         EXTERNAL
mysql       ReadWrite   mycluster-mysql.default.svc.cluster.local:3306   <none>

Topology:
COMPONENT   INSTANCE            ROLE       STATUS    AZ              NODE                                                CREATED-TIME
mysql       mycluster-mysql-0   follower   Running   us-central1-c   gke-yjtest-default-pool-50025fd7-t1wz/10.128.0.36   Jul 12,2023 18:12 UTC+0800
mysql       mycluster-mysql-1   follower   Running   us-central1-c   gke-yjtest-default-pool-50025fd7-t1wz/10.128.0.36   Jul 12,2023 18:12 UTC+0800
mysql       mycluster-mysql-2   leader     Running   us-central1-c   gke-yjtest-default-pool-50025fd7-t1wz/10.128.0.36   Jul 12,2023 18:12 UTC+0800

Resources Allocation:
COMPONENT   DEDICATED   CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)   STORAGE-SIZE   STORAGE-CLASS
mysql       false       1 / 10               1Gi / 20Gi              data:20Gi      standard-rwo

Images:
COMPONENT   TYPE    IMAGE
mysql       mysql   registry.cn-hangzhou.aliyuncs.com/apecloud/apecloud-mysql-server:8.0.30-5.alpha9.20230606.gf80d546.9

Data Protection:
AUTO-BACKUP   BACKUP-SCHEDULE   TYPE   BACKUP-TTL   LAST-SCHEDULE   RECOVERABLE-TIME

Show cluster events: kbcli cluster list-events -n default mycluster
  1. Inject io fault to 1 leader and 1 follower pod
➜  ~ kbcli fault io errno mycluster-mysql-2 mycluster-mysql-0 --ns-fault=default --volume-path=/data/mysql  --errno=6 --duration=2m
IOChaos io-chaos-c5wq2 created

➜  ~  k get pod -o wide| grep mycluster
mycluster-mysql-0   4/5     Error              8 (10m ago)   37m     10.104.3.25   gke-yjtest-default-pool-50025fd7-t1wz   <none>           <none>
mycluster-mysql-1   5/5     Running            6 (10m ago)   37m     10.104.3.24   gke-yjtest-default-pool-50025fd7-t1wz   <none>           <none>
mycluster-mysql-2   4/5     CrashLoopBackOff   4 (9s ago)    37m     10.104.3.26   gke-yjtest-default-pool-50025fd7-t1wz   <none>           <none>
  1. After 23 seconds, mycluster-mysql-2 recovered, and the cluster can be connected
➜  ~  k get pod -o wide| grep mycluster
mycluster-mysql-0   4/5     CrashLoopBackOff   8 (22s ago)   37m     10.104.3.25   gke-yjtest-default-pool-50025fd7-t1wz   <none>           <none>
mycluster-mysql-1   5/5     Running            6 (11m ago)   37m     10.104.3.24   gke-yjtest-default-pool-50025fd7-t1wz   <none>           <none>
mycluster-mysql-2   5/5     Running            5 (24s ago)   37m     10.104.3.26   gke-yjtest-default-pool-50025fd7-t1wz   <none>           <none>

^@Connect cluster 2023-07-12 18:49:47
Fail to connect cluster 2023-07-12 18:49:50
runningToStopTime - 2023-07-12 18:49:50
Fail to connect cluster 2023-07-12 18:49:53
Fail to connect cluster 2023-07-12 18:49:56
Fail to connect cluster 2023-07-12 18:49:59
Fail to connect cluster 2023-07-12 18:50:01
6---6
Connect cluster 2023-07-12 18:50:13
runningToStopTime - 2023-07-12 18:49:50
stopToRunningTime - 2023-07-12 18:50:13
Time interval since MySQL started: 23 seconds
  1. It takes about another 250s (273-23) for mycluster-mysql-0 pod and cluster turn to running finally
cluster_status:Abnormal
check cluster status done
cluster_status:Running
check pod status
check pod status done
fullRecoveryTime is: 273

➜  ~ k describe pod mycluster-mysql-0
Name:         mycluster-mysql-0
Namespace:    default
Priority:     0
Node:         gke-yjtest-default-pool-50025fd7-t1wz/10.128.0.36
Start Time:   Wed, 12 Jul 2023 18:17:13 +0800
Labels:       app.kubernetes.io/component=mysql
              app.kubernetes.io/instance=mycluster
              app.kubernetes.io/managed-by=kubeblocks
              app.kubernetes.io/name=apecloud-mysql
              app.kubernetes.io/version=ac-mysql-8.0.30
              apps.kubeblocks.io/component-name=mysql
              apps.kubeblocks.io/workload-type=Consensus
              controller-revision-hash=mycluster-mysql-bd48d986b
              cs.apps.kubeblocks.io/access-mode=Readonly
              kubeblocks.io/role=follower
              statefulset.kubernetes.io/pod-name=mycluster-mysql-0
Annotations:  apps.kubeblocks.io/component-replicas: 3
              cloud.google.com/cluster_autoscaler_unhelpable_since: 2023-07-12T10:12:36+0000
              cloud.google.com/cluster_autoscaler_unhelpable_until: Inf
              cs.apps.kubeblocks.io/leader: mycluster-mysql-1
Status:       Running
IP:           10.104.3.25
IPs:
  IP:           10.104.3.25
Controlled By:  StatefulSet/mycluster-mysql
Containers:
  mysql:
    Container ID:  containerd://b580f05a93059f4f858d9361f98bcce0fce0e4bea2b288ce4ca8268bc1aec906
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/apecloud-mysql-server:8.0.30-5.alpha9.20230606.gf80d546.9
    Image ID:      registry.cn-hangzhou.aliyuncs.com/apecloud/apecloud-mysql-server@sha256:e0aab7adb30f883f9b38d1d47d0456e50b9eee688f0d70624cc804151775e18c
    Ports:         3306/TCP, 13306/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /scripts/setup.sh
    State:          Running
      Started:      Wed, 12 Jul 2023 18:54:17 +0800
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Wed, 12 Jul 2023 18:44:11 +0800
      Finished:     Wed, 12 Jul 2023 18:49:50 +0800
    Ready:          True
    Restart Count:  9
    Limits:
      cpu:     10
      memory:  20Gi
    Requests:
      cpu:     1
      memory:  1Gi
    Environment Variables from:
      mycluster-mysql-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:               mycluster-mysql-0 (v1:metadata.name)
      KB_POD_UID:                 (v1:metadata.uid)
      KB_NAMESPACE:              default (v1:metadata.namespace)
      KB_SA_NAME:                 (v1:spec.serviceAccountName)
      KB_NODENAME:                (v1:spec.nodeName)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_IP:                  (v1:status.podIP)
      KB_POD_IPS:                 (v1:status.podIPs)
      KB_HOSTIP:                  (v1:status.hostIP)
      KB_PODIP:                   (v1:status.podIP)
      KB_PODIPS:                  (v1:status.podIPs)
      KB_CLUSTER_NAME:           mycluster
      KB_COMP_NAME:              mysql
      KB_CLUSTER_COMP_NAME:      mycluster-mysql
      KB_CLUSTER_UID_POSTFIX_8:  50a12a39
      KB_POD_FQDN:               $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      MYSQL_ROOT_HOST:           %
      MYSQL_ROOT_USER:           <set to the key 'username' in secret 'mycluster-conn-credential'>  Optional: false
      MYSQL_ROOT_PASSWORD:       <set to the key 'password' in secret 'mycluster-conn-credential'>  Optional: false
      MYSQL_DATABASE:            mydb
      MYSQL_USER:                u1
      MYSQL_PASSWORD:            u1
      CLUSTER_ID:                1
      CLUSTER_START_INDEX:       1
      REPLICATION_USER:          replicator
      REPLICATION_PASSWORD:
      MYSQL_TEMPLATE_CONFIG:
      MYSQL_CUSTOM_CONFIG:
      MYSQL_DYNAMIC_CONFIG:
      KB_EMBEDDED_WESQL:         1
    Mounts:
      /data/mysql from data (rw)
      /etc/annotations from annotations (rw)
      /opt/mysql from mysql-config (rw)
      /scripts from scripts (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xp76w (ro)
  metrics:
    Container ID:  containerd://9bb16f0bc653cc60763c6d69daa907f0531a6c5d4f079baf4c8c2296015021b9
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/agamotto:0.1.2-beta.1
    Image ID:      registry.cn-hangzhou.aliyuncs.com/apecloud/agamotto@sha256:cbab349b90490807a8d5039bf01bc7e37334f20c98c7dd75bc7fc4cf9e5b10ee
    Port:          9104/TCP
    Host Port:     0/TCP
    Command:
      /bin/agamotto
      --config=/opt/agamotto/agamotto-config.yaml
    State:          Running
      Started:      Wed, 12 Jul 2023 18:17:25 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      mycluster-mysql-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:               mycluster-mysql-0 (v1:metadata.name)
      KB_POD_UID:                 (v1:metadata.uid)
      KB_NAMESPACE:              default (v1:metadata.namespace)
      KB_SA_NAME:                 (v1:spec.serviceAccountName)
      KB_NODENAME:                (v1:spec.nodeName)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_IP:                  (v1:status.podIP)
      KB_POD_IPS:                 (v1:status.podIPs)
      KB_HOSTIP:                  (v1:status.hostIP)
      KB_PODIP:                   (v1:status.podIP)
      KB_PODIPS:                  (v1:status.podIPs)
      KB_CLUSTER_NAME:           mycluster
      KB_COMP_NAME:              mysql
      KB_CLUSTER_COMP_NAME:      mycluster-mysql
      KB_CLUSTER_UID_POSTFIX_8:  50a12a39
      KB_POD_FQDN:               $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      DB_TYPE:                   MySQL
      ENDPOINT:                  localhost:3306
      MYSQL_USER:                <set to the key 'username' in secret 'mycluster-conn-credential'>  Optional: false
      MYSQL_PASSWORD:            <set to the key 'password' in secret 'mycluster-conn-credential'>  Optional: false
    Mounts:
      /data/mysql from data (rw)
      /opt/agamotto from agamotto-configuration (rw)
      /var/log/kubeblocks from log-data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xp76w (ro)
  vttablet:
    Container ID:  containerd://dd6b6b68fd23c760882f88d5eb973e3fb175fae2f2f003bc9b9f514667b8a26d
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/apecloud-mysql-scale:latest
    Image ID:      registry.cn-hangzhou.aliyuncs.com/apecloud/apecloud-mysql-scale@sha256:3d084792864fc3a2c3b470ebb5c8366db3fea40f74d7b3425783b26813615f86
    Ports:         15100/TCP, 16100/TCP, 40000/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Command:
      /scripts/vttablet.sh
    State:          Running
      Started:      Wed, 12 Jul 2023 18:17:26 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      mycluster-mysql-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:               mycluster-mysql-0 (v1:metadata.name)
      KB_POD_UID:                 (v1:metadata.uid)
      KB_NAMESPACE:              default (v1:metadata.namespace)
      KB_SA_NAME:                 (v1:spec.serviceAccountName)
      KB_NODENAME:                (v1:spec.nodeName)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_IP:                  (v1:status.podIP)
      KB_POD_IPS:                 (v1:status.podIPs)
      KB_HOSTIP:                  (v1:status.hostIP)
      KB_PODIP:                   (v1:status.podIP)
      KB_PODIPS:                  (v1:status.podIPs)
      KB_CLUSTER_NAME:           mycluster
      KB_COMP_NAME:              mysql
      KB_CLUSTER_COMP_NAME:      mycluster-mysql
      KB_CLUSTER_UID_POSTFIX_8:  50a12a39
      KB_POD_FQDN:               $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      CELL:                      zone1
      ETCD_SERVER:               $(KB_CLUSTER_NAME)-etcd-headless
      ETCD_PORT:                 2379
      TOPOLOGY_FLAGS:            --topo_implementation etcd2 --topo_global_server_address $(ETCD_SERVER):$(ETCD_PORT) --topo_global_root /vitess/global
      VTTABLET_PORT:             15100
      VTTABLET_GRPC_PORT:        16100
      VTCTLD_HOST:               $(KB_CLUSTER_NAME)-vtctld-headless
      VTCTLD_WEB_PORT:           15000
      MYSQL_ROOT_USER:           <set to the key 'username' in secret 'mycluster-conn-credential'>  Optional: false
      MYSQL_ROOT_PASSWORD:       <set to the key 'password' in secret 'mycluster-conn-credential'>  Optional: false
    Mounts:
      /conf from mysql-scale-config (rw)
      /scripts from scripts (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xp76w (ro)
  kb-checkrole:
    Container ID:  containerd://c3f2f093783180c6c415fd87b29909fabdef6fbc6bf3a3e5ee8375e14fdb984b
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.6.0-alpha.34
    Image ID:      registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools@sha256:f33759f2607a8e50fe8b63e181c41e492d536a25a066e853ee08db3a40ffb92f
    Ports:         3501/TCP, 50001/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      probe
      --app-id
      batch-sdk
      --dapr-http-port
      3501
      --dapr-grpc-port
      50001
      --log-level
      info
      --config
      /config/probe/config.yaml
      --components-path
      /config/probe/components
    State:          Running
      Started:      Wed, 12 Jul 2023 18:17:26 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:      0
      memory:   0
    Readiness:  http-get http://:3501/v1.0/bindings/mysql%3Foperation=checkRole delay=0s timeout=1s period=1s #success=1 #failure=2
    Startup:    tcp-socket :3501 delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      mycluster-mysql-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:                mycluster-mysql-0 (v1:metadata.name)
      KB_POD_UID:                  (v1:metadata.uid)
      KB_NAMESPACE:               default (v1:metadata.namespace)
      KB_SA_NAME:                  (v1:spec.serviceAccountName)
      KB_NODENAME:                 (v1:spec.nodeName)
      KB_HOST_IP:                  (v1:status.hostIP)
      KB_POD_IP:                   (v1:status.podIP)
      KB_POD_IPS:                  (v1:status.podIPs)
      KB_HOSTIP:                   (v1:status.hostIP)
      KB_PODIP:                    (v1:status.podIP)
      KB_PODIPS:                   (v1:status.podIPs)
      KB_CLUSTER_NAME:            mycluster
      KB_COMP_NAME:               mysql
      KB_CLUSTER_COMP_NAME:       mycluster-mysql
      KB_CLUSTER_UID_POSTFIX_8:   50a12a39
      KB_POD_FQDN:                $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      KB_SERVICE_USER:            <set to the key 'username' in secret 'mycluster-conn-credential'>  Optional: false
      KB_SERVICE_PASSWORD:        <set to the key 'password' in secret 'mycluster-conn-credential'>  Optional: false
      KB_SERVICE_PORT:            3306
      KB_SERVICE_ROLES:           {"follower":"Readonly","leader":"ReadWrite","learner":"Readonly"}
      KB_SERVICE_CHARACTER_TYPE:  mysql
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xp76w (ro)
  config-manager:
    Container ID:  containerd://5902854e683f293d5371394d74a2817ae3f45e57fe8216f4c5eadb5799f6dc11
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.6.0-alpha.34
    Image ID:      registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools@sha256:f33759f2607a8e50fe8b63e181c41e492d536a25a066e853ee08db3a40ffb92f
    Port:          <none>
    Host Port:     <none>
    Command:
      env
    Args:
      PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:$(TOOLS_PATH)
      /bin/reloader
      --log-level
      info
      --volume-dir
      /conf
      --operator-update-enable
      --tcp
      9901
      --config
      /opt/config-manager/config-manager.yaml
    State:          Running
      Started:      Wed, 12 Jul 2023 18:17:27 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      mycluster-mysql-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:               mycluster-mysql-0 (v1:metadata.name)
      KB_POD_UID:                 (v1:metadata.uid)
      KB_NAMESPACE:              default (v1:metadata.namespace)
      KB_SA_NAME:                 (v1:spec.serviceAccountName)
      KB_NODENAME:                (v1:spec.nodeName)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_IP:                  (v1:status.podIP)
      KB_POD_IPS:                 (v1:status.podIPs)
      KB_HOSTIP:                  (v1:status.hostIP)
      KB_PODIP:                   (v1:status.podIP)
      KB_PODIPS:                  (v1:status.podIPs)
      KB_CLUSTER_NAME:           mycluster
      KB_COMP_NAME:              mysql
      KB_CLUSTER_COMP_NAME:      mycluster-mysql
      KB_CLUSTER_UID_POSTFIX_8:  50a12a39
      KB_POD_FQDN:               $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      CONFIG_MANAGER_POD_IP:      (v1:status.podIP)
      DB_TYPE:                   mysql
      MYSQL_USER:                <set to the key 'username' in secret 'mycluster-conn-credential'>  Optional: false
      MYSQL_PASSWORD:            <set to the key 'password' in secret 'mycluster-conn-credential'>  Optional: false
      DATA_SOURCE_NAME:          $(MYSQL_USER):$(MYSQL_PASSWORD)@(localhost:3306)/
      TOOLS_PATH:                /opt/kb-tools/reload/mysql-consensusset-config:/opt/config-manager
    Mounts:
      /conf from mysql-scale-config (rw)
      /opt/config-manager from config-manager-config (rw)
      /opt/kb-tools/reload/mysql-consensusset-config from cm-script-mysql-consensusset-config (rw)
      /opt/mysql from mysql-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xp76w (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-mycluster-mysql-0
    ReadOnly:   false
  log-data:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/kubeblocks
    HostPathType:  DirectoryOrCreate
  annotations:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.annotations['cs.apps.kubeblocks.io/leader'] -> leader
      metadata.annotations['apps.kubeblocks.io/component-replicas'] -> component-replicas
  agamotto-configuration:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      mycluster-mysql-agamotto-configuration
    Optional:  false
  scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      mycluster-mysql-apecloud-mysql-scripts
    Optional:  false
  mysql-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      mycluster-mysql-mysql-consensusset-config
    Optional:  false
  mysql-scale-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      mycluster-mysql-mysql-scale-config
    Optional:  false
  cm-script-mysql-consensusset-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      sidecar-mysql-reload-script-mycluster
    Optional:  false
  config-manager-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      sidecar-mycluster-mysql-config-manager-config
    Optional:  false
  kube-api-access-xp76w:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 kb-data=true:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                  From                     Message
  ----     ------                  ----                 ----                     -------
  Normal   NotTriggerScaleUp       43m                  cluster-autoscaler       pod didn't trigger scale-up (it wouldn't fit if a new node is added):
  Warning  FailedScheduling        43m (x4 over 43m)    default-scheduler        0/4 nodes are available: 1 node(s) had untolerated taint {kb-controller: true}, 3 node(s) had untolerated taint {node.kubernetes.io/memory-pressure: }. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling..
  Normal   Scheduled               39m                  default-scheduler        Successfully assigned default/mycluster-mysql-0 to gke-yjtest-default-pool-50025fd7-t1wz
  Warning  FailedMount             39m                  kubelet                  MountVolume.SetUp failed for volume "mysql-config" : failed to sync configmap cache: timed out waiting for the condition
  Warning  FailedMount             39m                  kubelet                  MountVolume.SetUp failed for volume "scripts" : failed to sync configmap cache: timed out waiting for the condition
  Normal   SuccessfulAttachVolume  39m                  attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-f5fcacf2-56ea-48c5-9963-32d149029ddd"
  Normal   Created                 39m                  kubelet                  Created container metrics
  Normal   Pulled                  39m                  kubelet                  Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/apecloud-mysql-scale:latest" already present on machine
  Normal   Started                 39m                  kubelet                  Started container metrics
  Normal   Pulled                  39m                  kubelet                  Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/agamotto:0.1.2-beta.1" already present on machine
  Normal   Pulled                  39m                  kubelet                  Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.6.0-alpha.34" already present on machine
  Normal   Started                 39m                  kubelet                  Started container kb-checkrole
  Normal   Pulled                  39m                  kubelet                  Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.6.0-alpha.34" already present on machine
  Normal   Created                 39m                  kubelet                  Created container vttablet
  Normal   Started                 39m                  kubelet                  Started container vttablet
  Normal   Created                 39m                  kubelet                  Created container kb-checkrole
  Normal   Started                 39m                  kubelet                  Started container config-manager
  Normal   Created                 39m                  kubelet                  Created container config-manager
  Normal   checkRole               38m                  sqlchannel               {"event":"Success","operation":"checkRole","originalRole":"","role":"Leader"}
  Normal   checkRole               33m                  sqlchannel               {"event":"Failed","message":"error executing select CURRENT_LEADER, ROLE, SERVER_ID  from information_schema.wesql_cluster_local: dial tcp 127.0.0.1:3306: connect: connection refused","operation":"checkRole","originalRole":"Leader"}
  Normal   checkRole               33m                  sqlchannel               {"event":"Success","operation":"checkRole","originalRole":"Leader","role":"Follower"}
  Normal   checkRole               31m                  sqlchannel               {"event":"Failed","message":"error executing select CURRENT_LEADER, ROLE, SERVER_ID  from information_schema.wesql_cluster_local: dial tcp 127.0.0.1:3306: connect: connection refused","operation":"checkRole","originalRole":"Follower"}
  Normal   Created                 31m (x3 over 39m)    kubelet                  Created container mysql
  Normal   Started                 31m (x3 over 39m)    kubelet                  Started container mysql
  Normal   Pulled                  31m (x3 over 39m)    kubelet                  Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/apecloud-mysql-server:8.0.30-5.alpha9.20230606.gf80d546.9" already present on machine
  Normal   checkRole               31m                  sqlchannel               {"event":"Failed","message":"error executing select CURRENT_LEADER, ROLE, SERVER_ID  from information_schema.wesql_cluster_local: dial tcp 127.0.0.1:3306: connect: connection refused","operation":"checkRole","originalRole":"Follower"}
  Normal   checkRole               30m                  sqlchannel               {"event":"Failed","message":"error executing select CURRENT_LEADER, ROLE, SERVER_ID  from information_schema.wesql_cluster_local: dial tcp 127.0.0.1:3306: connect: connection refused","operation":"checkRole","originalRole":"Follower"}
  Normal   checkRole               26m                  sqlchannel               {"event":"Failed","message":"error executing select CURRENT_LEADER, ROLE, SERVER_ID  from information_schema.wesql_cluster_local: dial tcp 127.0.0.1:3306: connect: connection refused","operation":"checkRole","originalRole":"Follower"}
  Normal   checkRole               23m                  sqlchannel               {"event":"Failed","message":"error executing select CURRENT_LEADER, ROLE, SERVER_ID  from information_schema.wesql_cluster_local: dial tcp 127.0.0.1:3306: connect: connection refused","operation":"checkRole","originalRole":"Follower"}
  Normal   checkRole               17m                  sqlchannel               {"event":"Failed","message":"error executing select CURRENT_LEADER, ROLE, SERVER_ID  from information_schema.wesql_cluster_local: dial tcp 127.0.0.1:3306: connect: connection refused","operation":"checkRole","originalRole":"Follower"}
  Normal   checkRole               6m39s                sqlchannel               {"event":"Failed","message":"error executing select CURRENT_LEADER, ROLE, SERVER_ID  from information_schema.wesql_cluster_local: dial tcp 127.0.0.1:3306: connect: connection refused","operation":"checkRole","originalRole":"Follower"}
  Warning  BackOff                 4m2s (x91 over 31m)  kubelet                  Back-off restarting failed container mysql in pod mycluster-mysql-0_default(8dd4fd67-9ae6-4de0-81e0-1b4dc960ac97)

k logs mycluster-mysql-0 -c kb-checkrole >kb-checkrole.txt kb-checkrole.txt

ahjing99 avatar Jul 12 '23 11:07 ahjing99

let me know if you need the shell script to track time or run test case

ahjing99 avatar Jul 12 '23 11:07 ahjing99

This issue has been marked as stale because it has been open for 30 days with no activity

github-actions[bot] avatar Aug 14 '23 00:08 github-actions[bot]

seems role can't be probed due to mysql failure

free6om avatar Apr 17 '24 03:04 free6om