kubeblocks icon indicating copy to clipboard operation
kubeblocks copied to clipboard

[BUG] redis-cluster bench data and delete all pod recover failed

Open JashBook opened this issue 3 months ago • 1 comments

Describe the bug A clear and concise description of what the bug is.

kbcli version
Kubernetes: v1.30.4-vke.4
KubeBlocks: 1.0.1
kbcli: 1.0.1

To Reproduce Steps to reproduce the behavior:

  1. create cluster
kubectl apply -f -<<EOF
apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
  name: rediscl-wizpai
  namespace: default
spec:
  terminationPolicy: Delete
  shardings:
  - name: shard
    shards: 3
    template:
      name: redis
      componentDef: redis-cluster-7-1.0.1
      serviceVersion: 7.2.4
      replicas: 2
      services:
      - name: redis-advertised
        serviceType: NodePort
        podService: true
      systemAccounts:
      - name: default
        passwordConfig:
          length: 10
          numDigits: 5
          numSymbols: 0
          letterCase: MixedCases
          seed: rediscl-wizpai
      resources:
        limits:
          cpu: 100m
          memory: 0.5Gi
        requests:
          cpu: 100m
          memory: 0.5Gi
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName: 
            accessModes:
              - ReadWriteOnce
            resources:
               requests:
                 storage: 20Gi
EOF
  1. redis bench
kubectl create -f -<<EOF
apiVersion: v1
kind: Pod
metadata:
  name: benchtest-rediscl-wizpai
  namespace: default
spec:
  containers:
    - name: test-benchmark
      imagePullPolicy: IfNotPresent
      image: apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-benchmark:latest
      args:
        - "-h"
        - "rediscl-wizpai-shard-shw-redis-advertised-0.default.svc.cluster.local"
        - "-p"
        - "6379"
        - "-a"
        - "4xI7V26Z4b"
        - "-n"
        - "5000"
        - "-c"
        - "10"

        - "--cluster"
        - "-q"
      
  restartPolicy: Never
EOF
  1. delete all cluster pod
  2. See error
kubectl get cmp -l app.kubernetes.io/instance=rediscl-wizpai
NAME                       DEFINITION              SERVICE-VERSION   STATUS    AGE
rediscl-wizpai-shard-cbv   redis-cluster-7-1.0.1   7.2.4             Running   30m
rediscl-wizpai-shard-shw   redis-cluster-7-1.0.1   7.2.4             Failed    30m
rediscl-wizpai-shard-svb   redis-cluster-7-1.0.1   7.2.4             Running   30m
➜  ~ 
➜  ~ kubectl get pod -l app.kubernetes.io/instance=rediscl-wizpai
NAME                         READY   STATUS             RESTARTS      AGE
rediscl-wizpai-shard-cbv-0   3/3     Running            0             18m
rediscl-wizpai-shard-cbv-1   3/3     Running            0             18m
rediscl-wizpai-shard-shw-0   2/3     CrashLoopBackOff   8 (64s ago)   18m
rediscl-wizpai-shard-shw-1   2/3     CrashLoopBackOff   8 (16s ago)   18m
rediscl-wizpai-shard-svb-0   3/3     Running            0             18m
rediscl-wizpai-shard-svb-1   3/3     Running            0             18m

describe crash pod

kubectl describe pod rediscl-wizpai-shard-shw-0
Name:             rediscl-wizpai-shard-shw-0
Namespace:        default
Priority:         0
Service Account:  kb-redis-cluster-7-1.0.1
Node:             192.168.0.250/192.168.0.250
Start Time:       Mon, 15 Sep 2025 15:43:19 +0800
Labels:           app.kubernetes.io/component=redis-cluster-7-1.0.1
                  app.kubernetes.io/instance=rediscl-wizpai
                  app.kubernetes.io/managed-by=kubeblocks
                  apps.kubeblocks.io/component-name=shard-shw
                  apps.kubeblocks.io/pod-name=rediscl-wizpai-shard-shw-0
                  apps.kubeblocks.io/release-phase=stable
                  apps.kubeblocks.io/service-version=7.2.4
                  apps.kubeblocks.io/sharding-name=shard
                  controller-revision-hash=544655fc48
                  kubeblocks.io/role=primary
                  workloads.kubeblocks.io/instance=rediscl-wizpai-shard-shw
                  workloads.kubeblocks.io/managed-by=InstanceSet
Annotations:      apps.kubeblocks.io/last-role-snapshot-version: 1757923255263080
                  vke.volcengine.com/cello-pod-evict-policy: allow
Status:           Running
IP:               192.168.0.17
IPs:
  IP:           192.168.0.17
Controlled By:  InstanceSet/rediscl-wizpai-shard-shw
Init Containers:
  init-dbctl:
    Container ID:  containerd://3e253d9d9d37ee69c483a0d339b3a5278b1ea11e72ce84eae0de6ca0a65f4363
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/dbctl:0.1.8
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/dbctl@sha256:af3024b9bf44b353b670938fb490b9f1e651f52785036895fed69a6bf62e9feb
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
      -r
      /bin/dbctl
      /config
      /tools/
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 15 Sep 2025 15:43:24 +0800
      Finished:     Mon, 15 Sep 2025 15:43:24 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:      <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:  <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:     <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
    Mounts:
      /tools from tools (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
  init-kbagent:
    Container ID:  containerd://6ebecc27b398ebc034f8b0b4d45f3ddb66404482652a5691c2da45048d36c1ce
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:1.0.1
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools@sha256:6e0084ec006f707226b29e30b9e6e81d3a2d454152e1a6b4bf5dfdc60edf17c8
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
      -r
      /bin/kbagent
      /kubeblocks/
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 15 Sep 2025 15:43:25 +0800
      Finished:     Mon, 15 Sep 2025 15:43:25 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:      <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:  <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:     <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
    Mounts:
      /kubeblocks from kubeblocks (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
  kbagent-worker:
    Container ID:  containerd://15b2bb1b9f9060a15433f13a33456bd3185aed34b72887b3b64f01d00096b597
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v10
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server@sha256:f55712f253ffafedd9f403cee24bc6644a26e058d48a65037bc64b6d31a86349
    Port:          <none>
    Host Port:     <none>
    Command:
      /kubeblocks/kbagent
    Args:
      --server=false
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 15 Sep 2025 15:43:25 +0800
      Finished:     Mon, 15 Sep 2025 15:43:25 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:        <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:    <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:       <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      CURRENT_POD_NAME:          rediscl-wizpai-shard-shw-0 (v1:metadata.name)
      CURRENT_POD_IP:             (v1:status.podIP)
      CURRENT_POD_HOST_IP:        (v1:status.hostIP)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_FQDN:               $(CURRENT_POD_NAME).$(CURRENT_SHARD_COMPONENT_NAME)-headless.$(CLUSTER_NAMESPACE).svc.cluster.local
      KB_CLUSTER_COMP_NAME:      $(CURRENT_SHARD_COMPONENT_NAME)
      REDIS_LB_ADVERTISED_HOST:  $(CURRENT_SHARD_LB_ADVERTISED_HOST)
      KB_AGENT_NAMESPACE:        default (v1:metadata.namespace)
      KB_AGENT_POD_NAME:         rediscl-wizpai-shard-shw-0 (v1:metadata.name)
      KB_AGENT_POD_UID:           (v1:metadata.uid)
      KB_AGENT_NODE_NAME:         (v1:spec.nodeName)
      KB_AGENT_ACTION:           [{"name":"postProvision","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-manage.sh --post-provision  \u003e /tmp/post-provision.log 2\u003e\u00261"]},"retryPolicy":{"maxRetries":10}},{"name":"preTerminate","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-manage.sh --pre-terminate \u003e /tmp/pre-terminate.log 2\u003e\u00261"]},"retryPolicy":{"maxRetries":10}},{"name":"switchover","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-switchover.sh  \u003e /tmp/switchover.log 2\u003e\u00261"]}},{"name":"memberLeave","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-replica-member-leave.sh \u003e /tmp/member-leave.log 2\u003e\u00261"]},"retryPolicy":{"maxRetries":10}},{"name":"roleProbe","exec":{"command":["/tools/dbctl","--config-path","/tools/config/dbctl/components","redis","getrole"]},"timeoutSeconds":1}]
      KB_AGENT_PROBE:            [{"instance":"rediscl-wizpai-shard-shw","action":"roleProbe","periodSeconds":1}]
    Mounts:
      /data from data (rw)
      /etc/conf from redis-cluster-config (rw)
      /etc/redis from redis-conf (rw)
      /kubeblocks from kubeblocks (rw)
      /scripts from scripts (rw)
      /tools from tools (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
  install-config-manager-tool:
    Container ID:  containerd://9d5ea6fe2aa1b36668f99906f4ecda4724e8c162c4449e97aad695b6afc072ee
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:1.0.1
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools@sha256:6e0084ec006f707226b29e30b9e6e81d3a2d454152e1a6b4bf5dfdc60edf17c8
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
      /bin/reloader
      /kb_tools
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 15 Sep 2025 15:43:27 +0800
      Finished:     Mon, 15 Sep 2025 15:43:27 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:      <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:  <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:     <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
    Mounts:
      /etc/conf from redis-cluster-config (rw)
      /kb_tools from kb-tools (rw)
      /opt/config-manager from config-manager-config (rw)
      /opt/kb-tools/reload/redis-cluster-config from cm-script-redis-cluster-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
Containers:
  redis-cluster:
    Container ID:  containerd://87a35cd5d22c4863697fe255fa5be167a45a2df07709ccb91913d21162af569d
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v10
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server@sha256:f55712f253ffafedd9f403cee24bc6644a26e058d48a65037bc64b6d31a86349
    Ports:         6379/TCP, 16379/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /scripts/redis-cluster-server-start.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 15 Sep 2025 16:00:54 +0800
      Finished:     Mon, 15 Sep 2025 16:01:02 +0800
    Ready:          False
    Restart Count:  8
    Limits:
      cpu:                        100m
      memory:                     512Mi
      vke.volcengine.com/eni-ip:  1
    Requests:
      cpu:                        100m
      memory:                     512Mi
      vke.volcengine.com/eni-ip:  1
    Readiness:                    exec [sh -c /scripts/redis-ping.sh] delay=10s timeout=5s period=5s #success=1 #failure=5
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:      <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:  <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:     <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      CURRENT_POD_NAME:        rediscl-wizpai-shard-shw-0 (v1:metadata.name)
      CURRENT_POD_IP:           (v1:status.podIP)
      CURRENT_POD_HOST_IP:      (v1:status.hostIP)
      POD_FQDN:                $(CURRENT_POD_NAME).$(CURRENT_SHARD_COMPONENT_NAME)-headless.$(CLUSTER_NAMESPACE).svc.cluster.local
    Mounts:
      /data from data (rw)
      /etc/conf from redis-cluster-config (rw)
      /etc/redis from redis-conf (rw)
      /kb_tools from kb-tools (rw)
      /scripts from scripts (rw)
      /tools from tools (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
  kbagent:
    Container ID:  containerd://46152cb497d10e12ec44b87c24c484e21aa48da4e2bee816162aa65701de52ef
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v10
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server@sha256:f55712f253ffafedd9f403cee24bc6644a26e058d48a65037bc64b6d31a86349
    Ports:         3501/TCP, 3502/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /kubeblocks/kbagent
    Args:
      --port
      3501
      --streaming-port
      3502
    State:          Running
      Started:      Mon, 15 Sep 2025 15:43:28 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Startup:   tcp-socket :3501 delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:        <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:    <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:       <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      CURRENT_POD_NAME:          rediscl-wizpai-shard-shw-0 (v1:metadata.name)
      CURRENT_POD_IP:             (v1:status.podIP)
      CURRENT_POD_HOST_IP:        (v1:status.hostIP)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_FQDN:               $(CURRENT_POD_NAME).$(CURRENT_SHARD_COMPONENT_NAME)-headless.$(CLUSTER_NAMESPACE).svc.cluster.local
      KB_CLUSTER_COMP_NAME:      $(CURRENT_SHARD_COMPONENT_NAME)
      REDIS_LB_ADVERTISED_HOST:  $(CURRENT_SHARD_LB_ADVERTISED_HOST)
      KB_AGENT_NAMESPACE:        default (v1:metadata.namespace)
      KB_AGENT_POD_NAME:         rediscl-wizpai-shard-shw-0 (v1:metadata.name)
      KB_AGENT_POD_UID:           (v1:metadata.uid)
      KB_AGENT_NODE_NAME:         (v1:spec.nodeName)
      KB_AGENT_ACTION:           [{"name":"postProvision","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-manage.sh --post-provision  \u003e /tmp/post-provision.log 2\u003e\u00261"]},"retryPolicy":{"maxRetries":10}},{"name":"preTerminate","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-manage.sh --pre-terminate \u003e /tmp/pre-terminate.log 2\u003e\u00261"]},"retryPolicy":{"maxRetries":10}},{"name":"switchover","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-switchover.sh  \u003e /tmp/switchover.log 2\u003e\u00261"]}},{"name":"memberLeave","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-replica-member-leave.sh \u003e /tmp/member-leave.log 2\u003e\u00261"]},"retryPolicy":{"maxRetries":10}},{"name":"roleProbe","exec":{"command":["/tools/dbctl","--config-path","/tools/config/dbctl/components","redis","getrole"]},"timeoutSeconds":1}]
      KB_AGENT_PROBE:            [{"instance":"rediscl-wizpai-shard-shw","action":"roleProbe","periodSeconds":1}]
    Mounts:
      /data from data (rw)
      /etc/conf from redis-cluster-config (rw)
      /etc/redis from redis-conf (rw)
      /kb_tools from kb-tools (rw)
      /kubeblocks from kubeblocks (rw)
      /scripts from scripts (rw)
      /tools from tools (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
  config-manager:
    Container ID:  containerd://76798595c2e4f6cd804deada953e93a8e71ec8742f92eb5bb5a7c12579f25569
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v18
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server@sha256:02c6fbcb146638f2943b501ab184d5ebff2c11838c6057f2838d92bd0ab9ee9d
    Port:          9901/TCP
    Host Port:     0/TCP
    Command:
      env
    Args:
      PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:$(TOOLS_PATH)
      /kb_tools/reloader
      --log-level
      info
      --operator-update-enable
      --tcp
      9901
      --config
      /opt/config-manager/config-manager.yaml
    State:          Running
      Started:      Mon, 15 Sep 2025 15:43:28 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:      <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:  <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:     <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      CONFIG_MANAGER_POD_IP:    (v1:status.podIP)
      TOOLS_PATH:              /opt/kb-tools/reload/redis-cluster-config:/opt/config-manager:/kb_tools
    Mounts:
      /etc/conf from redis-cluster-config (rw)
      /kb_tools from kb-tools (rw)
      /opt/config-manager from config-manager-config (rw)
      /opt/kb-tools/reload/redis-cluster-config from cm-script-redis-cluster-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-rediscl-wizpai-shard-shw-0
    ReadOnly:   false
  redis-conf:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  tools:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kubeblocks:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  redis-cluster-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      rediscl-wizpai-shard-shw-redis-cluster-config
    Optional:  false
  redis-metrics-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      rediscl-wizpai-shard-shw-redis-metrics-config
    Optional:  false
  scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      rediscl-wizpai-shard-shw-redis-cluster-scripts
    Optional:  false
  cm-script-redis-cluster-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      sidecar-redis-reload-tools-script-rediscl-wizpai
    Optional:  false
  config-manager-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      sidecar-rediscl-wizpai-shard-shw-config-manager-config
    Optional:  false
  kb-tools:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kube-api-access-fdblv:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  19m                   default-scheduler  Successfully assigned default/rediscl-wizpai-shard-shw-0 to 192.168.0.250
  Normal   Pulled     19m                   kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/dbctl:0.1.8" already present on machine
  Normal   Created    19m                   kubelet            Created container init-dbctl
  Normal   Started    19m                   kubelet            Started container init-dbctl
  Normal   Pulled     19m                   kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:1.0.1" already present on machine
  Normal   Created    19m                   kubelet            Created container init-kbagent
  Normal   Started    19m                   kubelet            Started container init-kbagent
  Normal   Pulled     19m                   kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v10" already present on machine
  Normal   Created    19m                   kubelet            Created container kbagent-worker
  Normal   Started    19m                   kubelet            Started container kbagent-worker
  Normal   Pulled     19m                   kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:1.0.1" already present on machine
  Normal   Created    19m                   kubelet            Created container install-config-manager-tool
  Normal   Started    19m                   kubelet            Started container install-config-manager-tool
  Normal   Pulled     19m                   kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v18" already present on machine
  Normal   Pulled     19m                   kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v10" already present on machine
  Normal   Created    19m                   kubelet            Created container kbagent
  Normal   Started    19m                   kubelet            Started container kbagent
  Normal   Created    19m                   kubelet            Created container config-manager
  Normal   Started    19m                   kubelet            Started container config-manager
  Normal   roleProbe  19m (x10 over 2m15s)  kbagent            {"instance":"rediscl-wizpai-shard-shw","probe":"roleProbe","code":0,"output":"cHJpbWFyeQ=="}
  Normal   roleProbe  19m                   kbagent            {"instance":"rediscl-wizpai-shard-shw","probe":"roleProbe","code":0,"output":"c2Vjb25kYXJ5"}
  Normal   Pulled     18m (x2 over 19m)     kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v10" already present on machine
  Normal   Created    18m (x2 over 19m)     kubelet            Created container redis-cluster
  Normal   Started    18m (x2 over 19m)     kubelet            Started container redis-cluster
  Normal   roleProbe  18m (x1080 over 0s)   kbagent            {"instance":"rediscl-wizpai-shard-shw","probe":"roleProbe","code":-1,"output":"cHJpbWFyeQ==","message":"exit code: 1: failed"}
  Warning  BackOff    4m46s (x67 over 18m)  kubelet            Back-off restarting failed container redis-cluster in pod rediscl-wizpai-shard-shw-0_default(d16ed53d-7cd1-4cdd-a488-bacbf9180a89)

logs error pod

kubectl logs rediscl-wizpai-shard-shw-0 --tail 50
Defaulted container "redis-cluster" out of: redis-cluster, kbagent, config-manager, init-dbctl (init), init-kbagent (init), kbagent-worker (init), install-config-manager-tool (init)
+ send_cluster_meet_with_retry 192.168.0.250 31267 192.168.0.226 30228 30325
+ local primary_endpoint=192.168.0.250
+ local primary_port=31267
+ local announce_ip=192.168.0.226
+ local announce_port=30228
+ local announce_bus_port=30325
++ call_func_with_retry 3 2 send_cluster_meet 192.168.0.250 31267 192.168.0.226 30228 30325
++ local max_retries=3
++ local retry_interval=2
++ local function_name=send_cluster_meet
++ shift 3
++ local retries=0
++ true
++ send_cluster_meet 192.168.0.250 31267 192.168.0.226 30228 30325
++ local primary_endpoint=192.168.0.250
++ local primary_port=31267
++ local announce_ip=192.168.0.226
++ local announce_port=30228
++ local announce_bus_port=30325
++ unset_xtrace_when_ut_mode_false
++ '[' false == false ']'
++ set +x
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.0.250:31267: Connection refused
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes
Function 'send_cluster_meet' failed in 1 times. Retrying in 2 seconds...
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.0.250:31267: Connection refused
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes
Function 'send_cluster_meet' failed in 2 times. Retrying in 2 seconds...
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.0.250:31267: Connection refused
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes
Function 'send_cluster_meet' failed after 3 retries.
+ send_cluster_meet_result='check and correct other primary nodes meet command: redis-cli -h 192.168.0.250 -p 31267 -a ******** cluster meet 192.168.0.226 30228 30325
check and correct other primary nodes meet command: redis-cli -h 192.168.0.250 -p 31267 -a ******** cluster meet 192.168.0.226 30228 30325
check and correct other primary nodes meet command: redis-cli -h 192.168.0.250 -p 31267 -a ******** cluster meet 192.168.0.226 30228 30325'
+ status=1
+ '[' 1 -ne 0 ']'
+ echo 'Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes after retry'
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes after retry
+ return 1
+ echo 'Failed to meet the node 192.168.0.226'
Failed to meet the node 192.168.0.226
+ shutdown_redis_server 6379
+ local service_port=6379
+ unset_xtrace_when_ut_mode_false
+ '[' false == false ']'
+ set +x
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
kubectl logs rediscl-wizpai-shard-shw-1 --tail 50
Defaulted container "redis-cluster" out of: redis-cluster, kbagent, config-manager, init-dbctl (init), init-kbagent (init), kbagent-worker (init), install-config-manager-tool (init)
+ send_cluster_meet_with_retry 192.168.0.85 30656 192.168.0.226 30228 30325
+ local primary_endpoint=192.168.0.85
+ local primary_port=30656
+ local announce_ip=192.168.0.226
+ local announce_port=30228
+ local announce_bus_port=30325
++ call_func_with_retry 3 2 send_cluster_meet 192.168.0.85 30656 192.168.0.226 30228 30325
++ local max_retries=3
++ local retry_interval=2
++ local function_name=send_cluster_meet
++ shift 3
++ local retries=0
++ true
++ send_cluster_meet 192.168.0.85 30656 192.168.0.226 30228 30325
++ local primary_endpoint=192.168.0.85
++ local primary_port=30656
++ local announce_ip=192.168.0.226
++ local announce_port=30228
++ local announce_bus_port=30325
++ unset_xtrace_when_ut_mode_false
++ '[' false == false ']'
++ set +x
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.0.85:30656: Connection refused
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes
Function 'send_cluster_meet' failed in 1 times. Retrying in 2 seconds...
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.0.85:30656: Connection refused
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes
Function 'send_cluster_meet' failed in 2 times. Retrying in 2 seconds...
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.0.85:30656: Connection refused
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes
Function 'send_cluster_meet' failed after 3 retries.
+ send_cluster_meet_result='check and correct other primary nodes meet command: redis-cli -h 192.168.0.85 -p 30656 -a ******** cluster meet 192.168.0.226 30228 30325
check and correct other primary nodes meet command: redis-cli -h 192.168.0.85 -p 30656 -a ******** cluster meet 192.168.0.226 30228 30325
check and correct other primary nodes meet command: redis-cli -h 192.168.0.85 -p 30656 -a ******** cluster meet 192.168.0.226 30228 30325'
+ status=1
+ '[' 1 -ne 0 ']'
+ echo 'Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes after retry'
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes after retry
+ return 1
+ echo 'Failed to meet the node 192.168.0.226'
Failed to meet the node 192.168.0.226
+ shutdown_redis_server 6379
+ local service_port=6379
+ unset_xtrace_when_ut_mode_false
+ '[' false == false ']'
+ set +x
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
kubectl logs rediscl-wizpai-shard-shw-0 --tail 10 kbagent
2025-09-15T08:04:51Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:52Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:53Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:54Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:55Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:56Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:57Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:58Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:59Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:05:00Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context Add any other context about the problem here.

JashBook avatar Sep 15 '25 08:09 JashBook

This issue has been marked as stale because it has been open for 30 days with no activity

github-actions[bot] avatar Oct 20 '25 00:10 github-actions[bot]