kubeblocks icon indicating copy to clipboard operation
kubeblocks copied to clipboard

[BUG] redis cluster hscale out shards post-provision pod Error after benchmark

Open JashBook opened this issue 1 year ago • 0 comments

Describe the bug Instability reappears in minikube.

To Reproduce Steps to reproduce the behavior:

  1. create cluster
kubectl apply -f -<<EOF
apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
metadata:
  name: redisc-ozklak
  namespace: default
spec:
  terminationPolicy: DoNotTerminate
  shardingSpecs:
    - name: shard
      shards: 3
      template:
        name: shard-cxk
        componentDef: redis-cluster
        replicas: 1
        switchPolicy:
          type: Noop
        resources:
          limits:
            cpu: 100m
            memory: 0.5Gi
          requests:
            cpu: 100m
            memory: 0.5Gi
        volumeClaimTemplates:
          - name: data
            spec:
              accessModes:
                - ReadWriteOnce
              resources:
                 requests:
                  storage: 1Gi
EOF
  1. redis benchmark
kubectl apply -f -<<EOF
apiVersion: v1
kind: Pod
metadata:
  name: benchtest-redisc-ozklak
  namespace: default
spec:
  containers:
    - name: test-benchmark
      imagePullPolicy: IfNotPresent
      image: docker.io/apecloud/redis-benchmark:latest
      args:
        - "-h"
        - "redisc-ozklak-shard-vvn-0.redisc-ozklak-shard-vvn-headless.default.svc"
        - "-p"
        - "6379"
        - "-a"
        - "O3605v7HsS"
        - "-n"
        - "1000"
        - "-c"
        - "2"
        - "--cluster"
        - "-q"
  restartPolicy: Never
EOF
  1. hscale out shards to 4
kubectl patch cluster redisc-ozklak --namespace default --type json \
  -p '[{"op": "replace", "path": "/spec/shardingSpecs/0/shards", "value": '4'}]'
  1. See error
kubectl get pod 
NAME                                                  READY   STATUS    RESTARTS   AGE
kb-post-provision-job-redisc-ozklak-shard-twh-pjjc8   0/1     Error     0          10m
kb-post-provision-job-redisc-ozklak-shard-twh-tj5bg   0/1     Error     0          9m57s
kb-post-provision-job-redisc-ozklak-shard-twh-vnnds   0/1     Error     0          9m40s
kb-post-provision-job-redisc-ozklak-shard-vvn-bjwm5   0/1     Error     0          9m42s
kb-post-provision-job-redisc-ozklak-shard-vvn-jgp6n   0/1     Error     0          10m
kb-post-provision-job-redisc-ozklak-shard-vvn-p5cdn   0/1     Error     0          10m
redisc-ozklak-shard-27h-0                             3/3     Running   0          8m18s
redisc-ozklak-shard-6s8-0                             3/3     Running   0          10m
redisc-ozklak-shard-twh-0                             3/3     Running   0          10m
redisc-ozklak-shard-vvn-0                             3/3     Running   0          10m

➜  ~ kubectl get cluster
NAME            CLUSTER-DEFINITION   VERSION   TERMINATION-POLICY   STATUS    AGE
redisc-ozklak 

logs error pod

kubectl logs kb-post-provision-job-redisc-ozklak-shard-twh-pjjc8
+ declare -gA initialize_redis_cluster_primary_nodes
+ declare -gA initialize_redis_cluster_secondary_nodes
+ declare -gA initialize_pod_name_to_advertise_host_port_map
+ declare -gA scale_out_shard_default_primary_node
+ declare -gA scale_out_shard_default_other_nodes
+ '[' 1 -eq 1 ']'
+ case $1 in
+ initialize_or_scale_out_redis_cluster
+ wait_random_second 10 1
+ local max_time=10
+ local min_time=1
+ local random_time=10
+ echo 'Sleeping for 10 seconds'
+ sleep 10
Sleeping for 10 seconds
+ is_redis_cluster_initialized
+ '[' -z 10.244.2.26,10.244.2.27,10.244.2.25 ']'
+ local initialized=false
++ echo 10.244.2.26,10.244.2.27,10.244.2.25
++ tr , ' '
+ for pod_ip in $(echo "$KB_CLUSTER_POD_IP_LIST" | tr ',' ' ')
++ redis-cli -h 10.244.2.26 -a O3605v7HsS cluster info
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
cluster_info cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
total_cluster_links_buffer_limit_exceeded:0
+ cluster_info='cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
'otal_cluster_links_buffer_limit_exceeded:0
+ echo 'cluster_info cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
'otal_cluster_links_buffer_limit_exceeded:0
++ echo 'cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
'otal_cluster_links_buffer_limit_exceeded:0
++ grep -oP '(?<=cluster_state:)[^\s]+'
+ cluster_state=fail
+ '[' -z fail ']'
+ '[' fail == ok ']'
+ for pod_ip in $(echo "$KB_CLUSTER_POD_IP_LIST" | tr ',' ' ')
++ redis-cli -h 10.244.2.27 -a O3605v7HsS cluster info
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
+ cluster_info='cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
'otal_cluster_links_buffer_limit_exceeded:0
+ echo 'cluster_info cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
'otal_cluster_links_buffer_limit_exceeded:0
cluster_info cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
total_cluster_links_buffer_limit_exceeded:0
++ echo 'cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
'otal_cluster_links_buffer_limit_exceeded:0
++ grep -oP '(?<=cluster_state:)[^\s]+'
+ cluster_state=fail
+ '[' -z fail ']'
+ '[' fail == ok ']'
+ for pod_ip in $(echo "$KB_CLUSTER_POD_IP_LIST" | tr ',' ' ')
++ redis-cli -h 10.244.2.25 -a O3605v7HsS cluster info
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
+ cluster_info='cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
'otal_cluster_links_buffer_limit_exceeded:0
+ echo 'cluster_info cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
'otal_cluster_links_buffer_limit_exceeded:0
cluster_info cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
total_cluster_links_buffer_limit_exceeded:0
++ echo 'cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
'otal_cluster_links_buffer_limit_exceeded:0
++ grep -oP '(?<=cluster_state:)[^\s]+'
+ cluster_state=fail
+ '[' -z fail ']'
+ '[' fail == ok ']'
+ '[' false = true ']'
+ echo 'Redis Cluster not initialized, initializing...'
+ initialize_redis_cluster
+ gen_initialize_redis_cluster_primary_node
+ gen_initialize_redis_cluster_node true
+ local is_primary=true
+ '[' -z redisc-ozklak-shard-vvn-0,redisc-ozklak-shard-twh-0,redisc-ozklak-shard-6s8-0 ']'
+ local shard_name
+ local shard_advertised_infos
+ local shard_advertised_svc
+ local shard_advertised_port
+ local shard_advertised_svc_ordinal
+ local pod_host_ip
Redis Cluster not initialized, initializing...
++ echo redisc-ozklak-shard-vvn-0,redisc-ozklak-shard-twh-0,redisc-ozklak-shard-6s8-0
++ tr , ' '
+ for pod_name in $(echo "$KB_CLUSTER_POD_NAME_LIST" | tr ',' ' ')
++ extract_ordinal_from_object_name redisc-ozklak-shard-vvn-0
++ local object_name=redisc-ozklak-shard-vvn-0
++ local ordinal=0
++ echo 0
+ pod_name_ordinal=0
+ '[' true = true ']'
+ '[' 0 -ne 0 ']'
+ '[' true = false ']'
+ '[' -n '' ']'
+ local port=6379
++ extract_pod_name_prefix redisc-ozklak-shard-vvn-0
++ local pod_name=redisc-ozklak-shard-vvn-0
+++ echo redisc-ozklak-shard-vvn-0
+++ sed 's/-[0-9]\+$//'
++ prefix=redisc-ozklak-shard-vvn
++ echo redisc-ozklak-shard-vvn
+ pod_name_prefix=redisc-ozklak-shard-vvn
+ local pod_fqdn=redisc-ozklak-shard-vvn-0.redisc-ozklak-shard-vvn-headless
+ '[' true = true ']'
+ initialize_redis_cluster_primary_nodes["$pod_name"]=redisc-ozklak-shard-vvn-0.redisc-ozklak-shard-vvn-headless:6379
+ initialize_pod_name_to_advertise_host_port_map["$pod_name"]=redisc-ozklak-shard-vvn-0.redisc-ozklak-shard-vvn-headless:6379
+ for pod_name in $(echo "$KB_CLUSTER_POD_NAME_LIST" | tr ',' ' ')
++ extract_ordinal_from_object_name redisc-ozklak-shard-twh-0
++ local object_name=redisc-ozklak-shard-twh-0
++ local ordinal=0
++ echo 0
+ pod_name_ordinal=0
+ '[' true = true ']'
+ '[' 0 -ne 0 ']'
+ '[' true = false ']'
+ '[' -n '' ']'
+ local port=6379
++ extract_pod_name_prefix redisc-ozklak-shard-twh-0
++ local pod_name=redisc-ozklak-shard-twh-0
+++ echo redisc-ozklak-shard-twh-0
+++ sed 's/-[0-9]\+$//'
++ prefix=redisc-ozklak-shard-twh
++ echo redisc-ozklak-shard-twh
+ pod_name_prefix=redisc-ozklak-shard-twh
+ local pod_fqdn=redisc-ozklak-shard-twh-0.redisc-ozklak-shard-twh-headless
+ '[' true = true ']'
+ initialize_redis_cluster_primary_nodes["$pod_name"]=redisc-ozklak-shard-twh-0.redisc-ozklak-shard-twh-headless:6379
+ initialize_pod_name_to_advertise_host_port_map["$pod_name"]=redisc-ozklak-shard-twh-0.redisc-ozklak-shard-twh-headless:6379
+ for pod_name in $(echo "$KB_CLUSTER_POD_NAME_LIST" | tr ',' ' ')
++ extract_ordinal_from_object_name redisc-ozklak-shard-6s8-0
++ local object_name=redisc-ozklak-shard-6s8-0
++ local ordinal=0
++ echo 0
+ pod_name_ordinal=0
+ '[' true = true ']'
+ '[' 0 -ne 0 ']'
+ '[' true = false ']'
+ '[' -n '' ']'
+ local port=6379
++ extract_pod_name_prefix redisc-ozklak-shard-6s8-0
++ local pod_name=redisc-ozklak-shard-6s8-0
+++ echo redisc-ozklak-shard-6s8-0
+++ sed 's/-[0-9]\+$//'
++ prefix=redisc-ozklak-shard-6s8
++ echo redisc-ozklak-shard-6s8
initialize_command: redis-cli --cluster create redisc-ozklak-shard-6s8-0.redisc-ozklak-shard-6s8-headless:6379 redisc-ozklak-shard-twh-0.redisc-ozklak-shard-twh-headless:6379 redisc-ozklak-shard-vvn-0.redisc-ozklak-shard-vvn-headless:6379  -a O3605v7HsS --cluster-yes
+ pod_name_prefix=redisc-ozklak-shard-6s8
+ local pod_fqdn=redisc-ozklak-shard-6s8-0.redisc-ozklak-shard-6s8-headless
+ '[' true = true ']'
+ initialize_redis_cluster_primary_nodes["$pod_name"]=redisc-ozklak-shard-6s8-0.redisc-ozklak-shard-6s8-headless:6379
+ initialize_pod_name_to_advertise_host_port_map["$pod_name"]=redisc-ozklak-shard-6s8-0.redisc-ozklak-shard-6s8-headless:6379
+ '[' 3 -eq 0 ']'
+ primary_nodes=
+ for primary_pod_name in "${!initialize_redis_cluster_primary_nodes[@]}"
+ primary_nodes+='redisc-ozklak-shard-6s8-0.redisc-ozklak-shard-6s8-headless:6379 '
+ for primary_pod_name in "${!initialize_redis_cluster_primary_nodes[@]}"
+ primary_nodes+='redisc-ozklak-shard-twh-0.redisc-ozklak-shard-twh-headless:6379 '
+ for primary_pod_name in "${!initialize_redis_cluster_primary_nodes[@]}"
+ primary_nodes+='redisc-ozklak-shard-vvn-0.redisc-ozklak-shard-vvn-headless:6379 '
+ '[' -z O3605v7HsS ']'
+ initialize_command='redis-cli --cluster create redisc-ozklak-shard-6s8-0.redisc-ozklak-shard-6s8-headless:6379 redisc-ozklak-shard-twh-0.redisc-ozklak-shard-twh-headless:6379 redisc-ozklak-shard-vvn-0.redisc-ozklak-shard-vvn-headless:6379  -a O3605v7HsS --cluster-yes'
+ echo 'initialize_command: redis-cli --cluster create redisc-ozklak-shard-6s8-0.redisc-ozklak-shard-6s8-headless:6379 redisc-ozklak-shard-twh-0.redisc-ozklak-shard-twh-headless:6379 redisc-ozklak-shard-vvn-0.redisc-ozklak-shard-vvn-headless:6379  -a O3605v7HsS --cluster-yes'
+ redis-cli --cluster create redisc-ozklak-shard-6s8-0.redisc-ozklak-shard-6s8-headless:6379 redisc-ozklak-shard-twh-0.redisc-ozklak-shard-twh-headless:6379 redisc-ozklak-shard-vvn-0.redisc-ozklak-shard-vvn-headless:6379 -a O3605v7HsS --cluster-yes
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at redisc-ozklak-shard-twh-0.redisc-ozklak-shard-twh-headless:6379: Name or service not known
+ echo 'Failed to create Redis Cluster'
+ exit 1
Failed to create Redis Cluster

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]
kbcli version
Kubernetes: v1.26.3
KubeBlocks: 0.9.0-beta.15
kbcli: 0.9.0-beta.4

Additional context Add any other context about the problem here.

JashBook avatar Apr 29 '24 07:04 JashBook