kubeblocks
kubeblocks copied to clipboard
[BUG] kb-post-provision-job fails with "ERR Duplicate master name."
Describe the bug
mark@L-R910LPKW:~$ k get pod
NAME READY STATUS RESTARTS AGE
aida-dev-xyz-mining-redis-0 3/3 Running 0 27h
aida-dev-xyz-mining-redis-1 3/3 Running 0 27h
aida-dev-xyz-mining-redis-sentinel-0 1/1 Running 0 27h
aida-dev-xyz-mining-redis-sentinel-1 1/1 Running 0 27h
aida-dev-xyz-mining-redis-sentinel-2 1/1 Running 0 27h
kb-post-provision-job-aida-dev-xyz-mining-redis-6l9gq 0/1 Error 0 3m
kb-post-provision-job-aida-dev-xyz-mining-redis-7qjhh 0/1 Error 0 3m25s
kb-post-provision-job-aida-dev-xyz-mining-redis-dcxt8 0/1 Error 0 3m41s
mark@L-R910LPKW:~$
To Reproduce Not sure, but for me it is reproduced very easily - I just need to delete the job to let it be created again and it errors out.
Expected behavior No errors.
Additional context I have 5 Redis instances deployed with KB, each with sentinels and each having 2 replicas for the database and 3 for the sentinels. Only one instance exhibits the problematic behavior:
mark@L-R910LPKW:~$ k get job
NAME STATUS COMPLETIONS DURATION AGE
kb-post-provision-job-aida-dev-xyz-mining-redis Failed 0/1 6m53s 6m53s
mark@L-R910LPKW:~$ k delete job --all
job.batch "kb-post-provision-job-aida-dev-xyz-mining-redis" deleted
mark@L-R910LPKW:~$ k get job
NAME STATUS COMPLETIONS DURATION AGE
kb-post-provision-job-aida-dev-xyz-mining-redis Running 0/1 2s 2s
mark@L-R910LPKW:~$ sleep 30
mark@L-R910LPKW:~$ k get job
NAME STATUS COMPLETIONS DURATION AGE
kb-post-provision-job-aida-dev-xyz-mining-redis Running 0/1 41s 41s
mark@L-R910LPKW:~$ k get pod
NAME READY STATUS RESTARTS AGE
aida-dev-xyz-mining-redis-0 3/3 Running 0 27h
aida-dev-xyz-mining-redis-1 3/3 Running 0 27h
aida-dev-xyz-mining-redis-sentinel-0 1/1 Running 0 27h
aida-dev-xyz-mining-redis-sentinel-1 1/1 Running 0 27h
aida-dev-xyz-mining-redis-sentinel-2 1/1 Running 0 27h
kb-post-provision-job-aida-dev-xyz-mining-redis-7dhpq 0/1 Error 0 6s
kb-post-provision-job-aida-dev-xyz-mining-redis-b89f8 0/1 Error 0 47s
kb-post-provision-job-aida-dev-xyz-mining-redis-ffmpj 0/1 Error 0 31s
mark@L-R910LPKW:~$ k logs kb-post-provision-job-aida-dev-xyz-mining-redis-b89f8
+ declare -g default_initialize_pod_ordinal
+ declare -g redis_advertised_svc_host_value
+ declare -g redis_advertised_svc_port_value
+ declare -g headless_postfix=headless
+ declare -g redis_default_service_port=6379
+ echo 'redis sentinel component replicas found, register to sentinel.'
+ register_to_sentinel_wrapper
+ '[' -z aida-dev-xyz-mining-redis-sentinel-0,aida-dev-xyz-mining-redis-sentinel-1,aida-dev-xyz-mining-redis-sentinel-2 ']'
+ '[' -z aida-dev-xyz-mining-redis-sentinel-headless ']'
+ get_minimum_initialize_pod_ordinal
+ '[' -z aida-dev-xyz-mining-redis-0,aida-dev-xyz-mining-redis-1 ']'
+ IFS=,
+ read -ra pod_list
+ for pod in "${pod_list[@]}"
+ '[' -z '' ']'
redis sentinel component replicas found, register to sentinel.
++ extract_ordinal_from_object_name aida-dev-xyz-mining-redis-0
++ local object_name=aida-dev-xyz-mining-redis-0
++ local ordinal=0
++ echo 0
+ default_initialize_pod_ordinal=0
+ continue
+ for pod in "${pod_list[@]}"
+ '[' -z 0 ']'
++ extract_ordinal_from_object_name aida-dev-xyz-mining-redis-1
++ local object_name=aida-dev-xyz-mining-redis-1
++ local ordinal=1
++ echo 1
+ pod_ordinal=1
+ '[' 1 -lt 0 ']'
+ default_redis_primary_pod_name=aida-dev-xyz-mining-redis-0
+ redis_default_primary_pod_headless_fqdn=aida-dev-xyz-mining-redis-0.aida-dev-xyz-mining-redis-headless.system-d-redis-aida-dev-xyz-mining.svc
+ init_redis_service_port
+ '[' -n 6379 ']'
+ redis_default_service_port=6379
+ parse_redis_advertised_svc_if_exist aida-dev-xyz-mining-redis-0
+ local pod_name=aida-dev-xyz-mining-redis-0
+ [[ -z '' ]]
+ echo 'Environment variable REDIS_ADVERTISED_PORT not found. Ignoring.'
Environment variable REDIS_ADVERTISED_PORT not found. Ignoring.
+ return 0
+ old_ifs='
'
+ IFS=,
+ set -f
+ read -ra sentinel_pod_list
+ set +f
+ IFS='
'
+ for sentinel_pod in "${sentinel_pod_list[@]}"
+ sentinel_pod_fqdn=aida-dev-xyz-mining-redis-sentinel-0.aida-dev-xyz-mining-redis-sentinel-headless
+ '[' -n '' ']'
+ echo 'register to sentinel:aida-dev-xyz-mining-redis-sentinel-0.aida-dev-xyz-mining-redis-sentinel-headless with ClusterIP service: redis_default_primary_pod_fqdn=aida-dev-xyz-mining-redis-0.aida-dev-xyz-mining-redis-headless.system-d-redis-aida-dev-xyz-mining.svc, redis_default_service_port=6379'
register to sentinel:aida-dev-xyz-mining-redis-sentinel-0.aida-dev-xyz-mining-redis-sentinel-headless with ClusterIP service: redis_default_primary_pod_fqdn=aida-dev-xyz-mining-redis-0.aida-dev-xyz-mining-redis-headless.system-d-redis-aida-dev-xyz-mining.svc, redis_default_service_port=6379
+ register_to_sentinel aida-dev-xyz-mining-redis-sentinel-0.aida-dev-xyz-mining-redis-sentinel-headless aida-dev-xyz-mining-redis aida-dev-xyz-mining-redis-0.aida-dev-xyz-mining-redis-headless.system-d-redis-aida-dev-xyz-mining.svc 6379
+ local sentinel_host=aida-dev-xyz-mining-redis-sentinel-0.aida-dev-xyz-mining-redis-sentinel-headless
+ local master_name=aida-dev-xyz-mining-redis
+ local sentinel_port=26379
+ local redis_primary_host=aida-dev-xyz-mining-redis-0.aida-dev-xyz-mining-redis-headless.system-d-redis-aida-dev-xyz-mining.svc
+ local redis_primary_port=6379
+ local timeout=600
++ date +%s
+ local start_time=1725899217
+ local current_time
+ set +x
Checking connectivity to aida-dev-xyz-mining-redis-sentinel-0.aida-dev-xyz-mining-redis-sentinel-headless on port 26379 using redis-cli...
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
aida-dev-xyz-mining-redis-sentinel-0.aida-dev-xyz-mining-redis-sentinel-headless is reachable on port 26379.
Checking connectivity to aida-dev-xyz-mining-redis-0.aida-dev-xyz-mining-redis-headless.system-d-redis-aida-dev-xyz-mining.svc on port 6379 using redis-cli...
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
aida-dev-xyz-mining-redis-0.aida-dev-xyz-mining-redis-headless.system-d-redis-aida-dev-xyz-mining.svc is reachable on port 6379.
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
ERR Duplicate master name.
Command failed with status 0 or output not OK.
mark@L-R910LPKW:~$ k get job
NAME STATUS COMPLETIONS DURATION AGE
kb-post-provision-job-aida-dev-xyz-mining-redis Failed 0/1 71s 71s
mark@L-R910LPKW:~$