[BUG] starrocks ce hscale in ERROR 1064 (HY000): Backend node not found
Describe the bug A clear and concise description of what the bug is.
kbcli version
Kubernetes: v1.30.4-vke.4
KubeBlocks: 0.9.5-beta.15
kbcli: 0.9.5-beta.7
ERROR 1064 (HY000): Backend node not found. Check if any backend node is down.backend: [strce-nlbjad-be-1.strce-nlbjad-be-headless.default.svc.cluster.local alive: true inBlacklist: false] [strce-nlbjad-be-0.strce-nlbjad-be-headless.default.svc.cluster.local alive: true inBlacklist: false] [strce-nlbjad-be-2.strce-nlbjad-be-headless.default.svc.cluster.local alive: false inBlacklist: true]
To Reproduce Steps to reproduce the behavior:
- create cluster
apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
metadata:
name: strce-nlbjad
namespace: default
spec:
terminationPolicy: Delete
componentSpecs:
- componentDef: starrocks-ce-fe
name: fe
replicas: 1
resources:
requests:
cpu: 1000m
memory: 1Gi
limits:
cpu: 1000m
memory: 1Gi
- name: be
componentDef: starrocks-ce-be
replicas: 2
resources:
requests:
cpu: 1000m
memory: 1Gi
limits:
cpu: 1000m
memory: 1Gi
volumeClaimTemplates:
- name: data
spec:
storageClassName:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
kubectl get cluster strce-nlbjad
NAME CLUSTER-DEFINITION VERSION TERMINATION-POLICY STATUS AGE
strce-nlbjad Delete Running 72s
kubectl exec -it strce-nlbjad-fe-0 --namespace default -- bash
root@strce-nlbjad-fe-0:/opt/starrocks# mysql -P9030 -hstrce-nlbjad-fe-fe.default.svc.cluster.local -uroot --prompt='StarRocks > '
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 5.1.0 3.3.0-19a3f66
Copyright (c) 2000, 2024, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
StarRocks > SHOW PROC '/frontends'\G;
*************************** 1. row ***************************
Name: strce-nlbjad-fe-0.strce-nlbjad-fe-headless.default.svc.cluster.local_9010_1755743297978
IP: strce-nlbjad-fe-0.strce-nlbjad-fe-headless.default.svc.cluster.local
EditLogPort: 9010
HttpPort: 8030
QueryPort: 9030
RpcPort: 9020
Role: LEADER
ClusterId: 985742647
Join: true
Alive: true
ReplayedJournalId: 58
LastHeartbeat: 2025-08-21 10:30:33
IsHelper: true
ErrMsg:
StartTime: 2025-08-21 10:28:25
Version: 3.3.0-19a3f66
1 row in set (0.02 sec)
ERROR:
No query specified
StarRocks > SHOW PROC '/backends'\G;
*************************** 1. row ***************************
BackendId: 10002
IP: strce-nlbjad-be-0.strce-nlbjad-be-headless.default.svc.cluster.local
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
LastStartTime: 2025-08-21 10:28:38
LastHeartbeat: 2025-08-21 10:30:43
Alive: true
SystemDecommissioned: false
ClusterDecommissioned: false
TabletNum: 58
DataUsedCapacity: 0.000 B
AvailCapacity: 19.429 GB
TotalCapacity: 19.518 GB
UsedPct: 0.46 %
MaxDiskUsedPct: 0.46 %
ErrMsg:
Version: 3.3.0-19a3f66
Status: {"lastSuccessReportTabletsTime":"2025-08-21 10:30:39"}
DataTotalCapacity: 19.429 GB
DataUsedPct: 0.00 %
CpuCores: 1
NumRunningQueries: 0
MemUsedPct: 14.04 %
CpuUsedPct: 0.1 %
DataCacheMetrics: Status: Normal, DiskUsage: 0B/0B, MemUsage: 0B/0B
Location:
*************************** 2. row ***************************
BackendId: 10001
IP: strce-nlbjad-be-1.strce-nlbjad-be-headless.default.svc.cluster.local
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
LastStartTime: 2025-08-21 10:28:38
LastHeartbeat: 2025-08-21 10:30:43
Alive: true
SystemDecommissioned: false
ClusterDecommissioned: false
TabletNum: 58
DataUsedCapacity: 0.000 B
AvailCapacity: 19.429 GB
TotalCapacity: 19.518 GB
UsedPct: 0.46 %
MaxDiskUsedPct: 0.46 %
ErrMsg:
Version: 3.3.0-19a3f66
Status: {"lastSuccessReportTabletsTime":"2025-08-21 10:30:39"}
DataTotalCapacity: 19.429 GB
DataUsedPct: 0.00 %
CpuCores: 1
NumRunningQueries: 0
MemUsedPct: 14.24 %
CpuUsedPct: 0.0 %
DataCacheMetrics: Status: Normal, DiskUsage: 0B/0B, MemUsage: 0B/0B
Location:
2 rows in set (0.00 sec)
ERROR:
No query specified
- insert data
kubectl exec -it strce-nlbjad-fe-0 --namespace default -- bash
root@strce-nlbjad-fe-0:/opt/starrocks# mysql -P9030 -hstrce-nlbjad-fe-fe.default.svc.cluster.local -uroot --prompt='StarRocks > '
# create database
StarRocks > create database executions_loop;
Query OK, 0 rows affected (0.00 sec)
# create table
StarRocks > CREATE TABLE IF NOT EXISTS executions_loop.executions_loop_table (id INT, value VARCHAR(255), tinyint_col TINYINT, smallint_col SMALLINT, int_col INT, bigint_col BIGINT, float_col FLOAT, double_col DOUBLE, decimal_col DECIMAL(10, 2), date_col DATE, datetime_col DATETIME, char_col CHAR(10), text_col TEXT, binary_col BINARY(30), varbinary_col VARBINARY(255), enum_col VARCHAR(20), set_col VARCHAR(50) ) ENGINE=OLAP DUPLICATE KEY(id) DISTRIBUTED BY HASH(id) BUCKETS 3 PROPERTIES ( 'replication_num' = '1' );
Query OK, 0 rows affected (0.01 sec)
# insert data
StarRocks > INSERT INTO executions_loop.executions_loop_table (value, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, decimal_col, date_col, datetime_col, char_col, text_col, binary_col, varbinary_col, enum_col, set_col) VALUES ('executions_loop_test_1', 3, 12669, 677080715, 997585623426478613, 0.5926235, 0.5697104875201761, 90.3266010729513, '2025-08-21', '2025-08-21 02:15:36.442', 'c4AlQnX6jO', 'u8pbpGlM7D8kB39YIIgSLD3dPa8Fr72pm8GBprPG2gBLeJ9nuheed950WvDL0CKujT5P8RF2uoRuIkkryaSqJsOK4kjhXWsO9C2Cj6debYTIBX5YKQoWUMCULpWxZ3KsJPRXG5bALYhanWd1K5QdHeBvK7QO4CSoOIQ4Mo82wh41pOII0mUefzJjDin2NZ1qobXW8HppqufJAGctcrq21PEOlBB50pqBLS4cdOuWVtUPDkMXQ4kIgWQUW8WC8PN', TO_BINARY('base64,Vh9ehIlaNPPkyA=='), TO_BINARY('base64,XTvDKwk6Bb673l91VtFra5OxPtwPIGNIkUEAYKgsUPu3Q6MDC0joZa99B9A+brOWspMYczFyeLA1kyP8uO1cArVyc7b3aUpUZLn7989BzMd9pb95DK+V+IeG3CRkFI1Y2Cjrvok5vRoQMiRn1qPrQWe6Qh9mzeXfAFUu+ZOFX96kqSf5LpHZDIzMkHbxfUf5JjVjuvbbAYT9onE='), 'Option3', 'Value1' );
Query OK, 1 row affected (2.08 sec)
{'label':'insert_09639ed8-7e37-11f0-a14f-00163e43bcc5', 'status':'VISIBLE', 'txnId':'2'}
StarRocks > select count(*) from executions_loop.executions_loop_table;
+----------+
| count(*) |
+----------+
| 1 |
+----------+
1 row in set (0.01 sec)
- hscale be
kbcli cluster hscale strce-nlbjad --auto-approve --force=true --components be --replicas 3
kbcli cluster hscale strce-nlbjad --auto-approve --force=true --components be --replicas 2
kubectl get cluster strce-nlbjad
NAME CLUSTER-DEFINITION VERSION TERMINATION-POLICY STATUS AGE
strce-nlbjad Delete Running 15m
➜ ~ kubectl get pod -l app.kubernetes.io/instance=strce-nlbjad
NAME READY STATUS RESTARTS AGE
strce-nlbjad-be-0 1/1 Running 0 15m
strce-nlbjad-be-1 1/1 Running 0 15m
strce-nlbjad-fe-0 1/1 Running 0 15m
- See error
kubectl exec -it strce-nlbjad-fe-0 --namespace default -- bash
root@strce-nlbjad-fe-0:/opt/starrocks# mysql -P9030 -hstrce-nlbjad-fe-fe.default.svc.cluster.local -uroot --prompt='StarRocks > '
StarRocks > select count(*) from executions_loop.executions_loop_table;
ERROR 1064 (HY000): Backend node not found. Check if any backend node is down.backend: [strce-nlbjad-be-1.strce-nlbjad-be-headless.default.svc.cluster.local alive: true inBlacklist: false] [strce-nlbjad-be-0.strce-nlbjad-be-headless.default.svc.cluster.local alive: true inBlacklist: false] [strce-nlbjad-be-2.strce-nlbjad-be-headless.default.svc.cluster.local alive: false inBlacklist: true]
StarRocks > SHOW PROC '/backends'\G;
*************************** 1. row ***************************
BackendId: 10002
IP: strce-nlbjad-be-0.strce-nlbjad-be-headless.default.svc.cluster.local
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
LastStartTime: 2025-08-21 10:28:38
LastHeartbeat: 2025-08-21 10:44:48
Alive: true
SystemDecommissioned: false
ClusterDecommissioned: false
TabletNum: 44
DataUsedCapacity: 21.008 KB
AvailCapacity: 19.429 GB
TotalCapacity: 19.518 GB
UsedPct: 0.46 %
MaxDiskUsedPct: 0.46 %
ErrMsg:
Version: 3.3.0-19a3f66
Status: {"lastSuccessReportTabletsTime":"2025-08-21 10:44:39"}
DataTotalCapacity: 19.429 GB
DataUsedPct: 0.00 %
CpuCores: 1
NumRunningQueries: 0
MemUsedPct: 15.24 %
CpuUsedPct: 0.0 %
DataCacheMetrics: Status: Normal, DiskUsage: 0B/0B, MemUsage: 0B/0B
Location:
*************************** 2. row ***************************
BackendId: 10001
IP: strce-nlbjad-be-1.strce-nlbjad-be-headless.default.svc.cluster.local
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
LastStartTime: 2025-08-21 10:28:38
LastHeartbeat: 2025-08-21 10:44:48
Alive: true
SystemDecommissioned: false
ClusterDecommissioned: false
TabletNum: 44
DataUsedCapacity: 15.900 KB
AvailCapacity: 19.429 GB
TotalCapacity: 19.518 GB
UsedPct: 0.46 %
MaxDiskUsedPct: 0.46 %
ErrMsg:
Version: 3.3.0-19a3f66
Status: {"lastSuccessReportTabletsTime":"2025-08-21 10:44:39"}
DataTotalCapacity: 19.429 GB
DataUsedPct: 0.00 %
CpuCores: 1
NumRunningQueries: 0
MemUsedPct: 15.51 %
CpuUsedPct: 0.0 %
DataCacheMetrics: Status: Normal, DiskUsage: 0B/0B, MemUsage: 0B/0B
Location:
*************************** 3. row ***************************
BackendId: 10210
IP: strce-nlbjad-be-2.strce-nlbjad-be-headless.default.svc.cluster.local
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
LastStartTime: 2025-08-21 10:36:33
LastHeartbeat: 2025-08-21 10:39:13
Alive: false
SystemDecommissioned: false
ClusterDecommissioned: false
TabletNum: 36
DataUsedCapacity: 12.607 KB
AvailCapacity: 19.429 GB
TotalCapacity: 19.518 GB
UsedPct: 0.46 %
MaxDiskUsedPct: 0.46 %
ErrMsg: java.net.UnknownHostException: strce-nlbjad-be-2.strce-nlbjad-be-headless.default.svc.cluster.local
Version: 3.3.0-19a3f66
Status: {"lastSuccessReportTabletsTime":"2025-08-21 10:38:34"}
DataTotalCapacity: 19.429 GB
DataUsedPct: 0.00 %
CpuCores: 1
NumRunningQueries: 0
MemUsedPct: 14.40 %
CpuUsedPct: 0.0 %
DataCacheMetrics: Status: Normal, DiskUsage: 0B/0B, MemUsage: 0B/0B
Location:
3 rows in set (0.01 sec)
ERROR:
No query specified
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
Additional context Add any other context about the problem here.
This issue has been marked as stale because it has been open for 30 days with no activity