kubeblocks
kubeblocks copied to clipboard
[BUG] starrocks cluster restart fe pod status always ContainerCreating
Describe the bug A clear and concise description of what the bug is.
To Reproduce Steps to reproduce the behavior:
- create cluster
apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
metadata:
name: strsent-blmdvx
namespace: default
spec:
terminationPolicy: WipeOut
componentSpecs:
- name: cn
componentDef: starrocks-cn
replicas: 2
resources:
requests:
cpu: 200m
memory: 1Gi
limits:
cpu: 200m
memory: 1Gi
- name: fe
componentDef: starrocks-fe-sd
replicas: 2
resources:
requests:
cpu: 200m
memory: 1Gi
limits:
cpu: 200m
memory: 1Gi
volumeClaimTemplates:
- name: data
spec:
storageClassName:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
- restart fe
kbcli cluster restart strsent-blmdvx --auto-approve --components fe
- See error
kubectl get cluster
NAME CLUSTER-DEFINITION VERSION TERMINATION-POLICY STATUS AGE
strsent-blmdvx WipeOut Updating 14m
kubectl get pod
NAME READY STATUS RESTARTS AGE
strsent-blmdvx-cn-0 1/1 Running 2 (12m ago) 14m
strsent-blmdvx-cn-1 1/1 Running 2 (12m ago) 14m
strsent-blmdvx-fe-0 0/1 ContainerCreating 0 5m37s
strsent-blmdvx-fe-1 1/1 Running 1 (12m ago) 14m
➜ ~ kubectl get ops
NAME TYPE CLUSTER STATUS PROGRESS AGE
strsent-blmdvx-restart-4vjjv Restart strsent-blmdvx Running 0/2 5m45s
describe cluster
kubectl describe cluster strsent-blmdvx
Name: strsent-blmdvx
Namespace: default
Labels: app.kubernetes.io/instance=strsent-blmdvx
Annotations: kubeblocks.io/ops-request: [{"name":"strsent-blmdvx-restart-4vjjv","type":"Restart"}]
kubeblocks.io/reconcile: 2024-04-08T11:32:33.513823465Z
API Version: apps.kubeblocks.io/v1alpha1
Kind: Cluster
Metadata:
Creation Timestamp: 2024-04-08T11:30:29Z
Finalizers:
cluster.kubeblocks.io/finalizer
Generation: 6
Managed Fields:
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:spec:
Manager: kubectl-client-side-apply
Operation: Update
Time: 2024-04-08T11:30:29Z
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:kubeblocks.io/ops-request:
f:kubeblocks.io/reconcile:
f:finalizers:
.:
v:"cluster.kubeblocks.io/finalizer":
f:spec:
f:componentSpecs:
f:monitor:
f:resources:
.:
f:cpu:
f:memory:
f:services:
f:storage:
.:
f:size:
Manager: manager
Operation: Update
Time: 2024-04-08T11:39:40Z
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:labels:
.:
f:app.kubernetes.io/instance:
f:spec:
f:terminationPolicy:
Manager: kbcli
Operation: Update
Time: 2024-04-08T11:44:27Z
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:components:
.:
f:cn:
.:
f:message:
.:
f:Pod/strsent-blmdvx-cn-0:
f:Pod/strsent-blmdvx-cn-1:
f:phase:
f:podsReady:
f:podsReadyTime:
f:fe:
.:
f:message:
.:
f:Pod/strsent-blmdvx-fe-1:
f:phase:
f:podsReady:
f:podsReadyTime:
f:conditions:
f:observedGeneration:
f:phase:
Manager: manager
Operation: Update
Subresource: status
Time: 2024-04-08T11:44:27Z
Resource Version: 316951856
UID: a859091c-ef2b-4cfd-9fd7-7d44d6b222a2
Spec:
Component Specs:
Component Def: starrocks-cn
Monitor: false
Name: cn
Replicas: 2
Resources:
Limits:
Cpu: 200m
Memory: 1Gi
Requests:
Cpu: 200m
Memory: 1Gi
Service Version: 3.2.2
Component Def: starrocks-fe-sd
Monitor: false
Name: fe
Replicas: 2
Resources:
Limits:
Cpu: 200m
Memory: 1Gi
Requests:
Cpu: 200m
Memory: 1Gi
Service Version: 3.2.2
Volume Claim Templates:
Name: data
Spec:
Access Modes:
ReadWriteOnce
Resources:
Requests:
Storage: 20Gi
Monitor:
Resources:
Cpu: 0
Memory: 0
Services:
Annotations:
networking.gke.io/load-balancer-type: Internal
Component Selector: fe
Name: fe-vpc
Service Name: fe-vpc
Spec:
Ports:
Name: fe-http
Node Port: 30217
Port: 8030
Protocol: TCP
Target Port: http-port
Name: fe-mysql
Node Port: 32740
Port: 9030
Protocol: TCP
Target Port: query-port
Type: LoadBalancer
Storage:
Size: 0
Termination Policy: WipeOut
Status:
Components:
Cn:
Message:
Pod/strsent-blmdvx-cn-0:
Pod/strsent-blmdvx-cn-1:
Phase: Running
Pods Ready: true
Pods Ready Time: 2024-04-08T11:44:27Z
Fe:
Message:
Pod/strsent-blmdvx-fe-1:
Phase: Updating
Pods Ready: false
Pods Ready Time: 2024-04-08T11:38:10Z
Conditions:
Last Transition Time: 2024-04-08T11:30:29Z
Message: The operator has started the provisioning of Cluster: strsent-blmdvx
Observed Generation: 6
Reason: PreCheckSucceed
Status: True
Type: ProvisioningStarted
Last Transition Time: 2024-04-08T11:30:29Z
Message: Successfully applied for resources
Observed Generation: 6
Reason: ApplyResourcesSucceed
Status: True
Type: ApplyResources
Last Transition Time: 2024-04-08T11:39:40Z
Message: pods are not ready in Components: [fe], refer to related component message in Cluster.status.components
Reason: ReplicasNotReady
Status: False
Type: ReplicasReady
Last Transition Time: 2024-04-08T11:39:40Z
Message: pods are unavailable in Components: [fe], refer to related component message in Cluster.status.components
Reason: ComponentsNotReady
Status: False
Type: Ready
Observed Generation: 6
Phase: Updating
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ComponentPhaseTransition 15m (x2 over 15m) cluster-controller component is Creating
Warning Unhealthy 14m (x9 over 15m) event-controller Pod strsent-blmdvx-cn-0: Startup probe failed: %!T(MISSING)otal %!R(MISSING)eceived %!X(MISSING)ferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (7) Failed to connect to 10.128.1.105 port 8040 after 0 ms: Connection refused
Warning Unhealthy 14m (x6 over 14m) event-controller Pod strsent-blmdvx-fe-1: Startup probe failed: %!T(MISSING)otal %!R(MISSING)eceived %!X(MISSING)ferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (7) Failed to connect to 10.128.1.106 port 8030 after 0 ms: Connection refused
Warning Unhealthy 14m (x10 over 15m) event-controller Pod strsent-blmdvx-cn-1: Startup probe failed: %!T(MISSING)otal %!R(MISSING)eceived %!X(MISSING)ferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (7) Failed to connect to 10.128.1.104 port 8040 after 0 ms: Connection refused
Normal ComponentPhaseTransition 13m (x3 over 14m) cluster-controller component is Failed
Warning Failed 13m cluster-controller Cluster: strsent-blmdvx is Failed, check according to the components message
Warning Abnormal 13m (x3 over 14m) cluster-controller Cluster: strsent-blmdvx is Abnormal, check according to the components message
Normal ComponentPhaseTransition 13m (x3 over 14m) cluster-controller component is Updating
Warning ReplicasNotReady 12m cluster-controller pods are not ready in Components: [fe], refer to related component message in Cluster.status.components
Warning ComponentsNotReady 12m cluster-controller pods are unavailable in Components: [fe], refer to related component message in Cluster.status.components
Normal ClusterReady 11m cluster-controller Cluster: strsent-blmdvx is ready, current phase is Running
Normal ComponentPhaseTransition 11m (x2 over 12m) cluster-controller component is Running
Normal Running 11m cluster-controller Cluster: strsent-blmdvx is ready, current phase is Running
Normal AllReplicasReady 11m cluster-controller all pods of components are ready, waiting for the probe detection successful
Warning NotFound 8m7s (x25 over 11m) system-account-controller ClusterDefinition.apps.kubeblocks.io "" not found
Normal ApplyResourcesSucceed 8m7s (x3 over 15m) cluster-controller Successfully applied for resources
Normal PreCheckSucceed 109s (x6 over 15m) cluster-controller The operator has started the provisioning of Cluster: strsent-blmdvx
describe pod
kubectl describe pod strsent-blmdvx-fe-0
Name: strsent-blmdvx-fe-0
Namespace: default
Priority: 0
Node: gke-infracreate-gke-kbdata-e2-standar-25c8fd47-ovfq/10.10.0.36
Start Time: Mon, 08 Apr 2024 19:39:43 +0800
Labels: app.kubernetes.io/component=starrocks-fe-sd
app.kubernetes.io/instance=strsent-blmdvx
app.kubernetes.io/managed-by=kubeblocks
app.kubernetes.io/name=starrocks-fe-sd
app.kubernetes.io/version=starrocks-fe-sd
apps.kubeblocks.io/cluster-uid=a859091c-ef2b-4cfd-9fd7-7d44d6b222a2
apps.kubeblocks.io/component-name=fe
apps.kubeblocks.io/service-version=3.2.2
componentdefinition.kubeblocks.io/name=starrocks-fe-sd
controller-revision-hash=8d4dfcf7b
Annotations: apps.kubeblocks.io/component-replicas: 2
kubeblocks.io/restart: 2024-04-08T11:39:40Z
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicatedStateMachine/strsent-blmdvx-fe
Containers:
fe:
Container ID:
Image: docker.io/starrocks/fe-ubuntu:3.2.2
Image ID:
Ports: 8030/TCP, 9020/TCP, 9030/TCP, 9010/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Command:
bash
-c
# FIXME temporary workaround that will be removed in the future when the FE supports the IPv6
POD_IP_V4=
ips=$(echo $KB_POD_IPS | tr "," "\n")
for ip in $ips; do
if [[ $ip == *":"* ]]; then
continue
fi
POD_IP_V4=$ip
break
done
if [[ -z $POD_IP_V4 ]]; then
echo "Failed to get IPv4 POD_IP from KB_POD_IPS"
exit 1
fi
HOST_TYPE=IP POD_IP=${POD_IP_V4} /opt/starrocks/fe_entrypoint.sh ${FE_DISCOVERY_SERVICE_NAME}
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
cpu: 200m
memory: 1Gi
Requests:
cpu: 200m
memory: 1Gi
Liveness: exec [/bin/bash -c POD_IP_V4=
ips=$(echo $KB_POD_IPS | tr "," "\n")
for ip in $ips; do
if [[ $ip == *":"* ]]; then
continue
fi
POD_IP_V4=$ip
break
done
if [[ -z $POD_IP_V4 ]]; then
echo "Failed to get IPv4 POD_IP from KB_POD_IPS"
exit 1
fi
curl --fail http://$POD_IP_V4:8030/api/health
] delay=0s timeout=1s period=5s #success=1 #failure=3
Readiness: exec [/bin/bash -c POD_IP_V4=
ips=$(echo $KB_POD_IPS | tr "," "\n")
for ip in $ips; do
if [[ $ip == *":"* ]]; then
continue
fi
POD_IP_V4=$ip
break
done
if [[ -z $POD_IP_V4 ]]; then
echo "Failed to get IPv4 POD_IP from KB_POD_IPS"
exit 1
fi
curl --fail http://$POD_IP_V4:8030/api/health
] delay=0s timeout=1s period=5s #success=1 #failure=3
Startup: exec [/bin/bash -c POD_IP_V4=
ips=$(echo $KB_POD_IPS | tr "," "\n")
for ip in $ips; do
if [[ $ip == *":"* ]]; then
continue
fi
POD_IP_V4=$ip
break
done
if [[ -z $POD_IP_V4 ]]; then
echo "Failed to get IPv4 POD_IP from KB_POD_IPS"
exit 1
fi
curl --fail http://$POD_IP_V4:8030/api/health
] delay=0s timeout=1s period=5s #success=1 #failure=60
Environment Variables from:
strsent-blmdvx-fe-env ConfigMap Optional: false
strsent-blmdvx-fe-rsm-env ConfigMap Optional: false
Environment:
STARROCKS_USER: <set to the key 'username' in secret 'strsent-blmdvx-fe-account-root'> Optional: false
STARROCKS_PASSWORD: <set to the key 'password' in secret 'strsent-blmdvx-fe-account-root'> Optional: false
MYSQL_PWD: <set to the key 'password' in secret 'strsent-blmdvx-fe-account-root'> Optional: false
KB_POD_NAME: strsent-blmdvx-fe-0 (v1:metadata.name)
KB_POD_UID: (v1:metadata.uid)
KB_NAMESPACE: default (v1:metadata.namespace)
KB_SA_NAME: (v1:spec.serviceAccountName)
KB_NODENAME: (v1:spec.nodeName)
KB_HOST_IP: (v1:status.hostIP)
KB_POD_IP: (v1:status.podIP)
KB_POD_IPS: (v1:status.podIPs)
KB_HOSTIP: (v1:status.hostIP)
KB_PODIP: (v1:status.podIP)
KB_PODIPS: (v1:status.podIPs)
KB_POD_FQDN: $(KB_POD_NAME).strsent-blmdvx-fe-headless.$(KB_NAMESPACE).svc
TZ: Asia/Shanghai
POD_NAME: strsent-blmdvx-fe-0 (v1:metadata.name)
POD_IP: (v1:status.podIP)
HOST_IP: (v1:status.hostIP)
POD_NAMESPACE: default (v1:metadata.namespace)
HOST_TYPE: FQDN
COMPONENT_NAME: fe
CONFIGMAP_MOUNT_PATH: /etc/starrocks/fe/conf
SERVICE_PORT: 8030
Mounts:
/opt/starrocks/fe/conf from fe-cm (rw)
/opt/starrocks/fe/log from log (rw)
/opt/starrocks/fe/meta from data (rw)
/scripts from scripts (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9r8gj (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
log:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
fe-cm:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: strsent-blmdvx-fe-fe-cm
Optional: false
scripts:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: strsent-blmdvx-fe-scripts
Optional: false
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-strsent-blmdvx-fe-0
ReadOnly: false
kube-api-access-9r8gj:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: kb-data=true:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m42s default-scheduler Successfully assigned default/strsent-blmdvx-fe-0 to gke-infracreate-gke-kbdata-e2-standar-25c8fd47-ovfq
Normal Pulled 7m38s kubelet Container image "docker.io/starrocks/fe-ubuntu:3.2.2" already present on machine
Normal Created 7m38s kubelet Created container fe
Normal Started 7m38s kubelet Started container fe
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
Additional context Add any other context about the problem here.
The CPU resource is insufficient, please increase it to at least 1 core and retry.
kubectl get pod
increase cpu to 1c restart fe component pod Error
➜ ~ kubectl get pod
NAME READY STATUS RESTARTS AGE
strsent-blmdvx-cn-0 2/2 Running 0 14m
strsent-blmdvx-cn-1 2/2 Running 0 14m
strsent-blmdvx-fe-0 1/2 Error 0 14m
strsent-blmdvx-fe-1 1/2 Running 0 4m11s
➜ ~
➜ ~ kbcli cluster list-instances strsent-blmdvx
NAME NAMESPACE CLUSTER COMPONENT STATUS ROLE ACCESSMODE AZ CPU(REQUEST/LIMIT) MEMORY(REQUEST/LIMIT) STORAGE NODE CREATED-TIME
strsent-blmdvx-cn-0 default strsent-blmdvx cn Running <none> <none> <none> 1 / 1 1Gi / 1Gi <none> minikube/192.168.49.2 May 07,2024 14:41 UTC+0800
strsent-blmdvx-cn-1 default strsent-blmdvx cn Running <none> <none> <none> 1 / 1 1Gi / 1Gi <none> minikube/192.168.49.2 May 07,2024 14:41 UTC+0800
strsent-blmdvx-fe-0 default strsent-blmdvx fe Running <none> <none> <none> 1 / 1 1Gi / 1Gi data:20Gi minikube/192.168.49.2 May 07,2024 14:41 UTC+0800
strsent-blmdvx-fe-1 default strsent-blmdvx fe Running <none> <none> <none> 1 / 1 1Gi / 1Gi data:20Gi minikube/192.168.49.2 May 07,2024 14:52 UTC+0800
logs pod
kubectl logs strsent-blmdvx-fe-0 fe
[Tue May 7 14:41:59 CST 2024] /etc/starrocks/fe/conf not exist or not a directory, ignore ...
[Tue May 7 14:41:59 CST 2024] first start fe with meta not exist.
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:00 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:02 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:04 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:06 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:08 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:10 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:12 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:14 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:16 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:18 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:20 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:22 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:24 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:26 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:28 CST 2024] No leader yet, has_member: false ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsent-blmdvx-fe-fe:9030' (111)
[Tue May 7 14:42:30 CST 2024] No leader yet, has_member: false ...
[Tue May 7 14:42:30 CST 2024] Timed out, no members detected ever, assume myself is the first node ..
[Tue May 7 14:42:30 CST 2024] first start with no meta run start_fe.sh with additional options: ' --host_type IP'
kubectl logs strsent-blmdvx-fe-1 fe
[Tue May 7 14:57:44 CST 2024] /etc/starrocks/fe/conf not exist or not a directory, ignore ...
[Tue May 7 14:57:44 CST 2024] start fe with exist meta.
[Tue May 7 14:57:44 CST 2024] start with meta run start_fe.sh with additional options: ' --host_type IP'
describe pod
kubectl describe pod strsent-blmdvx-fe-1
Name: strsent-blmdvx-fe-1
Namespace: default
Priority: 0
Node: minikube/192.168.49.2
Start Time: Tue, 07 May 2024 14:52:36 +0800
Labels: app.kubernetes.io/component=starrocks-fe-sd
app.kubernetes.io/instance=strsent-blmdvx
app.kubernetes.io/managed-by=kubeblocks
app.kubernetes.io/name=starrocks-fe-sd
app.kubernetes.io/version=starrocks-fe-sd
apps.kubeblocks.io/cluster-uid=231d9573-3125-49a0-82eb-21b031492685
apps.kubeblocks.io/component-name=fe
apps.kubeblocks.io/pod-name=strsent-blmdvx-fe-1
componentdefinition.kubeblocks.io/name=starrocks-fe-sd
controller-revision-hash=7987578947
workloads.kubeblocks.io/instance=strsent-blmdvx-fe
workloads.kubeblocks.io/managed-by=InstanceSet
Annotations: apps.kubeblocks.io/component-replicas: 2
kubeblocks.io/restart: 2024-05-07T06:52:34Z
Status: Running
IP: 10.244.4.152
IPs:
IP: 10.244.4.152
Controlled By: InstanceSet/strsent-blmdvx-fe
Init Containers:
starrocks-tools:
Container ID: docker://f25ee7cf366fe04e37571b3de22d7a06e526edc8f5825e858c6f660982217873
Image: docker.io/apecloud/starrocks-tools:3.2.2
Image ID: docker-pullable://apecloud/starrocks-tools@sha256:fd9b4e989932b172368cdd1de986845ea96c0d5c19efd4c7fe3bea11bd7aa0f5
Port: <none>
Host Port: <none>
Command:
cp
/bin/mysql
/kb_tools/mysql
State: Terminated
Reason: Completed
Exit Code: 0
Started: Tue, 07 May 2024 14:52:44 +0800
Finished: Tue, 07 May 2024 14:52:44 +0800
Ready: True
Restart Count: 0
Limits:
cpu: 0
memory: 0
Requests:
cpu: 0
memory: 0
Environment Variables from:
strsent-blmdvx-fe-env ConfigMap Optional: false
Environment:
STARROCKS_USER: <set to the key 'username' in secret 'strsent-blmdvx-fe-account-root'> Optional: false
STARROCKS_PASSWORD: <set to the key 'password' in secret 'strsent-blmdvx-fe-account-root'> Optional: false
MYSQL_PWD: <set to the key 'password' in secret 'strsent-blmdvx-fe-account-root'> Optional: false
KB_POD_NAME: strsent-blmdvx-fe-1 (v1:metadata.name)
KB_POD_UID: (v1:metadata.uid)
KB_NAMESPACE: default (v1:metadata.namespace)
KB_SA_NAME: (v1:spec.serviceAccountName)
KB_NODENAME: (v1:spec.nodeName)
KB_HOST_IP: (v1:status.hostIP)
KB_POD_IP: (v1:status.podIP)
KB_POD_IPS: (v1:status.podIPs)
KB_HOSTIP: (v1:status.hostIP)
KB_PODIP: (v1:status.podIP)
KB_PODIPS: (v1:status.podIPs)
KB_POD_FQDN: $(KB_POD_NAME).strsent-blmdvx-fe-headless.$(KB_NAMESPACE).svc
TOOLS_SCRIPTS_PATH: /opt/kb-tools/reload/fe-cm
Mounts:
/kb_tools from kb-tools (rw)
/opt/config-manager from config-manager-config (rw)
/opt/kb-tools/reload/fe-cm from cm-script-fe-cm (rw)
/opt/starrocks/fe/conf from fe-cm (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-v5qp4 (ro)
Containers:
fe:
Container ID: docker://311820a1d144f843bfd7f0579b3919c2c079e50efe1c9f990f32702a9d1ee5dc
Image: docker.io/starrocks/fe-ubuntu:3.2.2
Image ID: docker-pullable://starrocks/fe-ubuntu@sha256:6446acb1a16ce103476b17c3844e9f7e12cd09ac188cfe8ff01aad56ca87e612
Ports: 8030/TCP, 9020/TCP, 9030/TCP, 9010/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Command:
bash
-c
# FIXME temporary workaround that will be removed in the future when the FE supports the IPv6
POD_IP_V4=
ips=$(echo $KB_POD_IPS | tr "," "\n")
for ip in $ips; do
if [[ $ip == *":"* ]]; then
continue
fi
POD_IP_V4=$ip
break
done
if [[ -z $POD_IP_V4 ]]; then
echo "Failed to get IPv4 POD_IP from KB_POD_IPS"
exit 1
fi
HOST_TYPE=IP POD_IP=${POD_IP_V4} /opt/starrocks/fe_entrypoint.sh ${FE_DISCOVERY_SERVICE_NAME}
State: Running
Started: Tue, 07 May 2024 14:57:44 +0800
Last State: Terminated
Reason: Error
Exit Code: 143
Started: Tue, 07 May 2024 14:52:44 +0800
Finished: Tue, 07 May 2024 14:57:44 +0800
Ready: False
Restart Count: 1
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 1
memory: 1Gi
Liveness: exec [/bin/bash -c POD_IP_V4=
ips=$(echo $KB_POD_IPS | tr "," "\n")
for ip in $ips; do
if [[ $ip == *":"* ]]; then
continue
fi
POD_IP_V4=$ip
break
done
if [[ -z $POD_IP_V4 ]]; then
echo "Failed to get IPv4 POD_IP from KB_POD_IPS"
exit 1
fi
curl --fail http://$POD_IP_V4:8030/api/health
] delay=0s timeout=1s period=5s #success=1 #failure=3
Readiness: exec [/bin/bash -c POD_IP_V4=
ips=$(echo $KB_POD_IPS | tr "," "\n")
for ip in $ips; do
if [[ $ip == *":"* ]]; then
continue
fi
POD_IP_V4=$ip
break
done
if [[ -z $POD_IP_V4 ]]; then
echo "Failed to get IPv4 POD_IP from KB_POD_IPS"
exit 1
fi
curl --fail http://$POD_IP_V4:8030/api/health
] delay=0s timeout=1s period=5s #success=1 #failure=3
Startup: exec [/bin/bash -c POD_IP_V4=
ips=$(echo $KB_POD_IPS | tr "," "\n")
for ip in $ips; do
if [[ $ip == *":"* ]]; then
continue
fi
POD_IP_V4=$ip
break
done
if [[ -z $POD_IP_V4 ]]; then
echo "Failed to get IPv4 POD_IP from KB_POD_IPS"
exit 1
fi
curl --fail http://$POD_IP_V4:8030/api/health
] delay=0s timeout=1s period=5s #success=1 #failure=60
Environment Variables from:
strsent-blmdvx-fe-env ConfigMap Optional: false
strsent-blmdvx-fe-its-env ConfigMap Optional: false
Environment:
STARROCKS_USER: <set to the key 'username' in secret 'strsent-blmdvx-fe-account-root'> Optional: false
STARROCKS_PASSWORD: <set to the key 'password' in secret 'strsent-blmdvx-fe-account-root'> Optional: false
MYSQL_PWD: <set to the key 'password' in secret 'strsent-blmdvx-fe-account-root'> Optional: false
KB_POD_NAME: strsent-blmdvx-fe-1 (v1:metadata.name)
KB_POD_UID: (v1:metadata.uid)
KB_NAMESPACE: default (v1:metadata.namespace)
KB_SA_NAME: (v1:spec.serviceAccountName)
KB_NODENAME: (v1:spec.nodeName)
KB_HOST_IP: (v1:status.hostIP)
KB_POD_IP: (v1:status.podIP)
KB_POD_IPS: (v1:status.podIPs)
KB_HOSTIP: (v1:status.hostIP)
KB_PODIP: (v1:status.podIP)
KB_PODIPS: (v1:status.podIPs)
KB_POD_FQDN: $(KB_POD_NAME).strsent-blmdvx-fe-headless.$(KB_NAMESPACE).svc
TZ: Asia/Shanghai
POD_NAME: strsent-blmdvx-fe-1 (v1:metadata.name)
POD_IP: (v1:status.podIP)
HOST_IP: (v1:status.hostIP)
POD_NAMESPACE: default (v1:metadata.namespace)
HOST_TYPE: FQDN
COMPONENT_NAME: fe
CONFIGMAP_MOUNT_PATH: /etc/starrocks/fe/conf
SERVICE_PORT: 8030
Mounts:
/kb_tools from kb-tools (rw)
/opt/starrocks/fe/conf from fe-cm (rw)
/opt/starrocks/fe/log from log (rw)
/opt/starrocks/fe/meta from data (rw)
/scripts from scripts (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-v5qp4 (ro)
config-manager:
Container ID: docker://2e9631debd2c56a85d57dda0109c31264178ce90687d6d124f60575e5f19be00
Image: docker.io/apecloud/kubeblocks-tools:0.9.0-beta.18
Image ID: docker-pullable://apecloud/kubeblocks-tools@sha256:24b7a15e6391c331b506a04c7653cd75ed5c2423e4d7a5dbefc7b52b67210d2a
Port: <none>
Host Port: <none>
Command:
env
Args:
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:$(TOOLS_PATH)
/bin/reloader
--log-level
info
--operator-update-enable
--tcp
9901
--config
/opt/config-manager/config-manager.yaml
State: Running
Started: Tue, 07 May 2024 14:52:44 +0800
Ready: True
Restart Count: 0
Limits:
cpu: 0
memory: 0
Requests:
cpu: 0
memory: 0
Environment Variables from:
strsent-blmdvx-fe-env ConfigMap Optional: false
strsent-blmdvx-fe-its-env ConfigMap Optional: false
Environment:
STARROCKS_USER: <set to the key 'username' in secret 'strsent-blmdvx-fe-account-root'> Optional: false
STARROCKS_PASSWORD: <set to the key 'password' in secret 'strsent-blmdvx-fe-account-root'> Optional: false
MYSQL_PWD: <set to the key 'password' in secret 'strsent-blmdvx-fe-account-root'> Optional: false
KB_POD_NAME: strsent-blmdvx-fe-1 (v1:metadata.name)
KB_POD_UID: (v1:metadata.uid)
KB_NAMESPACE: default (v1:metadata.namespace)
KB_SA_NAME: (v1:spec.serviceAccountName)
KB_NODENAME: (v1:spec.nodeName)
KB_HOST_IP: (v1:status.hostIP)
KB_POD_IP: (v1:status.podIP)
KB_POD_IPS: (v1:status.podIPs)
KB_HOSTIP: (v1:status.hostIP)
KB_PODIP: (v1:status.podIP)
KB_PODIPS: (v1:status.podIPs)
KB_POD_FQDN: $(KB_POD_NAME).strsent-blmdvx-fe-headless.$(KB_NAMESPACE).svc
CONFIG_MANAGER_POD_IP: (v1:status.podIP)
TOOLS_PATH: /opt/kb-tools/reload/fe-cm:/opt/config-manager:/kb_tools
Mounts:
/kb_tools from kb-tools (rw)
/opt/config-manager from config-manager-config (rw)
/opt/kb-tools/reload/fe-cm from cm-script-fe-cm (rw)
/opt/starrocks/fe/conf from fe-cm (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-v5qp4 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
log:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
fe-cm:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: strsent-blmdvx-fe-fe-cm
Optional: false
scripts:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: strsent-blmdvx-fe-scripts
Optional: false
cm-script-fe-cm:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: sidecar-starrocks-scripts-strsent-blmdvx
Optional: false
config-manager-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: sidecar-strsent-blmdvx-fe-config-manager-config
Optional: false
kb-tools:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-strsent-blmdvx-fe-1
ReadOnly: false
kube-api-access-v5qp4:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: kb-data=true:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m36s default-scheduler Successfully assigned default/strsent-blmdvx-fe-1 to minikube
Normal Pulled 7m29s kubelet Container image "docker.io/apecloud/starrocks-tools:3.2.2" already present on machine
Normal Created 7m29s kubelet Created container starrocks-tools
Normal Started 7m28s kubelet Started container starrocks-tools
Normal Pulled 7m28s kubelet Container image "docker.io/starrocks/fe-ubuntu:3.2.2" already present on machine
Normal Created 7m28s kubelet Created container fe
Normal Started 7m28s kubelet Started container fe
Normal Pulled 7m28s kubelet Container image "docker.io/apecloud/kubeblocks-tools:0.9.0-beta.18" already present on machine
Normal Created 7m28s kubelet Created container config-manager
Normal Started 7m28s kubelet Started container config-manager
Warning Unhealthy 6m9s (x16 over 7m24s) kubelet Startup probe failed: % Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (7) Failed to connect to 10.244.4.152 port 8030 after 0 ms: Connection refused
Warning FailedPreStopHook 2m28s kubelet PreStopHook failed
To ensure functionality in an IPv6 environment, StarRocks FE has switched to using IP addresses as unique identifiers. However, when a Pod is rebuilt, its IP changes, making the old IPs inaccessible. Consequently, FE cannot reach consensus, and the cluster fails to start.
The solution is to add an ipFamily option in the values.yaml
file, with values of either IPv4 or IPv6, indicating the primary protocol stack in the environment. If it is IPv4, the pod headless service domain name is used as the unique identifier, ensuring stability. If it is IPv6, since the StarRocks kernel does not yet support IPv6, we need to adapt the readinessProbe and livenessProbe methods to use IPv4, ensuring the cluster can be properly launched. However, a known issue is that the IP changes after Pod reconstruction, requiring manual removal of old nodes and addition of new ones.