kubeblocks
kubeblocks copied to clipboard
[BUG] apecloud-mysql cluster hscale offline instance failed
Describe the bug A clear and concise description of what the bug is.
To Reproduce Steps to reproduce the behavior:
- create cluster
kbcli cluster create mysql-numdet --termination-policy=DoNotTerminate --cluster-definition=apecloud-mysql --cluster-version=ac-mysql-8.0.30-1 --set cpu=100m,memory=0.5Gi,replicas=3,storage=1Gi
- hscale offline instance
apiVersion: apps.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
generateName: mysql-numdet-hscaleoffinstance-
labels:
app.kubernetes.io/instance: mysql-numdet
app.kubernetes.io/managed-by: kubeblocks
namespace: default
spec:
clusterRef: mysql-numdet
horizontalScaling:
- componentName: mysql
replicas: 2
offlineInstances: ["mysql-numdet-mysql-0"]
ttlSecondsAfterSucceed: 0
type: HorizontalScaling
- See error offlineInstances mysql-numdet-mysql-0 but mysql-numdet-mysql-2 pvc was terminated
➜ ~ kubectl get cluster mysql-numdet
NAME CLUSTER-DEFINITION VERSION TERMINATION-POLICY STATUS AGE
mysql-numdet apecloud-mysql ac-mysql-8.0.30-1 DoNotTerminate Failed 11m
➜ ~ kubectl get pod,ops,pvc -l app.kubernetes.io/instance=mysql-numdet
NAME READY STATUS RESTARTS AGE
pod/mysql-numdet-mysql-1 5/5 Running 0 5m54s
pod/mysql-numdet-mysql-2 4/5 Error 3 (28s ago) 60s
NAME TYPE CLUSTER STATUS PROGRESS AGE
opsrequest.apps.kubeblocks.io/mysql-numdet-hscaleoffinstance-kfw8b HorizontalScaling mysql-numdet Failed 2/2 100s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/data-mysql-numdet-mysql-0 Bound pvc-9dc0e084-9339-41c0-a9af-9a57475f4b29 1Gi RWO csi-hostpath-sc 5m54s
persistentvolumeclaim/data-mysql-numdet-mysql-1 Bound pvc-c9a24f2c-579c-4ca8-ba12-d3595fedac5e 1Gi RWO csi-hostpath-sc 5m54s
persistentvolumeclaim/data-mysql-numdet-mysql-2 Bound pvc-6e46ec8c-f80c-4e33-a35a-4ef366aa0863 1Gi RWO csi-hostpath-sc 60s
➜ ~
describe ops
kubectl describe ops mysql-numdet-hscaleoffinstance-kfw8b
Name: mysql-numdet-hscaleoffinstance-kfw8b
Namespace: default
Labels: app.kubernetes.io/instance=mysql-numdet
app.kubernetes.io/managed-by=kubeblocks
ops.kubeblocks.io/ops-type=HorizontalScaling
Annotations: <none>
API Version: apps.kubeblocks.io/v1alpha1
Kind: OpsRequest
Metadata:
Creation Timestamp: 2024-04-10T08:00:50Z
Finalizers:
opsrequest.kubeblocks.io/finalizer
Generate Name: mysql-numdet-hscaleoffinstance-
Generation: 2
Managed Fields:
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:generateName:
f:labels:
.:
f:app.kubernetes.io/instance:
f:app.kubernetes.io/managed-by:
f:spec:
.:
f:clusterRef:
f:horizontalScaling:
.:
k:{"componentName":"mysql"}:
.:
f:componentName:
f:offlineInstances:
f:replicas:
f:ttlSecondsBeforeAbort:
f:type:
Manager: kubectl-create
Operation: Update
Time: 2024-04-10T08:00:50Z
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.:
v:"opsrequest.kubeblocks.io/finalizer":
f:labels:
f:ops.kubeblocks.io/ops-type:
f:ownerReferences:
.:
k:{"uid":"257e6114-a734-4221-842b-f1e019cb40a7"}:
Manager: manager
Operation: Update
Time: 2024-04-10T08:00:50Z
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:clusterGeneration:
f:completionTimestamp:
f:components:
.:
f:mysql:
.:
f:lastFailedTime:
f:phase:
f:progressDetails:
f:conditions:
.:
k:{"type":"Failed"}:
.:
f:lastTransitionTime:
f:message:
f:reason:
f:status:
f:type:
k:{"type":"HorizontalScaling"}:
.:
f:lastTransitionTime:
f:message:
f:reason:
f:status:
f:type:
k:{"type":"Validated"}:
.:
f:lastTransitionTime:
f:message:
f:reason:
f:status:
f:type:
k:{"type":"WaitForProgressing"}:
.:
f:lastTransitionTime:
f:message:
f:reason:
f:status:
f:type:
f:lastConfiguration:
.:
f:components:
.:
f:mysql:
.:
f:replicas:
f:targetResources:
.:
f:pods:
f:phase:
f:progress:
f:startTimestamp:
Manager: manager
Operation: Update
Subresource: status
Time: 2024-04-10T08:01:33Z
Owner References:
API Version: apps.kubeblocks.io/v1alpha1
Kind: Cluster
Name: mysql-numdet
UID: 257e6114-a734-4221-842b-f1e019cb40a7
Resource Version: 2885439
UID: a2fe0bfc-5495-4697-b2d4-6cf9229d48cb
Spec:
Cluster Ref: mysql-numdet
Horizontal Scaling:
Component Name: mysql
Offline Instances:
mysql-numdet-mysql-0
Replicas: 2
Ttl Seconds Before Abort: 0
Type: HorizontalScaling
Status:
Cluster Generation: 2
Completion Timestamp: 2024-04-10T08:01:33Z
Components:
Mysql:
Last Failed Time: 2024-04-10T08:01:03Z
Phase: Failed
Progress Details:
End Time: 2024-04-10T08:01:30Z
Message: Failed to re-create: Pod/mysql-numdet-mysql-2 in Component: mysql, message: Role probe timeout, check whether the application is available
Object Key: Pod/mysql-numdet-mysql-2
Start Time: 2024-04-10T08:00:58Z
Status: Failed
End Time: 2024-04-10T08:01:29Z
Message: Successfully delete pod: Pod/mysql-numdet-mysql-0 in Component: mysql
Object Key: Pod/mysql-numdet-mysql-0
Start Time: 2024-04-10T08:00:58Z
Status: Succeed
Conditions:
Last Transition Time: 2024-04-10T08:00:50Z
Message: wait for the controller to process the OpsRequest: mysql-numdet-hscaleoffinstance-kfw8b in Cluster: mysql-numdet
Reason: WaitForProgressing
Status: True
Type: WaitForProgressing
Last Transition Time: 2024-04-10T08:00:50Z
Message: OpsRequest: mysql-numdet-hscaleoffinstance-kfw8b is validated
Reason: ValidateOpsRequestPassed
Status: True
Type: Validated
Last Transition Time: 2024-04-10T08:00:50Z
Message: Start to horizontal scale replicas in Cluster: mysql-numdet
Reason: HorizontalScalingStarted
Status: True
Type: HorizontalScaling
Last Transition Time: 2024-04-10T08:01:33Z
Message: Failed to process OpsRequest: mysql-numdet-hscaleoffinstance-kfw8b in cluster: mysql-numdet, more detailed informations in status.components
Reason: OpsRequestFailed
Status: False
Type: Failed
Last Configuration:
Components:
Mysql:
Replicas: 3
Target Resources:
Pods:
mysql-numdet-mysql-2
mysql-numdet-mysql-1
mysql-numdet-mysql-0
Phase: Failed
Progress: 2/2
Start Timestamp: 2024-04-10T08:00:50Z
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForProgressing 2m39s ops-request-controller wait for the controller to process the OpsRequest: mysql-numdet-hscaleoffinstance-kfw8b in Cluster: mysql-numdet
Normal ValidateOpsRequestPassed 2m39s (x2 over 2m39s) ops-request-controller OpsRequest: mysql-numdet-hscaleoffinstance-kfw8b is validated
Normal HorizontalScalingStarted 2m39s (x2 over 2m39s) ops-request-controller Start to horizontal scale replicas in Cluster: mysql-numdet
Normal Processing 2m31s ops-request-controller Start to delete pod: Pod/mysql-numdet-mysql-2 in Component: mysql
Normal Processing 2m31s ops-request-controller Start to delete pod: Pod/mysql-numdet-mysql-0 in Component: mysql
Normal Succeed 2m ops-request-controller Successfully delete pod: Pod/mysql-numdet-mysql-0 in Component: mysql
Warning Failed 119s ops-request-controller Failed to re-create: Pod/mysql-numdet-mysql-2 in Component: mysql, message: Role probe timeout, check whether the application is available
Warning OpsRequestFailed 116s (x2 over 116s) ops-request-controller Failed to process OpsRequest: mysql-numdet-hscaleoffinstance-kfw8b in cluster: mysql-numdet, more detailed informations in status.components
describe cluster
kubectl describe cluster mysql-numdet
Name: mysql-numdet
Namespace: default
Labels: clusterdefinition.kubeblocks.io/name=apecloud-mysql
clusterversion.kubeblocks.io/name=ac-mysql-8.0.30-1
Annotations: <none>
API Version: apps.kubeblocks.io/v1alpha1
Kind: Cluster
Metadata:
Creation Timestamp: 2024-04-10T07:56:36Z
Finalizers:
cluster.kubeblocks.io/finalizer
Generation: 2
Managed Fields:
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:spec:
.:
f:affinity:
.:
f:podAntiAffinity:
f:tenancy:
f:clusterDefinitionRef:
f:clusterVersionRef:
f:monitor:
f:resources:
.:
f:cpu:
f:memory:
f:storage:
.:
f:size:
f:terminationPolicy:
Manager: kbcli
Operation: Update
Time: 2024-04-10T07:56:36Z
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.:
v:"cluster.kubeblocks.io/finalizer":
f:labels:
.:
f:clusterdefinition.kubeblocks.io/name:
f:clusterversion.kubeblocks.io/name:
f:spec:
f:componentSpecs:
Manager: manager
Operation: Update
Time: 2024-04-10T08:00:50Z
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:clusterDefGeneration:
f:components:
.:
f:mysql:
.:
f:message:
.:
f:Pod/mysql-numdet-mysql-0:
f:Pod/mysql-numdet-mysql-1:
f:Pod/mysql-numdet-mysql-2:
f:phase:
f:podsReady:
f:podsReadyTime:
f:conditions:
f:observedGeneration:
f:phase:
Manager: manager
Operation: Update
Subresource: status
Time: 2024-04-10T08:01:03Z
Resource Version: 2885438
UID: 257e6114-a734-4221-842b-f1e019cb40a7
Spec:
Affinity:
Pod Anti Affinity: Preferred
Tenancy: SharedNode
Cluster Definition Ref: apecloud-mysql
Cluster Version Ref: ac-mysql-8.0.30-1
Component Specs:
Component Def Ref: mysql
Enabled Logs:
auditlog
error
general
slow
Monitor: false
Name: mysql
Offline Instances:
mysql-numdet-mysql-0
Replicas: 2
Resources:
Limits:
Cpu: 100m
Memory: 512Mi
Requests:
Cpu: 100m
Memory: 512Mi
Service Account Name: kb-mysql-numdet
Volume Claim Templates:
Name: data
Spec:
Access Modes:
ReadWriteOnce
Resources:
Requests:
Storage: 1Gi
Monitor:
Resources:
Cpu: 0
Memory: 0
Storage:
Size: 0
Termination Policy: DoNotTerminate
Status:
Cluster Def Generation: 2
Components:
Mysql:
Message:
Pod/mysql-numdet-mysql-0: Role probe timeout, check whether the application is available
Pod/mysql-numdet-mysql-1: Role probe timeout, check whether the application is available
Pod/mysql-numdet-mysql-2: Role probe timeout, check whether the application is available
Phase: Failed
Pods Ready: false
Pods Ready Time: 2024-04-10T07:59:21Z
Conditions:
Last Transition Time: 2024-04-10T07:56:36Z
Message: The operator has started the provisioning of Cluster: mysql-numdet
Observed Generation: 2
Reason: PreCheckSucceed
Status: True
Type: ProvisioningStarted
Last Transition Time: 2024-04-10T07:56:36Z
Message: Successfully applied for resources
Observed Generation: 2
Reason: ApplyResourcesSucceed
Status: True
Type: ApplyResources
Last Transition Time: 2024-04-10T08:00:58Z
Message: pods are not ready in Components: [mysql], refer to related component message in Cluster.status.components
Reason: ReplicasNotReady
Status: False
Type: ReplicasReady
Last Transition Time: 2024-04-10T08:00:58Z
Message: pods are unavailable in Components: [mysql], refer to related component message in Cluster.status.components
Reason: ComponentsNotReady
Status: False
Type: Ready
Observed Generation: 2
Phase: Failed
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ComponentPhaseTransition 7m10s cluster-controller component is Creating
Normal AllReplicasReady 4m27s cluster-controller all pods of components are ready, waiting for the probe detection successful
Normal ComponentPhaseTransition 4m27s cluster-controller component is Running
Normal Running 4m27s cluster-controller Cluster: mysql-numdet is ready, current phase is Running
Normal ClusterReady 4m27s cluster-controller Cluster: mysql-numdet is ready, current phase is Running
Normal ComponentPhaseTransition 4m27s cluster-controller component is Abnormal
Warning Abnormal 4m27s cluster-controller Cluster: mysql-numdet is Abnormal, check according to the components message
Normal ApplyResourcesSucceed 2m58s (x2 over 7m12s) cluster-controller Successfully applied for resources
Normal PreCheckSucceed 2m58s (x2 over 7m12s) cluster-controller The operator has started the provisioning of Cluster: mysql-numdet
Normal HorizontalScale 2m50s component-controller start horizontal scale component mysql of cluster mysql-numdet from 3 to 2
Normal ComponentPhaseTransition 2m50s cluster-controller component is Updating
Warning ReplicasNotReady 2m50s cluster-controller pods are not ready in Components: [mysql], refer to related component message in Cluster.status.components
Warning ComponentsNotReady 2m50s cluster-controller pods are unavailable in Components: [mysql], refer to related component message in Cluster.status.components
Warning Failed 2m45s (x2 over 4m28s) cluster-controller Cluster: mysql-numdet is Failed, check according to the components message
Normal ComponentPhaseTransition 2m45s (x2 over 4m28s) cluster-controller component is Failed
logs error pod
kubectl logs mysql-numdet-mysql-2 mysql
mv: cannot stat '/data/mysql/plugin/audit_log.so': No such file or directory
+ rmdir /docker-entrypoint-initdb.d
+ mkdir -p /data/mysql/auditlog
+ mkdir -p /data/mysql/binlog
+ mkdir -p /data/mysql/docker-entrypoint-initdb.d
+ ln -s /data/mysql/docker-entrypoint-initdb.d /docker-entrypoint-initdb.d
KB_MYSQL_N=2
KB_MYSQL_CLUSTER_UID=a2052974-5450-4254-86e2-453ad1aec9a6
KB_0_HOSTNAME=mysql-numdet-mysql-0.mysql-numdet-mysql-headless
+ generate_cluster_info
+ local pod_name=mysql-numdet-mysql-2
+ local cluster_members=
+ export MYSQL_PORT=3306
+ MYSQL_PORT=3306
+ export MYSQL_CONSENSUS_PORT=13306
+ MYSQL_CONSENSUS_PORT=13306
+ export KB_MYSQL_VOLUME_DIR=/data/mysql/
+ KB_MYSQL_VOLUME_DIR=/data/mysql/
+ export KB_MYSQL_CONF_FILE=/opt/mysql/my.cnf
+ KB_MYSQL_CONF_FILE=/opt/mysql/my.cnf
+ '[' -z 2 ']'
+ echo KB_MYSQL_N=2
+ '[' -z a2052974-5450-4254-86e2-453ad1aec9a6 ']'
+ echo KB_MYSQL_CLUSTER_UID=a2052974-5450-4254-86e2-453ad1aec9a6
+ (( i = 0 ))
+ (( i < KB_REPLICA_COUNT ))
+ '[' 0 -gt 0 ']'
+ host=KB_0_HOSTNAME
+ echo KB_0_HOSTNAME=mysql-numdet-mysql-0.mysql-numdet-mysql-headless
+ cluster_members=mysql-numdet-mysql-0.mysql-numdet-mysql-headless:13306
+ export KB_MYSQL_0_HOSTNAME=mysql-numdet-mysql-0.mysql-numdet-mysql-headless
+ KB_MYSQL_0_HOSTNAME=mysql-numdet-mysql-0.mysql-numdet-mysql-headless
+ (( i++ ))
+ (( i < KB_REPLICA_COUNT ))
+ '[' 1 -gt 0 ']'
+ cluster_members='mysql-numdet-mysql-0.mysql-numdet-mysql-headless:13306;'
+ host=KB_1_HOSTNAME
+ echo KB_1_HOSTNAME=mysql-numdet-mysql-1.mysql-numdet-mysql-headless
+ cluster_members='mysql-numdet-mysql-0.mysql-numdet-mysql-headless:13306;mysql-numdet-mysql-1.mysql-numdet-mysql-headless:13306'
+ export KB_MYSQL_1_HOSTNAME=mysql-numdet-mysql-1.mysql-numdet-mysql-headless
+ KB_MYSQL_1_HOSTNAME=mysql-numdet-mysql-1.mysql-numdet-mysql-headless
+ (( i++ ))
+ (( i < KB_REPLICA_COUNT ))
+ export 'KB_MYSQL_CLUSTER_MEMBERS=mysql-numdet-mysql-0.mysql-numdet-mysql-headless:13306;mysql-numdet-mysql-1.mysql-numdet-mysql-headless:13306'
+ KB_MYSQL_CLUSTER_MEMBERS='mysql-numdet-mysql-0.mysql-numdet-mysql-headless:13306;mysql-numdet-mysql-1.mysql-numdet-mysql-headless:13306'
+ export KB_MYSQL_CLUSTER_MEMBER_INDEX=2
+ KB_MYSQL_CLUSTER_MEMBER_INDEX=2
+ local pod_host=KB_2_HOSTNAME
/scripts/setup.sh: line 39: !pod_host: missing current member hostname
KB_1_HOSTNAME=mysql-numdet-mysql-1.mysql-numdet-mysql-headless
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
kbcli version
Kubernetes: v1.26.3
KubeBlocks: 0.9.0-beta.4
kbcli: 0.9.0-beta.1
Additional context Add any other context about the problem here.
mongo cluster hscale offline instance failed too
- create cluster
kbcli cluster create mongo-nnwhrm --termination-policy=WipeOut --cluster-definition=mongodb --cluster-version=mongodb-4.2 --set cpu=100m,memory=0.5Gi,replicas=5,storage=3Gi
- hscale offline instance
kubectl create -f -<<EOF
apiVersion: apps.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
generateName: mongo-nnwhrm-hscaleoffinstance-
labels:
app.kubernetes.io/instance: mongo-nnwhrm
app.kubernetes.io/managed-by: kubeblocks
namespace: default
spec:
clusterRef: mongo-nnwhrm
horizontalScaling:
- componentName: mongodb
replicas: 4
offlineInstances: ["mongo-nnwhrm-mongodb-1"]
ttlSecondsAfterSucceed: 0
type: HorizontalScaling
EOF
- see error
~ kubectl get cluster mongo-nnwhrm
NAME CLUSTER-DEFINITION VERSION TERMINATION-POLICY STATUS AGE
mongo-nnwhrm mongodb mongodb-4.2 WipeOut Updating 18m
➜ ~
➜ ~ kubectl get pod,ops,pvc -l app.kubernetes.io/instance=mongo-nnwhrm
NAME READY STATUS RESTARTS AGE
pod/mongo-nnwhrm-mongodb-0 3/3 Running 0 15m
pod/mongo-nnwhrm-mongodb-2 3/3 Running 0 16m
pod/mongo-nnwhrm-mongodb-3 3/3 Running 0 15m
pod/mongo-nnwhrm-mongodb-4 3/3 Running 0 15m
NAME TYPE CLUSTER STATUS PROGRESS AGE
opsrequest.apps.kubeblocks.io/mongo-nnwhrm-hscaleoffinstance-9lsnc HorizontalScaling mongo-nnwhrm Running 5/5 16m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/data-mongo-nnwhrm-mongodb-0 Bound pvc-3d32c833-f14a-4630-a89f-6b3c80b427ea 3Gi RWO csi-hostpath-sc 18m
persistentvolumeclaim/data-mongo-nnwhrm-mongodb-1 Bound pvc-f31bcbeb-60e5-448c-89a8-191e98906bac 3Gi RWO csi-hostpath-sc 18m
persistentvolumeclaim/data-mongo-nnwhrm-mongodb-2 Bound pvc-ea1ade6a-d40c-4d5f-a8bf-cee0451bb23f 3Gi RWO csi-hostpath-sc 18m
persistentvolumeclaim/data-mongo-nnwhrm-mongodb-3 Bound pvc-850e61de-f71f-4a3d-966f-86dff4e4f062 3Gi RWO csi-hostpath-sc 18m
persistentvolumeclaim/data-mongo-nnwhrm-mongodb-4 Bound pvc-261ec477-d7e3-419e-87df-58511fcd59a6 3Gi RWO csi-hostpath-sc 15m
see mongo-nnwhrm-mongodb-4 role is null
kbcli cluster list-instances mongo-nnwhrm
NAME NAMESPACE CLUSTER COMPONENT STATUS ROLE ACCESSMODE AZ CPU(REQUEST/LIMIT) MEMORY(REQUEST/LIMIT) STORAGE NODE CREATED-TIME
mongo-nnwhrm-mongodb-0 default mongo-nnwhrm mongodb Running primary <none> <none> 100m / 100m 512Mi / 512Mi data:3Gi minikube/192.168.49.2 Apr 10,2024 16:17 UTC+0800
mongo-nnwhrm-mongodb-2 default mongo-nnwhrm mongodb Running secondary <none> <none> 100m / 100m 512Mi / 512Mi data:3Gi minikube/192.168.49.2 Apr 10,2024 16:16 UTC+0800
mongo-nnwhrm-mongodb-3 default mongo-nnwhrm mongodb Running secondary <none> <none> 100m / 100m 512Mi / 512Mi data:3Gi minikube/192.168.49.2 Apr 10,2024 16:17 UTC+0800
mongo-nnwhrm-mongodb-4 default mongo-nnwhrm mongodb Running <none> <none> <none> 100m / 100m 512Mi / 512Mi data:3Gi minikube/192.168.49.2 Apr 10,2024 16:17 UTC+0800
This issue has been marked as stale because it has been open for 30 days with no activity