kubeblocks
kubeblocks copied to clipboard
[BUG] PG restored cluster is always in Creating status due to Readiness probe failed
Describe the bug
Kubernetes: v1.29.7-gke.1274000
KubeBlocks: 0.9.1-beta.25
kbcli: 0.9.1-beta.10
To Reproduce Steps to reproduce the behavior:
-
Create pg cluster kbcli cluster create postgres-icpzcz --termination-policy=DoNotTerminate --cluster-definition=postgresql --enable-all-logs=false --cluster-version=postgresql-14.8.0 --set cpu=100m,memory=0.5Gi,replicas=2,storage=3Gi --namespace default
-
Create backup
kbcli cluster backup postgres-icpzcz --method wal-g --namespace default
-
Restore cluster
kbcli cluster restore postgres-icpzcz-backup --backup backup-default-postgres-icpzcz-20240914124229 --namespace default
-
Check cluster status
tianyue@localhost kbcli % k get cluster -A | grep postgres
default postgres-icpzcz postgresql postgresql-14.8.0 DoNotTerminate Running 61m
default postgres-icpzcz-backup postgresql postgresql-14.8.0 DoNotTerminate **Creating** 38m
- See error
tianyue@localhost kbcli % k describe cluster postgres-icpzcz-backup
Name: postgres-icpzcz-backup
Namespace: default
Labels: clusterdefinition.kubeblocks.io/name=postgresql
clusterversion.kubeblocks.io/name=postgresql-14.8.0
Annotations: kubeblocks.io/ops-request: [{"name":"postgres-icpzcz-backup","type":"Restore"}]
kubeblocks.io/reconcile: 2024-09-14T05:19:49.986719383Z
kubeblocks.io/restore-from-backup:
{"postgresql":{"connectionPassword":"EHhYeZrgFEC+x5rv7D+WRo9kZNvT2sIqM40QfqndWwQQIx94","doReadyRestoreAfterClusterRunning":"false","name":...
API Version: apps.kubeblocks.io/v1alpha1
Kind: Cluster
Metadata:
Creation Timestamp: 2024-09-14T04:44:04Z
Finalizers:
cluster.kubeblocks.io/finalizer
Generation: 1
Resource Version: 9186555
UID: c84b3271-c513-450b-b3ef-55c9d37dd378
Spec:
Affinity:
Pod Anti Affinity: Preferred
Tenancy: SharedNode
Cluster Definition Ref: postgresql
Cluster Version Ref: postgresql-14.8.0
Component Specs:
Component Def Ref: postgresql
Disable Exporter: true
Enabled Logs:
running
Name: postgresql
Replicas: 2
Resources:
Limits:
Cpu: 100m
Memory: 512Mi
Requests:
Cpu: 100m
Memory: 512Mi
Service Account Name: kb-postgres-icpzcz
Switch Policy:
Type: Noop
Volume Claim Templates:
Name: data
Spec:
Access Modes:
ReadWriteOnce
Resources:
Requests:
Storage: 3Gi
Resources:
Cpu: 0
Memory: 0
Storage:
Size: 0
Termination Policy: DoNotTerminate
Status:
Cluster Def Generation: 2
Components:
Postgresql:
Phase: Creating
Pods Ready: false
Conditions:
Last Transition Time: 2024-09-14T04:44:04Z
Message: The operator has started the provisioning of Cluster: postgres-icpzcz-backup
Observed Generation: 1
Reason: PreCheckSucceed
Status: True
Type: ProvisioningStarted
Last Transition Time: 2024-09-14T04:44:04Z
Message: Successfully applied for resources
Observed Generation: 1
Reason: ApplyResourcesSucceed
Status: True
Type: ApplyResources
Last Transition Time: 2024-09-14T04:44:04Z
Message: pods are not ready in Components: [postgresql], refer to related component message in Cluster.status.components
Reason: ReplicasNotReady
Status: False
Type: ReplicasReady
Last Transition Time: 2024-09-14T04:44:04Z
Message: pods are unavailable in Components: [postgresql], refer to related component message in Cluster.status.components
Reason: ComponentsNotReady
Status: False
Type: Ready
Observed Generation: 1
Phase: Creating
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal PreCheckSucceed 38m cluster-controller The operator has started the provisioning of Cluster: postgres-icpzcz-backup
Normal ApplyResourcesSucceed 38m cluster-controller Successfully applied for resources
Normal NeedWaiting 38m (x6 over 38m) component-controller waiting for restore "postgres-icpzcz-backup-postgresql-c84b3271-preparedata" successfully
Normal ComponentPhaseTransition 38m cluster-controller component is Creating
Warning Unhealthy 2m48s (x14 over 37m) event-controller Pod postgres-icpzcz-backup-postgresql-0: **Readiness probe failed**: 127.0.0.1:5432 - no response
Warning Unhealthy 2m41s (x12 over 36m) event-controller Pod postgres-icpzcz-backup-postgresql-1: **Readiness probe failed**: 127.0.0.1:5432 - no response
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
Additional context Add any other context about the problem here.