kubeblocks icon indicating copy to clipboard operation
kubeblocks copied to clipboard

[BUG] PG restored cluster is always in Creating status due to Readiness probe failed

Open tianyue86 opened this issue 5 months ago • 2 comments

Describe the bug

Kubernetes: v1.29.7-gke.1274000
KubeBlocks: 0.9.1-beta.25
kbcli: 0.9.1-beta.10

To Reproduce Steps to reproduce the behavior:

  1. Create pg cluster kbcli cluster create postgres-icpzcz --termination-policy=DoNotTerminate --cluster-definition=postgresql --enable-all-logs=false --cluster-version=postgresql-14.8.0 --set cpu=100m,memory=0.5Gi,replicas=2,storage=3Gi --namespace default

  2. Create backup kbcli cluster backup postgres-icpzcz --method wal-g --namespace default

  3. Restore cluster kbcli cluster restore postgres-icpzcz-backup --backup backup-default-postgres-icpzcz-20240914124229 --namespace default

  4. Check cluster status

tianyue@localhost kbcli % k get cluster -A | grep postgres
default     postgres-icpzcz          postgresql            postgresql-14.8.0     DoNotTerminate       Running    61m
default     postgres-icpzcz-backup   postgresql            postgresql-14.8.0     DoNotTerminate       **Creating**   38m
  1. See error
tianyue@localhost kbcli % k describe cluster postgres-icpzcz-backup
Name:         postgres-icpzcz-backup
Namespace:    default
Labels:       clusterdefinition.kubeblocks.io/name=postgresql
              clusterversion.kubeblocks.io/name=postgresql-14.8.0
Annotations:  kubeblocks.io/ops-request: [{"name":"postgres-icpzcz-backup","type":"Restore"}]
              kubeblocks.io/reconcile: 2024-09-14T05:19:49.986719383Z
              kubeblocks.io/restore-from-backup:
                {"postgresql":{"connectionPassword":"EHhYeZrgFEC+x5rv7D+WRo9kZNvT2sIqM40QfqndWwQQIx94","doReadyRestoreAfterClusterRunning":"false","name":...
API Version:  apps.kubeblocks.io/v1alpha1
Kind:         Cluster
Metadata:
  Creation Timestamp:  2024-09-14T04:44:04Z
  Finalizers:
    cluster.kubeblocks.io/finalizer
  Generation:        1
  Resource Version:  9186555
  UID:               c84b3271-c513-450b-b3ef-55c9d37dd378
Spec:
  Affinity:
    Pod Anti Affinity:     Preferred
    Tenancy:               SharedNode
  Cluster Definition Ref:  postgresql
  Cluster Version Ref:     postgresql-14.8.0
  Component Specs:
    Component Def Ref:  postgresql
    Disable Exporter:   true
    Enabled Logs:
      running
    Name:      postgresql
    Replicas:  2
    Resources:
      Limits:
        Cpu:     100m
        Memory:  512Mi
      Requests:
        Cpu:               100m
        Memory:            512Mi
    Service Account Name:  kb-postgres-icpzcz
    Switch Policy:
      Type:  Noop
    Volume Claim Templates:
      Name:  data
      Spec:
        Access Modes:
          ReadWriteOnce
        Resources:
          Requests:
            Storage:  3Gi
  Resources:
    Cpu:     0
    Memory:  0
  Storage:
    Size:              0
  Termination Policy:  DoNotTerminate
Status:
  Cluster Def Generation:  2
  Components:
    Postgresql:
      Phase:       Creating
      Pods Ready:  false
  Conditions:
    Last Transition Time:  2024-09-14T04:44:04Z
    Message:               The operator has started the provisioning of Cluster: postgres-icpzcz-backup
    Observed Generation:   1
    Reason:                PreCheckSucceed
    Status:                True
    Type:                  ProvisioningStarted
    Last Transition Time:  2024-09-14T04:44:04Z
    Message:               Successfully applied for resources
    Observed Generation:   1
    Reason:                ApplyResourcesSucceed
    Status:                True
    Type:                  ApplyResources
    Last Transition Time:  2024-09-14T04:44:04Z
    Message:               pods are not ready in Components: [postgresql], refer to related component message in Cluster.status.components
    Reason:                ReplicasNotReady
    Status:                False
    Type:                  ReplicasReady
    Last Transition Time:  2024-09-14T04:44:04Z
    Message:               pods are unavailable in Components: [postgresql], refer to related component message in Cluster.status.components
    Reason:                ComponentsNotReady
    Status:                False
    Type:                  Ready
  Observed Generation:     1
  Phase:                   Creating
Events:
  Type     Reason                    Age                   From                  Message
  ----     ------                    ----                  ----                  -------
  Normal   PreCheckSucceed           38m                   cluster-controller    The operator has started the provisioning of Cluster: postgres-icpzcz-backup
  Normal   ApplyResourcesSucceed     38m                   cluster-controller    Successfully applied for resources
  Normal   NeedWaiting               38m (x6 over 38m)     component-controller  waiting for restore "postgres-icpzcz-backup-postgresql-c84b3271-preparedata" successfully
  Normal   ComponentPhaseTransition  38m                   cluster-controller    component is Creating
  Warning  Unhealthy                 2m48s (x14 over 37m)  event-controller      Pod postgres-icpzcz-backup-postgresql-0: **Readiness probe failed**: 127.0.0.1:5432 - no response
  Warning  Unhealthy                 2m41s (x12 over 36m)  event-controller      Pod postgres-icpzcz-backup-postgresql-1: **Readiness probe failed**: 127.0.0.1:5432 - no response

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context Add any other context about the problem here.

tianyue86 avatar Sep 14 '24 06:09 tianyue86