kubeblocks icon indicating copy to clipboard operation
kubeblocks copied to clipboard

[BUG] KB1.0 cluster restart timeout

Open haowen159 opened this issue 5 months ago • 2 comments

Describe the bug A clear and concise description of what the bug is.

kbcli version
Kubernetes: v1.29.6-gke.1326000
KubeBlocks: 1.0.0-alpha.5
kbcli: 1.0.0-alpha.0

To Reproduce Steps to reproduce the behavior:

  1. create etcd cluster cluster yaml
apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
metadata:
  annotations:
    kubeblocks.io/ops-request: '[{"name":"etcdr-favclu-restart-crzd7","type":"Restart"}]'
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"apps.kubeblocks.io/v1alpha1","kind":"Cluster","metadata":{"annotations":{},"name":"etcdr-favclu","namespace":"default"},"spec":{"componentSpecs":[{"componentDef":"etcd","name":"etcd","replicas":3,"resources":{"limits":{"cpu":"100m","memory":"0.5Gi"},"requests":{"cpu":"100m","memory":"0.5Gi"}},"volumeClaimTemplates":[{"name":"data","spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"1Gi"}},"storageClassName":null}}]}],"terminationPolicy":"WipeOut"}}
  creationTimestamp: "2024-08-30T07:09:21Z"
  finalizers:
  - cluster.kubeblocks.io/finalizer
  generation: 3
  name: etcdr-favclu
  namespace: default
  resourceVersion: "52454470"
  uid: 7fc55f4a-4eda-4977-8cca-9179915006eb
spec:
  componentSpecs:
  - componentDef: etcd
    name: etcd
    replicas: 3
    resources:
      limits:
        cpu: 100m
        memory: 512Mi
      requests:
        cpu: 100m
        memory: 512Mi
    serviceVersion: v3.5.15
    volumeClaimTemplates:
    - name: data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
  resources:
    cpu: "0"
    memory: "0"
  storage:
    size: "0"
  terminationPolicy: WipeOut
status:
  components:
    etcd:
      phase: Running
      podsReady: true
      podsReadyTime: "2024-08-30T07:28:09Z"
  conditions:
  - lastTransitionTime: "2024-08-30T07:09:21Z"
    message: 'The operator has started the provisioning of Cluster: etcdr-favclu'
    observedGeneration: 3
    reason: PreCheckSucceed
    status: "True"
    type: ProvisioningStarted
  - lastTransitionTime: "2024-08-30T07:09:21Z"
    message: Successfully applied for resources
    observedGeneration: 3
    reason: ApplyResourcesSucceed
    status: "True"
    type: ApplyResources
  - lastTransitionTime: "2024-08-30T07:28:09Z"
    message: all pods of components are ready, waiting for the probe detection successful
    reason: AllReplicasReady
    status: "True"
    type: ReplicasReady
  - lastTransitionTime: "2024-08-30T07:28:09Z"
    message: 'Cluster: etcdr-favclu is ready, current phase is Running'
    reason: ClusterReady
    status: "True"
    type: Ready
  observedGeneration: 3
  phase: Running
  1. restart cluster
kbcli cluster restart etcdr-favclu --auto-approve --force=true
  1. see error
k get ops                                                     
NAME                         TYPE      CLUSTER        STATUS    PROGRESS   AGE
etcdr-favclu-restart-crzd7   Restart   etcdr-favclu   Running   0/3        13m

The opsrequest is timeout, and the process is always 0/3. 4. logs kb log:

2024-08-30T07:28:09.548Z        INFO    status conditions, creating: false, available: false, its running: true, has failure: false, updating: false, config synced: true       {"controller": "component", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Component", "Component": {"name":"etcdr-favclu-etcd","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd", "reconcileID": "e6c1a991-db7b-4562-8cbf-2181a06ae9b6", "component": {"name":"etcdr-favclu-etcd","namespace":"default"}}
2024-08-30T07:28:13.405Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-1.5zxnn8tq6fbrv9rb","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-1.5zxnn8tq6fbrv9rb", "reconcileID": "e0b1f75e-e585-4f34-a7f8-5819147c5b8e", "event": {"name":"etcdr-favclu-etcd-1.5zxnn8tq6fbrv9rb","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T07:28:22.186Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-2.hl6rjdslpmqk4gbp","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-2.hl6rjdslpmqk4gbp", "reconcileID": "1172a100-4e56-4446-8585-1ee4a6e5d234", "event": {"name":"etcdr-favclu-etcd-2.hl6rjdslpmqk4gbp","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T07:28:47.340Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"zkeeper-vblwev-zookeeper-0.lqtrrldrnf8hnvdz","namespace":"default"}, "namespace": "default", "name": "zkeeper-vblwev-zookeeper-0.lqtrrldrnf8hnvdz", "reconcileID": "3c3b1844-aeae-4594-a8ee-a132f3695ced", "event": {"name":"zkeeper-vblwev-zookeeper-0.lqtrrldrnf8hnvdz","namespace":"default"}, "message": "failed to start command: fork/exec /bin/bash: no such file or directory: failed"}
2024-08-30T07:29:04.527Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-0.47swk2f2lmn4p9xt","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-0.47swk2f2lmn4p9xt", "reconcileID": "125bb1cc-d32f-4e7b-bcd2-8a033b86e203", "event": {"name":"etcdr-favclu-etcd-0.47swk2f2lmn4p9xt","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T07:29:13.402Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-1.zzdqp7hhwhckx4rn","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-1.zzdqp7hhwhckx4rn", "reconcileID": "270b3d14-fec5-427c-b8ac-10abb5c7eb16", "event": {"name":"etcdr-favclu-etcd-1.zzdqp7hhwhckx4rn","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T07:29:22.190Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-2.pgx5d6sqsvq74nhv","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-2.pgx5d6sqsvq74nhv", "reconcileID": "2587aef7-cf7a-4884-9d3a-f344318d7c2b", "event": {"name":"etcdr-favclu-etcd-2.pgx5d6sqsvq74nhv","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T07:29:47.355Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"zkeeper-vblwev-zookeeper-0.ptlfqfxwqs2bzw8p","namespace":"default"}, "namespace": "default", "name": "zkeeper-vblwev-zookeeper-0.ptlfqfxwqs2bzw8p", "reconcileID": "4d244e31-cdce-4c8e-954d-4832a3e70580", "event": {"name":"zkeeper-vblwev-zookeeper-0.ptlfqfxwqs2bzw8p","namespace":"default"}, "message": "failed to start command: fork/exec /bin/bash: no such file or directory: failed"}
2024-08-30T07:30:04.513Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-0.szlx529xztv4xpdz","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-0.szlx529xztv4xpdz", "reconcileID": "b138e84c-6c1c-4c8d-8562-9c18caa96dbb", "event": {"name":"etcdr-favclu-etcd-0.szlx529xztv4xpdz","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T07:30:13.400Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-1.xf962h9wv7ljt6f2","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-1.xf962h9wv7ljt6f2", "reconcileID": "057160fb-9c0f-45e8-a945-512e296d2ed2", "event": {"name":"etcdr-favclu-etcd-1.xf962h9wv7ljt6f2","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T07:30:22.201Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-2.5vnk278h4zl88jxc","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-2.5vnk278h4zl88jxc", "reconcileID": "960f961a-7099-453b-8b30-69778bf41149", "event": {"name":"etcdr-favclu-etcd-2.5vnk278h4zl88jxc","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T07:30:47.362Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"zkeeper-vblwev-zookeeper-0.qdlsckt7wkkvg2rr","namespace":"default"}, "namespace": "default", "name": "zkeeper-vblwev-zookeeper-0.qdlsckt7wkkvg2rr", "reconcileID": "8e751acf-3547-4be5-b058-204dbb895f28", "event": {"name":"zkeeper-vblwev-zookeeper-0.qdlsckt7wkkvg2rr","namespace":"default"}, "message": "failed to start command: fork/exec /bin/bash: no such file or directory: failed"}
2024-08-30T07:31:04.514Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-0.s4ftnd2gwf7b4742","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-0.s4ftnd2gwf7b4742", "reconcileID": "8b9d1e4a-519c-4861-80ec-20b7ae0b5558", "event": {"name":"etcdr-favclu-etcd-0.s4ftnd2gwf7b4742","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context Add any other context about the problem here.

haowen159 avatar Aug 30 '24 07:08 haowen159