kubeblocks icon indicating copy to clipboard operation
kubeblocks copied to clipboard

[BUG] KB1.0 cluster start timeout

Open haowen159 opened this issue 5 months ago • 2 comments

Describe the bug A clear and concise description of what the bug is.

kbcli version
Kubernetes: v1.29.6-gke.1326000
KubeBlocks: 1.0.0-alpha.5
kbcli: 1.0.0-alpha.0

To Reproduce Steps to reproduce the behavior:

  1. create zookeeper cluster cluster yaml
apiVersion: [apps.kubeblocks.io/v1alpha1](http://apps.kubeblocks.io/v1alpha1)
kind: Cluster
metadata:
  annotations:
    [kubectl.kubernetes.io/last-applied-configuration](http://kubectl.kubernetes.io/last-applied-configuration): |
      {"apiVersion":"[apps.kubeblocks.io/v1alpha1](http://apps.kubeblocks.io/v1alpha1)","kind":"Cluster","metadata":{"annotations":{},"name":"zkeeper-vblwev","namespace":"default"},"spec":{"componentSpecs":[{"componentDef":"zookeeper","disableExporter":true,"env":[{"name":"ZOOKEEPER_IMAGE_VERSION","value":"3.4.14"}],"name":"zookeeper","replicas":1,"resources":{"limits":{"cpu":"500m","memory":"1Gi"},"requests":{"cpu":"500m","memory":"1Gi"}},"serviceVersion":"3.4.14","services":null,"volumeClaimTemplates":[{"name":"data","spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"5Gi"}},"storageClassName":null}},{"name":"log","spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"5Gi"}},"storageClassName":null}}]}],"terminationPolicy":"DoNotTerminate"}}
  creationTimestamp: "2024-08-30T07:22:24Z"
  finalizers:
  - [cluster.kubeblocks.io/finalizer](http://cluster.kubeblocks.io/finalizer)
  generation: 1
  labels:
    [app.kubernetes.io/instance](http://app.kubernetes.io/instance): zkeeper-vblwev
  name: zkeeper-vblwev
  namespace: default
  resourceVersion: "52451511"
  uid: 9462d162-9409-4189-9208-088b5f94fc73
spec:
  componentSpecs:
  - componentDef: zookeeper
    disableExporter: true
    env:
    - name: ZOOKEEPER_IMAGE_VERSION
      value: 3.4.14
    name: zookeeper
    replicas: 1
    resources:
      limits:
        cpu: 500m
        memory: 1Gi
      requests:
        cpu: 500m
        memory: 1Gi
    serviceVersion: 3.4.14
    volumeClaimTemplates:
    - name: data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 5Gi
    - name: log
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 5Gi
  terminationPolicy: DoNotTerminate
status:
  components:
    zookeeper:
      phase: Running
      podsReady: true
      podsReadyTime: "2024-08-30T07:22:52Z"
  conditions:
  - lastTransitionTime: "2024-08-30T07:22:24Z"
    message: 'The operator has started the provisioning of Cluster: zkeeper-vblwev'
    observedGeneration: 1
    reason: PreCheckSucceed
    status: "True"
    type: ProvisioningStarted
  - lastTransitionTime: "2024-08-30T07:22:24Z"
    message: Successfully applied for resources
    observedGeneration: 1
    reason: ApplyResourcesSucceed
    status: "True"
    type: ApplyResources
  - lastTransitionTime: "2024-08-30T07:22:52Z"
    message: all pods of components are ready, waiting for the probe detection successful
    reason: AllReplicasReady
    status: "True"
    type: ReplicasReady
  - lastTransitionTime: "2024-08-30T07:22:52Z"
    message: 'Cluster: zkeeper-vblwev is ready, current phase is Running'
    reason: ClusterReady
    status: "True"
    type: Ready
  observedGeneration: 1
  phase: Running
  1. stop cluster
kbcli cluster stop zkeeper-vblwev --auto-approve --force=true

cluster status

kbcli cluster list
NAME             NAMESPACE   CLUSTER-DEFINITION   VERSION   TERMINATION-POLICY   STATUS    CREATED-TIME                 
etcdr-favclu     default                                    WipeOut              Running   Aug 30,2024 15:09 UTC+0800   
zkeeper-vblwev   default                                    DoNotTerminate       Stopped   Aug 30,2024 15:22 UTC+0800
k get ops
NAME                         TYPE      CLUSTER          STATUS    PROGRESS   AGE
zkeeper-vblwev-stop-klhsg    Stop      zkeeper-vblwev   Succeed   1/1        30m

The stop ops is succeed! 3. start cluster

 kbcli cluster start zkeeper-vblwev --force=true  

cluster status

kbcli cluster list zkeeper-vblwev
NAME             NAMESPACE   CLUSTER-DEFINITION   VERSION   TERMINATION-POLICY   STATUS    CREATED-TIME                 
zkeeper-vblwev   default                                    DoNotTerminate       Running   Aug 30,2024 15:22 UTC+0800   
k get pod
NAME                         READY   STATUS    RESTARTS   AGE
zkeeper-vblwev-zookeeper-0   2/2     Running   0          8m13s
  1. see error
k get ops
NAME                         TYPE      CLUSTER          STATUS    PROGRESS   AGE
zkeeper-vblwev-start-dgcch   Start     zkeeper-vblwev   Running   0/1        9m

The start ops is always running. 5. logs

2024-08-30T08:24:41.271Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"zkeeper-vblwev-zookeeper-0.9kfbtj7tfm2c2dpx","namespace":"default"}, "namespace": "default", "name": "zkeeper-vblwev-zookeeper-0.9kfbtj7tfm2c2dpx", "reconcileID": "63532e11-688c-439e-aa9c-fa658ea90609", "event": {"name":"zkeeper-vblwev-zookeeper-0.9kfbtj7tfm2c2dpx","namespace":"default"}, "message": "failed to start command: fork/exec /bin/bash: no such file or directory: failed"}
2024-08-30T08:25:04.522Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-0.brqf2nqh2f2dwg28","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-0.brqf2nqh2f2dwg28", "reconcileID": "4e87a3ad-4cf6-42f5-884d-70add4ffe343", "event": {"name":"etcdr-favclu-etcd-0.brqf2nqh2f2dwg28","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T08:25:13.402Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-1.n7gj9lb8rtfmgfws","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-1.n7gj9lb8rtfmgfws", "reconcileID": "a44c5fba-68c9-4770-b659-885d756d244d", "event": {"name":"etcdr-favclu-etcd-1.n7gj9lb8rtfmgfws","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T08:25:22.195Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-2.2k8ghtx9t5d9vj82","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-2.2k8ghtx9t5d9vj82", "reconcileID": "288cc56e-24d2-45e2-82ea-ee471139035b", "event": {"name":"etcdr-favclu-etcd-2.2k8ghtx9t5d9vj82","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T08:25:41.276Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"zkeeper-vblwev-zookeeper-0.ctw5blkhvgl7t5fz","namespace":"default"}, "namespace": "default", "name": "zkeeper-vblwev-zookeeper-0.ctw5blkhvgl7t5fz", "reconcileID": "5e129456-0e40-4fee-848b-e382bd8dc5e4", "event": {"name":"zkeeper-vblwev-zookeeper-0.ctw5blkhvgl7t5fz","namespace":"default"}, "message": "failed to start command: fork/exec /bin/bash: no such file or directory: failed"}
2024-08-30T08:26:04.514Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-0.cjqpn5xz5xh6wd9p","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-0.cjqpn5xz5xh6wd9p", "reconcileID": "870919fe-c709-4b05-b8bb-67841331e14e", "event": {"name":"etcdr-favclu-etcd-0.cjqpn5xz5xh6wd9p","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T08:26:13.394Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-1.sjxgrkz97mvv69vv","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-1.sjxgrkz97mvv69vv", "reconcileID": "95d98ec3-c67a-4037-9c6d-4edfa1bdd3c0", "event": {"name":"etcdr-favclu-etcd-1.sjxgrkz97mvv69vv","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T08:26:22.194Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-2.nscjfvbwtmxxsxww","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-2.nscjfvbwtmxxsxww", "reconcileID": "ab78ffa5-f556-4a5b-980b-d2b4992af973", "event": {"name":"etcdr-favclu-etcd-2.nscjfvbwtmxxsxww","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T08:26:41.277Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"zkeeper-vblwev-zookeeper-0.8n7v7blkzjw5gf5w","namespace":"default"}, "namespace": "default", "name": "zkeeper-vblwev-zookeeper-0.8n7v7blkzjw5gf5w", "reconcileID": "0f319aca-d926-4ed3-a5ec-ea0a3a94e3ed", "event": {"name":"zkeeper-vblwev-zookeeper-0.8n7v7blkzjw5gf5w","namespace":"default"}, "message": "failed to start command: fork/exec /bin/bash: no such file or directory: failed"}
2024-08-30T08:27:04.515Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-0.8brr4tqhldg8n7fz","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-0.8brr4tqhldg8n7fz", "reconcileID": "d7a2e6c0-c147-4588-87e1-52d131ec08f1", "event": {"name":"etcdr-favclu-etcd-0.8brr4tqhldg8n7fz","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T08:27:13.393Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-1.jpktr7zvswr7nmzw","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-1.jpktr7zvswr7nmzw", "reconcileID": "0959b510-f2cc-4f7f-bf3c-aac42fb04361", "event": {"name":"etcdr-favclu-etcd-1.jpktr7zvswr7nmzw","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T08:27:22.198Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-2.65dtthck9gsn546z","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-2.65dtthck9gsn546z", "reconcileID": "f0c77f0f-f767-494b-8e6a-687a66e6f255", "event": {"name":"etcdr-favclu-etcd-2.65dtthck9gsn546z","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T08:27:41.279Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"zkeeper-vblwev-zookeeper-0.hmcndk5mlhvhzdgh","namespace":"default"}, "namespace": "default", "name": "zkeeper-vblwev-zookeeper-0.hmcndk5mlhvhzdgh", "reconcileID": "df5224c9-3055-4068-b38b-527dfee5ec3e", "event": {"name":"zkeeper-vblwev-zookeeper-0.hmcndk5mlhvhzdgh","namespace":"default"}, "message": "failed to start command: fork/exec /bin/bash: no such file or directory: failed"}
2024-08-30T08:28:04.516Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-0.8zf2rb84n4nvkbtc","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-0.8zf2rb84n4nvkbtc", "reconcileID": "b26c3fef-6ad2-433e-bdfc-24132a68532a", "event": {"name":"etcdr-favclu-etcd-0.8zf2rb84n4nvkbtc","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}
2024-08-30T08:28:13.401Z        INFO    probe event failed      {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"etcdr-favclu-etcd-1.66gbgtq2cdvtzflj","namespace":"default"}, "namespace": "default", "name": "etcdr-favclu-etcd-1.66gbgtq2cdvtzflj", "reconcileID": "1a7d3a41-c53f-4e39-9deb-de7b181e8441", "event": {"name":"etcdr-favclu-etcd-1.66gbgtq2cdvtzflj","namespace":"default"}, "message": "grep: /var/run/etcd/etcd.conf: No such file or directory\n/bin/sh: 59: [: =: unexpected operator\n/bin/sh: 61: [: =: unexpected operator\n: failed"}

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context Add any other context about the problem here.

haowen159 avatar Aug 30 '24 08:08 haowen159