cluster-api-provider-proxmox icon indicating copy to clipboard operation
cluster-api-provider-proxmox copied to clipboard

Some VMs Boot into initramfs

Open 3deep5me opened this issue 1 year ago • 1 comments

/kind bug

What steps did you take and what happened: I create a cluster as described in the quick-start. I changed the replicas to 3 for md and controlplane. I also changed the storage for the vm-disks to a lvm-thin storage. Some VMs (in this example two and three) do not boot correctly instead there a going into the initramfs. Therefore the machines does not go into running state. Its random which VMs run into the error. Sometimes its even the first controlplane VM.

image

first try

k get machines
NAME                           CLUSTER       NODENAME                         PROVIDERID                                       PHASE         AGE     VERSION
cappx-test-6dqst               cappx-test    cappx-test-controlplane-llwfx    proxmox://c6244ce6-be79-4073-a378-8b0fd75a79cf   Running       6m47s   v1.27.3
cappx-test-kw42b               cappx-test    cappx-test-controlplane-zj9b7    proxmox://81616cfd-df41-4d1a-aeab-c0fdb76c2987   Running       10m     v1.27.3
cappx-test-md-0-cnslp-45tvt    cappx-test                                     proxmox://a15e2fb3-5d6f-4b1f-84fd-7fb49c91abf8   Provisioned   13m     v1.27.3
cappx-test-md-0-cnslp-c28nt    cappx-test                                     proxmox://1c38b356-fa5a-4343-8cd9-e28f013cd55f   Provisioned   13m     v1.27.3
cappx-test-md-0-cnslp-td5hh    cappx-test    cappx-test-md-0-2fmq7            proxmox://6cf85bbb-c1d8-46df-a38e-b5b6f25dbf68   Running       13m     v1.27.3
cappx-test-tnrmr               cappx-test    cappx-test-controlplane-4fds9    proxmox://dc4da772-23a6-4446-80e8-40527dc8ac55   Running    

second try

k get machines
NAME                           CLUSTER       NODENAME                         PROVIDERID                                       PHASE         AGE     VERSION
cappx-test-cbbp8               cappx-test    cappx-test-controlplane-gkpbx    proxmox://cea53be2-dd58-4902-80c5-3eab7ba9e3d5   Running       3m17s   v1.27.3
cappx-test-md-0-85954-9ctbs    cappx-test                                     proxmox://3c92d086-7bce-4f4e-8986-2e1d43629aa5   Provisioned   6m51s   v1.27.3
cappx-test-md-0-85954-g6lnk    cappx-test                                     proxmox://0d6f79a8-867d-4a79-af75-3b7970dc498d   Provisioned   6m51s   v1.27.3
cappx-test-md-0-85954-j82jv    cappx-test                                     proxmox://5764bd21-660c-42aa-9466-bf4fe321cbeb   Provisioned   6m51s   v1.27.3
cappx-test-qdpkj               cappx-test    cappx-test-controlplane-v92sr    proxmox://3e631789-3aa5-4e05-846f-31e3a6a0a857   Running       6m50s   v1.27.3

What did you expect to happen: The VMs should start as the others.

Anything else you would like to add: I'm not sure if this is cappx related if not maybe this is something which can maybe be caught by a MachineHealthCheck. (Will investigate in this)

Environment:

  • Cluster-api-provider-proxmox version: 0.3.3
  • Proxmox VE version: Virtual Environment 8.0.4
  • Kubernetes version: (use kubectl version):
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.3
  • OS (e.g. from /etc/os-release): debian(client) ubuntu(default, server)

3deep5me avatar Nov 14 '23 11:11 3deep5me