cluster-api-provider-proxmox add MachineHealthCheck

Hi @sp-yduck,

this adds a MachineHealthCheck for all Machines of a new cluster. This can help if the node does not go into a running state. E.g. #145 or network problems during startup or other reasons.

Nov 14 '23 12:11 3deep5me

Thank you for the PR ! As you may know this template file is used for Quick Start. I want to keep Quick Start with minimal setups. So if you want to include this, the options is

create new template for it so that users can choose specific template to try specific features ref: https://cluster-api.sigs.k8s.io/clusterctl/commands/generate-cluster.html?highlight=flavor#flavors ref: https://cluster-api.sigs.k8s.io/clusterctl/commands/generate-cluster.html?highlight=flavor#alternative-source-for-cluster-templates

Nov 15 '23 01:11 sp-yduck

I also changed the location of the quick start i hope this is fine for you. I don't know how the clusterctl knows where to find the cluster-templates. Do i have to change something else to make this work?

Nov 16 '23 09:11 3deep5me

clusterctl checks assets of the release so the file changes are ok. the thing is I am using make release to output these assets for each release.

make release-templates https://github.com/sp-yduck/cluster-api-provider-proxmox/blob/bbcdd56993d21f9e3581eed558806fc909b71cec/Makefile#L219-L220
make generate-e2e-templates https://github.com/sp-yduck/cluster-api-provider-proxmox/blob/bbcdd56993d21f9e3581eed558806fc909b71cec/Makefile#L111-L113

I think you can use kustomize build template/base or something for both of them

Nov 17 '23 03:11 sp-yduck

btw I believe mhc does not help for

E.g. https://github.com/sp-yduck/cluster-api-provider-proxmox/issues/145 or network problems during startup or other reasons.

since mhc checks Machine and Node object to confirm if Node(in workload cluster) is ready. so like issue #145 , if the vm goes into unhealthy before it joins k8s cluster, mhc cannot find the Node associated to that unhealthy vm and cannot remediate it.

Nov 20 '23 01:11 sp-yduck

I'm not sure about the detailed mechanics. But I can confirm that if a VM does not boot, it is deleted and recreated. (But at the moment no VM boots because i get the error every time 😢)

I think mhc also checks on default the status.condtion[].type.Ready field. Or that is the only way I can explain the behavior.

Nov 23 '23 17:11 3deep5me

cluster-api-provider-proxmox cluster-api-provider-proxmox copied to clipboard

add MachineHealthCheck

cluster-api-provider-proxmox
cluster-api-provider-proxmox copied to clipboard