cluster-api-provider-proxmox icon indicating copy to clipboard operation
cluster-api-provider-proxmox copied to clipboard

Unable to create a cluster on a PVE node with sufficient memory

Open chengleqi opened this issue 1 year ago • 16 comments
trafficstars

What steps did you take and what happened:

I followed the quickstart guide step by step, used the Proxmox VE builder to successfully create a PVE template, then configured ~/.cluster-api/clusterctl.yaml, and finally I used the following command to create the cluster:

clusterctl generate cluster proxmox-quickstart \
    --infrastructure proxmox \
    --kubernetes-version v1.27.8 \
    --control-plane-machine-count 1 \
    --worker-machine-count 3 > cluster.yaml

kubectl apply -f cluster.yaml

Then I received an error message:

E0120 05:46:40.670202       1 controller.go:324] "Reconciler error" err="failed to reconcile VM: cannot reserve 2147483648B of memory on node newpve: 0B available memory left" controller="proxmoxmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="ProxmoxMachine" ProxmoxMachine="default/proxmox-quickstart-control-plane-gnjd5" namespace="default" name="proxmox-quickstart-control-plane-gnjd5" reconcileID="bd2f39b8-51fb-4cce-bd4d-429d596a8e31"

What did you expect to happen: Successfully created the cluster.

Anything else you would like to add: https://github.com/ionos-cloud/cluster-api-provider-proxmox/issues/36#issuecomment-1901789546

Environment:

  • Cluster-api-provider-proxmox version: 0.1.1
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release): Ubuntu 22.04

chengleqi avatar Jan 20 '24 06:01 chengleqi

Did you set the Variable MEMORY_MIB ?

mcbenjemaa avatar Jan 20 '24 10:01 mcbenjemaa

Did you set the Variable MEMORY_MIB ?

I tried setting the MEMORY_MIB variable as well as commenting out the MEMORY_MIB variable, and both methods resulted in an error with allocatable memory being 0.

chengleqi avatar Jan 20 '24 11:01 chengleqi

for _, vm := range vms {
		// Ignore VM Templates, as they can't be started.
		if vm.Template {
			continue
		}
		if reservableMemory < vm.MaxMem {
			reservableMemory = 0
		} else {
			reservableMemory -= vm.MaxMem
		}
	}

I think it should be an issue with the code above because there are already some virtual machines on my PVE node, and the total amount of memory set for these virtual machines has exceeded the total memory of the PVE node.

chengleqi avatar Jan 20 '24 11:01 chengleqi

Try to add ProxmoxCluster.spec.schedulerHints.memoryAdjustment: 300 note that you can also disable scheduling by setting it to 0.

Please read more about this here node-over--underprovisioning

mcbenjemaa avatar Jan 20 '24 12:01 mcbenjemaa

Try to add ProxmoxCluster.spec.schedulerHints.memoryAdjustment: 300 note that you can also disable scheduling by setting it to 0.

Please read more about this here node-over--underprovisioning

Thank you, I'll take a look.

chengleqi avatar Jan 20 '24 12:01 chengleqi

I added spec.schedulerHints to the ProxmoxCluster as follows:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: ProxmoxCluster
metadata:
  name: proxmox-quickstart
  namespace: default
spec:
  schedulerHints:
    memoryAdjustment: 0

But I got an error:

Error from server (BadRequest): error when creating "cluster.yaml": ProxmoxCluster in version "v1alpha1" cannot be handled as a ProxmoxCluster: strict decoding error: unknown field "spec.schedulerHints"`.

I found that the proxmoxclusters.infrastructure.cluster.x-k8s.io/v1alpha1 CRD installed with the command clusterctl init --infrastructure proxmox --ipam in-cluster --core cluster-api:v1.5.3 does not have the schedulerHints field, but the schedulerHints field is included in the project's config/crd/base/infrastructure.cluster.x-k8s.io_proxmoxclusters.yaml.

The configuration file called by clusterctl init --infrastructure proxmox --ipam in-cluster --core cluster-api:v1.5.3 is https://github.com/ionos-cloud/cluster-api-provider-proxmox/releases/latest/infrastructure-components.yaml, but the proxmoxclusters CRD in this file does not have the schedulerHints field. However, the proxmoxclusters CRD in the out/infrastructure-components.yaml file generated by make release-manifests includes the schedulerHints field. I suspect that the https://github.com/ionos-cloud/cluster-api-provider-proxmox/releases/latest/infrastructure-components.yaml file has not been updated. Could there be an issue with GitHub Actions CI Pipeline?

chengleqi avatar Jan 20 '24 14:01 chengleqi

you're right, This feature is not yet released, We will have the next release in the upcoming days.

mcbenjemaa avatar Jan 20 '24 15:01 mcbenjemaa

you're right, This feature is not yet released, We will have the next release in the upcoming days.

Thank you for your kind help, looking forward to the next release.

chengleqi avatar Jan 21 '24 03:01 chengleqi

v0.2.0 is now released https://github.com/ionos-cloud/cluster-api-provider-proxmox/releases/tag/v0.2.0

mcbenjemaa avatar Jan 25 '24 14:01 mcbenjemaa

@chengleqi did it work for you

mcbenjemaa avatar Jan 28 '24 13:01 mcbenjemaa

Hi @mcbenjemaa,

I had the same error and I was able to fix it by upgrading to v0.2.0 and using this conf too :

I added spec.schedulerHints to the ProxmoxCluster as follows:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: ProxmoxCluster
metadata:
  name: proxmox-quickstart
  namespace: default
spec:
  schedulerHints:
    memoryAdjustment: 0

But I got an error:

Error from server (BadRequest): error when creating "cluster.yaml": ProxmoxCluster in version "v1alpha1" cannot be handled as a ProxmoxCluster: strict decoding error: unknown field "spec.schedulerHints"`.

I found that the proxmoxclusters.infrastructure.cluster.x-k8s.io/v1alpha1 CRD installed with the command clusterctl init --infrastructure proxmox --ipam in-cluster --core cluster-api:v1.5.3 does not have the schedulerHints field, but the schedulerHints field is included in the project's config/crd/base/infrastructure.cluster.x-k8s.io_proxmoxclusters.yaml.

The configuration file called by clusterctl init --infrastructure proxmox --ipam in-cluster --core cluster-api:v1.5.3 is https://github.com/ionos-cloud/cluster-api-provider-proxmox/releases/latest/infrastructure-components.yaml, but the proxmoxclusters CRD in this file does not have the schedulerHints field. However, the proxmoxclusters CRD in the out/infrastructure-components.yaml file generated by make release-manifests includes the schedulerHints field. I suspect that the https://github.com/ionos-cloud/cluster-api-provider-proxmox/releases/latest/infrastructure-components.yaml file has not been updated. Could there be an issue with GitHub Actions CI Pipeline?

Thanks :ok_hand:

EBMBA avatar Apr 21 '24 19:04 EBMBA

You should upgrade to latest version, And it will work.

mcbenjemaa avatar Apr 21 '24 20:04 mcbenjemaa

Try

clusterctl init --ipam in-cluster --infrastructure proxmox:v0.4.0

Or v0.3.0

Both should work.

mcbenjemaa avatar Apr 21 '24 20:04 mcbenjemaa

Is this still an issue with v0.4.0?

wikkyk avatar Apr 22 '24 15:04 wikkyk

Hu @wikkyk , I've got v0.4.0 and I need to add the schedulerHints to the ProxmoxCluster specs manually:

generated:

  - lastTransitionTime: "2024-05-25T07:33:11Z"
    reason: WaitingForNodeRef
    severity: Info
    status: "False"
    type: NodeHealthy
  failureMessage: 'Failure detected from referenced resource infrastructure.cluster.x-k8s.io/v1alpha1,
    Kind=ProxmoxMachine with name "proxmox-quickstart-control-plane-t9hjg": cannot
    reserve 8438939648B of memory on node host01: 0B available memory left'
  failureReason: InsufficientResources
  lastUpdated: "2024-05-25T07:33:12Z"
  observedGeneration: 2
  phase: Failed

Adding manually:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: ProxmoxCluster
metadata:
  name: proxmox-quickstart
  namespace: default
spec:
  schedulerHints:
    memoryAdjustment: 0

Result:

  - lastTransitionTime: "2024-05-25T07:39:59Z"
    reason: WaitingForNodeRef
    severity: Info
    status: "False"
    type: NodeHealthy

pasettifabio avatar May 25 '24 07:05 pasettifabio

@pasettifabio We've had #47 since CAPMOX v0.2.0. Does it still not work if you follow https://github.com/ionos-cloud/cluster-api-provider-proxmox/blob/main/docs/advanced-setups.md#node-over--underprovisioning ?

wikkyk avatar Jun 10 '24 09:06 wikkyk

I'm closing this issue because I believe there is nothing left for us to do here. Feel free to reopen if it's still a problem.

wikkyk avatar Mar 14 '25 08:03 wikkyk