feat: support for externally managed control plane
Issue #95
Description of changes:
Supporting empty Control Plane endpoint when ProxmoxCluster is used by an externally managed Control Plane.
The ProxmoxCluster controller will wait for a valid IP before proceeding to mark the infrastructure ready.
Testing performed:
N.A.
Before providing more test coverage, may I ask for a simple review of the proposed changes, from a workflow perspective?
It's not clear to me if the maintainers are open to skipping the validation of the ProxmoxCluster.spec.controlPlaneEnpdoint field.
Hi @prometherion, thank you for your contribution. Skipping validation for optional fields is fine, but please make sure your changes don't alter the validation behavior of other fields. Users should see errors as early as possible. The controller checks for empty control plane endpoint should be done after IPAM was reconciled, so that they're not waiting for the external process to set the control plane endpoint.
Does Kamaji set the port and address separately? I'm asking, because my understanding was that the endpoint is always written in full or not at all. This would mean that you could merge the two conditions into one MissingControlPlaneEndpoint.
Does Kamaji set the port and address separately?
Kamaji is doing in a single transaction, yes, I can uniform those checks.
Anything needed here?
Thanks for the heads up @mcbenjemaa, I'm planning to work on this to make the PR ready for review, by the end of the week or the following one.
Pretty busy days, sorry.
take your time mate
@prometherion Are there any updates on this, or can I try it
Finally, I'm revamping it, feeling sorry for being late @mcbenjemaa.
Let me know if we're ready to get this tested.
can you provide a use-case to test it with Kamaji?
However, I don't think I like the 'magic' behaviour of empty
hostand(?)portmeaning externally managed. Someone could omit host/port by accident without actually intending to use Kamaji or so. This needs very clear documentation. (Technically, this doesn't actually make the fields optional (-:)
Documentation is a great place, of course, wondering if we could implement this kind of check in a different way.
The Cluster API has a contract for an externally managed control plane thanks to the status key externalManagedControlPlane.
When a user is open to using Kamaji, or any other Control Plane provider which is satisfying that status key contract, we could skip the validation requiring a filled ControlPlaneEndpoint struct. With this, we're creating more guardrails for users that otherwise could shoot themselves in the foot.
Two points, then:
- We should set externalManagedControlPlane in the status when selected
- I like it when things are explicit (or I don't like magic values) so I'd opt for a new key in
ProxmoxClusterSpec(we can name itExternalManagedControlPlaneto stay consistent) and the user either sets it totrueor populatesControlPlaneEndpointand we validate that only one of those is set.
- We should set externalManagedControlPlane in the status when selected
A bit confused, here. For our controller (Kamaji) it's already done, nothing's expected from the Proxmox CAPI provider.
2. I like it when things are explicit (or I don't like magic values) so I'd opt for a new key in
ProxmoxClusterSpec(we can name itExternalManagedControlPlaneto stay consistent) and the user either sets it totrueor populatesControlPlaneEndpointand we validate that only one of those is set.
Working on this.
Dunno who to ping here, I think I addressed all the required changes.
Quality Gate passed
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
85.7% Coverage on New Code
0.0% Duplication on New Code
@wikkyk no worries, no pressure at all, I truly understand the efforts in maintaining open source, especially considering summer has approached.
I addressed the changes you requested, and noticed e2e failed tho, could you help me understand what's broken here?
@prometherion the e2e fails are on us. We'll move them to a more reliable infra Soon™ :)
What is missing here, and what we can do?
@mcbenjemaa I just fixed the broken generated files, an e2e would be cool despite it seems a bit flaky according to the latest runs.
I can try also to provide a small recorded smoke test by showing the integration with Proxmox and Kamaji: unfortunately, providing a proper test is a bit complicated since the dependencies between the moving parts.
Quality Gate passed
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
83.6% Coverage on New Code
0.0% Duplication on New Code
@mcbenjemaa I just fixed the broken generated files, an e2e would be cool despite it seems a bit flaky according to the latest runs.
I can try also to provide a small recorded smoke test by showing the integration with Proxmox and Kamaji: unfortunately, providing a proper test is a bit complicated since the dependencies between the moving parts.
Can you share with me the manifest used to provision a proxmox cluster?
Quality Gate passed
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
82.3% Coverage on New Code
0.0% Duplication on New Code
@prometherion, we are happy to add support for this, can you help us document how to provide a cluster template and the technical considerations for the Kamaji CP provider?
Finally! Thank you for your contribution! :)
@mcbenjemaa absolutely, it's on my todo list!
Sharing also here just a reference to get this working with Kamaji as externally managed Control Plane
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: proxmox-quickstart
namespace: default
spec:
clusterNetwork:
pods:
cidrBlocks:
- REDACTED/REDACTED
controlPlaneRef:
apiVersion: controlplane.cluster.x-k8s.io/v1alpha1
kind: KamajiControlPlane
name: proxmox-quickstart
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: ProxmoxCluster
name: proxmox-quickstart
---
apiVersion: controlplane.cluster.x-k8s.io/v1alpha1
kind: KamajiControlPlane
metadata:
name: proxmox-quickstart
namespace: default
spec:
dataStoreName: default
addons:
coreDNS: { }
kubeProxy: { }
kubelet:
cgroupfs: systemd
preferredAddressTypes:
- InternalIP
network:
serviceType: LoadBalancer
deployment:
replicas: 2
version: 1.29.7
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: ProxmoxCluster
metadata:
name: proxmox-quickstart
namespace: default
spec:
allowedNodes:
- pve
dnsServers:
- REDACTED
- REDACTED
externalManagedControlPlane: true
ipv4Config:
addresses:
- REDACTED-REDACTED
gateway: REDACTED
prefix: REDACTED
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
name: proxmox-quickstart-workers
namespace: default
spec:
clusterName: proxmox-quickstart
replicas: 2
selector:
matchLabels: null
template:
metadata:
labels:
node-role.kubernetes.io/node: ""
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
name: proxmox-quickstart-worker
clusterName: proxmox-quickstart
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: ProxmoxMachineTemplate
name: proxmox-quickstart-worker
version: v1.29.7
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: ProxmoxMachineTemplate
metadata:
name: proxmox-quickstart-worker
namespace: default
spec:
template:
spec:
disks:
bootVolume:
disk: scsi0
sizeGb: REDACTED
format: qcow2
full: true
memoryMiB: REDACTED
network:
default:
bridge: REDACTED
model: virtio
numCores: REDACTED
numSockets: REDACTED
sourceNode: pve
templateID: REDACTED
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: proxmox-quickstart-worker
namespace: default
spec:
template:
spec:
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
provider-id: proxmox://'{{ ds.meta_data.instance_id }}'
users:
- name: root
sshAuthorizedKeys:
- REDACTED
I wasn't able to let worker nodes join the Control Plane mostly because I'm working on a kind environment, and Proxmox VE deployed in a Virtual Machine via Vagrant.
But overall everything's look good from the clusterctl describe command, sic:
NAME READY SEVERITY REASON SINCE MESSAGE
Cluster/proxmox-quickstart True 7m30s
├─ClusterInfrastructure - ProxmoxCluster/proxmox-quickstart True 7m43s
├─ControlPlane - KamajiControlPlane/proxmox-quickstart
└─Workers
└─MachineDeployment/proxmox-quickstart-workers False Warning WaitingForAvailableMachines 7m43s Minimum availability requires 2 replicas, current 0 available
├─Machine/proxmox-quickstart-workers-ft5tv-gfxf8 False Error VMProvisionFailed 35s 1 of 2 completed
└─Machine/proxmox-quickstart-workers-ft5tv-gjnmf False Info PoweringOn 5m14s 1 of 2 completed
