cluster-api-provider-proxmox
cluster-api-provider-proxmox copied to clipboard
Towards Beta
Please find a draft PR of our current variation/extension of this provider. With those changes we are closer to be able to consider using it outside of just experimental setups:
- Run CAPPX with non-priviledged PVE user / permissions
- Workaround proxmox shortcomings (VNC shell login, storage folder permissions)
- Add VMs to proxmox resource pool (CAPPX only having permission to manage VMs on that resource pool)
- Reworked storage approach to use
import-from
to allow use different storage for VMs (related to #93) - Control over VM ids & node placement
- Configure nodes on
ProxmoxCluster
andProxmoxMachineTemplate
level - Initial support for CAPI Failure Domains (Node as Failure Domain Strategy)
- Configure VM id ranges
- Configure nodes on
- Network Config:
- Configurable bridge & vlan tag
- Initial support for CAPI IPAM (#4)
- Various smaller fixes / changes
We are currently testing with Cluster API Provider RKE2, so have some example for #68
Sample using new features & RKE2
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
labels:
cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
name: ${CLUSTER_NAME}
namespace: ${NAMESPACE}
spec:
controlPlaneEndpoint:
host: ${CONTROLPLANE_ADDRESS}
port: 6443
controlPlaneRef:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: RKE2ControlPlane
name: ${CLUSTER_NAME}
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: ProxmoxCluster
name: ${CLUSTER_NAME}
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: ProxmoxCluster
metadata:
labels:
cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
name: ${CLUSTER_NAME}
namespace: ${NAMESPACE}
spec:
controlPlaneEndpoint:
host: ${CONTROLPLANE_ADDRESS}
port: 6443
serverRef:
endpoint: https://${PROXMOX_ADDRESS}:8006/api2/json
secretRef:
name: ${CLUSTER_NAME}
namespace: ${NAMESPACE}
resourcePool: capi-proxmox
failureDomain:
nodeAsFailureDomain: true
nodes:
- node1
- node2
- node3
---
apiVersion: controlplane.cluster.x-k8s.io/v1alpha1
kind: RKE2ControlPlane
metadata:
name: ${CLUSTER_NAME}
namespace: ${NAMESPACE}
spec:
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: ProxmoxMachineTemplate
name: ${CLUSTER_NAME}-controlplane
replicas: 3
agentConfig:
version: ${K8S_VERSION}+rke2r1
kubelet:
extraArgs:
- 'cloud-provider=external'
nodeDrainTimeout: 2m
registrationMethod: "address"
registrationAddress: "${CONTROLPLANE_ADDRESS}"
postRKE2Commands:
- curl https://kube-vip.io/manifests/rbac.yaml > /var/lib/rancher/rke2/server/manifests/kube-vip-rbac.yaml
- /var/lib/rancher/rke2/bin/crictl -r "unix:///run/k3s/containerd/containerd.sock" pull ghcr.io/kube-vip/kube-vip:latest
- CONTAINERD_ADDRESS=/run/k3s/containerd/containerd.sock /var/lib/rancher/rke2/bin/ctr -n k8s.io run --rm --net-host ghcr.io/kube-vip/kube-vip:latest vip /kube-vip manifest daemonset --arp --address ${CONTROLPLANE_ADDRESS} --controlplane --leaderElection --taint --services --inCluster | tee /var/lib/rancher/rke2/server/manifests/kube-vip.yaml
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: ProxmoxMachineTemplate
metadata:
labels:
cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
name: ${CLUSTER_NAME}-controlplane
namespace: ${NAMESPACE}
spec:
vmIDs:
start: 1000
end: 1010
template:
spec:
hardware:
cpu: 4
memory: 8192
disk: 16G
storage: ${PROXMOX_VM_STORAGE}
image:
checksum: ${CLOUD_IMAGE_HASH_SHA256}
checksumType: sha256
url: ${CLOUD_IMAGE_URL}
network:
bridge: ${VM_BRIDGE}
vlanTag: ${VM_VLAN}
nameServer: ${IPV4_NAMESEVER}
ipConfig:
IPv4FromPoolRef:
name: ${IPV4_POOL_NAME}
apiGroup: ipam.cluster.x-k8s.io
kind: InClusterIPPool
options:
onBoot: true
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
labels:
cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
name: ${CLUSTER_NAME}-md-0
namespace: ${NAMESPACE}
spec:
clusterName: ${CLUSTER_NAME}
replicas: 3
selector:
matchLabels: {}
template:
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha1
kind: RKE2ConfigTemplate
name: ${CLUSTER_NAME}-md-0
clusterName: ${CLUSTER_NAME}
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: ProxmoxMachineTemplate
name: ${CLUSTER_NAME}-md-0
version: ${K8S_VERSION}
---
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha1
kind: RKE2ConfigTemplate
metadata:
labels:
cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
name: ${CLUSTER_NAME}-md-0
namespace: ${NAMESPACE}
spec:
template:
spec:
# preRKE2Commands:
# - sleep 30 # fix to give OS time to become ready
agentConfig:
version: ${K8S_VERSION}+rke2r1
kubelet:
extraArgs:
- "cloud-provider=external"
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: ProxmoxMachineTemplate
metadata:
labels:
cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
name: ${CLUSTER_NAME}-md-0
namespace: ${NAMESPACE}
spec:
nodes:
- node1
- node2
vmIDs:
start: 1010
end: 1019
template:
spec:
hardware:
cpu: 4
memory: 8192
disk: 16G
storage: ${PROXMOX_VM_STORAGE}
image:
checksum: ${CLOUD_IMAGE_HASH_SHA256}
checksumType: sha256
url: ${CLOUD_IMAGE_URL}
network:
bridge: ${VM_BRIDGE}
vlanTag: ${VM_VLAN}
nameServer: ${IPV4_NAMESEVER}
ipConfig:
IPv4FromPoolRef:
name: ${IPV4_POOL_NAME}
apiGroup: ipam.cluster.x-k8s.io
kind: InClusterIPPool
options:
onBoot: true
---
apiVersion: v1
kind: Secret
metadata:
labels:
cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
name: ${CLUSTER_NAME}
namespace: ${NAMESPACE}
stringData:
PROXMOX_PASSWORD: "${PROXMOX_PASSWORD}"
PROXMOX_SECRET: ""
PROXMOX_TOKENID: ""
PROXMOX_USER: "${PROXMOX_USER}"
type: Opaque
---
apiVersion: addons.cluster.x-k8s.io/v1beta1
kind: ClusterResourceSet
metadata:
labels:
cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
name: ${CLUSTER_NAME}-crs-0
namespace: ${NAMESPACE}
spec:
clusterSelector:
matchLabels:
cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
resources:
- kind: ConfigMap
name: ${CLUSTER_NAME}-cloud-controller-manager
strategy: Reconcile
---
apiVersion: v1
data:
cloud-controller-manager.yaml: |
apiVersion: v1
kind: ServiceAccount
metadata:
name: proxmox-cloud-controller-manager
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:proxmox-cloud-controller-manager
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: proxmox-cloud-controller-manager
namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
k8s-app: cloud-controller-manager
name: cloud-controller-manager
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: cloud-controller-manager
template:
metadata:
labels:
k8s-app: cloud-controller-manager
spec:
serviceAccountName: proxmox-cloud-controller-manager
containers:
- name: cloud-controller-manager
image: ghcr.io/sp-yduck/cloud-provider-proxmox:latest
command:
- /usr/local/bin/cloud-controller-manager
- --cloud-provider=proxmox
- --cloud-config=/etc/proxmox/config.yaml
- --leader-elect=true
- --use-service-account-credentials
- --controllers=cloud-node,cloud-node-lifecycle
volumeMounts:
- name: cloud-config
mountPath: /etc/proxmox
readOnly: true
livenessProbe:
httpGet:
path: /healthz
port: 10258
scheme: HTTPS
initialDelaySeconds: 20
periodSeconds: 30
timeoutSeconds: 5
volumes:
- name: cloud-config
secret:
secretName: cloud-config
tolerations:
- key: node.cloudprovider.kubernetes.io/uninitialized
value: "true"
effect: NoSchedule
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
nodeSelector:
node-role.kubernetes.io/control-plane: "true"
---
apiVersion: v1
kind: Secret
metadata:
name: cloud-config
namespace: kube-system
stringData:
config.yaml: |
proxmox:
url: https://${PROXMOX_ADDRESS}:8006/api2/json
user: ""
password: ""
tokenID: "${PROXMOX_CCM_TOKENID}"
secret: "${PROXMOX_CCM_SECRET}"
kind: ConfigMap
metadata:
name: ${CLUSTER_NAME}-cloud-controller-manager
namespace: ${NAMESPACE}
Beyond those current changes, based on our assessment, the existing CRD would further require some bigger refactoring / changes to enable other (for us required) use cases, such as support for multiple network devices & disks. Those changes will definitely be breaking changes.
Any suggestions & feedback welcome