aerospike-kubernetes-operator
aerospike-kubernetes-operator copied to clipboard
Karpenter scaling with k8sNodeBlockList throw errors
Folks,
I updated a static AerospikeCluster
manifest with a bunch of EKS nodes on k8sNodeBlockList
. This triggered an update as expected:
NAME READY STATUS RESTARTS AGE
aerospike-1-0 2/2 Running 0 7h17m
aerospike-1-1 2/2 Running 0 7h16m
aerospike-1-2 0/2 Pending 0 82s
aerospike-2-0 2/2 Running 0 4h17m
aerospike-2-1 2/2 Running 0 4h17m
aerospike-2-2 2/2 Running 0 4h17m
aerospike-3-0 2/2 Running 0 27h
aerospike-3-1 2/2 Running 0 28h
aerospike-3-2 2/2 Running 0 27h
Although pod aerospike-1-2
keeps there forever. This is the error message from Karpenter:
2024-07-24T20:46:27.479Z DEBUG controller.provisioner ignoring pod, label kubernetes.io/hostname is restricted; specify a well known label: [karpenter.k8s.aws/instance-category karpenter.k8s.aws/instance-cpu karpenter.k8s.aws/instance-encryption-in-transit-supported karpenter.k8s.aws/instance-family karpenter.k8s.aws/instance-generation karpenter.k8s.aws/instance-gpu-count karpenter.k8s.aws/instance-gpu-manufacturer karpenter.k8s.aws/instance-gpu-memory karpenter.k8s.aws/instance-gpu-name karpenter.k8s.aws/instance-hypervisor karpenter.k8s.aws/instance-local-nvme karpenter.k8s.aws/instance-memory karpenter.k8s.aws/instance-network-bandwidth karpenter.k8s.aws/instance-pods karpenter.k8s.aws/instance-size karpenter.sh/capacity-type karpenter.sh/provisioner-name kubernetes.io/arch kubernetes.io/os node.kubernetes.io/instance-type topology.kubernetes.io/region topology.kubernetes.io/zone], or a custom label that does not use a restricted domain: [k8s.io karpenter.k8s.aws karpenter.sh kubernetes.io] {"commit": "dc3af1a", "pod": "datastore-shared/aerospike-1-2"}
Basically they don't allow kubernetes.io/hostname
with NodeAffinity
. This is what happens with that flag:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- us-east-1a
- key: kubernetes.io/hostname
operator: NotIn
values:
- <list of nodes>
I found this issue under Karpenter repo with the same issue where they say the usage is wrong.
Can you please share your thoughts if this can be improved somehow? Thanks.