kamaji
kamaji copied to clipboard
Enable topology spread constraints on `tcp` deployment
Feature description
Support the topology spread constraints to control how tcp replicas are spread across the admin cluster among failure-domains such as zones, racks, hosts, and other user-defined topology domains. This helps to achieve a more robust high availability of the Tenant Control Planes as well as efficient resource utilisation.
A simple proposal:
apiVersion: kamaji.clastix.io/v1alpha1
kind: TenantControlPlane
metadata:
name: tenant-00
namespace: default
spec:
controlPlane:
deployment:
replicas: 3
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: tcp
...
According to the definition above, the pods running the tcp will have:
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: tcp
Configure topology spread constraints by assigning the topology key label topology.kubernetes.io/zone to the Kamaji admin cluster nodes hosting the tenants' tcp pods:
kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
kamaji-infra-00 Ready <none> 15h v1.23.9 topology.kubernetes.io/zone=zone-a
kamaji-infra-01 Ready <none> 15h v1.23.9 topology.kubernetes.io/zone=zone-b
kamaji-infra-02 Ready <none> 15h v1.23.9 topology.kubernetes.io/zone=zone-c
To achieve pods deployment according to topology constraints, we can set the constraints at cluster level by creating a global scheduler configuration:
apiVersion: kubescheduler.config.k8s.io/v1beta3
kind: KubeSchedulerConfiguration
profiles:
- schedulerName: default-scheduler
pluginConfig:
- name: PodTopologySpread
args:
defaultConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
defaultingType: List
Since topology constraints are specific of the admin cluster hosting the tcp and not of the specific tcp, this seems a more wise option. @prometherion
Since topology constraints are specific of the admin cluster hosting the
tcpand not of the specifictcp, this seems a more wise option.
This would be applied to all the Pods in the cluster, and not for the TenantControlPlane ones, isn't it?
I would suggest adding the Deployment topologySpreadConstraint proposed here since wouldn't imply the use of a global scheduler configuration.
This would be applied to all the Pods in the cluster, and not for the TenantControlPlane ones, isn't it?
yes, it's a global behaviour unless the specific pod defines its own topologySpreadConstraint. For sure, having tcp its own setting will be more flexible solution.
spec: topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: app: tcp
Question on this: the matchLabels should be known in advance by the Cluster Administrator. If we want to make this feature totally transparent, I'd say we could replicate the same keys (maxSkew, topologyKey, whenUnsatisfiable, minDomains, nodeAffinityPolicy, nodeTaintsPolicy) omitting labelSelector that will be computed by the Kamaji operator.
omitting labelSelector that will be computed by the Kamaji operator
probably this might sound an override of the general topologySpreadConstraints feature, let say it is responsibility of the admin to set the label properly. What's your thought?
I think the feature is precious, I'm just saying that when deploying a Tenant Control Plane the Cluster Administrator should know in advance the Pod labels used because otherwise, the spread constraint wouldn't work.
Actually, all the Control Plane Pods got the label kamaji.clastix.io/soot=$tenantControlPlane.name}, and it's non-intuitive for a newcomer: what I can suggest is the following approach:
- if no label selector in the
topologySpreadConstraintsis provided, using the default Kamaji labels - otherwise, use the input ones.
With that said, the Cluster Administrator can also play with the additional metadata for the Deployment to add different labels, so there's no mandatory need to use the default ones.
I can start working on this.