cloud-provider-vsphere Expand failure domains beyond region/zone

/kind feature

Failure domains should be expanded beyond region/zone tags to include the following:

Clusters
Datacenters
Datastores
Host groups
ResourcePools

Aug 16 '19 18:08 akutz

This is tied to the following issue: https://github.com/kubernetes/cloud-provider-vsphere/issues/179

Aug 19 '19 16:08 davidvonthenen

@akutz can you please elaborate more on that feature?

You mention "region/zone tags". Does this imply vSphere tags or K8s labels? Are we saying that besides the mapping of vSphere tags for zones/region we also want additional fields (in the CPI configuration file) for clusters, datacenters, etc.? Assuming this is for additional labels on the workers for better placement decisions, correct?

Aug 21 '19 19:08 embano1

@embano1 those are good questions. You can already apply regions/zones to Clusters, Datacenters, Host groups (by way of folders), and also ResourcePools. Is there something else you had in mind? You can also place a hierarchy on regions/zones too where leaf most nodes override higher-level constructs/nodes.

Datastores you can't since regions/zones apply to where pods run (ie Clusters, Datacenters, Host groups, and ResourcePools)... the admin needs to make sure those regions/zones have access to the datastores you need. Maybe expand on this a little more?

Aug 30 '19 14:08 davidvonthenen

/priority important-longterm /lfecycle frozen

Sep 04 '19 16:09 frapposelli

/lifecycle frozen

Sep 04 '19 16:09 frapposelli

ping @pdaigle

Dec 17 '19 23:12 akutz

This just came up in some discussions we were having internally. Wanting to potentially be able to add something like a "host" parameter to the failure domain. Reason being - say we have an edge cluster that has 5 nodes, that we are running 5 VMs on, one of those hosts fails and the VM is brought back up automatically on another host via HA rules. Now we have 1 host, that has 2 k8s nodes on it. The normal topology now mean that it is possible for an application with 2 replicas to be running on a physical host with both VMs. If that host also has an issue, the application completely fails until rescheduled etc. Being able to set a hardware host level affinity would have safely redistributed that application instead. We also have some scenarios using PCI devices where for optimization multiple VMs run on the same host, but they are still part of the same cluster, host level separation would be useful there as well, while still maintaining the zone and region concept parity with cloudy things.

Feb 01 '20 03:02 jordanrinke

Another use case: running OpenShift Data Foundation recommends using vSphere host anti-affinity rules to ensure the Ceph failure domains are distributed among different physical chassis/hypervisors. But with rack topology keys, which may not match zone/region tags we can use a rack failure domain for Ceph: https://github.com/rook/rook/blob/master/Documentation/ceph-cluster-crd.md#osd-topology

May 11 '22 14:05 QuingKhaos

cloud-provider-vsphere cloud-provider-vsphere copied to clipboard

Expand failure domains beyond region/zone

cloud-provider-vsphere
cloud-provider-vsphere copied to clipboard