cluster-api-provider-openstack icon indicating copy to clipboard operation
cluster-api-provider-openstack copied to clipboard

Allow clusters without explicit availability zones

Open mkjpryor opened this issue 2 years ago • 4 comments

/kind feature

Describe the solution you'd like

Currently, CAPO machines require that a failure domain (availability zone) is specified. However there are a number of cases where this results in a sub-optimal user experience, IMHO.

In my particular target case, we are using traits, flavors, host aggregates and project isolation filters to ensure that VMs land on hypervisors that the project is permitted to use and/or have particular specialist hardware and connectivity. This works best when an AZ is not specified, as that allows Nova to pick a suitable host without constraints.

However the operator still wants to use AZs as a housekeeping thing, e.g. corresponding to racks, so they can quickly see what is scheduled where.

Because CAPO requires AZs to be specified, this means that when using CAPI/O users have to make sure that the flavor and AZ that they choose are compatible. This leads to a lot of mistakes leading to "no valid host" errors that we then have to deal with as support. We would prefer not to specify the AZ and let Nova deal with it.

We also have ambitions to use Blazar to control access to resources, in which case it will be even more important that Nova is in full control of the scheduling.

Anything else you would like to add:

We should still use a server group with soft anti-affinity to ensure control plane hosts are scheduled on different hypervisors when possible. This can be used whether specifying explicit AZs or not, but becomes more important in the non-explicit case.

mkjpryor avatar Jun 01 '22 09:06 mkjpryor

/assign mkjpryor

apricote avatar Jun 15 '22 13:06 apricote

Summary of a few of days of discussion and discovery:

The failure domain model suggestion doesn't make sense. There is only 1 realistic failure domain model in OpenStack, and it's Availability Zones.

Cluster API already doesn't perform the problematic explicit failure domain spreading unless the provider has also marked at least 1 failure domain as ControlPlane. CAPO is currently marking all discovered AZs as ControlPlane by default if ControlPlaneAvailabilityZones is empty in the OpenStackCluster spec.

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/pull/1263 will shortly merge to fix the issue for non-control plane machines.


I'm currently thinking that we should control the default setting of the ControlPlane flag rather than not returning failure domains at all.

We have 3 options here that I can see:

  • Flip the default to false. If the user wants to spread across availability zones they must list them.

  • Add a ControlPlaneIgnoresAZs flag which defaults to false, compatible with the existing behaviour. If the user does not want the spreading behaviour then they can leave ControlPlaneAvailabilityZones empty and set this new flag to true.

  • Add a ControlPlaneUsesAllAZs flag which defaults to false, a change in the default behaviour. If the user does not want the spreading behaviour they just leave ControlPlaneAvailabilityZones empty. If they do want it they must also set ControlPlaneUsesAllAZs to true.

Essentially, ControlPlaneIgnoresAZs and ControlPlaneUsesAllAZs define the semantics of an empty ControlPlaneAvailabilityZones.

In all 3 cases we will still report failure domains in OpenStackCluster.Status.FailureDomains. The only difference will be whether we mark some of them with ControlPlane.

mdbooth avatar Jun 16 '22 13:06 mdbooth

This makes a lot of sense. For me, I prefer a solution whose default behaviour is backwards compatible as it means we can merge it without making a new API version. I guess this means the “ignore” flag with a default of false. But I am happy to implement either version.

@mdbooth and others - let me know if you have a preference.

mkjpryor avatar Jun 16 '22 15:06 mkjpryor

@mdbooth

I'm still keen to progress this as it is a blocker for our use case. I am still happy to make the changes and, as stated above, I have a preference for maintaining backwards compatibility by default. Do you have an opinion or shall I just make the changes?

mkjpryor avatar Jun 28 '22 12:06 mkjpryor

However the operator still wants to use AZs as a housekeeping thing, e.g. corresponding to racks, so they can quickly see what is scheduled where. Because CAPO requires AZs to be specified, this means that when using CAPI/O users have to make sure that the flavor and AZ that they choose are compatible. This leads to a lot of mistakes leading to "no valid host" errors that we then have to deal with as support. We would prefer not to specify the AZ and let Nova deal with it. However the operator still wants to use AZs as a housekeeping thing, e.g. corresponding to racks, so they can quickly see what is scheduled where.

@mkjpryor sorry for late asking questions here as I only focus on the conversion question at beginning so If I understand you correctly, you want to let CAPO ignore the AZ param but still let operator able to handle (otherwise you can disable the AZ filter so this PR is not needed) ?

jichenjc avatar Sep 14 '22 06:09 jichenjc

@jichenjc

What I want is to be able to make CAPO clusters where none of the machines specify an AZ, allowing nova to pick suitable ones based on other scheduling constraints. I want to be able to do this on OpenStack clouds that I do not own (and hence cannot modify Nova configuration).

#1263 made it possible to create machines without specifying a failureDomain, and this works for workers.

However CAPI will explicitly spread control plane nodes across the failure domains reported by the infracluster as being suitable for control plane nodes. In order for CAPI to create control plane nodes that do not specify a failure domain, the infracluster must report no suitable failure domains. #1318 adds an option to enable this behaviour, so it is still required.

Hope that makes sense?

mkjpryor avatar Sep 15 '22 15:09 mkjpryor

I want to be able to do this on OpenStack clouds that I do not own (and hence cannot modify Nova configuration).

yes, this is the key thing :) , as we can't modify server side, then yes, CAPO can do something to make it happen

jichenjc avatar Sep 16 '22 02:09 jichenjc