bosh-vsphere-cpi-release
bosh-vsphere-cpi-release copied to clipboard
Add opt-in support for per bosh deployment DRS rules
Feature Request
Detailed Description
Given dynamically created bosh deployments (e.g. mysql-cluster-1, mysql-cluster-2, ... mysql-cluster-N) with an instance group "mysql" with 3 instances, In order for DRS to avoid scheduling mysql instances of a given deployment on the same vsphere esx I need that vsphere bosh cpi supports a DRS rule per deployment
Currently, either
- static DRS rules specified in the cloud config are supported, see https://bosh.io/docs/vsphere-cpi/#azs
https://github.com/cloudfoundry/docs-bosh/blob/96fdb6fff79d7eed1f78b6fb05ce064de2acfea0/content/vsphere-cpi.md?plain=1#L15-L17
* **drs_rules** [Array, optional]: Array of DRS rules applied to [constrain VM placement](vm-anti-affinity.md#vsphere). Must have only one. * **name** [String, required]: Name of a DRS rule that the Director will create. * **type** [String, required]: Type of a DRS rule. Currently only `separate_vms` is supported.
- type: replace
path: /vm_extensions?/-
value:
name: drs-antiaffinity-r4
cloud_properties:
datacenters:
- name: ((/secrets/vsphere_4_vcenter_dc))
clusters:
#r4-z1 cluster
- ((/secrets/vsphere_4_1_vcenter_cluster)):
drs_rules:
- name: ((/secrets/site_type))-bosh-coab-drs-antiaffinity
type: separate_vms
- or systematic dynamic DRS rules can be turned on for a given bosh director, and then applies to all deployments
https://github.com/cloudfoundry/bosh-vsphere-cpi-release/blob/da8f3fc281c8e8aecb972d7182f798a4f673c184/jobs/vsphere_cpi/spec#L26-L28
https://github.com/cloudfoundry/bosh-vsphere-cpi-release/blob/bf35f007bd40a42c2ca9474b673f780dab39b8f8/src/vsphere_cpi/lib/cloud/vsphere/vm_creator.rb#L364-L376
https://github.com/cloudfoundry/bosh-vsphere-cpi-release/blob/bf35f007bd40a42c2ca9474b673f780dab39b8f8/src/vsphere_cpi/lib/cloud/vsphere/vm_config.rb#L145-L151
Given that env.bosh.group is systematically defined by bosh director in https://github.com/cloudfoundry/bosh/blob/dec31de320fcd29a574db8685f6abf697138f788/src/bosh-director/lib/bosh/director/deployment_plan/steps/create_vm_step.rb#L135 This results into DRS rules being created for each instance group of each deployment. The DRS rules are named from template: <bosh-director-name>-<bosh-deployment-name>-<instance-group-name>
This results into a large number of auto-created DRS rules for bosh directors with a existing large number of deployments
While theoretically there is no limit to number of DRS rules, it seems not recommended to enable this property on a bosh director with a large number of deployments (unless every single instance group in all deployments require an anti-affinity DRS rule ) https://communities.vmware.com/t5/VMware-vCenter-Discussions/Maximum-Number-of-DRS-Rules-per-Cluster/td-p/2744546
It is recommend to use DRS rules sparingly, hence it is better not to use them unless it is absolutely required. As the number of rules gets increased, it will restrict DRS opportunities of balancing the cluster. It is operationally challenging in managing them as well.
Context
Why is this change important to you? How would you use it?
In order to benefit from vsphere HA support from distinct esx instances, I need DRS anti affinity on relevant instance groups of selected deployments. This is important for many dynamic bosh deployments which can not leverage static DRS rules declared in the cloud-config.
Alternative Implementations
VM Types / VM Extensions support for enable_auto_anti_affinity_drs_rules
In addition to supporting the enable_auto_anti_affinity_drs_rules=true at the global level, this property would also be supported in a vm_types or vm_extensions block, overriding the global value.
Inspiration from similar property upgrade_hw_version
https://github.com/cloudfoundry/bosh-vsphere-cpi-release/blob/bf35f007bd40a42c2ca9474b673f780dab39b8f8/src/vsphere_cpi/lib/cloud/vsphere/vm_creator.rb#L240-L242
https://github.com/cloudfoundry/bosh-vsphere-cpi-release/blob/bf35f007bd40a42c2ca9474b673f780dab39b8f8/src/vsphere_cpi/lib/cloud/vsphere/vm_config.rb#L13-L15
https://github.com/orange-cloudfoundry/bosh-vsphere-cpi-release/blob/87b8474f18046e6920d4c44478138f084cb3cdf3/src/vsphere_cpi/spec/unit/cloud/vsphere/vm_config_spec.rb#L24-L50
~~New cpi property~~
EDIT: likely too complex proposal
Add new cpi flag vcenter.restrict_auto_anti_affinity_drs_rules_to_marked_instance_groups which adds new opt-in behavior without introducing breaking changes to existing behavior
vcenter.enable_auto_anti_affinity_drs_rules:
description: Creates a DRS rules for each instance group to place VMs on separate hosts. Conditional to the deployment manifest to set a non-nil `env.bosh.group` field in an instance group. The DRS rules are named from template: <bosh-director-name>-<bosh-deployment-name>-<instance-group-name>
default: false
vcenter.restrict_auto_anti_affinity_drs_rules_to_marked_instance_groups:
description: When `enable_auto_anti_affinity_drs_rules=true`, restrict auto generated DRS rules to instance groups declaring `env.bosh.enable_auto_anti_affinity_drs_rules=true` in the deployment manifest
default: false
Complexity
- [x] Low - Simple enhancement or bug fix, no architectural changes or refactoring
- [ ] Medium - Change requires some thought, but is relatively isolated
- [ ] High - Significant architectural change or large refactor
@selzoc would you accept a PR implementing this proposal ?
@selzoc would you accept a PR implementing this proposal ?
Well, it's not up to me! But I see this issue is in the Waiting for Changes | Open for Contribution part of the working group project, so we'd probably review it.
@cunnie would you by chance have historical background to review and comment this updated proposal, in particular the VM Types / VM Extensions support for enable_auto_anti_affinity_drs_rules section above ?