om
om copied to clipboard
`om` CLI will use an incorrect guid when running `configure-director` if two vSphere clusters under an AZ have the same cluster name.
Overview
On a vSphere environment, each AZ can have multiple clusters defined underneath it. The clusters have 3 properties that define its uniqueness - cluster
, resource_pool
, and host_group
. You can have multiple clusters that use the same cluster
name, as long as the resource_pool
or host_group
differs between the two. e.g.
az-configuration:
- name: puff-first-az
iaas_configuration_name: default
clusters:
- cluster: ops_manager_cluster
drs_rule: MUST
host_group: ""
resource_pool: ""
- cluster: ops_manager_cluster
drs_rule: MUST
host_group: ""
resource_pool: puff1
The om
CLI attempts to add in the guid
property for each cluster by using the /api/v0/staged/director/availability_zones
Ops Manager API endpoint. This ensures that the payload sent to the update AZ API endpoint is matched up with the existing AZ and cluster definitions. This is necessary because the fields are locked after BOSH + associated products are deployed, and Ops Manager protects against deletions / modifications to the AZs + clusters with an error like:
Cannot modify the cluster 'ops_manager_cluster' in the availability zone 'puff-first-az' of a deployed product
However, the logic om
CLI uses to look up the existing cluster only considers the cluster
property, which may not be unique within a given AZ:
https://github.com/pivotal-cf/om/blob/ca9f0f846ec7510d4a7d638feb709715ccc05834/api/director_service.go#L485-L488
In examples like the above, this will result in om
reusing the same guid
for two different clusters. The Ops Manager API does not currently prevent this (story to fix here: https://www.pivotaltracker.com/story/show/179348373). Once in this state, any attempts to modify the AZ definition, either in the Ops Manager UI or using the om
CLI will result in the previously mentioned 'Cannot modify the cluster ...' error.
Once the API is updated to properly prevent using the same GUID for two different clusters, the om
CLI will begin returning an error if this state is reached.
Reproduction steps
- Configure Ops Manager using
om configure-director --config director-config.yml
- Apply Changes
- Update
director-config.yml
to include a new cluster to the AZ that has the same cluster name, but a differentresource_pool
orhost_group property
than the original cluster. - Use
om configure-director
again to update the config in Ops Manager - Use
om staged-director-config
to get the latest config from Ops Manager. You will see the sameguid
defined from both clusters.
Workaround
There is no known workaround, other than using different cluster names (which is likely not possible since these are defined at the vSphere layer and would require vSphere configuration changes). Adding guid
to the director config YML file does not seem to help, since the code to look up and assign guid
always runs as part of the om configure-director
command.
We have created an issue in Pivotal Tracker to manage this. Unfortunately, the Pivotal Tracker project is private so you may be unable to view the contents of the story.
The labels on this github issue will be updated when the story is started.
HI @ystros
There is an existing PR #559 with the change, could you be able to check it out if it fixes your problem?
@ystros Hey Brian,
Did #559 work to resolve this issue for you?