framework
framework copied to clipboard
Create MDS unittest for specific scenario
We assume the following scenario can result in a bad behavior when running the 'ensure_safety' check for a vdisk
- Safety is configured on 2
- Volume V has master on node1, slave on node2
- Master node dies, HA kicks in, volume gets moved by volumedriver to node2
- Volumedriver sends owner_changed event and fwk runs ensure_safety for said volume
- The logging indicated 2 reasons for reconfiguration and an error:
- Not enough safety
- Not enough services in use in primary domain
- Failed to update the metadata backend configuration
- Framework eventually configured node3 to be master and node4 to be slave, removing node2 from the config altogether
try:
if len(configs_no_ex_master) != len(configs_all):
vdisk.storagedriver_client.update_metadata_backend_config(volume_id=str(vdisk.volume_id),
metadata_backend_config=MDSMetaDataBackendConfig(configs_no_ex_master),
req_timeout_secs=5)
vdisk.storagedriver_client.update_metadata_backend_config(volume_id=str(vdisk.volume_id),
metadata_backend_config=MDSMetaDataBackendConfig(configs_all),
req_timeout_secs=5)
except Exception:
MDSServiceController._logger.exception('MDS safety: vDisk {0}: Failed to update the metadata backend configuration'.format(vdisk.guid))
raise Exception('MDS configuration for volume {0} with guid {1} could not be changed'.format(vdisk.name, vdisk.guid))
We assume that the 1st update_metadata_backend_config was initiated and timed out after 5 seconds, thus the 2nd update was not executed, which presumeably contained the original master node (node2). But this cannot be verified from the logging available On voldrv side we did see the actual updateMetadataBackendConfig succeeded, but took 386s for it to complete