Improve error reporting when instance fails to reconcile
Feature and motivation
When running the MAS CLI mas install command, if any existing MAS Suite CR in the cluster is in a problematic state, the installer will fail on Step 3. The error message will be:
3) Review MAS Instances
The following MAS intances are installed on the target cluster and will be affected by the catalog update:
Traceback (most recent call last):
File "/opt/app-root/bin/mas-cli", line 62, in
app.update(argv[2:])
File "/opt/app-root/lib64/python3.9/site-packages/mas/cli/update/app.py", line 94, in update
self.reviewMASInstance()
File "/opt/app-root/lib64/python3.9/site-packages/mas/cli/update/app.py", line 225, in reviewMASInstance
suites = listMasInstances(self.dynamicClient)
File "/opt/app-root/lib64/python3.9/site-packages/mas/devops/mas.py", line 124, in listMasInstances
logger.info(f" * {suite['metadata']['name']} v{suite['status']['versions']['reconciled']}")
KeyError: 'reconciled'
This error message does not explain clearly what the problem is, or which Suite CR is causing the error message. The proposed enhancement would be for the error message to clearly state which Suite CR needs to be reviewed/fixed before the installation can proceed.
Usage example
This would minimize the time needed to investigate any failed installations due to this error, as well as reducing the amount of support cases created by clients who do not understand what the current error means.
The proposed enhancement would be for the error message to clearly state which Suite CR needs to be reviewed/fixed before the installation can proceed.
Agreed, we should be able to catch instances in unexpected state and trigger a FatalError with a clear message indicating that we don't recommend (and thus don't support) installing MAS instance on a cluster that is already unhealthy, and provide details of the MAS instance that's not in a good state.