bundle-kubeflow
bundle-kubeflow copied to clipboard
Units get stuck in error state when unsupported relation is joined and then removed
None of the CKF charms handle unsupported relations, which can cause units to stay in an error state after an unsupported relation is joined and then removed. This behaviour is observed if you accidentally relate two charms that are not supposed to be related, they will go to an error state because of the unexpected relation, and then if you try to remove it, it will be deleted, but units will remain in the same state.
Steps to reproduce
I used mlflow and admission-webhook, but this may be the case for other charms.
- Deploy Charmed Kubeflow v1.4 in Microk8s 1.21, wait for all charms in the bundle to be active
- Deploy mlflow:
juju deploy mlflow - Relate mlflow and admission-webhook:
juju relate mlflow admission-webhook - Wait for units to crash:
juju status mlflow-server admission-webhookshould show errors - Remove the offending relation
juju remove-relation mlflow admission-webhook - Units remain in the same error status as in step 4:
juju status mlflow-server admission-webhook