bundle-kubeflow icon indicating copy to clipboard operation
bundle-kubeflow copied to clipboard

Units get stuck in error state when unsupported relation is joined and then removed

Open DnPlas opened this issue 3 years ago • 0 comments

None of the CKF charms handle unsupported relations, which can cause units to stay in an error state after an unsupported relation is joined and then removed. This behaviour is observed if you accidentally relate two charms that are not supposed to be related, they will go to an error state because of the unexpected relation, and then if you try to remove it, it will be deleted, but units will remain in the same state.

Steps to reproduce

I used mlflow and admission-webhook, but this may be the case for other charms.

  1. Deploy Charmed Kubeflow v1.4 in Microk8s 1.21, wait for all charms in the bundle to be active
  2. Deploy mlflow: juju deploy mlflow
  3. Relate mlflow and admission-webhook: juju relate mlflow admission-webhook
  4. Wait for units to crash: juju status mlflow-server admission-webhook should show errors
  5. Remove the offending relation juju remove-relation mlflow admission-webhook
  6. Units remain in the same error status as in step 4: juju status mlflow-server admission-webhook

DnPlas avatar Feb 04 '22 22:02 DnPlas