che icon indicating copy to clipboard operation
che copied to clipboard

Workspaces can get stuck in failed state if DevWorkspaceRouting cannot be processed

Open amisevsk opened this issue 3 years ago • 0 comments

Describe the bug

Due to a bug [1] in the DevWorkspace Operator, it's possible for workspaces to get stuck in a failed state if the Che Operator encounters a temporary issue in reconciling DevWorkspaceRoutings. Once a DevWorkspaceRouting is failed, further reconciles exit early and cannot clear the failed status.

This issue is for tracking in the Che repo; the fix will have to come in the DevWorkspace Operator. There are workarounds listed in the DWO issue.

[1] - https://github.com/devfile/devworkspace-operator/issues/923

Che version

next (development version)

Steps to reproduce

  1. Install Che as normal
  2. Create a second CheCluster in another namespace. This will cause all workspace starts to fail with error
    Unable to provision networking for DevWorkspace: workspace routing is invalid: the routing does not specify any Che manager in its configuration but there are 2 Che managers in the cluster
    
  3. Create a DevWorkspace and wait for it to enter the failed state
  4. Remove the second CheCluster from step 2.
  5. New workspaces or workspaces that didn't enter the failed state due to the second CheCluster can be started as normal, but any workspaces that failed cannot be started.

Expected behavior

Failed status should be cleared when a workspace is restarted.

Runtime

Kubernetes (vanilla)

Screenshots

No response

Installation method

other (please specify in additional context)

Environment

Linux

Eclipse Che Logs

No response

Additional context

No response

amisevsk avatar Sep 10 '22 01:09 amisevsk