conductor icon indicating copy to clipboard operation
conductor copied to clipboard

Conductor workflow stalled after a sub-workflow

Open rajeshwar-nu opened this issue 1 year ago • 3 comments

Hi Team, I am experiencing this issue in latest version of conductor https://github.com/Netflix/conductor/issues/3491

Stack

  1. orkesio/orkes-conductor-community:1.1.11
  2. Redis for workflow execution - http://docker.io/bitnami/redis:7.0.8-debian-11-r0|docker.io/bitnami/redis:7.0.8-debian-11-r0
  3. Postgres for workflow persistence - http://ghcr.io/cloudnative-pg/postgresql:15.3|ghcr.io/cloudnative-pg/postgresql:15.3

Description of issue

A workflow get stuck in RUNNING state right after completion of a SUBWORKFLOW. This was observed in multiple workflows we have, all having subworkflow. The issue is erratic, it only happens for a few executions.

I have attached 3 images for 3 sample failures

workflow1 (1) workflow2 (1) workflow3 (1)

The problem gets fixed when we pause and resume , after which it completes normally

Slack Message

rajeshwar-nu avatar Dec 07 '23 10:12 rajeshwar-nu

Hi @rajeshwar-nu , Are these subworkflow retried or restarted?

manan164 avatar Dec 07 '23 12:12 manan164

Hey @manan164 , no they are not.

rajeshwar-nu avatar Dec 07 '23 14:12 rajeshwar-nu

@rajeshwar-nu do they have double underscore in the name ?

appunni-old avatar Dec 12 '23 16:12 appunni-old