conductor
conductor copied to clipboard
DO_WHILE loop does not restart after WAIT task update
Describe the bug After task inside DO_WHILE loop is updated, it takes a long time (2+ minutes) to start the next loop iteration.
Details Conductor version: Persistence implementation: MySQL Queue implementation: MySQL
Detailed issue description What we have is the following setup:
Main WF
task A
nested WF
Loop
some task A
some task B
WAIT task
This WAIT task is being updated via the API, and gets completed almost instantly.
But some task A doesn't start for a while. You can see there's about a 2 minute delay, sometimes more.
During my debugging, I've noticed that the execution of the loop doesn't restart until the next sweep cycle, but considering we have only 2 sweeper threads, the mysql queue implementation limits itself to select
-ing only 2 messages per query, so this takes some time.
Now, my question is - judging by the code, am I correct to assume that the workflow should actually restart immediately?
My task is indeed a loop task
Task: TaskModeltaskType='WAIT', status=IN_PROGRESS, inputData=until=2023-08-18 20:59,
referenceTaskName='delay_wait_task__16', retryCount=0, seq=117 ... **iteration=16**, subWorkflowId='null',
subworkflowChanged=false belonging to Workflow generic_workflow.1/88001d3b-6398-424f-b556-6dae8849919d.RUNNING
being updated[spanId=c256afeb36cb6fe0, traceId=c256afeb36cb6fe0]
Am I missing something here?
This should really be marked as help_wanted instead of bug
I was able to figure it out, so turns out conductor will update/complete the task regardless of which workflowInstanceId
you pass as a parameter, as long as it exists. My mistake was that I was passing the Main WF
instance id, instead of the nested WF
instance id. If I pass the nested one, everything executes as expected.
I don't know if there is any kind of use case where this would be necessary, but this seems like a missing validation issue?