conductor
conductor copied to clipboard
Human task is scheduled twice even if I set redis lock
Describe the bug I set up a workflow with several simple tasks and one human task. I also have several workers pull and update simple tasks and 1 worker update human task. However, I found that human task is scheduled twice after previous simple task is finished
Details Conductor version: 3.16.0 Persistence implementation: Postgres Queue implementation: Redis Lock: Redis Workflow definition: see attachment workflow_definition.json workflow_definition.json
Task definition: see attachment task_definition.json tasks_definition.json
Below is the result returned by api/workflow/8f4d7300-5dd8-42dd-a58b-aadbc68db157?includeTasks=true
Below is the properties we use: conductor-config.properties.log
Below is the env setup: 1)Redis(1primary + 1 replica): AWS elasticache: cache.t4g.micro 2)postgreSQL(1w + 1 ro): AWS aurora RDS: db.t4g.medium 3)conductor service(2 pod/replica): AWS ec2: m7a.xlarge
To Reproduce Steps to reproduce the behavior:
-
Create workflow definition
-
Create task definitions
-
Start 50 workers for each of the simple tasks with poll time 200ms
-
Start 1 process to update human task if there is a human in_progress in current instance. Once worker updates task "saveDbWithWorkflowDummy_0" with finish status, then insert the workflow instance to local memory table. There will be a process check the table every 200ms to get the instance id out and check if there is any human task scheduled. If yes, update the human task with complete status and delete the instance from table, otherwise wait next check poll.
-
Start 2000 workflow instances(If it doesn't work, after this finishes, start another 2000, usually 4 round can reproduce the issue)
-
See duplicate human tasks in screenshot
Expected behavior Human task is only scheduled once
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context Add any other context about the problem here.
@JenniferZh90 , at lease it seems the human task in your workflow working? In my workflow, the huamtask stuck , can not updated as finished. Could you share how do you update human task to finish it? Thank you!
@Dyson-Ido , I just first get human task id then update it with complete status: POST "http://localhost:8080/api/tasks" with body: { "taskId": <task_id>, "workflowInstanceId": <instance_id>, "status": "COMPLETED", "outputData" => <output_data> }
@JenniferZh90 , It's actually updated as COMPLETED status? I mean the human task is completed?
@Dyson-Ido Yes. All tasks are marked as "COMPLETED" finally if they are completed/finished
@v1r3n I did replicate this issue and It seems a bug in system. With conductor client v3.9, this was occurring few times with 2-4k of load while with conductor client v3.19 (with batch poll & execute using completableFuture) Its very evident. This hangs the entire system if your next workflow trigger is waiting for this workflow to be completed.
This also doesn't mark the workflow as completed with pause and resume because one of the task instance is still in-progress.
Have a work around to handle this.