Task completion event lost
Describe the bug We are facing an issue where a conductor task remains in progress. This task executes in a do-while loop along with other tasks. The sequence of tasks in do-while is as follows. UploadPrepare -> Upload_collectItem_Output -> Upload_item_start -> Upload -> Upload_item_end
In the annexed screenshot, for iteration 135, the Upload_item_start__135 is IN_PROGRESS. We have already marked task Upload_item_start__135 as COMPLETED. It triggered the next task of the same iteration i.e. Upload__135. Also, the next task is COMPLETED. This seems like a case of lost updates. Moreover, the workflow is never completed.
Details Conductor version: 3.18 Persistence implementation: Postgres Queue implementation: Dynoqueues Lock: Redis Workflow definition:
Task definition: Event handler definition:
To Reproduce Steps to reproduce the behavior:
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
Expected behavior The task and the workflow should have been completed.
Screenshots
Additional context Add any other context about the problem here.
Hi @ravig-kant what database backend are you using?
We are using postgres as backend @v1r3n
This is not a race condition within the persistence engine being used, but rather one of the general design. In this example what we have is the task emitting a kafka message, and the response to mark the task as complete comes before the task is marked as in progress. The remaining code on the original thread to mark the task as in progress then executes and moves from complete -> in progress.
This behaviour would be the same with any persistence engine and would only be able to be fixed if the update logic itself had a bit more complexity and logic to handle this case (potentially through conditional updates).
👋 Hi @ravig-kant @aradu-atlassian
We're currently reviewing open issues in the Conductor OSS backlog, and noticed that this issue hasn't been addressed.
To help us keep the backlog focused and actionable, we’d love your input:
- Is this issue still relevant?
- Has the problem been resolved in the latest version v3.21.12?
- Do you have any additional context or updates to provide?
If we don’t hear back in the next 14 days, we’ll assume this issue is no longer active and will close it for housekeeping. Of course, if it's still a valid issue, just let us know and we’ll keep it open!
Thanks for contributing to Conductor OSS! We appreciate your support. 🙌
Jeff Bull
Developer Community Manager | Orkes
DM on Conductor Slack Email me!
Hi @jeffbulltech
This issue still exists. We had to duplicate the updateTask code in PostgresExecutionDAO and add a conditional status check in the update query, so that it only moves to in-progress from a valid status.
However, ideally this should be resolved in oss itself.