flow-framework
flow-framework copied to clipboard
[DISCUSS] How should we handle a successful deprovision with a failure in workflow state update
Coming from https://github.com/opensearch-project/flow-framework/pull/689#discussion_r1585173319
When deprovisioning, successfully deleting all resources gives a successful response to the user.
There is a possibility of an unexpected failure resetting the state document to NOT_STARTED (if template exists) or deleting it (if template doesn't exist). Presently this state index failure is only logged.
What solution would you like?
Keep this status quo, as the failures are extremely unlikely to occur, and can be corrected with another deprovisioning.
What alternatives have you considered?
Waiting to return a result to a user until the state document is updated, and returning a more verbose error message of the form "resources were successfully deprovisioned, but the workflow state update failed, try deprovisioning again. This may or may not succeed and may be confusing.
While attempting to write REST integration tests for this, it was a challenge as the REST request returned successfully while the state deletion was still processing. I'm leaning heavily toward waiting to return to the user until the state document is deleted (or not). The user can still get a verbose response describing the problem.