workflow-core icon indicating copy to clipboard operation
workflow-core copied to clipboard

Idempotency when creating workflows

Open creyke opened this issue 3 years ago • 16 comments

Is your feature request related to a problem? Please describe. As a client of an IWorkflowHost host I can start a workflow instance with a reference, which could be my idempotency key, but I cannot check on the status of this workflow instance unless I successfully receive the workflow id back. I am also able to start more than one workflow instance with the same reference. This means that in some cases I am unable to validate that my initial request was unsuccessful and I risk starting multiple workflows where I only intended to start one.

Describe the solution you'd like The ability to define the workflow id as a client to an idempotent implementation of IWorkflowHost.StartWorkflow(). In the event this workflow already exists, this call would not kick off another workflow.

Describe alternatives you've considered Returning the workflow id, although this doesn't solve the idempotency issues.

Additional context N/A

creyke avatar May 06 '21 15:05 creyke

Not sure this is something the workflow engine should take responsibility for, have you considered implementing idempotency checks in the calling process?

danielgerlag avatar May 08 '21 18:05 danielgerlag

Hi Daniel thank you for the quick response. The Web API in front of the ‘IWorkflowHost’ is already idempotent but this will not guarantee at most once instancing of a workflow, as if the process is terminated before the response is received by the client of the API, the client must assume the request was not received and duplicate the request.

creyke avatar May 09 '21 09:05 creyke

I need to prevent multiple active instances of some workflows. To achieve this I didn't find anything better than the following: public bool IsWorkflowActivated(string workflowId) { var runnableInstances = _workflowRepository.GetRunnableInstances(DateTime.Now); var workflowInstances = _workflowRepository.GetWorkflowInstances(runnableInstances); return workflowInstances.Any(wi => wi.WorkflowDefinitionId == workflowId); } My understanding is that this code makes huge load on the persistence tier and CPU as well (as long as it jsondeserializes all the instances).

IWorkflowRepository used to provide another method, but it is marked as obsolete now. [Obsolete] Task<IEnumerable<WorkflowInstance>> GetWorkflowInstances(WorkflowStatus? status, string type, DateTime? createdFrom, DateTime? createdTo, int skip, int take);

Is there any better approach (better than GetWorkflowInstances as well, because deserialization of data is redundant here)? Or in case of 'no' maybe it's worth considering adding special method to API? Or IQueriable?

vladimir-kovalyuk avatar May 12 '21 10:05 vladimir-kovalyuk

You could store the workflow instance ID and your reference in your own storage and check against that?

danielgerlag avatar May 12 '21 14:05 danielgerlag

Would that not be a two phase commit? Paradoxically that would arguably require a workflow to orchestrate. Would it be possible to supply the workflow id when creating it, and then it cannot be created twice as storage will be the single source of truth?

On Wed, 12 May 2021 at 15:52, Daniel Gerlag @.***> wrote:

You could store the workflow instance ID and your reference in your own storage and check against that?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/danielgerlag/workflow-core/issues/828#issuecomment-839837490, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANY3DIUIMY4EHBNRM6BQUDTNKI4JANCNFSM44HJWLNA .

creyke avatar May 12 '21 15:05 creyke

You could store the workflow instance ID and your reference in your own storage and check against that?

Why? IWorkflowRepository already provides a solution. The problem is that it is just not efficient and marked as obsolete w/o providing an alternative. A parameter that prevents Json conversion would be enough to make it way more performant (it is like requesting messages w/o attachments to render the inbox). And packing parameters into a class would help with open-closed principle.

And BTW what method you suggest to use to build up UI that renders workflow instances? It does not require workflow data be deserialized.

vladimir-kovalyuk avatar May 13 '21 05:05 vladimir-kovalyuk

Would that not be a two phase commit?

That depends on how you implement it... there are many solutions to this problem. That being said, it might be useful to implement a method to retrieve a workflow by the reference field, but I don't think building in the idempotency functionality should be in the scope of this library, rather give the tools to the consumer of the library to achieve that. Ideally, IWorkflowRepository should remain an internal concern, and we expose the functionality via WorkflowController

danielgerlag avatar May 15 '21 21:05 danielgerlag

I am looking into something similar. I want to be able to skip steps that were successfully executed..

ekjuanrejon avatar Jun 26 '21 08:06 ekjuanrejon

I have faced the same issue, so I have created a PR that implements idempotency for workflows creation, see https://github.com/danielgerlag/workflow-core/pull/921.

SergiiKram avatar Sep 22 '21 13:09 SergiiKram

I have faced the same issue, so I have created a PR that implements idempotency for workflows creation, see #921.

First off I agree with @danielgerlag concerning tracking active workflow instances separately. As long as we had to create sophisticated mechanism of activating workflows we had to add persistence for tracking active workflows and I'm pretty sure each solution would have its own mechanism for this which wouldn't suit the other needs. Second off workflow does not have definition of idempotency on its own and the possible definition would depend on the context it is used in. For example different workflows may have multiple instances within time slot, extent of the entities, or may be prohibited at all. Correlation is smth different in this regard and I personally don't see the relation between idempotency and correlation as notions. In my world correlation has to do with multiple entities and because of this I see a great mismatch. Third off I would like to continue debugging WFC based solutions w/o breaking at WorklfowExistsException. I don't like when library rudely intervenes into my normal execution flow by throwing exception as an execution control.

vladimir-kovalyuk avatar Sep 23 '21 17:09 vladimir-kovalyuk

tracking active workflow instances separately

The PR solves a different problem and does not encourage using WFC as a business/domain-specific job tracking etc.

For the context - imagine you have a REST API that client code invokes to create a workflow instance. After the workflow instance is created many things might go wrong, e.g. network connectivity drops, client crashes before recording the workflow instance id, the WCF crashes before sending the response.

Now a client would need to retry a call because it does not know whether the workflow instance was created and what is the id so it can track it. Current implementation would simply create a duplicate instance. With the correlation id generated at a client, the retry would just return an already created workflow instance id.

The same problem exists when you start a workflow in response to a message or an event - you might start workflow but fail to publish the id or fail to mark the message as completed and it would become visible in the queue again.

So a well-architected distributed system requires idempotency implemented to be resilient and free from side effects.

I personally don't see the relation between idempotency and correlation as notions

I see what you are saying, I thought for quite a while for a good name here :) I ended up with the Correlation Id because I saw that name used in other workflows/orchestrators solutions. I'm open to good suggestions here, e.g. we can use Idempotency Key if that would be easier to grasp.

throwing exception as an execution control

I understand your point and sometimes I also feel pain from unnecessary exceptions.

That said, as a library user you would not see the WorklfowExistsException - it is thrown by the persistence provider and then caught by WorkflowController and the id of the already created workflow instance is returned to your code.

In this case, the exception approach was chosen for the next reasons:

  • It's a cold path, it should be really exceptional situation.
  • Single Responsibility Principle - I want the persistence layer to do just one job - persist new workflow or fail to do it and then let the Workflow Controller decide what to do with it.
  • Common approach - in any persistence provider I've dealt with, the constraints violations throw an exception and in our case, we also catch the exception from the underlying provider.

SergiiKram avatar Sep 24 '21 08:09 SergiiKram

That said, as a library user you would not see the WorklfowExistsException - it is thrown by the persistence provider and then caught by WorkflowController and the id of the already created workflow instance is returned to your code.

Of course I would! I catch all exceptions in debugger so it would just break on it. People use TrySmth methods for achieving the result without throwing exceptions.

vladimir-kovalyuk avatar Sep 24 '21 13:09 vladimir-kovalyuk

For the context - imagine you have a REST API that client code invokes to create a workflow instance. After the workflow instance is created many things might go wrong, e.g. network connectivity drops, client crashes before recording the workflow instance id, the WCF crashes before sending the response.

So this PR implies that in distributed system multiple nodes that try to activate workflow would generate the same idempotency id? Let's think about how we would achieve that. For instance, I could build a compound key for the context the workflow is starting in. For instance the process is working around an entity instance in some business system, so I could build $"{entityName}-{entityId}". In this case multiple nodes would generate the same idempotency id without necessity to collaborate to achieve the result. But what if the job the workflow performs is idempotent on its own? What if we need to repeat the step on the same entity and this is essential part of real process? In this case I wouldn't be able to create ANOTHER instance of this workflow once the first completed. From which I conclude that UNIQUENESS constraint is too strict for the implementation within WFC.

And I still think that the task is easy to achieve without making changes in WorkflowCore. Just create entity for tracking activation via API controller and do the same with this entity what you've done in your PR. I've done it. It is ez.

vladimir-kovalyuk avatar Sep 24 '21 14:09 vladimir-kovalyuk

Of course I would! I catch all exceptions in debugger so it would just break on it. People use TrySmth methods for achieving the result without throwing exceptions.

Is it a correct assumption that you are debugging your app that uses WCF nuget and not the WCF itself? If so then you would not see any exceptions unless you enabled the First Chance Exceptions. Even more, if I remove the WorkflowExistsException the DbUpdateException exception still would be thrown by the EF.

People use TrySmth methods for achieving the result without throwing exceptions.

True, it depends on the expectations though, how often it would fail. E.g. DbContext.SaveChangesAsync throws exceptions though there are clearly expected failures, just the rate is low and failures a very different, and TryMethoid does not give us details of what exactly went wrong.

That said, if repository owners would prefer to use the TryCreateNewWorkflow method instead I would rename and change the behavior. I obviously don't know the strategy and conventions here.

SergiiKram avatar Sep 24 '21 15:09 SergiiKram

this PR implies that in distributed system multiple nodes that try to activate workflow would generate the same idempotency id?

No, each node would generate a unique idempotency key for each individual call but would use the same key to retry workflow creation.

But what if the job the workflow performs is idempotent on its own?

That's would be a very lucky scenario, just do not provide an idempotency key if the workflow itself is idempotent and you are fine with duplicates.

And I still think that the task is easy to achieve without making changes in WorkflowCore. Just create entity for tracking activation via API controller and do the same with this entity what you've done in your PR. I've done it. It is ez.

The problem I see is that workflow instance is created in one database transaction and your tracking entity would record the workflow id in a different transaction.

Now, what if the process crashes or forcibly terminates after the first transaction but before the second transaction is committed? You would end up with duplicate workflows after retry.

It is a very critical issue for some systems/scenarios while it is an acceptable risk for others.

SergiiKram avatar Sep 24 '21 15:09 SergiiKram

Quick update: I've created a nuget for PostgreSQL with Reference as an idempotency key. See https://github.com/SergiiKram/workflow-core#idempotent-workflow-core

The diff SergiiKram@b1b54c9 I'm closing the PR mentioned above until better times.

Thanks, guys!

SergiiKram avatar Oct 04 '21 12:10 SergiiKram