elsa-core
elsa-core copied to clipboard
Retention Module Job could cause timeout when deleting large data (or small db)
Issue detected using SQL Azure Standard S1: 20 DTUs.
I've dig into the retention module and see that no data was really deleted from the Db. After going step by step , I've seen different things :
https://github.com/elsa-workflows/elsa-core/blob/e4bab8810afc201cd7b474fccb1e0757e448d528/src/modules/retention/Elsa.Retention/Jobs/CleanupJob.cs#L52-L69
Record are paged for retrieval but they will be deleted in batch. This can cause timeout when deleting with silent error, and a retry only after the SweepInterval

Following the code :
await _workflowInstanceStore.DeleteManyAsync(specification, cancellationToken);
goes through (EventPublishingWorkflowInstanceStore.cs)
public async Task<int> DeleteManyAsync(ISpecification<WorkflowInstance> specification, CancellationToken cancellationToken = default)
{
var instances = await FindManyAsync(specification, cancellationToken: cancellationToken).ToList();
var count = await _store.DeleteManyAsync(specification, cancellationToken);
if (instances.Any())
{
foreach (var instance in instances)
await _mediator.Publish(new WorkflowInstanceDeleted(instance), cancellationToken);
await _mediator.Publish(new ManyWorkflowInstancesDeleted(instances), cancellationToken);
}
return count;
}
Then in EntityFrameworkWorkflowInstanceStore.cs, a new time FindManyAsync
public override async Task<int> DeleteManyAsync(ISpecification<WorkflowInstance> specification, CancellationToken cancellationToken = default)
{
var workflowInstances = (await FindManyAsync(specification, cancellationToken: cancellationToken)).ToList();
var workflowInstanceIds = workflowInstances.Select(x => x.Id).ToArray();
await DeleteManyByIdsAsync(workflowInstanceIds, cancellationToken);
return workflowInstances.Count;
}
to finaly DeleteManyByIdsAsync with dependencies.
I think we loose the effort of using the pagination and I think we should delete records for each page instead of using page only for first retrieval
We can also change the way we select data, from older to newer.
https://github.com/elsa-workflows/elsa-core/blob/e4bab8810afc201cd7b474fccb1e0757e448d528/src/modules/retention/Elsa.Retention/Jobs/CleanupJob.cs#L49
to
var orderBy = new OrderBy<WorkflowInstance>(x => x.CreatedAt, SortDirection.asc);
@jdevillard Closing this since your PR has been merged. Thank you for the improvements!