elsa-core
elsa-core copied to clipboard
Retention Module
I've ported the retention module from Elsa V2 to V3.
Usage is similar to before:
elsa
.UseRetention(options =>
{
options.ConfigureCleanupOptions = cleanUp =>
{
cleanUp.TimeToLive = TimeSpan.FromDays(7);
cleanUp.SweepInterval = TimeSpan.FromHours(12);
cleanUp.WorkflowInstanceFilter = new WorkflowInstanceFilter
{
WorkflowStatus = WorkflowStatus.Finished
};
};
})
To be able to change the configured WorkflowInstanceFilter to add the time to live filter, I also made WorkflowInstanceFilter
implement IClonable
to make a deep clone of the filter.
Hello,
thanks for this PR , I think we need to check some elements because the different persistence stores are not managed in the same way as in the v2.
For example, I don't think the WorkflowInstanceStore delete all the dependency logs or events around the instance.
A WorkflowInstanceManager was introduced and used for eg by the Rest Api to delete Instance. This Manager trigger event that are then handle by other services to remove dependency.
Ah, I missed that! I've changed the job to use the IWorkflowInstanceManager
. Now It's similar to how it's done in the WorkflowInstance delete endpoint.
I wonder if there is a better way to combine 2 workflow instance filters, maybe using ICloneable
interface is not the best way.
Hello @Sverre-W,
I do have a same requirement of usage of Retention Module is Elsa 3.0. So can I contribute if there is any pending activity to complete this task or do you have any tentative plan to merge this PR into master branch.
Thank You!
@jayachandra21 I've updated it to be current with the main branch.
I also moved the cloning of the WorkflowInstanceFilter
to an extension method inside the module, so no code changes outside the Retention module. I think It's ready to be merged now.
@jayachandra21 I've updated it to be current with the main branch.
I also moved the cloning of the
WorkflowInstanceFilter
to an extension method inside the module, so no code changes outside the Retention module. I think It's ready to be merged now.
Thank You very much! is it merged already with Master branch?
@jayachandra21 I've updated it to be current with the main branch.
I also moved the cloning of the
WorkflowInstanceFilter
to an extension method inside the module, so no code changes outside the Retention module. I think It's ready to be merged now.Thank You very much! is it merged already with Master branch?
No not yet, this I can't do myself.
@Sverre-W Thank you for the initiative! In general, we definitely want this module. There are some good suggestions from @raymonddenhaan that would be good to review. In addition, I would like to see the following enhancement:
Currently, we're doing a bulk delete on workflow instances that match the filter. However, it would be good to have the option of customizing the cleanup job with something else, like for instance: archiving the workflow instances and related data by sending it to cheap storage, like Elastic Search or a local DB file.
To that end, it might be good to introduce a strategy for the cleanup job, e.g. IRetentionCleanupStrategy
and have a default implementation called e.g. DeleteCleanupStrategy
that performs the bulk delete that you have now. Later, we, or the application developer himself, can then implement an archiving strategy.
What do you think?
Sounds good, I'll make the requested changes. Do you have in mind a contract for IRetentionCleanupStrategy
Something along the lines of:
interface IRetentionCleanupStrategy {
public Task Handle(IWorkflowInstanceFilter filter, CancellationToken CancellationToken);
}
Or should we already load the workflow instances that match the filter? In this case, we would need to have something that calls IRetentionCleanupStrategy
in paged batches.
Awesome! I believe the cleanup strategies should be responsible only for handling the items they receive. This means the hosted service will manage the querying and batching, while the cleanup strategy processes each batch (deleting, archiving, etc.).
The hosted service should load not only the workflow instance to be cleaned up but also related entities such as workflow execution log records, activity execution records, bookmarks, and potentially third-party tables dependent on a workflow instance.
The challenge may be implementing support for third-party modules that want to hook into the cleanup process. Currently, we have the following known entities eligible for cleanup:
- WorkflowInstance
- WorkflowExecutionLogRecord
- ActivityExecutionRecord
- Bookmark
Ideally, we should design an API that also accommodates third-party entities, for example:
- CustomWorkflowInstanceLookup (an entity that links a customer to a workflow instance)
One solution could be to raise events and let individual handlers perform the cleanup. This approach would allow third-party modules to handle the cleanup event as well.
These handlers should ideally receive the objects to be cleaned up without having to query them first. This would reduce code repetition when supporting both deletion and archiving to Elasticsearch and other platforms. Furthermore, this approach would enable scenarios where data is sent to multiple destinations.
Perhaps this could be handled nicely with generic type interfaces, for example:
/// Gets entities related to a given batch of workflow instances.
interface IRelatedEntityCollector<TEntity>
{
// Returns an async enumerable stream of entities to be cleaned up.
IAsyncEnumerable<TEntity> GetRelatedEntities(ICollection<WorkflowInstance> workflowInstances);
}
class BookmarkCollector : IRelatedEntityCollector<Bookmark>
{
IAsyncEnumerable<Bookmark> GetRelatedEntities(ICollection<WorkflowInstance> workflowInstances)
{
var workflowInstanceIds = workflowInstances.Select(x => x.Id).ToList();
// Load bookmarks in batches
// For each batch:
yield return bookmark;
}
}
// A strategy responsible for cleaning up a set of entities.
interface ICleanupStrategy<TEntity>
{
Task Cleanup(ICollection<TEntity> entities);
}
class DeleteBookmarkCleanupStrategy : ICleanupStrategy<Bookmark>
{
Task Cleanup(ICollection<Bookmark> entities)
{
// Delete bookmarks
}
}
class ElasticsearchBookmarkCleanupStrategy : ICleanupStrategy<Bookmark>
{
Task Cleanup(ICollection<Bookmark> entities)
{
// Send bookmarks to Elasticsearch
}
}
These are just initial ideas that serve as an illustration and may not be entirely viable. But please consider them and see if they capture the outlined requirements, which, I think, should be:
- Having an extensible cleanup process that supports cleaning up related entities and enables third-party modules to hook into it (so they can clean up associated entities).
- Having an extensible approach to what happens when something gets "cleaned up," e.g., delete, send to Elasticsearch, or other actions.
We might start simple by just sending a notification and letting each handler in the enabled modules figure out what to do. We can refine the API from there if we start to see too much code repetition.
This approach might work well. I will also check if this approach can be used if we would introduce a RetentionPolicy
which would be a combination of a IRetentionInstanceFilter
and a ICleanupStrategy
. This would allow retaining different workflows at different intervals or moving some workflow instances to an archive while deleting others.
I just stumbled upon this old thread that may be interesting to consider as part of this PR: https://github.com/elsa-workflows/elsa-core/issues/412
Even if not in one go, it could be part of one of the next steps.
I like the retention policy idea π
If we allow a custom clean-up strategy which in the end does not enforce the deletion of workflow, we could have some scenarios that are not really retention focused.
For example, I could create a clean-up strategy that just cancels a workflow and use a filter that targets long running suspended workflows.
I wonder if deleting a workflow is some concrete job that you would schedule using Elsa scheduling module. Maybe where you have some generic helpers around the principle of scheduling a job that filters on some workflows and then does some transformation on the filtered workflows, like deleting, archiving, cancelling, restarting, ....
β οΈ GitGuardian has uncovered 1 secret following the scan of your pull request.
Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.
Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard. Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.
π Detected hardcoded secret in your pull request
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
13223926 | Triggered | Generic Password | e2014d3bd0ad434559c9931e60c38ef49fe372d9 | test/component/Elsa.Workflows.ComponentTests/Helpers/Fixtures/Infrastructure.cs | View secret |
π Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
π¦ GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
@sfmskywalker, I've updated the module with the following usage:
elsa.UseRetention(retention =>
{
retention.ConfigureCleanupOptions = options =>
{
options.SweepInterval = TimeSpan.FromHours(4);
options.PageSize = 25;
};
retention.AddDeletePolicy("Delete completed workflows", sp =>
{
ISystemClock systemClock = sp.GetRequiredService<ISystemClock>();
RetentionWorkflowInstanceFilter filter = new()
{
TimestampFilters = new List<TimestampFilter>(),
WorkflowStatus = WorkflowStatus.Finished
};
filter.TimestampFilters.Add(
new TimestampFilter
{
Column = nameof(WorkflowInstance.CreatedAt),
Operator = TimestampFilterOperator.LessThanOrEqual,
Timestamp = systemClock.UtcNow.Subtract(TimeSpan.FromDays(1))
}
);
return filter;
});
});
To implement the policy concept, I used a marker interface to determine which ICleanupStrategy<T>
to apply. For example, the DeleteRetentionPolicy
uses IDeletionCleanupStrategy
to identify all ICleanupStrategy<>
implementations for deleting workflows and related items.
You can view the implementation here: https://github.com/Sverre-W/elsa-core/blob/f7bb304643a281d23c1253fffe32741ec87f8efd/src/modules/Elsa.Retention/Policies/DeletionRetentionPolicy.cs#L9-L21
@sfmskywalker Hi, could you let me know in which version of Elsa this feature will be included?
Thank you!
@af-git-64 Yes, version 3.3 ππ»