[Core feature] Support for Postgres database pruning / data retention
Motivation: Why do you think this is important?
As Flyte runs it will accumulate rows in the Postgres database. In order to keep Postgres disk usage under control it would be helpful if there was an option to configure database pruning or a data retention policy. This may also retain query performance over time.
Goal: What should the final outcome look like, ideally?
I imagine there could be a pruning go binary that could run as a Kubernetes cron job that would deleter executions (or other items) if they are in a terminal state and have an age over a configurable threshold.
Describe alternatives you've considered
None yet
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
- [x] Yes
Have you read the Code of Conduct?
- [x] Yes
Flyte currently does not have a built-in feature for automatic database pruning or data retention policies. However, there is a related discussion about implementing a cleanup feature in Flyteadmin to automatically clean up database records older than a specified number of days. This feature would allow Flyteadmin to clear database records via a command, such as flyteadmin cleanup -d 90, which could be scheduled to run daily using a cron job (source).
Would you like more detail on how to implement a custom solution for database pruning in Flyte?
Sources:
This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.
Mention @runllm to ask follow-up questions. See runllm.com for more!
We don't have the bandwidth to jump on this now, but before we talk about the mechanism to delete (either via a cron or a managed solution provided by Flyte), we'll need to setup cascade constraints in the gorm models, which will require a migration.
#take
Let's just de I this as part for 2.0