[Feature] Export Data Based on Backup S3 Bucket
Search before asking
- [X] I had searched in the issues and found no similar feature requirement.
Description
Description
This feature relies on the Backup Agent functionality outlined in #12876. The backup agent transfers the data to a S3 bucket, the Export Command LIne will enable exporting data based on this bucket, allowing users to extract and utilize specific sets of data efficiently.
Requirements
-
Integration with Backup Snapshots:
- Leverage the S3 bucket created by the Backup Agent feature as the data source for export.
- Ensure the export process clean up the snapshot is created for the exportation.
-
Export Configuration:
- Time Range Export: Allow users to define a specific time range from the snapshot for export.
- Query-Based Export: Allow users to filter and export data using specific query criteria (e.g., tags, metrics, identifiers).
-
Export Formats:
- Support commonly used data formats like JSON and CSV.
Use case
No response
Related issues
No response
Are you willing to submit a pull request to implement this on your own?
- [ ] Yes I am willing to submit a pull request on my own!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Is Export Manager going to be a separate role node?
Is Export Manager going to be a separate role node?
It is a command-line tool provided by bydbctl, which can be scheduled to run automatically using cron on a daily basis or at specified intervals.
So, the tool is going to grap snapshots from mutiple data nodes(through liaison)? Where the filter happens? Are they processed on data nodes?
About the feature, the exportation should support tags/fields selection if possible. This could reduce the volume costs of the files.
I want to create a design to explain the details. Perhaps we can set up exporting to the backup process.
Exporting could be a separate tool, and it is not critical for the whole ecosystem. This tool will read the whole snapshot from the s3 or efs, and filter out the data to generate exporting result. We will reserve this for new developer programs, such as GSOC or OSPP 2025