skywalking icon indicating copy to clipboard operation
skywalking copied to clipboard

[Feature] Export Data Based on Backup S3 Bucket

Open hanahmily opened this issue 1 year ago • 6 comments

Search before asking

  • [X] I had searched in the issues and found no similar feature requirement.

Description

Description

This feature relies on the Backup Agent functionality outlined in #12876. The backup agent transfers the data to a S3 bucket, the Export Command LIne will enable exporting data based on this bucket, allowing users to extract and utilize specific sets of data efficiently.

Requirements

  1. Integration with Backup Snapshots:

    • Leverage the S3 bucket created by the Backup Agent feature as the data source for export.
    • Ensure the export process clean up the snapshot is created for the exportation.
  2. Export Configuration:

    • Time Range Export: Allow users to define a specific time range from the snapshot for export.
    • Query-Based Export: Allow users to filter and export data using specific query criteria (e.g., tags, metrics, identifiers).
  3. Export Formats:

    • Support commonly used data formats like JSON and CSV.

Use case

No response

Related issues

No response

Are you willing to submit a pull request to implement this on your own?

  • [ ] Yes I am willing to submit a pull request on my own!

Code of Conduct

hanahmily avatar Nov 29 '24 22:11 hanahmily

Is Export Manager going to be a separate role node?

wu-sheng avatar Nov 30 '24 00:11 wu-sheng

Is Export Manager going to be a separate role node?

It is a command-line tool provided by bydbctl, which can be scheduled to run automatically using cron on a daily basis or at specified intervals.

hanahmily avatar Nov 30 '24 00:11 hanahmily

So, the tool is going to grap snapshots from mutiple data nodes(through liaison)? Where the filter happens? Are they processed on data nodes?

wu-sheng avatar Nov 30 '24 00:11 wu-sheng

About the feature, the exportation should support tags/fields selection if possible. This could reduce the volume costs of the files.

wu-sheng avatar Nov 30 '24 04:11 wu-sheng

I want to create a design to explain the details. Perhaps we can set up exporting to the backup process.

hanahmily avatar Nov 30 '24 09:11 hanahmily

Exporting could be a separate tool, and it is not critical for the whole ecosystem. This tool will read the whole snapshot from the s3 or efs, and filter out the data to generate exporting result. We will reserve this for new developer programs, such as GSOC or OSPP 2025

wu-sheng avatar Feb 08 '25 06:02 wu-sheng